PeriscopeAI##PeriscopeAI
285 Million people have a visual impairment and 39 Million people are completely blind. Most people are living in less developed Countries, with an inadequate health system and less support than in developed countries.
As phone prices are dropping fast and nowadays even cheap secondhand devices have strong processing power. We want to provide a powerful app that enables people with visual impairment to use our Image classification neural networks to detect danger or to just figure out what is going on.
The Apps An android application creates every 8 to 10 seconds a snapshot and sends it as a BitArray through a rest API to our neural network TensorFlow which runs on Python 3. The machine learning model (Image Captioning neural network model im2txt [2]) analyses the picture and sends an English text file to a Microsoft Azure translation API. It translates the sentence into one out of 60 Languages and forwards it to a different Azure API which converts the text into an mp3. After sending the audiophile back to the mobile app, the process will be repeated until the user turns off the app.
Challenges Sending the audio files to a mobile app and run it on a user-friendly interface which considers our user's handicap.
What's next for PeriscopeAI
- Danger Detection
- Modification of the model for the task of Audio recognition and understanding for deaf people (voice 2 text)
- Only play new events which happen in front of the camera (learned customization of the models)
- Detection of known people or family members (open source model OpenFace [3])
- Create a lite version for remote areas without mobile network (+/-2 sec inference time on low powered machine)
References / Research
Our work was inspired by the research of:
- [1] Andrej Karpathy's research project Neural Talk.
- [2] Vinyals, Oriol, et al. "Show and tell: Lessons learned from the 2015 mscoco image captioning challenge." IEEE transactions on pattern analysis and machine intelligence 39.4 (2017): 652-663. [https://arxiv.org/abs/1609.06647]
Further customization to the user by recognition of their family members and close ones with open source model OpenFace:
- [3] B. Amos, B. Ludwiczuk, M. Satyanarayanan, "Openface: A general-purpose face recognition library with mobile applications," CMU-CS-16-118, CMU School of Computer Science, Tech. Rep., 2016. [http://cmusatyalab.github.io/openface/]
Thanks
Thanks to Rafael of the University of St. Gallen for the logo. Thanks the whole Start Hack team for the hackathon.
Log in or sign up for Devpost to join the conversation.