PeriscopeAI

App
Workflow
User
Team

PeriscopeAI##PeriscopeAI

285 Million people have a visual impairment and 39 Million people are completely blind. Most people are living in less developed Countries, with an inadequate health system and less support than in developed countries.

As phone prices are dropping fast and nowadays even cheap secondhand devices have strong processing power. We want to provide a powerful app that enables people with visual impairment to use our Image classification neural networks to detect danger or to just figure out what is going on.

The Apps An android application creates every 8 to 10 seconds a snapshot and sends it as a BitArray through a rest API to our neural network TensorFlow which runs on Python 3. The machine learning model (Image Captioning neural network model im2txt [2]) analyses the picture and sends an English text file to a Microsoft Azure translation API. It translates the sentence into one out of 60 Languages and forwards it to a different Azure API which converts the text into an mp3. After sending the audiophile back to the mobile app, the process will be repeated until the user turns off the app.

Challenges Sending the audio files to a mobile app and run it on a user-friendly interface which considers our user's handicap.

What's next for PeriscopeAI

Danger Detection
Modification of the model for the task of Audio recognition and understanding for deaf people (voice 2 text)
Only play new events which happen in front of the camera (learned customization of the models)
Detection of known people or family members (open source model OpenFace [3])
Create a lite version for remote areas without mobile network (+/-2 sec inference time on low powered machine)

References / Research

Our work was inspired by the research of:

[1] Andrej Karpathy's research project Neural Talk.
[2] Vinyals, Oriol, et al. "Show and tell: Lessons learned from the 2015 mscoco image captioning challenge." IEEE transactions on pattern analysis and machine intelligence 39.4 (2017): 652-663. [https://arxiv.org/abs/1609.06647]

Further customization to the user by recognition of their family members and close ones with open source model OpenFace:

[3] B. Amos, B. Ludwiczuk, M. Satyanarayanan, "Openface: A general-purpose face recognition library with mobile applications," CMU-CS-16-118, CMU School of Computer Science, Tech. Rep., 2016. [http://cmusatyalab.github.io/openface/]

Thanks

Thanks to Rafael of the University of St. Gallen for the logo. Thanks the whole Start Hack team for the hackathon.

Built With

api
azure
c#
python
tensorflow
xamarin-forms

Created by

I completed the front-end part of the application and I connected the application to the REST API, a colleague completed for the purposes of our application. In my task to implement the interface of this app which would be used from blind people I had to make it as much accessible as possible. Blind people could find this application useful, but without an appropriate interface, which would enable them to use it without further assistance, or at least with minimum assistance, in their everyday life, such an app would be useless. For that reason, I used gestures, which enable the user to change settings quickly and efficiently.

Nikandros Mavroudakis
I coded a Xamarin Forms application which connects to the Azure Custom Vision and Computer API and the Python Backend.

David Eggenberger
Backend, "im2txt" Machine Learning model inference with Python Flask server with Python client.

Vít Růžička
ML researcher currently employed as an intern at ETH Zurich, previously on an internship at Carnegie Mellon University.
I have set up the Azure Translation and Text to Voice API's in the backend on Python 3 (Microsoft Azure Cognitive Services) and the Pitch-Presentation.

Andreas Batista Teixeira

Updates

Andreas Batista Teixeira started this project — Mar 09, 2019 09:44 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.