VisionMate: the AI agent to help visually impaired people

VisionMate

Our Inspiration

We’ve seen that visually impaired individuals face challenges in navigating everyday environments and accessing information. These challenges include reading documents, detecting nearby objects, and listening to environments. Current assistive technologies often require users to manually switch between multiple tools, creating inefficiency.

Introducing VisionMate

We created an iPhone Operating Systems app, a webpage, and an Apple Watch app specially designed for this suite of products: text-to-braille conversion, image-to-audio conversion, and object-awareness detection. VisionMate is the first AI Agent Suite designed to help visually impaired individuals. The iOS app automatically detects objects that are nearby, sending a pulsing signal to the Apple Watch when objects are within one foot. The agent automatically determines whether to use text-to-braille conversion or image-to-audio conversion when using the video. The Apple Watch app was trained on motion-moving datasets to detect a thumbs-up trigger. When the user gives a thumbs-up in the app, this stops the video streaming in the webpage and any audio or braille that is being played. We created a convolutional neural network which was converted into the Core-ML format for Watch iOS.

How we built it

We knew coming into TreeHacks of our project idea, and we got right to building in minute 1. Jeet worked on all things front-end, Pratham worked on the object detection, Anaïs worked on text-to-braille conversion and image-to-audio conversion, and Eric worked on training the watch motion detection using a convolutional neural net. The object detection was done to detect objects in a one-foot-radius of the app. We use an augmented reality kit which helps us get in-depth footage of objects around us. Once we detect an object, we send a signal to the watch app to trigger a response as haptic feedback. We used the Stanford Product Lab to create the hardware component of the project with the braille.

Challenges we ran into and accomplishments that we're proud of

We ran into the challenge of connecting the frontend to the backend (classic) because we had a bunch of different languages (Python, Node.js, Swift, Typescript, CSS, JavaScript, etc.) that were not compatible with each other. We also ran into the challenge of connecting the watch and iOS appes, and creating the agent to decide the context. With lots of debugging and hours of work, we were able to overcome these challenges and make our suite of products that we are very proud of!

What we learned

We learned about a variety of frameworks and how to integrate them together, including how to train convolutional neural networks using motion training data. We also learned about more Apple Watch development and iOS development.

What's next for VisionMate: the AI agent to help visually impaired people

We are excited to expand this product and hopefully take it to become a real AI Agent that can be used for sale. We are looking into speaking to accelerator programs about our product and/ or venture capital firms.

Tracks

For the Edge AI Track Challenge 1: We created an Apple Core ML embedded on the watch with watch OS. We created a Convolutional Neural Network to train the motion using a dataset with accelerometer data, and gyroscope data. Our model is 181 kB, trained with tensor flow for 15 epochs.

For Rox, we created an iPhone Operating Systems app, a webpage, and an Apple Watch app specially designed for this suite of products: text-to-braille conversion, image-to-audio conversion, and object-awareness detection. VisionMate is the first AI Agent Suite designed to help visually impaired individuals.

For the Vercel Track: We created an Apple Core ML embedded on the watch with watch OS. We created a Convolutional Neural Network to train the motion using a dataset with accelerometer data, and gyroscope data. Our model is 181 kB, trained with tensor flow for 15 epochs.

For the DAIN Labs: We created an iPhone Operating Systems app, a webpage, and an Apple Watch app specially designed for this suite of products: text-to-braille conversion, image-to-audio conversion, and object-awareness detection. VisionMate is the first AI Agent Suite designed to help visually impaired individuals.

For LumaLabs: We created an iPhone Operating Systems app, a webpage, and an Apple Watch app specially designed for this suite of products: text-to-braille conversion, image-to-audio conversion, and object-awareness detection. VisionMate is the first AI Agent Suite designed to help visually impaired individuals.

For Vespa.ai: We created an iPhone Operating Systems app, a webpage, and an Apple Watch app specially designed for this suite of products: text-to-braille conversion, image-to-audio conversion, and object-awareness detection. VisionMate is the first AI Agent Suite designed to help visually impaired individuals.

For EigenLayer: We created an iPhone Operating Systems app, a webpage, and an Apple Watch app specially designed for this suite of products: text-to-braille conversion, image-to-audio conversion, and object-awareness detection. VisionMate is the first AI Agent Suite designed to help visually impaired individuals.

For OpenAI: We used the OpenAI API to detect what type of image was used and based on that, chose text-to-braille or image-to-audio.

For Hudson River Trading (HRT): We created an Apple Core ML embedded on the watch with watch OS. We created a Convolutional Neural Network to train the motion using a dataset with accelerometer data, and gyroscope data. Our model is 181 kB, trained with tensor flow for 15 epochs.