My desire to help the visually impaired started with my experiences in elementary school where I had a friend who was blind. Seeing him struggle to perform simple tasks without help inspired me to research how machine learning could be used to solve some of the problems the visually impaired commonly face. While writing a research paper on how object identification could be used to aid the visually impaired, I found that most models used were for general cases meaning they were inaccurate for objects that weren't commonly detected by object detection models. This gave me the idea to develop an application to help the visually impaired count spare change.
¢oinVision accesses the user's camera to take a picture and return the total value of the coins in that picture. To do this, it transfers the image from the javascript to a python file via a combination of ajax and flask. cv2 is used to detect circles within the image which are then extracted from the image. These segments are each sent through an image classifier to check whether they are a type of coin or a false detection. After the classifier returns the value of each coin detected, the values are added and the total is stored as part of a string that is converted to an audio file via the gtts library.
¢oinVision is also able to guide the user in taking the picture. Every five seconds, it takes a frame of the video and analyzes it to see whether all the coins the user wants to take a picture of are in the frame or not. It then gives verbal confirmation to the user stating that the picture can be taken or tell the user to move the camera in one direction or another.
In the beginning, I tried to make an object detection model to detect the coins in the image, however, I found this process to be too difficult as detecting multiple objects requires a much more complicated algorithm, so I opted for an image classification model instead. After creating the model, I saved the weights to a folder to use with my project so I didn't have to retrain the model every time. Using an image classifier led to the second major problem which was that an image classifier only labels an entire image as a single object making it useless if there is more than one object in the picture. In order to overcome this, I used the cv2 python library to detect circles in the image to find potential coins and cut each circle detection out of the image. These image segments were then sent through the classification system and the total value was returned.
During this project, I learned how to create an image classification model as well as how to create a site that takes into account the user's needs. Creating a site meant for the visually impaired added some important constraints such as keeping the website design simplistic and large to prevent misclicks as well as keeping all communication verbal.

Log in or sign up for Devpost to join the conversation.