Caption Glasses

Showing the insides (mirror reflects OLED display onto glasses)
Initial glasses attachment design made in cardboard. Final design was 3D printed blue shown above
Lens
Our text appearing on the glasses!
Showing how to use product
Learning how to use the SSD1306 OLED Display! (Why did we choose the SPI version)

Inspiration

In a world where the voices of the minority are often not heard, technology must be adapted to fit the equitable needs of these groups. Picture the millions who live in a realm of silence, where for those who are deaf, you are constantly silenced and misinterpreted. Of the 50 million people in the United States with hearing loss, less than 500,000 — or about 1% — use sign language, according to Acessibility.com and a recent US Census. Over 466 million people across the globe struggle with deafness, a reality known to each in the deaf community. Imagine the pain where only 0.15% of people (in the United States) can understand you. As a mother, father, teacher, friend, or ally, there is a strong gap in communication that impacts deaf people every day. The need for a new technology is urgent from both an innovation perspective and a human rights perspective.

Amidst this urgent disaster of an industry, a revolutionary vision emerges – Caption Glasses, a beacon of hope for the American Sign Language (ASL) community. Caption Glasses bring the magic of real-time translation to life, using artificial neural networks (machine learning) to detect ASL "fingerspeaking" (their one-to-one version of the alphabet), and creating instant subtitles displayed on glasses. This revolutionary piece effortlessly bridges the divide between English and sign language. Instant captions allow for the deaf child to request food from their parents. Instant captions allow TAs to answer questions in sign language. Instant captions allow for the nurse to understand the deaf community seeking urgent care at hospitals. Amplifying communication for the deaf community to the unprecedented level that Caption Glasses does increases the diversity of humankind through equitable accessibility means!

With Caption Glasses, every sign becomes a verse, every gesture an eloquent expression. It's a revolution, a testament to humanity's potential to converse with one another. In a society where miscommunication causes wars, there is a huge profit associated with developing Caption Glasses. Join us in this journey as we redefine the meaning of connection, one word, one sign, and one profound moment at a time.

What it does

The Caption Glasses provide captions displayed on glasses after detecting American Sign Language (ASL). The captions are instant and in real-time, allowing for effective translations into the English Language for the glasses wearer.

How we built it

Recognizing the high learning curve of ASL, we began brainstorming for possible solutions to make sign language more approachable to everyone. We eventually settled on using AR-style glasses to display subtitles that can help an ASL learner quickly identify what sign they are looking at.

We started our build with hardware and design, starting off by programming a SSD1306 OLED 0.96'' display with an Arduino Nano. We also began designing our main apparatus around the key hardware components, and created a quick prototype using foam.

Next, we got to loading computer vision models onto a Raspberry Pi4. Although we were successful in loading a basic model that looks at generic object recognition, we were unable to find an ASL gesture recognition model that was compact enough to fit on the RPi.

To circumvent this problem, we made an approach change that involved more use of the MediaPipe Hand Recognition models. The particular model we chose marked out 21 landmarks of the human hand (including wrist, fingertips, knuckles, etc.). We then created and trained a custom Artificial Neural Network that takes the position of these landmarks, and determines what letter we are trying to sign.

At the same time, we 3D printed the main apparatus with a Prusa I3 3D printer, and put in all the key hardware components. This is when we became absolute best friends with hot glue!

Challenges we ran into

The main challenges we ran into during this project mainly had to do with programming on an RPi and 3D printing.

Initially, we wanted to look for pre-trained models for recognizing ASL, but there were none that were compact enough to fit in the limited processing capability of the Raspberry Pi. We were able to circumvent the problem by creating a new model using MediaPipe and PyTorch, but we were unsuccessful in downloading the necessary libraries on the RPi to get the new model working. Thus, we were forced to use a laptop for the time being, but we will try to mitigate this problem by potentially looking into using ESP32i's in the future.

As a team, we were new to 3D printing, and we had a great experience learning about the importance of calibrating the 3D printer, and had the opportunity to deal with a severe printer jam. While this greatly slowed down the progression of our project, we were lucky enough to be able to fix our printer's jam!

Accomplishments that we're proud of

Our biggest accomplishment is that we've brought our vision to life in the form of a physical working model. Employing the power of 3D printing through leveraging our expertise in SolidWorks design, we meticulously crafted the components, ensuring precision and functionality.

Our prototype seamlessly integrates into a pair of glasses, a sleek and practical design. At its heart lies an Arduino Nano, wired to synchronize with a 40mm lens and a precisely positioned mirror. This connection facilitates real-time translation and instant captioning. Though having extensive hardware is challenging and extremely time-consuming, we greatly take the attention of the deaf community seriously and believe having a practical model adds great value.

Another large accomplishment is creating our object detection model through a machine learning approach of detecting 21 points in a user's hand and creating the 'finger spelling' dataset. Training the machine learning model was fun but also an extensively difficult task. The process of developing the dataset through practicing ASL caused our team to pick up the useful language of ASL.

What we learned

Our journey in developing Caption Glasses revealed the profound need within the deaf community for inclusive, diverse, and accessible communication solutions. As we delved deeper into understanding the daily lives of over 466 million deaf individuals worldwide, including more than 500,000 users of American Sign Language (ASL) in the United States alone, we became acutely aware of the barriers they face in a predominantly spoken word.

The hardware and machine learning development phases presented significant challenges. Integrating advanced technology into a compact, wearable form required a delicate balance of precision engineering and user-centric design. 3D printing, SolidWorks design, and intricate wiring demanded meticulous attention to detail. Overcoming these hurdles and achieving a seamless blend of hardware components within a pair of glasses was a monumental accomplishment.

The machine learning aspect, essential for real-time translation and captioning, was equally demanding. Developing a model capable of accurately interpreting finger spelling and converting it into meaningful captions involved extensive training and fine-tuning. Balancing accuracy, speed, and efficiency pushed the boundaries of our understanding and capabilities in this rapidly evolving field.

Through this journey, we've gained profound insights into the transformative potential of technology when harnessed for a noble cause. We've learned the true power of collaboration, dedication, and empathy. Our experiences have cemented our belief that innovation, coupled with a deep understanding of community needs, can drive positive change and improve the lives of many. With Caption Glasses, we're on a mission to redefine how the world communicates, striving for a future where every voice is heard, regardless of the language it speaks.

What's next for Caption Glasses

The market for Caption Glasses is insanely large, with infinite potential for advancements and innovations. In terms of user design and wearability, we can improve user comfort and style. The prototype given can easily scale to be less bulky and lighter. We can allow for customization and design patterns (aesthetic choices to integrate into the fashion community).

In terms of our ML object detection model, we foresee its capability to decipher and translate various sign languages from across the globe pretty easily, not just ASL, promoting a universal mode of communication for the deaf community. Additionally, the potential to extend this technology to interpret and translate spoken languages, making Caption Glasses a tool for breaking down language barriers worldwide, is a vision that fuels our future endeavors. The possibilities are limitless, and we're dedicated to pushing boundaries, ensuring Caption Glasses evolve to embrace diverse forms of human expression, thus fostering an interconnected world.