Inspiration
In Hackprinceton 2018, Christian worked with a deaf/mute student to create an AR application to caption sign language. They failed. However, his new friend's story was inspiring. During their pitch, he said that growing up deaf was lonely because having to sign to talk was isolating. If applications like what we had been planning to make existed it would have helped many other hearing impaired people like him.
For the first time in history, a group of undergraduates who learned from free intro to ML videos could create technology advanced enough to challenge impairments and impact lives meaningfully.
So we built a first iteration as a proof of concept for a robust generalized sign language model through keypoint estimation. To make this model useful in the short-term, we'll be tackling the accessibility problem with smart homes by synthesizing speech to issue voice commands.
What it does
- It detects your body's keypoints (finds where your head, hands, fingers etc... in 2d space)
- We feed these keypoints into our deep learning model to see what gesture you're most likely trying to do
- We then figure out what command your gesture maps to, and use google cloud text to speech to send commands to any smart home device
How we built it
We used openpose to estimate keypoints and pytorch to build the deep learning model. We then used Google speech to text to synthesize a voice command to Alexa
Challenges we ran into
- Hardware challenges
- Data, data, data
- Time
Accomplishments that we're proud of
Robust model that works on any type of camera (web cam, phone cam etc...) on any type of background, and do not require any special equipment.
Most modern deep learning approaches require expensive equipment or rely on rigid constraints such as solid colored backgrounds. This works in the wild!
What we learned
Babysitting the training process, pytorch, deep learning concepts, teamwork
What's next for SignToSpeech
This project shows that with good data a generalized sign/gesture to text model could work. This opens up a whole new world of innovation for HCI (more gesture based UI), AR, and of course a potential novel solution to the sign language computer vision problem.

Log in or sign up for Devpost to join the conversation.