Inspiration

Our inspiration is constantly forgetting what we need to know -- be it passwords, where we put something the day before, when that one meeting was supposed to be, the due date for the CS lab, or our full leg day workout routine.

We also strongly disliked how disorganized the Notes app can be when storing information; it's impossible to keep track of 20 different notes scattered around with no effective way to find something.

What it does

Overview

Our web app allows someone to create notes/reminders they can add to and query using their voice allowing people to store "memories" in an organized fashion. These notes can really be anything, ranging from passwords for certain websites to locations for items.

Functionality

After you log in, you have two options:

  1. Add a new "memory" - you speak into the website which converts your voice into text and stores it in the database to be queried later on.
  2. Query old "memories" - you ask the website, using your voice, what you have forgotten or need to know, and it will tell you.

This is all authenticated via voice: you don't need a password, only a username, and the system will verify your identity by comparing your voice to a sample recorded when you sign up. This is also far more secure than any password could hope to be.

At its core, this is a drastic improvement to the notes app, powered by Voice Recognition and Natural Language Processing.

While creating the app, we also noted its possibilities for the blind or visually impaired, Paired with Apple's already built-in accessibility functionality, this could be a great method to store information for blind persons.

One potential niche use case is for professionals looking to develop theories in real time and in the real world. For example, a detective at a crime scene could record and organize their ideas by using VoiceVault to retrieve data easily at a later time. Scientists, looking at case studies or in the field, can use the app in a similar way.

How we built it

For the NLP and audio models, we used pretrained Pytorch models from various API's and proceeded to finetune and evaluate them on the tasks we used. We used BERT for Question-Answering, a Hidden Markov Model for speech recognition, and ECAPA-TDNN for voice authentication.

We used Flask to route the website around and Tailwind CSS for styling. We used RecorderJS to effectively record audio from the user.

Challenges we ran into

We faced two major challenges:

  1. Voice authentication was very challenging. While the other models were fairly simple to take off the internet and fine-tune, for voice authentication, there are a lot of requisite data preprocessing mechanisms we had to find. What was especially difficult here was that it was difficult to find which preprocessing methods the authors used and which ones they didn't - if we messed up here the pretrained model wouldn't work at all.
  2. Using audio in general was difficult. None of us have ever worked with audio data, so it was difficult to learn the various necessities and libraries we needed. It was also difficult to record audio on the frontend and send it to the backend.
  3. We had a lot of trouble attempting to deploy our app to AWS EC2, as its data size limits were too low for our project, even under a basic paid version. In the end, we decided not to deploy, but rather have the repo act as a prototype for similar projects of this revolutionary nature.

Accomplishments that we're proud of

We're proud of successfully creating a fully functional, ready to deploy, app which we believe has great use for the everyday person. We really felt that this project would be too ambitious, but we managed to pull it off.

What we learned

We learned how to work with audio data, Flask, Tailwind CSS, and a variety of other tools.

What's next for VoiceVault

We want to create more functionality, such as including personal preferences or having more optional security.

Further, while a web app has its advantages by being accessible from any device such that the loss of a device wouldn't be devastating or such that someone's information is accessible from any device, we also think a mobile app would be useful, since it would be a more streamlined atmosphere.

Built With

Share this project:

Updates