Inspiration
Vocalytics was inspired by the team's collective passion to help those who suffered from speech impediment as this was something that people close to us suffered through. All around the world, billions of people lose out on potential job opportunities and other important events all due to this rudimentary inability. With this in mind, we wanted to create a solution and make a difference to improve their lives.
What it does
The application allows invidiauls to work on their english speaking skills and specifically the way they pronounce certain english words and phrases. The way this works is that users would sign up with an account on the application and be directed to a web page where they would either upload an audio file or speak directly into a mirophone. After this, the application would capture this audio and then on the backend side, send the audio to a very well trained machine learning model and a not so well trained model which would both convert the speech to text. The application would then proceed to compare the output of both models and use custom logic and criteria to determine where words were mispronounced by comparing the pronounciations of the words captured. The user would then be able to see the phrase which was spoken and what words they said incorrectly.
How we built it
The technical architecture of the project is described as follows:
** Frontend ** : The frontend portion of the application was built with React JS to provide visually stunning pages which immidietely catches the eye of potential users.
** Backend ** : The backend portion of the application was built with the lightweight web framework Flask which allowed users to customize their profiles and tailor the app language settings for their own development
** Machine Learning **: The machine learning pipeline was built using torch.audio as well as AssemblyAI and was integrated deep within the backend
We also used docker and terraform initially to co-collaborate on the project and deploy
Challenges we ran into
One major issue we ran into was integration, specifically between the backend and the ML model we were using. Due to pytorch's intensive computational power and python requirements, it became hard to create a dockerfile to containerize the application. This posed a barrier to co-collborating within the team and after many hours of debugging, we ultimately decided to host the application locally which was disappointing for us but a very valuable learning lesson.
Accomplishments that we're proud of
We are very proud that our application was able to effectively capture real-time audio data and send it to the machine learning model on the backend to analyze. As we mentioned earlier, integration was a major issue for the application and the fact that our frontend, backend, and AI model were able to seamlessly integrate was something we were very proud of.
What we learned
We learned a lot about project management during our time here at HackTrent and how important it is to allocate tasks effectively and communicate as well. Communication was key in our group's ability to deliver a MVP for the hackathon and continues to be an important part of our success. We also learned a lot about co-collaborating on code and just how many issues it could present which was a great experience for all of us.
What's next for Vocalytics
The next step for Vocalytics is to continue to develop the machine learning capabilities of the application which involves training the model on large amounts of datasets. We would also like to work on the backend and frontend capabilities of the project which involves customizing the frontend display and allowing more user personalized choices for settings.
Log in or sign up for Devpost to join the conversation.