eMuter

Inspiration

As the COVID-19 pandemic has migrated us to a more online environment, it is becoming more and more important to take care of this aspect of our lives. For example, while gaming with friends, many of us have experienced unintentional toxicity or excessive raging from the other end of the call. This can strain friendships or cause feelings to be hurt, so we decided to create a bot that will detect and automatically diffuse these tense situations through an amusing gif.

What it does

Our product, eMuter, moderates calls to create safer online environments through emotion recognition. It transcribes a call in real-time, classifies the emotion of each user’s statements, and gives a notice to the user (and sends a tension-defusing gif) if they sound overly angry.

How we built it

Overall, our service is built with Node.js, a Javascript runtime. Our user interface uses Discord.js, a framework that allows us to send API calls to send messages and listen to voice calls on Discord.com, a widely-used texting platform. Our emotion recognition pipeline first converts speech to text using Google Cloud’s web speech API (google-web-speech-api), a state-of-the-art AI text-to-speech model. Then, our program performs tone analysis on the converted text using ibm-watson.

Challenges we ran into

One largest challenge was using the discord js library, which is constantly being updated and doesn’t have the best documentation.

Accomplishments that we're proud of

We are proud that even though it was the first hackathon for many of our team members, we collaborated very well and ended up with a great project.

What we learned

We learned about the many nuances and limitations of emotion detection. For example, we needed to establish a threshold to classify a statement as a certain emotion, Many of our statements were also incorrectly or insufficiently classified, even with one of the best emotion detection models in the world.

What's next for eMuter

We can extend eMuter’s capabilities to cheering people up when they’re sad, or muting them when they’ve raged too many times or used excessive profanities. We can also let users create their own rules for speech detection and save user transcripts for future use.