-
-
This page shows important details about the movie: where all it was filmed, who it was filmed by, the budget, release date, ratings, etc.
-
Since we're not a fan of websites that ask us to login to use their services, we made login optional for those who wanted to save movies.
-
The user profiles. Since login is optional, not all users will have them, but those that do will see their saved tracks here!
-
The home page! The user goes here after the web app detects their mood, and it's populated with suggestions fetched from the API.
Inspiration
When the theme "Adventure Awaits" was announced, we thought of things that remind us of adventure.
Adventure? In my boring life? I like to ride my bicycle - but I can't imagine how in the world I'd integrate into a web app of all places. Also, strolling through the crowded roads of New York isn't exactly that adventurous.
Oxford Dictionary defines "adventure" as an unusual, exciting or dangerous experience, journey or series of events. Then I realised as software engineers, most of our adventure probably doesn't come from our life experiences. This led me to think about movies. I believe every movie is an adventure movie, as they all aim to appeal to the viewer's emotions through excitement.
I then set out to look for a team that was passionate regarding this Hackathon, where I met my fellow teammates. Now that you know my side of their story, our machine learning team had an idea of making an application which would analyse your facial features to detect if you were studying or not. While that's cool, it's not exactly adventurous. Now, wouldn't it be cool to get movie recommendations based on how you're currently feeling?
So to sum it up - Face APIs are cool, and I watch so much Netflix that it's hard to find half-decent movies to watch nowadays. Those are the two random thoughts that we combined to build out this project, Moodv.
What it does
What does Moodv do? Well it's quite direct and simple. We generate suggestions for movies based upon your facial expression. Within this application we've integrated user accounts with ratings/comments along with features such as learning about the details of the movie from inside the app itself (such as languages it's available in, where it was filmed, ratings, overview, ratings, and more!).
Moodv applies state-of-the-art facial emotion classification model Local Multi-Head Channel Self-Attention (LHC-Net) [1] to acuratly classify facial emotion into seven classes: 0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral. LHC-Net achieves 74.42% test accuracy on FER2013 dataset (https://paperswithcode.com/dataset/fer2013).
How we built it
Front-end:
It was a journey integrating an immense amount of features in our application, and making it come together with aesthetic an colour scheme and interactive animations.
Since this is a frontend serverless application, we built it in none other than Next.js. With features such as server-side rendering to improve our SEO and performance (drastically!). As Next.js has amazing support for TypeScript (and because I hate JavaScript), we decided to use the language for better type safety and tooling.
For styling, we decided to build out our own components, and since we were short on time, there wasn't a better candidate than TailwindCSS. Enabling us to rapidly build beautiful interfaces with simple HTML classes, TailwindCSS saved us a lot of time and energy.
Staying on the topic of styling, we used Framer Motion, an open-source and production-ready animation library for React.
I'm a huge fan of CockraochDB serverless, and the easiest way to connect our serverless application with our database and easily generate a schema within it is Prisma, which is a next-generation ORM (object relational mapping) for Node.js and TypeScript
For the API, we used TheMovieDB, which features a free and flexible API allowing us to fetch a wide collection of movies.
To deploy this monster, we used none other than Vercel, from the creators of Next.js, which scales this application dynamically to millions of users around this globe without breaking a sweat. This helps us take full advantage of CockraochDB serverless within our application, making our application fully serverless.
For the backend machine learning models, we decided to use the local multi-headed channel self-attention (LHC-Net) because of it being:
- An attention based model with reliable state-of-the-art accuracy for seven classes
- Equipped with pre-trained models, saving us a lot of time.
We built the environment for LHC-Net with Anaconda3, and we built a simple interfacesfor the LHC-Net to read our Base64 string input using Flask. We configured a free Amazon EC2 server to host the modified LHC-Net and provide a web service model.
Challenges we ran into
It wasn't easy integrating all those features within an application that we had to make in 36 hours, nor was it easy to create our own machine learning web service with state-of-the-art model for face recognition as opposed to using a public API. We had our fair shares of challenges and struggles, the biggest one being deploying our machine learning model onto AWS. Although, after hours of scrolling documentation and online blog posts, and with assistance from the bitcamp volunteers/sponsors, we were able to successfully deploy our application.
A problem we faced was related to the training of machine learning models over the Wi-Fi present at Bitcamp. Running an internet speedtest, we have a low 9.5 mbps download speed and a 10.16 mbps upload speed. Those are decently low speeds, and it's pretty difficult to work on server to configure and train our machine learning model with a poor internet connection. We didn't give up though - instead, we searched around for a while and found a perfect state-of-the-art model to do exactly what we wanted to!
Predicting a user's facial expression isn't a piece of pie. We felt that a premade moel is much harder than building out our own, as the environment install oftentimes unexpectedly breaks due to system differences, package versions, and other compatibility issues. Rather than wasting time trying out models that probably won't work at all, training our own model tailored to the needs of this application seemed to be the logical route. We found the facial emotion data set FER2013 extremely complicated, and an overkill for our purposes as it required a far more complicated model than a simple CNN, making it hard to build and train it within 36 hours. We spent a while researching and finding usable state-of-the-art models, and luckily enough we were ablet o find LHC-Net and make it work in the limited amount of time we had.
Deployment wasn't a walk up a crystal stair either. Our machine learning had the usual package version compatibility issues, and the AWS EC2 server ran out of storage as the machine learning model we used is large and requires a massive amount of dependencies. We decided to host our machine learning model live with AWS EC2 utilising Flask to build out our web service.
Accomplishments that we're proud of
- Built our own face detection API with LHC-Net in Tensorflow with state-of-the-art performance
- Craft out an environment for our machine learning model
- Successfully host a machine learning model with AWS EC2
- Successfully collaborating on the code for the repository through a smooth workflow using Git and GitHub
- Integrating authentication with GitHub's OAuth services
- Finishing 5 hours before the deadline
- Building out a clean, functional and fluid UI which feels awesome to interact with
- Participating in our first Hackathon!
- ...driving 7 hours from New York to participate in this Hackathon 🎉
What we learned
Bitcamp 2022 has undoubtably been a huge learning experience for everyone on our team. For the machine learning team, they learned how to use AWS to host their machine learning models live on the internet, along with learning Flask and coming out of their comfort zone by dipping their feet in minimal amounts of HTML and some JavaScript.
The application team (...which consists of just me) learned to work with powerful libraries such as react-webcam. I learned just how sheerly powerful this library is, along with brushing upon my skills of Next.js and similar serverless technologies.
What's next for Moodv
What's next? Well, for now, I believe we first want to focus on improving the authentication aspect of this application. Currently for the Hackathon, users could only login with GitHub as it's easiest to setup an OAuth workflow there, although signing up is not required as many people just simply want to test out the application.
After I tackle that, I want to focus on performance optimisation. Since we load a lot of information on this website (fetching large images, etc), it currently takes a while to optimise.
On the machine learning model side, the model we used now is pruely trained on FER2013 dataset. One future work we can do to improve the model's performance is to gather user's image (with users' agreement) and further train the current checkpoint of this model. Another future work is to improve the inference time on the AWS web service.
Reference:
- Pecoraro, Roberto, et al. "Local Multi-Head Channel Self-Attention for Facial Expression Recognition." arXiv preprint arXiv:2111.07224 (2021).
Built With
- amazon-web-services
- anaconda
- cockroachdb
- lhc-net
- nextjs
- prisma
- python
- self-attention
- shell
- tailwindcss
- tensorflow
- themoviedb
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.