[41] - Image Caption AI

Inspiration

We wanted to dip our toes into machine learning, since it seems like a really powerful technology that none of us had ever used before. We thought that generating a caption based on an image would be a pretty realistic task for an AI to do, and it would be easy to adapt into a cool project.

What it does

Our project is a web app where you can upload a photo, generate a caption for it, and then automatically tweet it to our twitter bot account.

How we built it

Our application is a web app built using Flask. For caption generation, we used the keras deep learning library. We first used this locally to train our model from a dataset of images and captions, then we imported the trained model to the web app and used it to generate captions. The images and captions are then tweeted using the twitter API.

Challenges we ran into

The biggest challenge we encountered was limited computing power to train the AI model. We didn't have access to cloud servers to train our model, so we had to do it locally, which took 4+ hours. This meant we were only able to go through two iterations of our model. Near the end of the project, when trying to upload our code to a Heroku server, we ran into trouble with large dependencies and struggled to get our code running on the server. We also had a lot of trouble making our web page look good.

Accomplishments that we're proud of

We got a semi functional machine learning model to work, and we successfully integrated it into a website. This was our first real hackathon, and we did this all with technologies that none of us had ever used before, so we're happy to just get a functional, cohesive result.

What we learned

We learned how to use Flask, and in general how to make a website backend. We learned how to make a website front end using HTML and CSS. We learned how to train a machine learning model from a dataset using keras. We learned how to use the Twitter API.

What's next for Image Caption AI

In the future, we would like to train our model on a larger, more diverse dataset, which would massively improve our captions. If possible we would use a dataset of image captions from social media, so our captions would be more appropriate. We would also like to train multiple models to generate captions with different moods, so the user can select the most appropriate.

Built With

Updates

Riley Bridges posted an update — Apr 18, 2021 03:59 PM EDT

We realized too late that our Heroku server doesn't have enough memory to run our code. Since we can't afford to upgrade to a more powerful server, our website doesn't work but everything works on our localhost, so we can demo it if you want

Log in or sign up for Devpost to join the conversation.

Raymond Yang started this project — Apr 17, 2021 04:24 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.