RagTagAI

Image Capture Demo
Image Upload Demo
Image Capture Advice Generation
Image Upload Advice Generation

Inspiration

One of our friends could never seem to get his fit right. We made a whole web app to help him (and others like ourselves who sometimes struggle to piece together an outfit).

What it does

Our app allows the user to capture pictures of themselves with a webcam, or upload photos of themselves in their current outfit, and get suggestions and ratings based on the latest social media trends.

How we built it

We put the focus on honest, straight-from-the-street fashion advice from recent trending social media posts at the heart of our product. We gave each outfit a street score by performing sentiment analysis on the comments and taking a weighted average based on the comments' score. For context-aware querying, we created embeddings for each image we scraped from fashion message boards using a CLIP model. These embeddings included descriptions of the image, which we generated with a BLIP model (Bootstrapping Language-Image Pre-Training).

Our web app is hosted on Google Cloud App Engine with a Flask backend. When the web app receives an image from the user, it embeds the image and searches a MongoDB Atlas database through vector search, effectively creating a comparison to similar styles. A Gemini LLM generates personalised suggestions through RAG (Retrieval Augmented Generation), with the closest embeddings from Reddit as context. A custom-trained deep learning model maps the embedded image to its street score, providing a comprehensive evaluation of the user’s fit.

Challenges we ran into

There were challenges in gathering data to depict contemporary fashion trends. As public datasets are a few years old, they are not an accurate representation of today's fashion standards. Thus we took to sourcing data from the user forum platform Reddit. However, Reddit's API had a limitation of only returning at most 1000 posts in a single query, restricting the size of data we could retrieve. We alleviated this problem by sourcing images from as many subreddits as we could, covering multiple fashion styles and user demographics. To prevent overfitting our limited dataset, we attempted to reduce the dimensionality of our image embedding but no major components were identified by Principal Component Analysis, so we used overparameterised models with early stopping instead.

We also encountered problems with training larger models as well as with deployment options due to our limited budget, credits and time. Although the competition generously provided us with some free credits, we found that it was not enough to experiment with different types of models and various solution approaches. Through our efforts on research and reading, we managed to find a solution architecture which fit our constraints and could be completed feasibly. The deployment of the web app was also new to most of us, and required us to learn some basics of Docker and containers. Although this was an additional stress on our time crunch, we are still satisfied at being able to learn a new technology.

Accomplishments that we're proud of

We are proud to have pushed out a functional product in our very first hackathon, especially since many of the technologies like MongoDB Vector Search are new to us and none of us have experience with software engineering at such a scale. We were able to utilise our respective strengths and work together on the various components of the web app, whilst sharing our knowledge about Web Development, Data Mining and AI Training along the way.

What's next for RagTagAI

We intend to add more multimodal sources of data: users of RagTagAI can look forward to fashion advice from TikTok videos, Instagram reels and Youtube Shorts, or online articles in the future. If possible we would partner with a social media platform and add online-shopping capabilities to our web app, turning it into a store-front for e-commerce platforms to make sales directly through social media.