Inspiration

Visual media is the dominant language of the internet, yet the way we interact with it hasn't caught up. Today, video is where we find our next favorite outfit, a piece of home decor, or a hidden travel destination. Currently, we have to pause our experience and try to translate a visual feeling into words, which often fails to capture the specifics of what they see. We changed that making every frame of a video interactive so that inspiration can be captured the moment it happens.

What it does

MINT is a browser extension that transforms passive viewing into a searchable experience. When you see an object you like in a video on your page, you simply click the MINT icon and drag to select it. Our tool automatically identifies the item using high-performance AI and finds its products. Beyond just searching, it provides a dedicated space to save and organize these products for later.

How we built it

On the frontend, we used HTML, CSS, and React to build a responsive, aesthetically pleasing interface, featuring a custom Canvas layer designed to track user interactions over live video. The backend was primarily written in Python, processing image data and interfacing with the Google Gemini and Dedalus Labs API for analysis. We chose Gemini for its high performance and ease of use. We managed the entire project through GitHub, which was essential for maintaining the precise connection between our frontend selection logic and our backend AI processing.

Challenges we ran into

Because we were inspired by existing softwares like Google Lens and Pinterest, we had to work hard to ensure we didn't make our own program too similar. We had to find the right balance, learning from their strengths while innovating on the delivery.

Accomplishments that we're proud of

We are proud to have delivered an efficient, end-to-end project that actually works in real-world conditions. We have precise connection between our frontend selection tool and our backend AI processing; ensuring that a user’s mouse drag on a shifting video player translates perfectly into an AI-ready image was a technical feat we nailed. We also succeeded in creating a user experience that is simple.

What we learned

We learned a lot about efficient app building and how to troubleshoot and application given four team members and merging many different features.

What's next for MINT

Next, we would like to implement an avatar feature in order to enhance the user experience. This feature will use saved bookmarks related to clothing and the user will be able to dress their own avatar and explore different looks.

Built With

Share this project:

Updates