🔍 SmartSearch – Multimodal AI-Powered Product Finder
📌 Overview
SmartSearch is an AI-driven product discovery platform that helps users find the exact products they need—especially for tech accessories like chargers, adapters, or cables—by allowing them to search using text, image, or voice input. The app intelligently analyzes these inputs using Google Gemini’s multimodal AI, matches them with product data embedded in MongoDB Atlas Vector Search, and returns the most compatible and high-quality results.
🎯 Problem Statement
In today’s digital marketplaces, especially on platforms like Amazon, finding the right product can be overwhelming:
- Multiple brands sell nearly identical items (e.g., USB-C chargers).
- Product listings are fragmented, poorly categorized, or misleading.
- Users often make purchases that are incompatible with their devices.
This is caused by manufacturer competition, loss of listing coherence, and poor metadata quality.
🚀 Solution
SmartSearch provides an intelligent, unified product search experience powered by:
- Multimodal AI understanding: Users can upload an image, type a query, or speak a request.
- Vector-based product retrieval: Every product is embedded into vector space using Gemini, stored in MongoDB Atlas.
- AI-enhanced filtering: Gemini Gen AI helps reason over vague or partial queries, clarifying and filtering results before returning them.
- Accurate results: Metadata filtering is done before vector search to narrow down candidates, improve speed, and ensure compatibility.
💡 Key Features
- 🧠 Gemini Multimodal Search – Understands both images and text simultaneously.
- 🔍 Vector Similarity Search – Uses MongoDB to search high-dimensional product embeddings.
- 🧹 Metadata-First Filtering – Filters by product type, brand, wattage, and compatibility before vector search to optimize relevance.
- 🖼️ Image Upload & GCS Integration – Uploaded images go directly to Google Cloud Storage, then analyzed by Gemini.
- 🗣️ Voice Search – Voice input converted to text for natural, hands-free searching.
- ⚡ Fast, Accurate Results – Combines filtering, embeddings, and Gen AI validation to return the most relevant matches in real time.
🧱 Architecture

Backend Pipeline:
- User submits a text, image, or voice search query.
- Flask API receives input, uploads image to Google Cloud Storage if applicable.
- Gemini:
- Extracts embeddings for text, image, or both (multimodal).
- Uses Gen AI to reason over and clarify the user’s intent.
- Metadata filters (e.g., device type = “Lenovo”, type = “charger”) are applied to narrow down the MongoDB vector search space.
- Vector search is performed on MongoDB Atlas, returning a list of semantically similar and compatible products.
- Gemini validates and ranks results based on compatibility and confidence.
- Flask API sends results to the Next.js frontend for display.
🧰 Tech Stack
| Layer | Tools / Technologies |
|---|---|
| Frontend | Next.js (React), TypeScript, Tailwind CSS |
| Backend | Flask, Python |
| Database | MongoDB Atlas (with Vector Search) |
| AI Models | Google Gemini (Multimodal embeddings + Gen AI) |
| Storage | Google Cloud Storage (for image uploads) |
| Other | LangChain (optional advanced logic handling) |
🏗️ Implementation Highlights
- Product Data: Amazon product metadata scraped and cleaned, including brand, model, wattage, port type, etc.
- Embedding Generation: Product data passed to Gemini to create vector embeddings for text and image content.
- Query Processing: User queries also embedded using Gemini. The query is filtered, clarified, and enhanced using generative reasoning.
- Search & Match: Metadata filters (e.g., category, brand) reduce search space; vector search matches semantically similar items.
- Ranking & Display: Final results validated and scored using Gemini Gen AI before display.
🧗 Challenges We Faced
- Designing a schema that supports filtering + vector similarity without overloading the system.
- Handling inconsistent product descriptions from Amazon (e.g., multiple names for the same item).
- Combining image and text into a single query embedding reliably.
- Integrating Google Cloud services and MongoDB into a unified pipeline.
- Maintaining speed and reliability under multimodal and filtered query load.
🏆 Achievements
- Built a full-stack AI application integrating real-time image and text-based product search.
- Achieved sub-second search times on a dataset of Amazon products using MongoDB vector search + metadata filtering.
- Leveraged Gemini’s generative capabilities to improve search query interpretation and result validation.
- Seamlessly connected the backend to a responsive, interactive frontend built with modern tools.
📚 What We Learned
- Metadata + vector search = 🔥 powerful and accurate product discovery.
- Gemini’s strength isn’t just in generating content—it’s incredibly useful for filtering, reasoning, and ranking.
- Google Cloud’s seamless integration with storage and AI APIs makes deployment fast and scalable.
- Embedding strategies differ significantly between pure text and multimodal input; multimodal gives better context.
🔮 Future Plans
- 🌍 Add multilingual support for global users.
- 📱 Launch a mobile version with camera-based live matching.
- 🛍️ Integrate affiliate links or checkout with vendors.
- 🤝 Collaborate with manufacturers and retailers to ensure updated listings.
- 🎯 Add personalization: learning from user preferences and history.
- 🎙️ Expand voice assistant capabilities and smart device integration.
Built With
- cloudrun
- gcp
- gemini
- mongodb
- python
- typescript

Log in or sign up for Devpost to join the conversation.