CARmen

Inspiration

We took on the SEAT challenge because we felt a personal drive to push ourselves and to test the limits of our knowledge, which a project like this one might surely do.

Project Overview

We developed an intuitive and intelligent platform that helps potential buyers interact with and explore the Cupra Tavascan. Through a combination of conversational interfaces, image understanding, and immersive visualization, users can get to know the car in a more engaging and informative way.

Key Features

Chatbot: Provides accurate, conversational responses to questions about features, specifications, pricing, and more.
3D Simulation: Offers an interactive experience to explore the interior of the Cupra Tavascan.
Image Recognition: Allows users to upload photos of the vehicle or specific components to receive detailed information.
Object Detection: Automatically identifies and labels parts in images
User Experience: The interface is designed with Cupra’s visual identity in mind—clean, modern, and easy to navigate.

How It Works

1. Data Preprocessing

We begin by processing the Cupra Tavascan user manual (PDF):

The content is divided into meaningful sections (e.g., safety, dashboard, controls).
Text and images are extracted and indexed for later retrieval.
Metadata such as section titles and page numbers is added to enable context-aware search.

2. Retrieval-Augmented Generation (RAG)

To answer user questions, we use a Retrieval-Augmented Generation approach:

A vector database stores embeddings of the manual's content, including both text and images.
When a query is made—either as text or via image selection—the system retrieves the most relevant chunks.
A language model (LLM) uses the retrieved information to generate accurate and grounded answers.

This system supports both textual questions and interactions based on specific regions within uploaded images.

3. User Interface

The frontend is designed for clarity, responsiveness, and ease of use:

Users can upload an image and select an area to receive relevant information.
A chatbot provides natural language interaction for asking questions about the vehicle.
A 3D simulation allows users to visually explore the interior of the Cupra Tavascan.
All results are shown in a clear and informative way, maintaining a consistent visual design aligned with the Cupra brand.

Technical Implementation

Our system is composed of the following main components:

Data Preprocessing: The Cupra manual is parsed, segmented, and enriched with metadata for efficient search.
Vector Indexing: Text and image embeddings are stored in a vector database to enable semantic similarity search.
RAG Pipeline: Combines vector-based retrieval with a large language model to generate context-based answers.
Frontend: A web application that handles image uploads, region selection, chatbot interaction, and 3D visualization.
Image Processing: Uses the Canvas API to extract selected image regions and send them in base64 format to the backend.
Backend API: Processes inputs, retrieves relevant content, and returns formatted explanations for the frontend to display.