Skip to content

KalleHahl/data-science-group-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pocket Sommelier is a wine recommendation web app built with FastAPI.
It uses TF-IDF vectorization of wine descriptions to suggest similar wines based on text similarity.
All model artifacts are precomputed offline and loaded at runtime for fast responses.

🚀 Features

  • 🔍 Content-based recommendations: Uses TF-IDF + cosine similarity between wine descriptions
  • FastAPI backend: Simple REST API for querying wine similarities
  • 🧠 Precomputed artifacts: TF-IDF vectorizer, sparse matrix, and metadata CSV built via Jupyter Notebook
  • 🐳 Dockerized environment: One-command setup for local development
  • 🧩 Reproducible pipeline: Notebook for preprocessing and asset generation

🧰 Technical Overview

  1. Data Preparation:

    • Combine multiple open-source wine datasets
    • Clean and normalize descriptions
    • Build TF-IDF model and compute cosine similarity matrix
  2. Model Artifacts:

    • tfidf_vectorizer.joblib – serialized scikit-learn vectorizer
    • tfidf_matrix.npz – sparse matrix representation
    • wines_data.csv – metadata (name, country, variety, etc.)
  3. API Endpoints:

    Endpoint Method Description
    /by_description?description=... GET Returns top 5 similar wines to input text

🐳 Quick Start (with Docker)

These instructions assume you have Docker and docker-compose installed.

  1. Start the API container:
    docker-compose up -d
  2. Go to http://0.0.0.0:8000/

🏆 Acknowledgements

📊 Datasets

The datasets used in this project are publicly available and were sourced from the following repositories:

🎨 UI Styling

Parts of the user interface make use of ready-made CSS templates from uiverse.io.
These templates are licensed under the MIT License.

Used with permission under the MIT License.

About

Data science group project for University of Helsinkis Introduction to Data Science course

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors