Enhancing Scent Selection with Machine Learning
- Synesthesia: a neurological phenomenon where one sensory stimulus involuntarily triggers another perception.
- Color Psychology: the study of how colors influence human emotions, perceptions, and behavior.
- Fragrance Expertise: combining these principles to design a system that connects visual, textual, and olfactory signals into meaningful recommendations.
Finding the right fragrance is highly personal — it reflects identity, emotions, and moments. This system acts as an expert advisor, helping users discover scents that match both mood and style.
- Data Analysis: Explore historical fragrance data to understand key industry trends and most used ingredients.
- Recommendation Engine: Use AI to provide personalized fragrance recommendations.
- Fashion Integration: Link fragrances with fashion and lifestyle to enhance self-identity.
- Source: Fragrantica.com (via Kaggle)
- Size: 24,063 rows × 18 columns
- Features: brand, country, year, gender, ratings, olfactive notes (top, middle, base), perfumer, accords, family.
- +1600 fragrance notes across 20 olfactive families.
- Data Cleaning & Exploration
- Standardization of fragrance dataset.
- Textual Processing
- Descriptions vectorized with TF-IDF.
- Similarity computed using KNN.
- Query expansion with synonyms (WordNet).
- Image Processing
- Background removal (
rembg). - Face removal (OpenCV Haar Cascade).
- Dominant color extraction (KMeans).
- Color-to-tag mapping (e.g., blue → aquatic).
- Background removal (
- Recommendation Engine
- Combines text-based and image-based signals.
- Provides personalized fragrance suggestions.
- Image-Based Recommender: Maps uploaded images and palettes to fragrance families.
- Text-Based Recommender: Suggests similar perfumes based on olfactive descriptions.
- Hybrid Model: Integrates image + text to improve precision and personalization.
- Language: Python
- Data Processing: Pandas, Matplotlib
- Image Processing: OpenCV, PIL, rembg, scikit-learn (KMeans)
- Text Processing: NLTK, TF-IDF, Nearest Neighbors
- Deployment (optional): Streamlit
.
├── data/
│ ├── fragrance_ML_model.csv
│ ├── fragrance_database.csv
│ └── colors.csv
├── fragrance_code/
│ ├── data_loader.py
│ ├── image_processing.py
│ ├── processing_text.py
│ ├── model_text.py
│ ├── recommender_image_based.py
│ └── recommender_text_based.py
├── notebooks/
│ └── fragrance_EDA.ipynb
├── models/
│ ├── tfidf_knn_model.pkl
│ ├── vectorizer.pkl
│ └── ...
└── README.md