This is the Analytics Service component of the Splitwise Clone project, implementing 4 core responsibilities:
- Spending pattern analysis and visualization
- Group spending statistics aggregation
- ML-based expense categorization
- Historical trend analysis
split_data/
├── Core Python Files
│ ├── analysis.py # Spending patterns, group stats, trend analysis
│ ├── chart_data.py # Chart data generation for visualization
│ ├── expense_classifier.py # ML-based expense categorization
│ ├── database.py # Database connection and queries
│ └── api_server.py # Flask API server
│
├── ML Components
│ ├── train_classifier.py # Train the ML model
│ └── expense_classifier_model.pkl # Trained model (run train_classifier.py first)
│
├── Database
│ └── init.sql # Database schema
│
├── Docker
│ ├── Dockerfile # Container configuration
│ └── docker-compose.yml # Full stack setup (API + MySQL)
│
├── Testing
│ ├── test_all_responsibilities.py # Test all 4 responsibilities
│ ├── test_classifier.py # Test ML classifier
│ ├── test_workflow.py # Test complete workflow
│ └── test_settlement_logic.py # Test settlement logic
│
├── Utilities
│ ├── generate_dummy_data.py # Generate test data
│ ├── settlement_checker.py # Background settlement checker service
│ ├── chart_data_api.py # CLI script for chart data
│ └── api_example.py # API usage examples
│
└── Documentation
├── RESPONSIBILITIES_GUIDE.md # Complete API documentation
├── HOW_TO_TEST.md # Testing guide
└── DOCKER_SETUP.md # Docker setup guide
- Python 3.8+
- MySQL (via Docker)
- Docker & Docker Compose
-
Install dependencies:
pip install -r requirements.txt
-
Train the ML model (first time only):
python3 train_classifier.py
-
Start the database:
docker-compose up -d mysql
-
Generate test data (optional):
python3 generate_dummy_data.py
-
Start the API server:
python3 api_server.py
Or use Docker Compose for everything:
docker-compose up -d
GET /api/users/<user_id>/analysis/patterns- Spending pattern analysisGET /api/groups/<group_id>/statistics- Group spending statisticsPOST /api/tags/suggest- ML-based expense categorization (top 3 suggestions)GET /api/users/<user_id>/analysis/trends- Historical trend analysis
GET /api/users/<user_id>/charts- All chart dataGET /api/users/<user_id>/charts/weekly- Weekly expensesGET /api/users/<user_id>/charts/monthly- Monthly expensesGET /api/users/<user_id>/charts/categories- Expenses by category
See RESPONSIBILITIES_GUIDE.md for complete API documentation.
Run all tests:
python3 test_all_responsibilities.pyTest individual components:
python3 test_classifier.py # Test ML classifier
python3 analysis.py # Test analysis functions
python3 test_workflow.py # Test complete workflowSee HOW_TO_TEST.md for detailed testing instructions.
- Language: Python 3.8+
- Framework: Flask (REST API)
- ML Library: scikit-learn (TF-IDF + Naive Bayes)
- Database: MySQL
- Containerization: Docker & Docker Compose
| Responsibility | Main Files |
|---|---|
| #1: Spending Patterns | analysis.py, chart_data.py |
| #2: Group Statistics | analysis.py |
| #3: ML Categorization | expense_classifier.py, train_classifier.py |
| #4: Historical Trends | analysis.py |
- RESPONSIBILITIES_GUIDE.md - Complete API documentation with examples
- HOW_TO_TEST.md - Testing guide for all 4 responsibilities
- DOCKER_SETUP.md - Docker deployment guide
Jiawei Li - Analytics Service Implementation