A comprehensive tutorial project demonstrating production-ready model deployment and monitoring practices using LightGBM, FastAPI, Docker, Kubernetes, and AWS services.
This project demonstrates best practices for deploying and monitoring machine learning models in production. It includes:
- Model Serving: FastAPI-based REST API for real-time predictions
- Containerization: Docker and Kubernetes deployment configurations
- CI/CD: GitHub Actions and AWS CodeBuild/CodePipeline integration
- Monitoring: CloudWatch, Athena, and custom metrics
- Scalability: Horizontal scaling with Kubernetes
- β FastAPI REST API for model inference
- β LightGBM Model serving with batch prediction support
- β Docker containerization
- β Kubernetes deployment manifests with HPA (auto-scaling)
- β CI/CD with GitHub Actions and AWS CodeBuild
- β CloudWatch logging and monitoring
- β Athena for querying prediction logs (with partitioned tables)
- β S3 model storage and prediction logging (Parquet format)
- β AWS ALB Ingress for production routing
- β Health checks and metrics endpoints
- β Feature extraction and validation
- β Error handling and logging
- β Load testing script
- β Makefile for convenient operations
βββββββββββββββ
β Client β
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββββββ
β FastAPI API β
β (Port 8000) β
ββββββββ¬βββββββββββ
β
ββββΊ Model Loader βββΊ LightGBM Model
β
ββββΊ Feature Extractor
β
ββββΊ Metrics Collector
β
ββββΊ CloudWatch Logger
β
ββββΊ Data Pipeline βββΊ S3 βββΊ Athena
- Clone and setup:
cd model_deployment_monitoring_tutorial
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt- Run the API:
python -m uvicorn src.api:app --reload- Test the API:
curl http://localhost:8000/health# Build image
docker build -t model-deployment-tutorial .
# Run container
docker run -p 8000:8000 model-deployment-tutorial
# Or use docker-compose
docker-compose upcurl -X POST "http://localhost:8000/predict" \
-H "Content-Type: application/json" \
-d '{
"user_id": 259,
"movie_id": 298,
"age": 21,
"gender": "M",
"occupation_new": "student",
"release_year": 1997.0,
"Action": 0,
"Adventure": 1,
"War": 1,
"user_total_ratings": 2,
"user_liked_ratings": 2,
"user_like_rate": 1.0
}'docker build -t model-deployment-tutorial .
docker run -p 8000:8000 \
-e AWS_REGION=eu-central-1 \
-e S3_BUCKET=your-bucket \
model-deployment-tutorial- Update image in deployment.yaml:
image: <YOUR_ECR_REGISTRY>/model-deployment-tutorial:latest- Update secrets (if using secrets.yaml):
# Edit k8s/secrets.yaml with your actual values
# Or use k8s/secrets.yaml.example as a template- Apply manifests:
# Deploy ConfigMap and Secrets
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/secrets.yaml
# Deploy application
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml
# Deploy HPA for auto-scaling (optional)
kubectl apply -f k8s/hpa.yaml
# Deploy Ingress for external access (optional)
kubectl apply -f k8s/ingress.yaml- Check status:
kubectl get pods
kubectl get services
kubectl get hpa # Check auto-scaling
kubectl get ingress # Check ingress (if deployed)Or use Makefile:
make k8s-deploy # Deploy application
make k8s-apply-hpa # Apply HPA
make k8s-apply-ingress # Apply Ingressπ For complete EKS deployment guide, see EKS_DEPLOYMENT_GUIDE.md
Quick start:
- Create EKS cluster:
./scripts/setup_eks_cluster.sh --cluster-name model-deployment-cluster- Deploy application:
./scripts/deploy_to_eks.sh --cluster-name model-deployment-cluster- Or use CI/CD (see DEPLOYMENT_GUIDE.md):
- Configure CodeBuild with
DEPLOY_TO_K8S=true - Automatic deployment on every push
- Configure CodeBuild with
- Push to ECR:
aws ecr get-login-password --region eu-central-1 | docker login --username AWS --password-stdin <ACCOUNT_ID>.dkr.ecr.eu-central-1.amazonaws.com
docker tag model-deployment-tutorial:latest <ACCOUNT_ID>.dkr.ecr.eu-central-1.amazonaws.com/model-deployment-tutorial:latest
docker push <ACCOUNT_ID>.dkr.ecr.eu-central-1.amazonaws.com/model-deployment-tutorial:latest- Create ECS service using the ECR image
Automatically builds and deploys on push to main branch. Configure secrets:
AWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEY
-
Create CodeBuild project using
buildspec.yaml- For testing: Use
buildspec.test.yamlfor test pipeline - For deployment: Use
buildspec.yamlfor build and deploy
- For testing: Use
-
Create CodePipeline with stages:
- Source: GitHub repository
- Build: CodeBuild (test) using
buildspec.test.yaml - Build: CodeBuild (deploy) using
buildspec.yaml - Deploy: Automatic (via buildspec) or manual approval
-
Environment Variables (set in CodeBuild):
ECR_REPO_REGISTRY: Your ECR registry URLAPP_NAME: model-deployment-tutorialAWS_DEFAULT_REGION: eu-central-1EKS_CLUSTER_NAME: Your EKS cluster nameDEPLOY_TO_K8S: true (to enable auto-deployment)
curl http://localhost:8000/metricsReturns:
- Total requests and predictions
- Average, P95, P99 inference times
- Error rate
- Requests per second
Enable CloudWatch logging and metrics:
export ENABLE_CLOUDWATCH=true
export AWS_REGION=eu-central-1
export AWS_ACCESS_KEY_ID=your-key
export AWS_SECRET_ACCESS_KEY=your-secretSetup CloudWatch Dashboard:
aws cloudwatch put-dashboard \
--dashboard-name ModelDeployment \
--dashboard-body file://scripts/setup_cloudwatch_dashboard.jsonThe application automatically sends metrics to CloudWatch:
RequestCount: Total number of requestsErrorCount: Number of errorsInferenceTime: Model inference time (ms)RequestTime: Total request time (ms)
Setup Athena table:
# Update scripts/create_athena_table.sql with your bucket name
aws athena start-query-execution \
--query-string "$(cat scripts/create_athena_table.sql)" \
--result-configuration OutputLocation=s3://your-bucket/athena-results/Query prediction logs:
-- Get recent predictions
SELECT * FROM model_predictions
WHERE year = 2024 AND month = 12
ORDER BY timestamp DESC
LIMIT 100;
-- Get average prediction by user
SELECT user_id, AVG(prediction) as avg_prediction, COUNT(*) as prediction_count
FROM model_predictions
WHERE year = 2024 AND month = 12
GROUP BY user_id
ORDER BY prediction_count DESC
LIMIT 10;Note: Predictions are automatically saved to S3 in Parquet format with date/hour partitioning for efficient querying.
See training/ directory for:
- Slides: Presentation slides (PDF/PPTX)
- Outline: Training session outline
- Exercises: Hands-on exercises
- AWS Setup Guide: Step-by-step AWS account setup
Environment variables:
| Variable | Description | Default |
|---|---|---|
ENVIRONMENT |
Environment (dev/prod) | dev |
DEBUG |
Debug mode | false |
MODEL_PATH |
Path to model file | model.txt |
AWS_REGION |
AWS region | eu-central-1 |
S3_BUCKET |
S3 bucket for model | - |
ENABLE_CLOUDWATCH |
Enable CloudWatch logging | false |
ENABLE_REDIS |
Enable Redis caching | false |
ATHENA_DATABASE |
Athena database name | - |
ATHENA_TABLE |
Athena table name | model_predictions |
# Run all tests
pytest tests/ -v
# Run specific test files
pytest tests/test_api.py -v
pytest tests/test_model_loader.py -v
pytest tests/test_feature_extractor.py -v
# Run with coverage
pytest tests/ --cov=src --cov-report=html# Health check
curl http://localhost:8000/health
# Metrics
curl http://localhost:8000/metrics
# Prediction (using test request file)
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d @test_request.json# Using the load test script
./scripts/load_test.sh http://localhost:8000/predict 10 100
# Or use Makefile
make load-testOnce the server is running, visit:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
- Use IAM roles instead of access keys when possible
- Store secrets in AWS Secrets Manager or Kubernetes secrets
- Enable HTTPS in production
- Implement authentication/authorization
- Validate all inputs
This project is for educational/tutorial purposes.
This is a tutorial project. Feel free to use it as a reference for your own projects.
For questions or issues, refer to the training materials in the training/ directory.