Skip to content

unalbahadir/model-deployment-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Model Deployment and Monitoring Tutorial

A comprehensive tutorial project demonstrating production-ready model deployment and monitoring practices using LightGBM, FastAPI, Docker, Kubernetes, and AWS services.

πŸ“‹ Table of Contents

🎯 Overview

This project demonstrates best practices for deploying and monitoring machine learning models in production. It includes:

  • Model Serving: FastAPI-based REST API for real-time predictions
  • Containerization: Docker and Kubernetes deployment configurations
  • CI/CD: GitHub Actions and AWS CodeBuild/CodePipeline integration
  • Monitoring: CloudWatch, Athena, and custom metrics
  • Scalability: Horizontal scaling with Kubernetes

✨ Features

  • βœ… FastAPI REST API for model inference
  • βœ… LightGBM Model serving with batch prediction support
  • βœ… Docker containerization
  • βœ… Kubernetes deployment manifests with HPA (auto-scaling)
  • βœ… CI/CD with GitHub Actions and AWS CodeBuild
  • βœ… CloudWatch logging and monitoring
  • βœ… Athena for querying prediction logs (with partitioned tables)
  • βœ… S3 model storage and prediction logging (Parquet format)
  • βœ… AWS ALB Ingress for production routing
  • βœ… Health checks and metrics endpoints
  • βœ… Feature extraction and validation
  • βœ… Error handling and logging
  • βœ… Load testing script
  • βœ… Makefile for convenient operations

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Client    β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   FastAPI API   β”‚
β”‚   (Port 8000)    β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚
       β”œβ”€β”€β–Ί Model Loader ──► LightGBM Model
       β”‚
       β”œβ”€β”€β–Ί Feature Extractor
       β”‚
       β”œβ”€β”€β–Ί Metrics Collector
       β”‚
       β”œβ”€β”€β–Ί CloudWatch Logger
       β”‚
       └──► Data Pipeline ──► S3 ──► Athena

πŸš€ Quick Start

Local Development

  1. Clone and setup:
cd model_deployment_monitoring_tutorial
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt
  1. Run the API:
python -m uvicorn src.api:app --reload
  1. Test the API:
curl http://localhost:8000/health

Docker

# Build image
docker build -t model-deployment-tutorial .

# Run container
docker run -p 8000:8000 model-deployment-tutorial

# Or use docker-compose
docker-compose up

Make a Prediction

curl -X POST "http://localhost:8000/predict" \
  -H "Content-Type: application/json" \
  -d '{
    "user_id": 259,
    "movie_id": 298,
    "age": 21,
    "gender": "M",
    "occupation_new": "student",
    "release_year": 1997.0,
    "Action": 0,
    "Adventure": 1,
    "War": 1,
    "user_total_ratings": 2,
    "user_liked_ratings": 2,
    "user_like_rate": 1.0
  }'

πŸ“¦ Deployment

Docker

docker build -t model-deployment-tutorial .
docker run -p 8000:8000 \
  -e AWS_REGION=eu-central-1 \
  -e S3_BUCKET=your-bucket \
  model-deployment-tutorial

Kubernetes

Local/Minikube Deployment

  1. Update image in deployment.yaml:
image: <YOUR_ECR_REGISTRY>/model-deployment-tutorial:latest
  1. Update secrets (if using secrets.yaml):
# Edit k8s/secrets.yaml with your actual values
# Or use k8s/secrets.yaml.example as a template
  1. Apply manifests:
# Deploy ConfigMap and Secrets
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/secrets.yaml

# Deploy application
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml

# Deploy HPA for auto-scaling (optional)
kubectl apply -f k8s/hpa.yaml

# Deploy Ingress for external access (optional)
kubectl apply -f k8s/ingress.yaml
  1. Check status:
kubectl get pods
kubectl get services
kubectl get hpa          # Check auto-scaling
kubectl get ingress      # Check ingress (if deployed)

Or use Makefile:

make k8s-deploy          # Deploy application
make k8s-apply-hpa       # Apply HPA
make k8s-apply-ingress   # Apply Ingress

Amazon EKS Deployment

πŸ“˜ For complete EKS deployment guide, see EKS_DEPLOYMENT_GUIDE.md

Quick start:

  1. Create EKS cluster:
./scripts/setup_eks_cluster.sh --cluster-name model-deployment-cluster
  1. Deploy application:
./scripts/deploy_to_eks.sh --cluster-name model-deployment-cluster
  1. Or use CI/CD (see DEPLOYMENT_GUIDE.md):
    • Configure CodeBuild with DEPLOY_TO_K8S=true
    • Automatic deployment on every push

AWS ECS/Fargate

  1. Push to ECR:
aws ecr get-login-password --region eu-central-1 | docker login --username AWS --password-stdin <ACCOUNT_ID>.dkr.ecr.eu-central-1.amazonaws.com
docker tag model-deployment-tutorial:latest <ACCOUNT_ID>.dkr.ecr.eu-central-1.amazonaws.com/model-deployment-tutorial:latest
docker push <ACCOUNT_ID>.dkr.ecr.eu-central-1.amazonaws.com/model-deployment-tutorial:latest
  1. Create ECS service using the ECR image

CI/CD

GitHub Actions

Automatically builds and deploys on push to main branch. Configure secrets:

  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY

AWS CodeBuild/CodePipeline

  1. Create CodeBuild project using buildspec.yaml

    • For testing: Use buildspec.test.yaml for test pipeline
    • For deployment: Use buildspec.yaml for build and deploy
  2. Create CodePipeline with stages:

    • Source: GitHub repository
    • Build: CodeBuild (test) using buildspec.test.yaml
    • Build: CodeBuild (deploy) using buildspec.yaml
    • Deploy: Automatic (via buildspec) or manual approval
  3. Environment Variables (set in CodeBuild):

    • ECR_REPO_REGISTRY: Your ECR registry URL
    • APP_NAME: model-deployment-tutorial
    • AWS_DEFAULT_REGION: eu-central-1
    • EKS_CLUSTER_NAME: Your EKS cluster name
    • DEPLOY_TO_K8S: true (to enable auto-deployment)

πŸ“Š Monitoring

Metrics Endpoint

curl http://localhost:8000/metrics

Returns:

  • Total requests and predictions
  • Average, P95, P99 inference times
  • Error rate
  • Requests per second

CloudWatch

Enable CloudWatch logging and metrics:

export ENABLE_CLOUDWATCH=true
export AWS_REGION=eu-central-1
export AWS_ACCESS_KEY_ID=your-key
export AWS_SECRET_ACCESS_KEY=your-secret

Setup CloudWatch Dashboard:

aws cloudwatch put-dashboard \
  --dashboard-name ModelDeployment \
  --dashboard-body file://scripts/setup_cloudwatch_dashboard.json

The application automatically sends metrics to CloudWatch:

  • RequestCount: Total number of requests
  • ErrorCount: Number of errors
  • InferenceTime: Model inference time (ms)
  • RequestTime: Total request time (ms)

Athena

Setup Athena table:

# Update scripts/create_athena_table.sql with your bucket name
aws athena start-query-execution \
  --query-string "$(cat scripts/create_athena_table.sql)" \
  --result-configuration OutputLocation=s3://your-bucket/athena-results/

Query prediction logs:

-- Get recent predictions
SELECT * FROM model_predictions 
WHERE year = 2024 AND month = 12 
ORDER BY timestamp DESC 
LIMIT 100;

-- Get average prediction by user
SELECT user_id, AVG(prediction) as avg_prediction, COUNT(*) as prediction_count 
FROM model_predictions 
WHERE year = 2024 AND month = 12 
GROUP BY user_id 
ORDER BY prediction_count DESC 
LIMIT 10;

Note: Predictions are automatically saved to S3 in Parquet format with date/hour partitioning for efficient querying.

πŸ“š Training Materials

See training/ directory for:

  • Slides: Presentation slides (PDF/PPTX)
  • Outline: Training session outline
  • Exercises: Hands-on exercises
  • AWS Setup Guide: Step-by-step AWS account setup

πŸ”§ Configuration

Environment variables:

Variable Description Default
ENVIRONMENT Environment (dev/prod) dev
DEBUG Debug mode false
MODEL_PATH Path to model file model.txt
AWS_REGION AWS region eu-central-1
S3_BUCKET S3 bucket for model -
ENABLE_CLOUDWATCH Enable CloudWatch logging false
ENABLE_REDIS Enable Redis caching false
ATHENA_DATABASE Athena database name -
ATHENA_TABLE Athena table name model_predictions

πŸ§ͺ Testing

Unit Tests

# Run all tests
pytest tests/ -v

# Run specific test files
pytest tests/test_api.py -v
pytest tests/test_model_loader.py -v
pytest tests/test_feature_extractor.py -v

# Run with coverage
pytest tests/ --cov=src --cov-report=html

Manual Testing

# Health check
curl http://localhost:8000/health

# Metrics
curl http://localhost:8000/metrics

# Prediction (using test request file)
curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d @test_request.json

Load Testing

# Using the load test script
./scripts/load_test.sh http://localhost:8000/predict 10 100

# Or use Makefile
make load-test

πŸ“– API Documentation

Once the server is running, visit:

πŸ”’ Security

  • Use IAM roles instead of access keys when possible
  • Store secrets in AWS Secrets Manager or Kubernetes secrets
  • Enable HTTPS in production
  • Implement authentication/authorization
  • Validate all inputs

πŸ“ License

This project is for educational/tutorial purposes.

🀝 Contributing

This is a tutorial project. Feel free to use it as a reference for your own projects.

πŸ“ž Support

For questions or issues, refer to the training materials in the training/ directory.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors