Skip to content

Danselem/weather_health

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

24 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🌦️ Weather Disease Prediction - MLOps Project

πŸ‘€ Author

Daniel Egbo – @Danselem


πŸ“– Project Description

An end-to-end MLOps pipeline for predicting weather-sensitive diseases using machine learning. The project demonstrates professional DevOps practices including:

  • Configuration Management with Hydra (multi-environment support)
  • Data Versioning with DVC
  • Experiment Tracking with MLflow
  • Orchestration with Prefect
  • CI/CD with GitHub Actions
  • Infrastructure as Code with Terraform (AWS & GCP)
  • Container Orchestration (ECS Fargate, Cloud Run, Kubernetes)
  • Monitoring with Evidently, Grafana

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚    Data     │────▢│   Training  │────▢│  Deployment β”‚
β”‚  (DVC/GCS)  β”‚     β”‚ (MLflow)    β”‚     β”‚ (AWS/GCP)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
                    β”‚  Monitoring β”‚
                    β”‚ (Evidently) β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

weather-health/
β”œβ”€β”€ config/                    # Hydra configuration
β”‚   β”œβ”€β”€ config.yaml           # Base config
β”‚   β”œβ”€β”€ env/                  # Environment configs (dev/staging/prod)
β”‚   β”œβ”€β”€ model/                # Model configs (logistic_regression, lightgbm, etc.)
β”‚   β”œβ”€β”€ cloud/                # Cloud configs (aws.yaml, gcp.yaml)
β”‚   └── mlflow.yaml           # MLflow settings
β”‚
β”œβ”€β”€ src/                      # Source code
β”‚   β”œβ”€β”€ clean_data.py         # Data cleaning
β”‚   β”œβ”€β”€ transform.py          # Data preprocessing
β”‚   β”œβ”€β”€ train.py              # Model training
β”‚   β”œβ”€β”€ pipeline.py           # Full pipeline orchestration
β”‚   └── utils/                # Utilities (MLflow, optimization)
β”‚
β”œβ”€β”€ tests/                   # Unit tests
β”‚   β”œβ”€β”€ test_clean_data.py
β”‚   β”œβ”€β”€ test_transform.py
β”‚   └── test_utils.py
β”‚
β”œβ”€β”€ infra-aws/               # AWS Terraform (separate repo recommended)
β”‚   β”œβ”€β”€ modules/             # Reusable Terraform modules
β”‚   β”‚   β”œβ”€β”€ ecr/             # ECR container registry
β”‚   β”‚   β”œβ”€β”€ ecs-fargate/    # ECS Fargate service
β”‚   β”‚   └── s3-artifacts/   # S3 storage
β”‚   └── environments/        # dev/staging/prod
β”‚
β”œβ”€β”€ infra-gcp/               # GCP Terraform (separate repo recommended)
β”‚   β”œβ”€β”€ modules/
β”‚   β”‚   β”œβ”€β”€ cloudrun/        # Cloud Run service
β”‚   β”‚   β”œβ”€β”€ artifact-registry/
β”‚   β”‚   └── gcs-artifacts/   # GCS storage
β”‚   └── environments/
β”‚
β”œβ”€β”€ .github/workflows/       # CI/CD pipelines
β”‚   β”œβ”€β”€ ci.yml               # Lint, type-check, test
β”‚   └── (deploy-aws.yml, deploy-gcp.yml in infra repos)
β”‚
└── Makefile                 # Development commands

πŸš€ Quick Start

1. Clone & Setup

git clone https://github.com/Danselem/weather-health.git
cd weather-health
make init
make install

2. Configure Environment

make env  # Creates .env from .env.example
# Edit .env with your MLflow/DVC credentials

3. Run Pipeline

# Development (default)
make pipeline

# Production
make pipeline-prod

# Specific model
make train MODEL=lightgbm
make train ENV=prod MODEL=random_forest

βš™οΈ Configuration (Hydra)

Environments

make train ENV=dev         # Dev environment
make train ENV=staging    # Staging environment  
make train ENV=prod       # Production environment

Models

make train MODEL=logistic_regression
make train MODEL=random_forest
make train MODEL=gradient_boosting
make train MODEL=lightgbm

Combined

make train ENV=prod MODEL=lightgbm n_trials=10

πŸ§ͺ Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=src --cov-report=html

# Run specific test file
pytest tests/test_train.py -v

πŸ”§ Available Make Commands

Command Description
make init Initialize project with uv
make install Install dependencies
make pipeline Run full ML pipeline
make train Train model
make test Run tests
make quality_checks Run linters (ruff, black, mypy)
make dvc Run DVC pipeline
make build Build Docker image
make serve_local Run FastAPI locally
make start-monitoring Start Grafana/PostgreSQL

Prefect Server

make prefect          # Start Prefect server
make prefect-stop    # Stop Prefect server
make prefect-reset  # Reset Prefect database

☁️ Cloud Deployment

AWS (ECS Fargate)

# Step 1: Initialize Terraform (first time only)
make aws-init

# Step 2: Plan infrastructure changes
make aws-plan

# Step 3: Deploy (builds Docker, creates infra, pushes to ECR, deploys to ECS)
make aws-destroy  # First clean up any existing resources
make aws-deploy

# Step 4: Destroy infrastructure when done
make aws-destroy

Workflow:

  1. aws-init - Initialize Terraform providers
  2. aws-plan - Fetch model from MLflow and preview infrastructure changes
  3. aws-deploy - Build Docker image β†’ Create ECR/VPC/ECS β†’ Push to ECR β†’ Update ECS service
  4. aws-destroy - Delete ECR repository and destroy all infrastructure

Outputs:

  • ALB DNS: Check Terraform output for alb_dns_name
  • ECR: 828221019178.dkr.ecr.us-east-1.amazonaws.com/weather-health

GCP (Cloud Run)

# Initialize
make gcp-init

# Plan
make gcp-plan

# Deploy
make gcp-deploy

# Destroy
make gcp-destroy

πŸ“Š Monitoring

# Start monitoring stack
make start-monitoring

πŸ”„ CI/CD Pipeline

# .github/workflows/ci.yml runs:
1. Lint (ruff)
2. Type Check (mypy)
3. Test (pytest --cov)
4. Docker Build

βœ… Requirements

  • Python 3.10+
  • uv (package manager)
  • Docker
  • Terraform (for cloud deployment)
  • AWS CLI / GCP CLI (for cloud deployment)

πŸ“ Notes

  • Use Hydra for multi-environment configuration
  • DVC handles data versioning
  • MLflow tracks experiments
  • Separate infra repos recommended for production

πŸ“œ License

MIT License - See LICENSE


πŸ™‹ Contact

Created by Daniel Egbo

About

An MLOps project for predicting illnesses based on weather conditions

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors