MLOps Lab

A complete MLOps platform built from scratch — experiment tracking, model registry, containerized serving, Kubernetes deployment, CI/CD pipeline, and monitoring.

Architecture

Developer pushes code
        ↓
GitHub Actions (CI/CD)
  ├── Train model (MLflow tracked)
  ├── Register model (MLflow Registry)
  ├── Build Docker image (versioned)
  └── Verify predictions
        ↓
Docker image on ECR/DockerHub
        ↓
Kubernetes deployment (Minikube/GKE/EKS)
  ├── 2 replicas
  ├── Health + readiness probes
  └── FastAPI serving predictions
        ↓
Prometheus scraping /metrics
        ↓
Grafana dashboards

Stack

Component	Tool
Experiment Tracking	MLflow
Model Registry	MLflow Registry
Model Serving	FastAPI + Uvicorn
Containerization	Docker
Container Registry	ECR / DockerHub
Infrastructure as Code	Terraform
Orchestration	Kubernetes
CI/CD	GitHub Actions
Metrics	Prometheus
Dashboards	Grafana
Cloud	AWS (S3, ECR)

Project Structure

mlops-lab/
├── .github/
│   └── workflows/
│       └── mlops-pipeline.yml    # CI/CD pipeline
├── manifests/
│   ├── app/
│   │   ├── deployment.yaml       # K8s deployment
│   │   └── service.yaml          # K8s service
│   ├── monitoring/
│   │   └── servicemonitor.yaml   # Prometheus scrape config
│   └── namespaces/
│       ├── mlops.yaml            # mlops namespace
│       └── monitoring.yaml       # monitoring namespace
├── scripts/
│   ├── train.py                  # Train + track with MLflow
│   ├── register_model.py         # Register best model
│   ├── serve.py                  # FastAPI serving API
│   ├── save_model.py             # Save model to local file
│   ├── local-start.sh            # Start local environment
│   └── local-stop.sh             # Stop local environment
├── terraform/
│   ├── local/                    # Minikube + K8s + monitoring
│   │   ├── main.tf
│   │   ├── k8s.tf
│   │   └── monitoring.tf
│   └── aws/                      # S3 + ECR + IAM
│       ├── main.tf
│       ├── variables.tf
│       └── aws.tf
├── Dockerfile                    # Container definition
├── requirements.txt              # Full dev dependencies
└── requirements_serve.txt        # Minimal serving dependencies

Quick Start

Setup

1. Clone repository:

git clone [email protected]:atharvspathak/mlops-lab.git
cd mlops-lab

2. Create Python virtual environment:

cd terraform/local
terraform init
terraform apply

3. Start Minikube:

minikube start --driver=docker --cpus=2 --memory=2048

4. Deploy to K8s:

cd terraform/local
terraform apply

Quick Start

Start everything:

# Full command
bash scripts/local-start.sh

# With alias (add to ~/.bashrc first)
mlops-start

Activate Python environment:

# Full command
source ~/mlops-lab/venv/bin/activate && cd ~/mlops-lab

# With alias
mlops

Stop everything:

# Full command
bash scripts/local-stop.sh

# With alias
mlops-stop

Optional Aliases

Add to ~/.bashrc for convenience:

alias mlops='source ~/mlops-lab/venv/bin/activate && cd ~/mlops-lab'
alias mlops-start='~/mlops-lab/scripts/local-start.sh'
alias mlops-stop='~/mlops-lab/scripts/local-stop.sh'
alias tf='terraform'

Prerequisites

WSL2 (Ubuntu 22.04)
Docker
Minikube
kubectl
Terraform
Python 3.11

Local Development

Start everything:

mlops-start

Activate Python environment:

mlops

Train a model:

python scripts/train.py

Register best model:

python scripts/register_model.py

Test prediction:

curl -X POST http://localhost:8080/predict \
  -H "Content-Type: application/json" \
  -d '{"sepal_length": 5.1, "sepal_width": 3.5, "petal_length": 1.4, "petal_width": 0.2}'

Stop everything:

mlops-stop

Services

Service	URL	Credentials
iris-serving API	http://localhost:8080	-
API Docs (Swagger)	http://localhost:8080/docs	-
MLflow UI	http://localhost:5000	-
Grafana	http://localhost:3000	admin / mlops123
Prometheus	http://localhost:9090	-

CI/CD Pipeline

Push to main branch triggers:

Train model + track with MLflow
Register best model
Build Docker image tagged with {run_number}-{commit_sha}
Push to DockerHub
Health check + prediction verification

AWS Infrastructure

cd terraform/aws
terraform init
terraform apply

Provisions:

S3 bucket for model artifacts
ECR repository for Docker images
IAM user for GitHub Actions

Model API

Health Check

GET /health

Predict

POST /predict
{
  "sepal_length": 5.1,
  "sepal_width": 3.5,
  "petal_length": 1.4,
  "petal_width": 0.2
}

Response:

{
  "species_id": 0,
  "species_name": "setosa",
  "confidence": 0.9985
}

Monitoring

Prometheus scrapes /metrics from iris-serving pods every 15s.

Key metrics:

http_requests_total — total requests by endpoint
http_request_duration_seconds — request latency
python_gc_objects_collected_total — GC stats

Author

Atharv Pathak — DevOps → MLOps transition

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLOps Lab

Architecture

Stack

Project Structure

Quick Start

Setup

Quick Start

Optional Aliases

Prerequisites

Local Development

Services

CI/CD Pipeline

AWS Infrastructure

Model API

Health Check

Predict

Monitoring

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
docs		docs
manifests		manifests
scripts		scripts
terraform		terraform
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt
requirements_serve.txt		requirements_serve.txt
setup.log		setup.log

Folders and files

Latest commit

History

Repository files navigation

MLOps Lab

Architecture

Stack

Project Structure

Quick Start

Setup

Quick Start

Optional Aliases

Prerequisites

Local Development

Services

CI/CD Pipeline

AWS Infrastructure

Model API

Health Check

Predict

Monitoring

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages