ML Docker Demo - Machine Learning Model Docker Deployment Guide

This project demonstrates how to package machine learning models into Docker containers and serve them via REST API. It's a complete end-to-end example covering model training, dependency management, Docker building, and deployment debugging best practices.

📁 Project Structure

ml-docker-demo/
├── train_model.py          # Model training script
├── capture_requirements.py # Smart dependency capture tool
├── app.py                  # Flask API service
├── Dockerfile              # Docker configuration
├── requirements.txt        # Auto-generated dependency list
├── model.joblib            # Saved model file
└── README.MD               # Project documentation

🚀 Getting Started

Step 1: Environment Setup

Create and activate virtual environment:

# Create virtual environment
conda create -n ml-deploy python=3.9
conda activate ml-deploy

# Or use Python venv
python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/macOS
source .venv/bin/activate

Install required dependencies:

pip install flask==2.3.2 scikit-learn==1.3.0 numpy==1.23.5 joblib==1.3.2

Step 2: Train Model and Generate Dependencies

# 1. Train the model
python train_model.py

# 2. Auto-capture exact dependency versions
python capture_requirements.py

This generates requirements.txt with pinned versions:

scikit-learn==1.6.1
numpy==1.23.5
flask==2.3.2
joblib==1.3.2

Step 3: Docker Build and Run

# Build Docker image
docker build -t ml-model-api .

# Run container
docker run -p 5000:5000 ml-model-api

# One-command build and run
docker build -t ml-model-api . && docker run -p 5000:5000 ml-model-api

🧪 Testing Your Deployment

Recommended Testing Workflow

Follow this sequence for the most efficient debugging and validation:

Phase 1: Direct Model Testing (Fastest Validation)

# Single command test - most efficient validation
docker run --rm ml-model-api \
  python -c "import joblib; print('Prediction:', joblib.load('model.joblib').predict([[5.1,3.5,1.4,0.2]]))"

Expected Output:

Prediction: [0]

Phase 2: Interactive Container Testing

# Start interactive session for comprehensive testing
docker run --rm -it --entrypoint /bin/sh ml-model-api

# Then run multiple tests inside the container:
python -c "import joblib; model = joblib.load('model.joblib'); print(model.predict([[5.1,3.5,1.4,0.2]]))"
python -c "import sklearn; print('scikit-learn version:', sklearn.__version__)"
python -c "import numpy; print('numpy version:', numpy.__version__)"

Phase 3: API Testing via HTTP

Using curl (Recommended):

Windows PowerShell/CMD:

curl -v -X POST http://localhost:5000/predict -H "Content-Type: application/json" -d "{\"features\": [5.1, 3.5, 1.4, 0.2]}"

Linux/macOS/Git Bash:

curl -X POST http://localhost:5000/predict \
     -H "Content-Type: application/json" \
     -d '{"features": [5.1, 3.5, 1.4, 0.2]}'

Expected Output:

{"prediction": 0}

Using Python:

# test_api.py
import requests

response = requests.post(
    "http://localhost:5000/predict",
    json={"features": [5.1, 3.5, 1.4, 0.2]}
)
print(response.json())

Why This Testing Order Matters

Direct Container Testing is Better Because:

No Network Dependencies - Eliminates Flask/API-related issues, tests model in isolation
Faster Feedback Loop - Instant results without HTTP overhead
More Reliable Validation - Confirms the model loads correctly and dependency versions match

Testing Priority Sequence:

✅ Direct model test - Fastest validation
✅ Interactive testing - Comprehensive validation
✅ API testing - Integration validation
✅ External testing - End-to-end validation

📖 Understanding the curl Command

High-Level Summary

At a glance, this command uses the tool curl to send an HTTP POST request to a web server running on your local machine (localhost) on port 5000. It's sending a piece of data formatted as JSON to the /predict endpoint, likely to get a machine learning prediction in return.

Detailed Line-by-Line Breakdown

`curl`

This is the name of the command-line tool itself. cURL (short for "Client for URLs") is a powerful and versatile tool used to transfer data to or from a server. It can communicate over dozens of protocols, including HTTP and HTTPS, which are used for the web. Think of it as a web browser for your terminal, without the graphical interface.

`-X POST`

-X: This flag (short for --request) allows you to specify the HTTP request method to be used. While curl can often infer the method, explicitly stating it with -X is clear and good practice.
POST: This is the specified HTTP method. Unlike a GET request (which is used to retrieve data from a server), a POST request is used to send data to the server to create a new resource or submit data for processing. In this case, you are posting data to be used for a prediction.

`http://localhost:5000/predict`

This is the destination URL for the request. Let's break down the URL itself:

http://: This is the protocol being used—Hypertext Transfer Protocol. It's the standard protocol for web communication.
localhost: This is a special hostname that always points back to your own computer. This means you are not sending the request out to the internet; you are sending it to a server program that is running on the same machine you are running the curl command from.
:5000: This is the port number. Since a computer can have many different server applications running at once, a port is used to direct the request to the correct application. Web servers typically use port 80 for HTTP, but development servers often use other ports like 5000, 8000, or 8080 to avoid conflicts.
/predict: This is the path on the server. It specifies which "endpoint" or specific resource should handle this request. In an API, this path is likely routed to a function that takes the incoming data, feeds it to a machine learning model, and generates a prediction.

`-H "Content-Type: application/json"`

-H: This flag (short for --header) allows you to include an HTTP header in your request. Headers provide additional information or metadata about the request to the server.
"Content-Type: application/json": This is one of the most common headers. It explicitly tells the server what kind of data format is being sent in the body of the request. By specifying application/json, you are letting the server know that it should expect to parse a JSON object, which helps it process the request correctly.

`-d '{"features": [5.1, 3.5, 1.4, 0.2]}'`

-d: This flag (short for --data) is used to include data in the body of the request. Since this is a POST request, the data specified here is what gets sent to the server.
'{"features": [...]}': This is the actual data payload.
- The outer single quotes ('...') are used to ensure that the shell treats the entire JSON string as a single argument, protecting the inner double quotes from being misinterpreted by the terminal.
- The inner content ({"features": [5.1, 3.5, 1.4, 0.2]}) is the data itself, formatted as a JSON object. This object has a single key named "features", and its value is an array of four numbers—likely the feature vector for a single data point you want the model to make a prediction on.

Putting It All Together

When you execute this command, the following happens:

curl constructs an HTTP POST request.
It sets the destination to the /predict endpoint of a server on your own machine at port 5000.
It adds a header indicating the data payload is in JSON format.
It attaches the JSON data {"features": [5.1, 3.5, 1.4, 0.2]} as the body of the request.
It sends the request and waits for a response from the server, which it will then print to your terminal.

🔧 Container Management & Debugging

Essential Container Commands

# Build and run
docker build -t ml-model-api .
docker run -p 5000:5000 ml-model-api
docker build -t ml-model-api . && docker run -p 5000:5000 ml-model-api

# Container lifecycle management
docker ps                                                    # List running containers
docker logs <container_id>                                   # View container logs
docker stop <container_id>                                   # Stop container
docker stop $(docker ps -q --filter ancestor=ml-model-api)  # Stop all related containers
docker stats <container_id>                                  # View resource usage (CPU, memory, network I/O)
docker attach <container_id>                                 # Reattach to view container output (Ctrl+C to exit)

# Container inspection
docker run -it ml-model-api pip list      # Verify dependencies inside container
docker run -it ml-model-api ls -l /app    # Check container file structure

Advanced Container Debugging Techniques

Method 1: Enter Running Container

# Get container ID and enter shell
docker ps
docker exec -it <container_id> /bin/sh

# Install curl inside container for internal testing
apk add curl  # Alpine image
# OR
apt-get update && apt-get install -y curl  # Debian image

# Test API from inside the container
curl -X POST http://localhost:5000/predict \
     -H "Content-Type: application/json" \
     -d '{"features": [5.1, 3.5, 1.4, 0.2]}'

Method 2: Start Detached Container with Named Access

# Start container in detached mode with a named reference
docker run -d --name my-api ml-model-api

Command breakdown:

-d: Runs the container in "detached" mode (in the background)
--name my-api: Gives the container an easy-to-remember name
This command starts the container and runs the default ENTRYPOINT, so your API server will be running inside it

# Access the running container's shell
docker exec -it my-api /bin/sh

# Inside the container's shell - install curl for testing
apk add curl  # Alpine image
# OR
apt-get update && apt-get install -y curl  # Debian image

# Test API from inside the container
curl -X POST http://localhost:5000/predict \
     -H "Content-Type: application/json" \
     -d '{"features": [5.1, 3.5, 1.4, 0.2]}'

# Additional debugging commands
ls -la /app                    # Check file structure
cat requirements.txt           # Verify dependencies
python -c "import joblib; print(joblib.load('model.joblib'))"  # Test model loading

Method 3: One-off Interactive Container (Alternative)

# Start container in interactive mode without running the API
docker run -it --entrypoint /bin/sh ml-model-api

# Debug inside container (API is NOT running in this mode)
ls -la /app
cat requirements.txt
python app.py  # Manually start the API for testing

What Do These Commands Actually Do?

Understanding the difference between these Docker commands:

`docker run -it --entrypoint /bin/sh ml-model-api`

What it does:

Creates a new container from the ml-model-api image
Overrides the default startup command (which would normally run python app.py)
Instead runs /bin/sh (a shell) as the main process
-it flags make it interactive with a terminal

Key points:

❌ Your Flask API is NOT running when you use this command
✅ You get direct shell access to explore the container
✅ You can manually start the API with python app.py if needed
🔄 When you exit the shell, the container stops and is removed

`docker run -d --name my-api ml-model-api` + `docker exec -it my-api /bin/sh`

What it does:

First command creates and starts a container running your Flask API in the background
Second command opens a shell session inside the already-running container
The API continues running while you explore

Key points:

✅ Your Flask API IS running and accessible
✅ You can test the API from inside the container
✅ You can also test from outside (host machine) simultaneously
🔄 When you exit the shell, the API keeps running

Visual Comparison:

Method 3: docker run -it --entrypoint /bin/sh
┌─────────────────────┐
│   Container         │
│ ┌─────────────────┐ │
│ │  /bin/sh        │ │  ← You are here (shell is the main process)
│ │  (interactive)  │ │
│ └─────────────────┘ │
│                     │
│  python app.py      │  ← API is NOT running
│  (not started)      │
└─────────────────────┘

Method 2: docker run -d + docker exec
┌─────────────────────┐
│   Container         │
│ ┌─────────────────┐ │
│ │  python app.py  │ │  ← API running as main process
│ │  (Flask server) │ │
│ └─────────────────┘ │
│ ┌─────────────────┐ │
│ │  /bin/sh        │ │  ← You are here (additional shell session)
│ │  (interactive)  │ │
│ └─────────────────┘ │
└─────────────────────┘

When to Use Each:

Use Method 3 (--entrypoint /bin/sh) when:

🔍 You want to explore the container environment
🐛 Debug file permissions, missing files, or environment issues
🧪 Test individual components before running the full application
📝 You want to understand what's inside the image

Use Method 2 (docker run -d + docker exec) when:

🌐 You want to test the running API from inside the container
🔄 You need both internal and external testing simultaneously
🏃‍♂️ You want to debug a live, running application
🔍 You want to monitor logs while testing

Advanced Testing Without Network

Testing Model Logic Directly in a Shell

# 1. Start a new container and get a shell inside it
docker run --rm -it ml-model-api /bin/sh

# 2. Once inside the shell, run this Python command
python -c "import joblib; print(joblib.load('model.joblib').predict([[5.1,3.5,1.4,0.2]]))"

What this proves: This entire process "Proves the model works even without port mapping." This test doesn't involve the Flask web application or any networking at all. It's a direct test of the files (model.joblib) and the Python environment inside the container, verifying that the core logic is sound.

Checking for Running Processes from the Host

Prerequisite: A running container named ml-model-api:

# Start the container in the background and give it a name
docker run -d --name ml-model-api -p 5000:5000 ml-model-api

The Command:

# On your host machine:
docker exec ml-model-api ps aux

What this shows: The output will be a list of all the processes running inside that specific container. If you started the container normally, you would expect to see the main python process that is running your Flask application.

Inside the Container: Systematic Investigation

Once you have a shell inside the container, you become a detective. Here's a systematic approach to investigating issues:

Step 1: File System Inspection

ls -lha

Narration: "Are the files I expect to be here actually here? Do they have the right permissions?"

Check for app.py, model.joblib, requirements.txt
Verify file sizes and timestamps
Look for permission issues (executable bits, ownership)

Step 2: Package Environment Verification

pip list

Narration: "Let me check the installed package versions to see if they match requirements.txt."

Compare installed versions with requirements.txt
Look for missing dependencies
Check for version conflicts

Step 3: Environment Variables Check

env

Narration: "Let me check the environment variables. Is the PYTHONPATH set correctly? Is a required API key or config path missing?"

Verify PYTHONPATH configuration
Check for missing environment variables
Look for configuration paths

Step 4: Manual Application Testing

python app.py

Narration: "I'll try to run the application manually from the shell. This will give me a direct, interactive traceback if it crashes."

Get immediate Python error messages
See startup logs and error details
Identify import errors or runtime issues

Step 5: Internal Service Testing

curl localhost:5000/predict -X POST -H "Content-Type: application/json" -d '{"features": [5.1,3.5,1.4,0.2]}'

Narration: "I'll test the service from inside the container to isolate whether it's an application problem or a Docker networking problem."

Confirms the Flask app is responding internally
Tests the prediction endpoint functionality
Eliminates port mapping issues from diagnosis

Complete Investigation Workflow

# Enter container
docker exec -it <container_id> /bin/sh

# Step 1: Are the files I expect to be here actually here? Do they have the right permissions?
ls -lha                    # Check your app.py, your model file, etc.
ls -la /app               # Verify files in the working directory

# Step 2: Let me check the installed package versions to see if they match requirements.txt
pip list                  # Compare with requirements.txt for version mismatches
pip list | grep -E "flask|sklearn|numpy|joblib"  # Focus on key packages

# Step 3: Let me check the environment variables. Is the PYTHONPATH set correctly? Is a required API key or config path missing?
env                       # Check all environment variables
env | grep -E "PYTHON|PATH"  # Focus on Python-related variables

# Step 4: Try to run the application manually from the shell. This will give you a direct, interactive traceback if it crashes.
python app.py             # Manual startup - see immediate error messages

# Step 5: I'll test the service from inside the container to isolate whether it's an application problem or a Docker networking problem.
curl localhost:5000/predict \
     -X POST \
     -H "Content-Type: application/json" \
     -d '{"features": [5.1,3.5,1.4,0.2]}'  # Test internal service response

This systematic approach helps you quickly identify whether issues are:

File-related: Missing files, wrong permissions
Dependency-related: Package version mismatches
Configuration-related: Missing environment variables
Application-related: Python code errors
Network-related: Port mapping or service binding issues

Troubleshooting Cheat Sheet

Issue	Container Test Command
Model Loading	`docker run --rm ml-model-api python -c "import joblib; joblib.load('model.joblib')"`
Dependency Check	`docker run --rm ml-model-api python -c "import sklearn,numpy; print(sklearn.__version__, numpy.__version__)"`
Data Shape Test	`docker run --rm ml-model-api python -c "import joblib; print(joblib.load('model.joblib').predict([[1,2,3,4]]).shape)"`
Feature Count	`docker run --rm ml-model-api python -c "import joblib; print('Features expected:', joblib.load('model.joblib').n_features_in_)"`
Model Type	`docker run --rm ml-model-api python -c "import joblib; print('Model type:', type(joblib.load('model.joblib')))"`

Combined Testing Strategy

Use these techniques as part of a comprehensive debugging approach:

Direct Model Test: Verify the model loads and predicts correctly
Process Inspection: Confirm the Flask application is running
Internal API Test: Use curl from inside the container
External API Test: Use curl from the host machine

This layered approach helps isolate issues at each level of the stack.

🛠️ Dependency Management Best Practices

Benefits of Auto Dependency Capture

Using capture_requirements.py instead of pip freeze:

✅ Exact Version Matching: Ensures production environment matches training environment exactly
✅ Avoid Version Conflicts: Only includes dependencies actually needed by the model
✅ Reproducible Builds: Eliminates "works on my machine" problems

Manual requirements.txt Generation (Backup Method)

Windows:

pip freeze | findstr "flask scikit-learn numpy joblib" > requirements.txt

Linux/macOS:

pip freeze | grep -E "flask|scikit-learn|numpy|joblib" > requirements.txt

Verify generated file:

type requirements.txt  # Windows
cat requirements.txt   # Linux/macOS

🐛 Troubleshooting Guide

Common Issues and Solutions

Symptom	Possible Cause	Solution
`curl` command hangs	Container not started or port mapping error	`docker ps` to check container status, verify port mapping
Port conflict error	Port 5000 already in use	Change host port: `-p 5001:5000`
Container startup failure	Dependency installation error or missing files	Check `docker logs <container_id>`
API returns 500 error	Model file missing or corrupted	Retrain model, check `model.joblib`

Debugging Workflow

Check Container Status
```
docker ps -a
```
View Container Logs
```
docker logs <container_id>
```

Verify Container Internals

docker exec -it <container_id> /bin/sh
ls -la /app
python -c "import joblib; print('Model loaded:', joblib.load('model.joblib'))"

Test API Response
- First test inside container
- Then test from host
- Compare results to locate issue

🌐 Understanding Docker Networking: Host vs Container Testing

The Critical Difference Between Testing Approaches

The two testing methods look similar, but they are fundamentally different and test different parts of your setup.

Scenario 1: Testing from the Host Machine

curl -v -X POST http://localhost:5000/predict -H "Content-Type: application/json" -d "{\"features\": [5.1, 3.5, 1.4, 0.2]}"

What is happening? You are running curl on your own computer's command line (the "host"). The request originates from your host machine and travels through its network stack to the container.

Meaning of localhost:5000: In this context, localhost refers to your host machine. The request is sent to port 5000 on your computer.
The Critical Prerequisite: This test will only succeed if you have mapped the container's port to the host's port. You must have started your container with the --publish or -p flag, like this:
```
docker run -p 5000:5000 my-api-image
```

What it tests:

External Connectivity: This is an end-to-end test that simulates how a real external client would access your service.
Port Mapping: It directly verifies that your -p 5000:5000 mapping is configured correctly.
Host Firewall Rules: It confirms that your host machine's firewall is not blocking traffic on port 5000.

Analogy: You are calling the restaurant's public-facing phone number. You are testing the entire connection from the outside world to the kitchen.

Scenario 2: Testing from Inside the Container

docker exec -it <container_id> /bin/sh
# Now inside the container's shell...
curl -X POST http://localhost:5000/predict -H "Content-Type: application/json" -d '{"features": [5.1, 3.5, 1.4, 0.2]}'

What is happening? You first use docker exec to get a command-line shell inside the running container. Then, you run curl from that shell. The entire network request originates and terminates within the container's isolated network environment.

Meaning of localhost:5000: In this context, localhost refers to the container itself. The request never leaves the container.
The Critical Prerequisite: The container must be running, but no port mapping is required for this test. You are bypassing the entire Docker port forwarding mechanism.

What it tests:

Internal Application Health: This test verifies that your Python/Flask application is running correctly inside its own environment and is listening on the correct internal port (5000).
Application Configuration: It confirms your application code (app.run(host="0.0.0.0", ...)) is bound correctly to listen for connections within the container.
Debugging: This is primarily a diagnostic tool. If the test from the host fails, this is your next step. If this internal test succeeds but the external one fails, you know the problem is with your port mapping or a firewall. If this internal test also fails, you know the problem is with your application code itself (it crashed, or it's not listening on the right port).

Analogy: You are already inside the restaurant's kitchen, and you shout an order directly to the chef. You are only testing if the chef is there and can hear you. You are not testing the public phone lines at all.

Summary Comparison Table

Feature	Testing from Host Machine	Testing from Inside Container
What it Tests	End-to-end external connectivity & port mapping	Internal application health & configuration
Network Path	Host Machine → Docker Network → Container	Container → Container (Internal Loopback)
Meaning of `localhost`	Your main computer (the host)	The container itself
Prerequisite	`docker run -p <host_port>:<container_port>` is required	No port mapping is required
Primary Use Case	Simulating a real user; integration testing	Debugging and isolating application-level problems

A Note on the `curl` Command Syntax

You may have noticed a subtle difference in the commands:

From Host cmd: -d "{\"features\": [5.1, 3.5, 1.4, 0.2]}"
Inside Container sh: -d '{"features": [5.1, 3.5, 1.4, 0.2]}'

This is due to differences in how command-line shells handle quotes:

Windows Command Prompt (cmd.exe): Requires escaping inner double quotes with a backslash (\) inside an outer double-quoted string.
Linux shells (sh or bash): Allow you to enclose the entire JSON string in single quotes ('), which tells the shell to treat everything inside literally.

Debugging Strategy Using Both Methods

Step 1: Test from Inside Container First

docker exec -it <container_id> /bin/sh
curl -X POST http://localhost:5000/predict -H "Content-Type: application/json" -d '{"features": [5.1, 3.5, 1.4, 0.2]}'

Step 2: Interpret Results

✅ Internal test succeeds + External test succeeds: Everything is working perfectly
✅ Internal test succeeds + External test fails: Problem is with port mapping (-p flag) or firewall
❌ Internal test fails: Problem is with your application code, dependencies, or container configuration

Step 3: Fix Based on Results

If internal test fails: Check container logs, verify model files, check dependencies
If only external test fails: Verify port mapping, check firewall settings, ensure correct host/port

📚 Technical Deep Dives

Docker Layer Build Optimization

Dockerfile follows best practices:

FROM python:3.9-slim      # Use lightweight base image
WORKDIR /app              # Set working directory
COPY requirements.txt .   # Copy dependency file first
RUN pip install --no-cache-dir -r requirements.txt  # Install deps (cached layer)
COPY . .                  # Copy application code
EXPOSE 5000              # Declare port
CMD ["python", "app.py"]  # Startup command

Flask Application Configuration

Key configuration notes:

host="0.0.0.0": Allow external access (required for Docker containers)
port=5000: Must match Dockerfile's EXPOSE directive
Production environments should use WSGI servers like Gunicorn

Development Workflow

# 1. Development and training
python train_model.py

# 2. Generate exact dependencies
python capture_requirements.py

# 3. Local testing
python app.py

# 4. Docker deployment
docker build -t ml-model-api .
docker run -p 5000:5000 ml-model-api

# 5. Validation testing
curl -X POST http://localhost:5000/predict \
     -H "Content-Type: application/json" \
     -d '{"features": [5.1, 3.5, 1.4, 0.2]}'

🔍 Deep Dive: How Flask Handles Requests

Understanding the Connection Between `app.run()` and `predict()`

That is an excellent question that gets to the very heart of how modern web frameworks like Flask operate. The connection between app.run() and your predict() function is not direct; it's a beautifully orchestrated process involving a web server, a standardized interface, and a routing system.

Let's walk through the entire lifecycle of a request "under the hood."

The Big Picture: The Restaurant Analogy

Think of your Flask application as a specialized restaurant:

The Server (app.run): This is the restaurant itself opening for business. It opens the front door (0.0.0.0:5000) and waits for customers.
The Routing Map (@app.route): This is the restaurant's menu and floor plan. It tells the host where to send a customer based on what they ask for.
Your Function (predict): This is a specific chef in the kitchen who knows how to prepare one particular dish.
The Request (curl): This is a customer walking in and placing a specific order.
WSGI: This is the universal language that the host, waiters, and chefs all agree to speak so that orders are handled consistently.

Step-by-Step Breakdown

Here is the detailed sequence of events that connects a curl request to your predict() function.

Before a Request Ever Arrives

1. The Menu is Written (The Routing Map is Built)

This is the most important "magic" and it happens the moment you run python app.py, even before the server starts waiting for requests.

Python executes your script from top to bottom.
It sees @app.route("/predict", methods=["POST"]). This is a Python decorator.
A decorator is a special function that wraps another function. In this case, the @app.route() decorator "wraps" your predict() function.
Its job is not to run predict() right now. Its job is to register it. It tells the main app object: "Hey, if you ever receive a request for the path /predict and the method is POST, the function you need to call is predict."
This builds an internal "routing map" or "URL map" inside the app object. It's essentially a dictionary mapping URL rules to specific Python functions.

2. The Restaurant Opens (The Server Starts)

The script reaches the if __name__ == "__main__": block. This standard Python construct ensures the code inside only runs when the script is executed directly (not when imported as a module).
app.run(host="0.0.0.0", port=5000) is called. This starts a development web server. Flask uses a library called Werkzeug (German for "tool") for this.
This Werkzeug server creates a listening socket on your computer. It listens on all available network interfaces (host="0.0.0.0") on port 5000. It is now in a loop, waiting patiently for an incoming network connection.

When a Request Arrives

3. A Customer Arrives (The curl Request is Made)

You run the command: curl -X POST http://localhost:5000/predict ...

Your computer sends a raw HTTP request over the network to port 5000. The server sees this incoming connection.

4. The Server Greets the Customer (Werkzeug Parses the Request)

The Werkzeug server accepts the connection. It reads the raw HTTP text, which looks something like this:

POST /predict HTTP/1.1
Host: localhost:5000
Content-Type: application/json

{"features": [5.1, 3.5, 1.4, 0.2]}

Werkzeug's job is to parse this raw text into a clean, structured format.

5. The Universal Language (WSGI)

Werkzeug now needs to pass this request information to your Flask application. It doesn't just call a random function. It uses a standard interface called WSGI (Web Server Gateway Interface).
Werkzeug packages all the request details (path, method, headers, body, etc.) into a standardized Python dictionary. It then calls your Flask app object, passing it this information according to the WSGI standard. This standard is what allows you to swap out the development server for a production-grade server (like Gunicorn or uWSGI) without changing your Flask code at all.

6. The Host Directs Traffic (Flask's Routing)

Your Flask app object receives the request information via WSGI.
It looks at the key pieces of information: the path (/predict) and the method (POST).
It now consults the routing map it built back in Step 1.
It finds a match! The map says: "A POST request to /predict should be handled by the predict function."

7. The Chef Cooks the Meal (Your Function is Executed)

Finally! Flask calls your predict() function.
To make your life easier, Flask creates helpful "context-aware" objects like request. The request object is a user-friendly way to access the data that Werkzeug originally parsed.
data = request.get_json(): This Flask helper reads the request body and parses it from a JSON string into a Python dictionary.
prediction = model.predict(...): This is your own application logic, which has nothing to do with Flask itself.
return jsonify({"prediction": prediction}): You don't just return a dictionary. The jsonify helper function creates a proper Flask Response object. It converts your Python dictionary back into a JSON string and, crucially, sets the Content-Type header to application/json.

8. The Food is Delivered (The Response is Sent Back)

Your predict function returns the Response object to Flask.
Flask passes this Response object back to the Werkzeug server (again, using the WSGI standard).
The Werkzeug server translates the Response object back into a raw HTTP response text.
It sends this text back over the network to the curl client, which then prints the response body to your terminal.

Key Takeaway

So, app.run doesn't call predict directly. It starts a server that listens for requests, and when a request comes in that matches a rule you defined with @app.route, the server uses the WSGI standard to hand it off to Flask, which then looks up and calls your function.

This architecture provides several benefits:

Separation of Concerns: Web server logic is separate from application logic
Flexibility: You can easily switch between development and production servers
Scalability: Multiple workers can handle requests simultaneously
Standards Compliance: WSGI ensures compatibility across different Python web frameworks

🎯 Key Learning Points

Docker Container Debugging Strategy

This is primarily a diagnostic tool. The debugging workflow follows this logic:

If external test fails but internal test succeeds → Problem is with port mapping or firewall
If internal test also fails → Problem is with application code itself (crashed or wrong port)
Always check container logs first: docker logs <container_id>

Production Deployment Considerations

Environment Consistency: Use capture_requirements.py to ensure exact version matching
Container Health Checks: Implement health endpoints for monitoring
Resource Limits: Set memory and CPU limits in production
Security: Never expose debug endpoints in production

📈 Extension Recommendations

🔐 Security: Add API authentication and input validation
📊 Monitoring: Integrate logging and performance monitoring
🚀 Scalability: Use Kubernetes for container orchestration
🧪 Testing: Add unit tests and integration tests
📦 CI/CD: Configure automated build and deployment pipelines

This project serves as a comprehensive learning example for dockerizing machine learning models, covering the complete pipeline from model training to production deployment.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.gitignore		.gitignore
AWS_DEPLOYMENT.md		AWS_DEPLOYMENT.md
Dockerfile		Dockerfile
README.MD		README.MD
README_LLM_Deployment (Docker vs vLLM).md		README_LLM_Deployment (Docker vs vLLM).md
app.py		app.py
capture_requirements.py		capture_requirements.py
model.joblib		model.joblib
requirements.txt		requirements.txt
train_model.py		train_model.py

Folders and files

Latest commit

History

Repository files navigation