This project demonstrates how to package machine learning models into Docker containers and serve them via REST API. It's a complete end-to-end example covering model training, dependency management, Docker building, and deployment debugging best practices.
ml-docker-demo/
├── train_model.py # Model training script
├── capture_requirements.py # Smart dependency capture tool
├── app.py # Flask API service
├── Dockerfile # Docker configuration
├── requirements.txt # Auto-generated dependency list
├── model.joblib # Saved model file
└── README.MD # Project documentation
Create and activate virtual environment:
# Create virtual environment
conda create -n ml-deploy python=3.9
conda activate ml-deploy
# Or use Python venv
python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/macOS
source .venv/bin/activateInstall required dependencies:
pip install flask==2.3.2 scikit-learn==1.3.0 numpy==1.23.5 joblib==1.3.2# 1. Train the model
python train_model.py
# 2. Auto-capture exact dependency versions
python capture_requirements.pyThis generates requirements.txt with pinned versions:
scikit-learn==1.6.1
numpy==1.23.5
flask==2.3.2
joblib==1.3.2
# Build Docker image
docker build -t ml-model-api .
# Run container
docker run -p 5000:5000 ml-model-api
# One-command build and run
docker build -t ml-model-api . && docker run -p 5000:5000 ml-model-apiFollow this sequence for the most efficient debugging and validation:
# Single command test - most efficient validation
docker run --rm ml-model-api \
python -c "import joblib; print('Prediction:', joblib.load('model.joblib').predict([[5.1,3.5,1.4,0.2]]))"Expected Output:
Prediction: [0]
# Start interactive session for comprehensive testing
docker run --rm -it --entrypoint /bin/sh ml-model-api
# Then run multiple tests inside the container:
python -c "import joblib; model = joblib.load('model.joblib'); print(model.predict([[5.1,3.5,1.4,0.2]]))"
python -c "import sklearn; print('scikit-learn version:', sklearn.__version__)"
python -c "import numpy; print('numpy version:', numpy.__version__)"Using curl (Recommended):
Windows PowerShell/CMD:
curl -v -X POST http://localhost:5000/predict -H "Content-Type: application/json" -d "{\"features\": [5.1, 3.5, 1.4, 0.2]}"Linux/macOS/Git Bash:
curl -X POST http://localhost:5000/predict \
-H "Content-Type: application/json" \
-d '{"features": [5.1, 3.5, 1.4, 0.2]}'Expected Output:
{"prediction": 0}Using Python:
# test_api.py
import requests
response = requests.post(
"http://localhost:5000/predict",
json={"features": [5.1, 3.5, 1.4, 0.2]}
)
print(response.json())Direct Container Testing is Better Because:
- No Network Dependencies - Eliminates Flask/API-related issues, tests model in isolation
- Faster Feedback Loop - Instant results without HTTP overhead
- More Reliable Validation - Confirms the model loads correctly and dependency versions match
Testing Priority Sequence:
- ✅ Direct model test - Fastest validation
- ✅ Interactive testing - Comprehensive validation
- ✅ API testing - Integration validation
- ✅ External testing - End-to-end validation
At a glance, this command uses the tool curl to send an HTTP POST request to a web server running on your local machine (localhost) on port 5000. It's sending a piece of data formatted as JSON to the /predict endpoint, likely to get a machine learning prediction in return.
This is the name of the command-line tool itself. cURL (short for "Client for URLs") is a powerful and versatile tool used to transfer data to or from a server. It can communicate over dozens of protocols, including HTTP and HTTPS, which are used for the web. Think of it as a web browser for your terminal, without the graphical interface.
-X: This flag (short for--request) allows you to specify the HTTP request method to be used. Whilecurlcan often infer the method, explicitly stating it with-Xis clear and good practice.POST: This is the specified HTTP method. Unlike aGETrequest (which is used to retrieve data from a server), aPOSTrequest is used to send data to the server to create a new resource or submit data for processing. In this case, you are posting data to be used for a prediction.
This is the destination URL for the request. Let's break down the URL itself:
http://: This is the protocol being used—Hypertext Transfer Protocol. It's the standard protocol for web communication.localhost: This is a special hostname that always points back to your own computer. This means you are not sending the request out to the internet; you are sending it to a server program that is running on the same machine you are running thecurlcommand from.:5000: This is the port number. Since a computer can have many different server applications running at once, a port is used to direct the request to the correct application. Web servers typically use port 80 for HTTP, but development servers often use other ports like5000,8000, or8080to avoid conflicts./predict: This is the path on the server. It specifies which "endpoint" or specific resource should handle this request. In an API, this path is likely routed to a function that takes the incoming data, feeds it to a machine learning model, and generates a prediction.
-H: This flag (short for--header) allows you to include an HTTP header in your request. Headers provide additional information or metadata about the request to the server."Content-Type: application/json": This is one of the most common headers. It explicitly tells the server what kind of data format is being sent in the body of the request. By specifyingapplication/json, you are letting the server know that it should expect to parse a JSON object, which helps it process the request correctly.
-d: This flag (short for--data) is used to include data in the body of the request. Since this is aPOSTrequest, the data specified here is what gets sent to the server.'{"features": [...]}': This is the actual data payload.- The outer single quotes (
'...') are used to ensure that the shell treats the entire JSON string as a single argument, protecting the inner double quotes from being misinterpreted by the terminal. - The inner content (
{"features": [5.1, 3.5, 1.4, 0.2]}) is the data itself, formatted as a JSON object. This object has a single key named "features", and its value is an array of four numbers—likely the feature vector for a single data point you want the model to make a prediction on.
- The outer single quotes (
When you execute this command, the following happens:
curlconstructs an HTTPPOSTrequest.- It sets the destination to the
/predictendpoint of a server on your own machine at port5000. - It adds a header indicating the data payload is in JSON format.
- It attaches the JSON data
{"features": [5.1, 3.5, 1.4, 0.2]}as the body of the request. - It sends the request and waits for a response from the server, which it will then print to your terminal.
# Build and run
docker build -t ml-model-api .
docker run -p 5000:5000 ml-model-api
docker build -t ml-model-api . && docker run -p 5000:5000 ml-model-api
# Container lifecycle management
docker ps # List running containers
docker logs <container_id> # View container logs
docker stop <container_id> # Stop container
docker stop $(docker ps -q --filter ancestor=ml-model-api) # Stop all related containers
docker stats <container_id> # View resource usage (CPU, memory, network I/O)
docker attach <container_id> # Reattach to view container output (Ctrl+C to exit)
# Container inspection
docker run -it ml-model-api pip list # Verify dependencies inside container
docker run -it ml-model-api ls -l /app # Check container file structure# Get container ID and enter shell
docker ps
docker exec -it <container_id> /bin/sh
# Install curl inside container for internal testing
apk add curl # Alpine image
# OR
apt-get update && apt-get install -y curl # Debian image
# Test API from inside the container
curl -X POST http://localhost:5000/predict \
-H "Content-Type: application/json" \
-d '{"features": [5.1, 3.5, 1.4, 0.2]}'# Start container in detached mode with a named reference
docker run -d --name my-api ml-model-apiCommand breakdown:
-d: Runs the container in "detached" mode (in the background)--name my-api: Gives the container an easy-to-remember name- This command starts the container and runs the default ENTRYPOINT, so your API server will be running inside it
# Access the running container's shell
docker exec -it my-api /bin/sh
# Inside the container's shell - install curl for testing
apk add curl # Alpine image
# OR
apt-get update && apt-get install -y curl # Debian image
# Test API from inside the container
curl -X POST http://localhost:5000/predict \
-H "Content-Type: application/json" \
-d '{"features": [5.1, 3.5, 1.4, 0.2]}'
# Additional debugging commands
ls -la /app # Check file structure
cat requirements.txt # Verify dependencies
python -c "import joblib; print(joblib.load('model.joblib'))" # Test model loading# Start container in interactive mode without running the API
docker run -it --entrypoint /bin/sh ml-model-api
# Debug inside container (API is NOT running in this mode)
ls -la /app
cat requirements.txt
python app.py # Manually start the API for testingUnderstanding the difference between these Docker commands:
What it does:
- Creates a new container from the
ml-model-apiimage - Overrides the default startup command (which would normally run
python app.py) - Instead runs
/bin/sh(a shell) as the main process -itflags make it interactive with a terminal
Key points:
- ❌ Your Flask API is NOT running when you use this command
- ✅ You get direct shell access to explore the container
- ✅ You can manually start the API with
python app.pyif needed - 🔄 When you exit the shell, the container stops and is removed
What it does:
- First command creates and starts a container running your Flask API in the background
- Second command opens a shell session inside the already-running container
- The API continues running while you explore
Key points:
- ✅ Your Flask API IS running and accessible
- ✅ You can test the API from inside the container
- ✅ You can also test from outside (host machine) simultaneously
- 🔄 When you exit the shell, the API keeps running
Method 3: docker run -it --entrypoint /bin/sh
┌─────────────────────┐
│ Container │
│ ┌─────────────────┐ │
│ │ /bin/sh │ │ ← You are here (shell is the main process)
│ │ (interactive) │ │
│ └─────────────────┘ │
│ │
│ python app.py │ ← API is NOT running
│ (not started) │
└─────────────────────┘
Method 2: docker run -d + docker exec
┌─────────────────────┐
│ Container │
│ ┌─────────────────┐ │
│ │ python app.py │ │ ← API running as main process
│ │ (Flask server) │ │
│ └─────────────────┘ │
│ ┌─────────────────┐ │
│ │ /bin/sh │ │ ← You are here (additional shell session)
│ │ (interactive) │ │
│ └─────────────────┘ │
└─────────────────────┘
Use Method 3 (--entrypoint /bin/sh) when:
- 🔍 You want to explore the container environment
- 🐛 Debug file permissions, missing files, or environment issues
- 🧪 Test individual components before running the full application
- 📝 You want to understand what's inside the image
Use Method 2 (docker run -d + docker exec) when:
- 🌐 You want to test the running API from inside the container
- 🔄 You need both internal and external testing simultaneously
- 🏃♂️ You want to debug a live, running application
- 🔍 You want to monitor logs while testing
# 1. Start a new container and get a shell inside it
docker run --rm -it ml-model-api /bin/sh
# 2. Once inside the shell, run this Python command
python -c "import joblib; print(joblib.load('model.joblib').predict([[5.1,3.5,1.4,0.2]]))"What this proves: This entire process "Proves the model works even without port mapping." This test doesn't involve the Flask web application or any networking at all. It's a direct test of the files (model.joblib) and the Python environment inside the container, verifying that the core logic is sound.
Prerequisite: A running container named ml-model-api:
# Start the container in the background and give it a name
docker run -d --name ml-model-api -p 5000:5000 ml-model-apiThe Command:
# On your host machine:
docker exec ml-model-api ps auxWhat this shows: The output will be a list of all the processes running inside that specific container. If you started the container normally, you would expect to see the main python process that is running your Flask application.
Once you have a shell inside the container, you become a detective. Here's a systematic approach to investigating issues:
ls -lhaNarration: "Are the files I expect to be here actually here? Do they have the right permissions?"
- Check for
app.py,model.joblib,requirements.txt - Verify file sizes and timestamps
- Look for permission issues (executable bits, ownership)
pip listNarration: "Let me check the installed package versions to see if they match requirements.txt."
- Compare installed versions with
requirements.txt - Look for missing dependencies
- Check for version conflicts
envNarration: "Let me check the environment variables. Is the PYTHONPATH set correctly? Is a required API key or config path missing?"
- Verify
PYTHONPATHconfiguration - Check for missing environment variables
- Look for configuration paths
python app.pyNarration: "I'll try to run the application manually from the shell. This will give me a direct, interactive traceback if it crashes."
- Get immediate Python error messages
- See startup logs and error details
- Identify import errors or runtime issues
curl localhost:5000/predict -X POST -H "Content-Type: application/json" -d '{"features": [5.1,3.5,1.4,0.2]}'Narration: "I'll test the service from inside the container to isolate whether it's an application problem or a Docker networking problem."
- Confirms the Flask app is responding internally
- Tests the prediction endpoint functionality
- Eliminates port mapping issues from diagnosis
# Enter container
docker exec -it <container_id> /bin/sh
# Step 1: Are the files I expect to be here actually here? Do they have the right permissions?
ls -lha # Check your app.py, your model file, etc.
ls -la /app # Verify files in the working directory
# Step 2: Let me check the installed package versions to see if they match requirements.txt
pip list # Compare with requirements.txt for version mismatches
pip list | grep -E "flask|sklearn|numpy|joblib" # Focus on key packages
# Step 3: Let me check the environment variables. Is the PYTHONPATH set correctly? Is a required API key or config path missing?
env # Check all environment variables
env | grep -E "PYTHON|PATH" # Focus on Python-related variables
# Step 4: Try to run the application manually from the shell. This will give you a direct, interactive traceback if it crashes.
python app.py # Manual startup - see immediate error messages
# Step 5: I'll test the service from inside the container to isolate whether it's an application problem or a Docker networking problem.
curl localhost:5000/predict \
-X POST \
-H "Content-Type: application/json" \
-d '{"features": [5.1,3.5,1.4,0.2]}' # Test internal service responseThis systematic approach helps you quickly identify whether issues are:
- File-related: Missing files, wrong permissions
- Dependency-related: Package version mismatches
- Configuration-related: Missing environment variables
- Application-related: Python code errors
- Network-related: Port mapping or service binding issues
| Issue | Container Test Command |
|---|---|
| Model Loading | docker run --rm ml-model-api python -c "import joblib; joblib.load('model.joblib')" |
| Dependency Check | docker run --rm ml-model-api python -c "import sklearn,numpy; print(sklearn.__version__, numpy.__version__)" |
| Data Shape Test | docker run --rm ml-model-api python -c "import joblib; print(joblib.load('model.joblib').predict([[1,2,3,4]]).shape)" |
| Feature Count | docker run --rm ml-model-api python -c "import joblib; print('Features expected:', joblib.load('model.joblib').n_features_in_)" |
| Model Type | docker run --rm ml-model-api python -c "import joblib; print('Model type:', type(joblib.load('model.joblib')))" |
Use these techniques as part of a comprehensive debugging approach:
- Direct Model Test: Verify the model loads and predicts correctly
- Process Inspection: Confirm the Flask application is running
- Internal API Test: Use curl from inside the container
- External API Test: Use curl from the host machine
This layered approach helps isolate issues at each level of the stack.
Using capture_requirements.py instead of pip freeze:
- ✅ Exact Version Matching: Ensures production environment matches training environment exactly
- ✅ Avoid Version Conflicts: Only includes dependencies actually needed by the model
- ✅ Reproducible Builds: Eliminates "works on my machine" problems
Windows:
pip freeze | findstr "flask scikit-learn numpy joblib" > requirements.txtLinux/macOS:
pip freeze | grep -E "flask|scikit-learn|numpy|joblib" > requirements.txtVerify generated file:
type requirements.txt # Windows
cat requirements.txt # Linux/macOS| Symptom | Possible Cause | Solution |
|---|---|---|
curl command hangs |
Container not started or port mapping error | docker ps to check container status, verify port mapping |
| Port conflict error | Port 5000 already in use | Change host port: -p 5001:5000 |
| Container startup failure | Dependency installation error or missing files | Check docker logs <container_id> |
| API returns 500 error | Model file missing or corrupted | Retrain model, check model.joblib |
-
Check Container Status
docker ps -a
-
View Container Logs
docker logs <container_id>
-
Verify Container Internals
docker exec -it <container_id> /bin/sh ls -la /app python -c "import joblib; print('Model loaded:', joblib.load('model.joblib'))"
-
Test API Response
- First test inside container
- Then test from host
- Compare results to locate issue
The two testing methods look similar, but they are fundamentally different and test different parts of your setup.
curl -v -X POST http://localhost:5000/predict -H "Content-Type: application/json" -d "{\"features\": [5.1, 3.5, 1.4, 0.2]}"What is happening?
You are running curl on your own computer's command line (the "host"). The request originates from your host machine and travels through its network stack to the container.
- Meaning of
localhost:5000: In this context,localhostrefers to your host machine. The request is sent to port5000on your computer. - The Critical Prerequisite: This test will only succeed if you have mapped the container's port to the host's port. You must have started your container with the
--publishor-pflag, like this:docker run -p 5000:5000 my-api-image
What it tests:
- External Connectivity: This is an end-to-end test that simulates how a real external client would access your service.
- Port Mapping: It directly verifies that your
-p 5000:5000mapping is configured correctly. - Host Firewall Rules: It confirms that your host machine's firewall is not blocking traffic on port
5000.
Analogy: You are calling the restaurant's public-facing phone number. You are testing the entire connection from the outside world to the kitchen.
docker exec -it <container_id> /bin/sh
# Now inside the container's shell...
curl -X POST http://localhost:5000/predict -H "Content-Type: application/json" -d '{"features": [5.1, 3.5, 1.4, 0.2]}'What is happening?
You first use docker exec to get a command-line shell inside the running container. Then, you run curl from that shell. The entire network request originates and terminates within the container's isolated network environment.
- Meaning of
localhost:5000: In this context,localhostrefers to the container itself. The request never leaves the container. - The Critical Prerequisite: The container must be running, but no port mapping is required for this test. You are bypassing the entire Docker port forwarding mechanism.
What it tests:
- Internal Application Health: This test verifies that your Python/Flask application is running correctly inside its own environment and is listening on the correct internal port (
5000). - Application Configuration: It confirms your application code (
app.run(host="0.0.0.0", ...)) is bound correctly to listen for connections within the container. - Debugging: This is primarily a diagnostic tool. If the test from the host fails, this is your next step. If this internal test succeeds but the external one fails, you know the problem is with your port mapping or a firewall. If this internal test also fails, you know the problem is with your application code itself (it crashed, or it's not listening on the right port).
Analogy: You are already inside the restaurant's kitchen, and you shout an order directly to the chef. You are only testing if the chef is there and can hear you. You are not testing the public phone lines at all.
| Feature | Testing from Host Machine | Testing from Inside Container |
|---|---|---|
| What it Tests | End-to-end external connectivity & port mapping | Internal application health & configuration |
| Network Path | Host Machine → Docker Network → Container | Container → Container (Internal Loopback) |
Meaning of localhost |
Your main computer (the host) | The container itself |
| Prerequisite | docker run -p <host_port>:<container_port> is required |
No port mapping is required |
| Primary Use Case | Simulating a real user; integration testing | Debugging and isolating application-level problems |
You may have noticed a subtle difference in the commands:
- From Host
cmd:-d "{\"features\": [5.1, 3.5, 1.4, 0.2]}" - Inside Container
sh:-d '{"features": [5.1, 3.5, 1.4, 0.2]}'
This is due to differences in how command-line shells handle quotes:
- Windows Command Prompt (
cmd.exe): Requires escaping inner double quotes with a backslash (\) inside an outer double-quoted string. - Linux shells (
shorbash): Allow you to enclose the entire JSON string in single quotes ('), which tells the shell to treat everything inside literally.
Step 1: Test from Inside Container First
docker exec -it <container_id> /bin/sh
curl -X POST http://localhost:5000/predict -H "Content-Type: application/json" -d '{"features": [5.1, 3.5, 1.4, 0.2]}'Step 2: Interpret Results
- ✅ Internal test succeeds + External test succeeds: Everything is working perfectly
- ✅ Internal test succeeds + External test fails: Problem is with port mapping (
-pflag) or firewall - ❌ Internal test fails: Problem is with your application code, dependencies, or container configuration
Step 3: Fix Based on Results
- If internal test fails: Check container logs, verify model files, check dependencies
- If only external test fails: Verify port mapping, check firewall settings, ensure correct host/port
Dockerfile follows best practices:
FROM python:3.9-slim # Use lightweight base image
WORKDIR /app # Set working directory
COPY requirements.txt . # Copy dependency file first
RUN pip install --no-cache-dir -r requirements.txt # Install deps (cached layer)
COPY . . # Copy application code
EXPOSE 5000 # Declare port
CMD ["python", "app.py"] # Startup commandKey configuration notes:
host="0.0.0.0": Allow external access (required for Docker containers)port=5000: Must match Dockerfile's EXPOSE directive- Production environments should use WSGI servers like Gunicorn
# 1. Development and training
python train_model.py
# 2. Generate exact dependencies
python capture_requirements.py
# 3. Local testing
python app.py
# 4. Docker deployment
docker build -t ml-model-api .
docker run -p 5000:5000 ml-model-api
# 5. Validation testing
curl -X POST http://localhost:5000/predict \
-H "Content-Type: application/json" \
-d '{"features": [5.1, 3.5, 1.4, 0.2]}'That is an excellent question that gets to the very heart of how modern web frameworks like Flask operate. The connection between app.run() and your predict() function is not direct; it's a beautifully orchestrated process involving a web server, a standardized interface, and a routing system.
Let's walk through the entire lifecycle of a request "under the hood."
Think of your Flask application as a specialized restaurant:
- The Server (
app.run): This is the restaurant itself opening for business. It opens the front door (0.0.0.0:5000) and waits for customers. - The Routing Map (
@app.route): This is the restaurant's menu and floor plan. It tells the host where to send a customer based on what they ask for. - Your Function (
predict): This is a specific chef in the kitchen who knows how to prepare one particular dish. - The Request (
curl): This is a customer walking in and placing a specific order. - WSGI: This is the universal language that the host, waiters, and chefs all agree to speak so that orders are handled consistently.
Here is the detailed sequence of events that connects a curl request to your predict() function.
1. The Menu is Written (The Routing Map is Built)
This is the most important "magic" and it happens the moment you run python app.py, even before the server starts waiting for requests.
- Python executes your script from top to bottom.
- It sees
@app.route("/predict", methods=["POST"]). This is a Python decorator. - A decorator is a special function that wraps another function. In this case, the
@app.route()decorator "wraps" yourpredict()function. - Its job is not to run
predict()right now. Its job is to register it. It tells the mainappobject: "Hey, if you ever receive a request for the path/predictand the method isPOST, the function you need to call ispredict." - This builds an internal "routing map" or "URL map" inside the
appobject. It's essentially a dictionary mapping URL rules to specific Python functions.
2. The Restaurant Opens (The Server Starts)
- The script reaches the
if __name__ == "__main__":block. This standard Python construct ensures the code inside only runs when the script is executed directly (not when imported as a module). app.run(host="0.0.0.0", port=5000)is called. This starts a development web server. Flask uses a library called Werkzeug (German for "tool") for this.- This Werkzeug server creates a listening socket on your computer. It listens on all available network interfaces (
host="0.0.0.0") on port5000. It is now in a loop, waiting patiently for an incoming network connection.
3. A Customer Arrives (The curl Request is Made)
You run the command: curl -X POST http://localhost:5000/predict ...
- Your computer sends a raw HTTP request over the network to port
5000. The server sees this incoming connection.
4. The Server Greets the Customer (Werkzeug Parses the Request)
- The Werkzeug server accepts the connection. It reads the raw HTTP text, which looks something like this:
POST /predict HTTP/1.1 Host: localhost:5000 Content-Type: application/json {"features": [5.1, 3.5, 1.4, 0.2]}
- Werkzeug's job is to parse this raw text into a clean, structured format.
5. The Universal Language (WSGI)
- Werkzeug now needs to pass this request information to your Flask application. It doesn't just call a random function. It uses a standard interface called WSGI (Web Server Gateway Interface).
- Werkzeug packages all the request details (path, method, headers, body, etc.) into a standardized Python dictionary. It then calls your Flask
appobject, passing it this information according to the WSGI standard. This standard is what allows you to swap out the development server for a production-grade server (like Gunicorn or uWSGI) without changing your Flask code at all.
6. The Host Directs Traffic (Flask's Routing)
- Your Flask
appobject receives the request information via WSGI. - It looks at the key pieces of information: the path (
/predict) and the method (POST). - It now consults the routing map it built back in Step 1.
- It finds a match! The map says: "A
POSTrequest to/predictshould be handled by thepredictfunction."
7. The Chef Cooks the Meal (Your Function is Executed)
- Finally! Flask calls your
predict()function. - To make your life easier, Flask creates helpful "context-aware" objects like
request. Therequestobject is a user-friendly way to access the data that Werkzeug originally parsed. data = request.get_json(): This Flask helper reads the request body and parses it from a JSON string into a Python dictionary.prediction = model.predict(...): This is your own application logic, which has nothing to do with Flask itself.return jsonify({"prediction": prediction}): You don't just return a dictionary. Thejsonifyhelper function creates a proper FlaskResponseobject. It converts your Python dictionary back into a JSON string and, crucially, sets theContent-Typeheader toapplication/json.
8. The Food is Delivered (The Response is Sent Back)
- Your
predictfunction returns theResponseobject to Flask. - Flask passes this
Responseobject back to the Werkzeug server (again, using the WSGI standard). - The Werkzeug server translates the
Responseobject back into a raw HTTP response text. - It sends this text back over the network to the
curlclient, which then prints the response body to your terminal.
So, app.run doesn't call predict directly. It starts a server that listens for requests, and when a request comes in that matches a rule you defined with @app.route, the server uses the WSGI standard to hand it off to Flask, which then looks up and calls your function.
This architecture provides several benefits:
- Separation of Concerns: Web server logic is separate from application logic
- Flexibility: You can easily switch between development and production servers
- Scalability: Multiple workers can handle requests simultaneously
- Standards Compliance: WSGI ensures compatibility across different Python web frameworks
This is primarily a diagnostic tool. The debugging workflow follows this logic:
- If external test fails but internal test succeeds → Problem is with port mapping or firewall
- If internal test also fails → Problem is with application code itself (crashed or wrong port)
- Always check container logs first:
docker logs <container_id>
- Environment Consistency: Use
capture_requirements.pyto ensure exact version matching - Container Health Checks: Implement health endpoints for monitoring
- Resource Limits: Set memory and CPU limits in production
- Security: Never expose debug endpoints in production
- 🔐 Security: Add API authentication and input validation
- 📊 Monitoring: Integrate logging and performance monitoring
- 🚀 Scalability: Use Kubernetes for container orchestration
- 🧪 Testing: Add unit tests and integration tests
- 📦 CI/CD: Configure automated build and deployment pipelines
This project serves as a comprehensive learning example for dockerizing machine learning models, covering the complete pipeline from model training to production deployment.