iris-classifier

Deploy a scikit-learn model as a web service

This example shows how to deploy a classifier trained on the famous iris data set using scikit-learn.

Train your model

Create a Python file trainer.py.
Use scikit-learn's LogisticRegression to train your model.
Add code to pickle your model (you can use other serialization libraries such as joblib).
Upload it to S3 (boto3 will need access to valid AWS credentials).

import boto3
import pickle
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Train the model

iris = load_iris()
data, labels = iris.data, iris.target
training_data, test_data, training_labels, test_labels = train_test_split(data, labels)

model = LogisticRegression(solver="lbfgs", multi_class="multinomial")
model.fit(training_data, training_labels)
accuracy = model.score(test_data, test_labels)
print("accuracy: {:.2f}".format(accuracy))

# Upload the model

pickle.dump(model, open("model.pkl", "wb"))
s3 = boto3.client("s3")
s3.upload_file("model.pkl", "my-bucket", "sklearn/iris-classifier/model.pkl")

Run the script locally:

# Install scikit-learn and boto3
$ pip3 install sklearn boto3

# Run the script
$ python3 trainer.py

Implement a predictor

Create another Python file predictor.py.
Add code to load and initialize your pickled model.
Add a prediction function that will accept a sample and return a prediction from your model.

# predictor.py

import pickle
import numpy as np


model = None
labels = ["setosa", "versicolor", "virginica"]


def init(model_path, metadata):
    global model
    model = pickle.load(open(model_path, "rb"))


def predict(sample, metadata):
    measurements = [
        sample["sepal_length"],
        sample["sepal_width"],
        sample["petal_length"],
        sample["petal_width"],
    ]

    label_id = model.predict(np.array([measurements]))[0]
    return labels[label_id]

Specify Python dependencies

Create a requirements.txt file to specify the dependencies needed by predictor.py. Cortex will automatically install them into your runtime once you deploy:

# requirements.txt

numpy

You can skip dependencies that are pre-installed to speed up the deployment process. Note that pickle is part of the Python standard library so it doesn't need to be included.

Configure a deployment

Create a cortex.yaml file and add the configuration below. A deployment specifies a set of resources that are deployed together. An api provides a runtime for inference and makes our predictor.py implementation available as a web service that can serve real-time predictions:

# cortex.yaml

- kind: deployment
  name: iris

- kind: api
  name: classifier
  predictor:
    path: predictor.py
    model: s3://cortex-examples/sklearn/iris-classifier/model.pkl

Deploy to AWS

cortex deploy takes the declarative configuration from cortex.yaml and creates it on your Cortex cluster:

$ cortex deploy

creating classifier api

Track the status of your deployment using cortex get:

$ cortex get classifier --watch

status   up-to-date   available   requested   last update   avg latency
live     1            1           1           8s            -

endpoint: http://***.amazonaws.com/iris/classifier

The output above indicates that one replica of the API was requested and is available to serve predictions. Cortex will automatically launch more replicas if the load increases and spin down replicas if there is unused capacity.

Serve real-time predictions

We can use curl to test our prediction service:

$ curl http://***.amazonaws.com/iris/classifier \
    -X POST -H "Content-Type: application/json" \
    -d '{"sepal_length": 5.2, "sepal_width": 3.6, "petal_length": 1.4, "petal_width": 0.3}'

"setosa"

Configure prediction tracking

Add a tracker to your cortex.yaml and specify that this is a classification model:

# cortex.yaml

- kind: deployment
  name: iris

- kind: api
  name: classifier
  predictor:
    path: predictor.py
  tracker:
    model_type: classification

Run cortex deploy again to perform a rolling update to your API with the new configuration:

$ cortex deploy

updating classifier api

After making more predictions, your cortex get command will show information about your API's past predictions:

$ cortex get classifier --watch

status   up-to-date   available   requested   last update   avg latency
live     1            1           1           16s           28ms

class        count
setosa       8
versicolor   2
virginica    4

Configure compute resources

This model is fairly small but larger models may require more compute resources. You can configure this in your cortex.yaml:

- kind: deployment
  name: iris

- kind: api
  name: classifier
  predictor:
    path: predictor.py
  tracker:
    model_type: classification
  compute:
    cpu: 0.5
    mem: 1G

You could also configure GPU compute here if your cluster supports it. Adding compute resources may help reduce your inference latency. Run cortex deploy again to update your API with this configuration:

$ cortex deploy

updating classifier api

Run cortex get again:

$ cortex get classifier --watch

status   up-to-date   available   requested   last update   avg latency
live     1            1           1           16s           24 ms

class        count
setosa       8
versicolor   2
virginica    4

Add another API

If you trained another model and want to A/B test it with your previous model, simply add another api to your configuration and specify the new model:

- kind: deployment
  name: iris

- kind: api
  name: classifier
  predictor:
    path: predictor.py
    model: s3://cortex-examples/sklearn/iris-classifier/model.pkl
  tracker:
    model_type: classification
  compute:
    cpu: 0.5
    mem: 1G

- kind: api
  name: another-classifier
  predictor:
    path: predictor.py
    model: s3://cortex-examples/sklearn/iris-classifier/another-model.pkl
  tracker:
    model_type: classification
  compute:
    cpu: 0.5
    mem: 1G

Run cortex deploy to create the new API:

$ cortex deploy

creating another-classifier api

cortex deploy is declarative so the classifier API is unchanged while another-classifier is created:

$ cortex get --watch

api                  status   up-to-date   available   requested   last update
classifier           live     1            1           1           5m
another-classifier   live     1            1           1           8s

Add a batch API

First, implement batch-predictor.py with a predict function that can process an array of samples:

# batch-predictor.py

import pickle
import numpy as np


model = None
labels = ["setosa", "versicolor", "virginica"]


def init(model_path, metadata):
    global model
    model = pickle.load(open(model_path, "rb"))


def predict(sample, metadata):
    measurements = [
        [s["sepal_length"], s["sepal_width"], s["petal_length"], s["petal_width"]] for s in sample
    ]

    label_ids = model.predict(np.array(measurements))
    return [labels[label_id] for label_id in label_ids]

Next, add the api to cortex.yaml:

- kind: deployment
  name: iris

- kind: api
  name: classifier
  predictor:
    path: predictor.py
    model: s3://cortex-examples/sklearn/iris-classifier/model.pkl
  tracker:
    model_type: classification
  compute:
    cpu: 0.5
    mem: 1G

- kind: api
  name: another-classifier
  predictor:
    path: predictor.py
    model: s3://cortex-examples/sklearn/iris-classifier/another-model.pkl
  tracker:
    model_type: classification
  compute:
    cpu: 0.5
    mem: 1G


- kind: api
  name: batch-classifier
  predictor:
    path: batch-predictor.py
    model: s3://cortex-examples/sklearn/iris-classifier/model.pkl
  compute:
    cpu: 0.5
    mem: 1G

Run cortex deploy to create the batch API:

$ cortex deploy

creating batch-classifier api

cortex get should show all three APIs now:

$ cortex get --watch

api                  status   up-to-date   available   requested   last update
classifier           live     1            1           1           10m
another-classifier   live     1            1           1           5m
batch-classifier     live     1            1           1           8s

Try a batch prediction

$ curl http://***.amazonaws.com/iris/classifier \
    -X POST -H "Content-Type: application/json" \
    -d '[
          {
        		"sepal_length": 5.2,
        		"sepal_width": 3.6,
        		"petal_length": 1.5,
        		"petal_width": 0.3
        	},
        	{
        		"sepal_length": 7.1,
        		"sepal_width": 3.3,
        		"petal_length": 4.8,
        		"petal_width": 1.5
        	},
        	{
        		"sepal_length": 6.4,
        		"sepal_width": 3.4,
        		"petal_length": 6.1,
        		"petal_width": 2.6
        	}
        ]'

["setosa","versicolor","virginica"]

Clean up

Run cortex delete to spin down your API:

$ cortex delete

deleting classifier api
deleting another-classifier api
deleting batch-classifier api

Running cortex delete will free up cluster resources and allow Cortex to scale down to the minimum number of instances you specified during cluster installation. It will not spin down your cluster.

Any questions? chat with us.

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
batch-predictor.py		batch-predictor.py
cortex.yaml		cortex.yaml
predictor.py		predictor.py
requirements.txt		requirements.txt
sample.json		sample.json
samples.json		samples.json
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Deploy a scikit-learn model as a web service

Train your model

Implement a predictor

Specify Python dependencies

Configure a deployment

Deploy to AWS

Serve real-time predictions

Configure prediction tracking

Configure compute resources

Add another API

Add a batch API

Try a batch prediction

Clean up

FilesExpand file tree

iris-classifier

Directory actions

More options

Directory actions

More options

Latest commit

History

iris-classifier

Folders and files

parent directory

README.md

Deploy a scikit-learn model as a web service

Train your model

Implement a predictor

Specify Python dependencies

Configure a deployment

Deploy to AWS

Serve real-time predictions

Configure prediction tracking

Configure compute resources

Add another API

Add a batch API

Try a batch prediction

Clean up