edoakes/serve-model-pipeline
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
Simple prototype of deploying model pipelines from a config on Ray Serve. - Requires that you're running on the Ray nightly wheels (for Serve CLI). - "Models" are defined in serve_pipeline/__init__.py. They're currently just random number generators but could have any Python code filled in. - Pipeline is defined in example_pipeline.json. The "class" field must be the import path to a class that's installed in the Python environment on the Ray cluster. To run: pip install -e serve_pipeline ray start --head serve start python deploy.py example_pipeline.json curl -X GET localhost:8000/api # Should return a random float in [0, 1). Other notes: - Deploying the pipeline (deploy.py) could be done from a remote machine via the Ray client. - The model code definitions should be built into the Docker image that the Ray cluster is running. - This can support all of the Ray Serve features - scaling each model, batching requests, using GPUs, etc. - Error handling is really sloppy right now (e.g., it won't clean up resources if it fails halfway through), but we could make this declarative.