LLM Ops

This is the final project for group 6 for IDS 721. The project involves taking an open source model from HuggingFace and creating a web service in Rust to serve inferences from the model. The webservice needs to be deployed on a Kubernetes cluster and set up with some monitoring and metrics. There also needs to be a CI/CD pipeline for this repository to automate the process of testing/building/deployment of the service.

Build and Dockerization Process

The build process for this project is requires some assets. First you need to download a model from huggingface, and keep it in a known location in the repository.

This process is simplified as you can simply call the model/get_model.sh script to download the correct model and set it up with the correct name and location without direct user input.

You can then build this binary using cargo build. You then need to copy it alongside the built application binary into a docker container to deploy.

To simplify this, the binary has been packaged with a Dockerfile. This Dockerfile will build the binary, including the predownloaded assets. Once the binary is built, it is pushed to ECR using

export ECR_URL=<ECR Repo URL>
docker build -t $ECR_URL:latest .
aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin $ECR_URL
docker push $ECR_URL:latest

A simpler way to do this is to simply call the

deploy/build.sh

script instead, which will run these instructions for you. You need to pass a few options. Example usage could be

./deploy/build.sh -u <ECR_REPO> -n

which will build the container but not deploy it to ECR, or

./deploy/build.sh -u <ECR_REPO> -r <REGION>

which will build and deploy the container to ECR in a region other than us-west-2

Once you have a built container, you can run it locally to test using a command like

docker run -it --rm -p 8080:8080 $ECR_URL:latest

and hit the endpoint with a command like

curl -X POST -H "Content-Type: application/json" --data '{"input": "tigers are cool because"}' http://localhost:8080/message

run from a new terminal (since the original terminal will be running the service).

Kubernetes - Container Orchestration

Amazon Elastic Kubernetes Service (Amazon EKS) is a managed service that makes it easy for you to run Kubernetes on AWS without needing to install and operate your own Kubernetes clusters.

Installation

In order to use the command line interface and interact with EKS, you need to install two things: kubectl and eksctl. Run the following commands (assuming you have homebrew) to install.

kubectl
- brew install kubectl
eksctl
- brew tap weaveworks/tap
- brew install weaveworks/tap/eksctl

Cluster and Namespace Creation

First, we need to create a kubernetes cluster and a namespace

To create a cluster, run:

eksctl create cluster 'sample-cluster2' --zones us-east-1a,us-east-1b,us-east-1c

To create a namespace, run:

kubectl create namespace {name of namespace}

Deploy Docker Image to Cluster

To deploy the docker image to cluster, we need to write two yaml files and run two commands.

The deployment.yaml file pulls a container image from a public repository and deploys three replicas (known as pods).

The service.yaml file allows you to access all replicas through a single IP address or name. We are able to add load balancing here. This performs monitoring and metrics

Once the .yaml files are written, run the following two lines of code:

kubectl apply -f {name of deployment yaml file}.yaml

kubectl apply -f {name of service yaml file}.yaml

QC Deployment

Run the following commands to confirm there are actively running pods

kubectl get all -n {name of namespace}

Details of the deployment can be found by running:

kubectl -n {name of app} describe service {name of service}

Usage

This webservice is intended to be used with cURL. A command for usage should look like

curl -X POST -H "Content-Type: application/json" --data '{"context": "mary is my mother", "query": "who is mary"}' http://a32c9ee0e4f5e4632beea263c7bce134-2036735149.us-west-2.elb.amazonaws.com/answer_question

and should return an answer from the model like

my mother

CI/CD

This repository includes CI/CD pipelines using GitHub Actions to automate testing, linting, formatting, building, and deployment processes for your Rust application.

Continuous Integration (CI)

The CI pipeline runs on every push to the main branch and pull requests targeting the main branch. It ensures code quality and correctness by performing tests, linting, and formatting.

Continuous Deployment (CD)

The CD pipeline deploys your application on every push to the main branch. It requires successful completion of the CI pipeline and deploys your application to AWS Elastic Container Registry (ECR).

Setup

To use the CD pipeline, you'll need to:

Set up AWS access and secret keys as repository secrets (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY).
Ensure the IAM user associated with the provided keys has permissions to:
- Log in to AWS.
- Deploy images to Amazon ECR.

For more details on configuring permissions, refer to AWS IAM documentation.

For detailed configuration, see main.yaml in this repository.

Demo Video

https://youtu.be/P7U5G8eKj98

Screenshots

Create Public ECR

EKS Cluster Creation

Monitoring and Metrics

Working Rust Web Service

Continuous Deployment Image in ECR

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.github/workflows		.github/workflows
deploy		deploy
model		model
photos		photos
src		src
.gitignore		.gitignore
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Ops

Build and Dockerization Process

Kubernetes - Container Orchestration

Installation

Cluster and Namespace Creation

Deploy Docker Image to Cluster

QC Deployment

Usage

CI/CD

Continuous Integration (CI)

Continuous Deployment (CD)

Setup

Demo Video

Screenshots

Create Public ECR

EKS Cluster Creation

Monitoring and Metrics

Working Rust Web Service

Continuous Deployment Image in ECR

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Ops

Build and Dockerization Process

Kubernetes - Container Orchestration

Installation

Cluster and Namespace Creation

Deploy Docker Image to Cluster

QC Deployment

Usage

CI/CD

Continuous Integration (CI)

Continuous Deployment (CD)

Setup

Demo Video

Screenshots

Create Public ECR

EKS Cluster Creation

Monitoring and Metrics

Working Rust Web Service

Continuous Deployment Image in ECR

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages