Skip to content

katwre/Genetype-classifier-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

Figure. An example of a gene and its nucleotide sequence. Figure copied from this post.

The aim of this project is to predict gene type directly from a raw DNA sequence using classical machine learning models, convolutional neural networks, and a custom nucleotide-level Transformer, while also exploring practical deployment of machine learning services with Docker compose, Kubernetes (locally with Kind) and cloud infrastructure (AWS Kubernetes EKS).

A gene is a defined segment of DNA that contains the nucleotide sequence (A, T, C, and G) encoding the instructions for producing a functional product, most commonly a protein. The nucleotide sequence determines the amino acid sequence, which in turn specifies a protein’s structure and function. Genes typically begin with a start codon and end with a stop codon, which together define the boundaries of the coding region.

Genes can be broadly categorized into protein-coding genes, which are translated into proteins, and non-coding genes, which give rise to functional RNA molecules. Non-coding RNAs constitute the majority of the genome and carry out diverse regulatory and structural roles without being translated into proteins. They are often classified by size into small non-coding RNAs (e.g., snRNAs) and long non-coding RNAs (lncRNAs).

Project overview

Gene types (such as protein-coding genes, non-coding RNAs, and pseudogenes) and their corresponding DNA sequences were obtained from the NCBI Gene Database via the following Kaggle dataset: https://www.kaggle.com/datasets/harshvardhan21/dna-sequence-prediction.

The dataset contains:

  • Gene symbols and textual descriptions
  • Gene functional types
  • DNA nucleotide sequences represented as A, C, G, T
  • Predefined train / validation / test splits

Each sequence consists of a variable number of nucleotides and is capped at a maximum length of 1024 bp for transformer training.

Modeling Approaches

To systematically evaluate different learning paradigms for gene-type prediction, the following models were implemented and compared:

  1. Classical machine learning (baseline)
  • 6-mer generation from nucleotide sequences
  • k-mer frequency counting
  • Classification using:
    • Logistic Regression
    • XGBoost
  • This baseline captures local motif statistics but ignores long-range dependencies.
  1. Deep learning with convolutions
  • 1D Convolutional Neural Network (CNN)
  • One-hot encoded DNA input
  • It captures local sequence patterns and motifs.
  1. Nucleotide-level transformer
  • A custom, lightweight, encoder-only Transformer trained from scratch for genomic sequence classification.
  • Architecture:
    • Nucleotide-level, encoder-only Transformer
    • 3 encoder layers
    • Hidden dimension (d_model) of 192
    • 6 attention heads
    • Feed-forward hidden dimension of 512
    • Dropout 0.2
    • Early stopping based on validation macro-F1
    • Max sequence length (max_len): 1024 to cover all sequences without truncation
    • Tokenization: simply one token per nucleotide:
      • Vocabulary: ["A", "C", "G", "T", "N", "[PAD]", "[CLS]"] (plus maybe [UNK])
      • Encode each sequence as [CLS] + tokens + [PAD] up to max_len

Tech Stack

ML | Data Science

🧠 numpy • pandas • scikit-learn • PyTorch • XGBoost • CNN • transformer-encoder

📊 seaborn • matplotlib

Backend

🌐 FastAPI

🚀 gunicorn

MLOps | Deployment

🐳 Docker • Docker Compose

📦 BentoML

⚙️ ONNX Runtime

☁️ Kind • AWS Kubernetes (EKS)

Usage

⚠️ Service availability note

The public Kubernetes endpoint shown below was used during development and testing on AWS EKS. To avoid unnecessary cloud costs, the EKS cluster and external load balancer are not kept running continuously. As a result, the endpoint may be unavailable at the time of reading. All inference steps can be fully reproduced locally using Docker Compose or local Kubernetes (Kind), as described below.

Example request in python (EKS deployment - illustrative):

url = "http://a180c213e0dc94bf3b569493738d5e85-1066055406.eu-west-1.elb.amazonaws.com/predict"

# request JSON
data = {'sequence': 'CTTTCTGCCGCCATCTTGCTTCCGCGTTCCCTGCACAAAATGCCGGGCGAAGCCACAGAAACCGTCCCTGCTACAGAGCAGGAGTTGCCGCAGTCCCAGGCTGAGACAGGGTCTGGAACAGCATCTGATAGTGGTGAATCAGTACCAGGGATTGAAGAACAGGATTCCACCCAGACCACCACACAAAAAGCCTGGCTGGTGGCAGCAGCTGAAATTGATGAAGAACCAGTCGGTAAAGCAAAACAGAGTCGGAGTGAAAAGAGGGCACGGAAGGCTATGTCCAAACTGGGTCTTCTACAGGTTACAGGAGTTACTAGAGTCACTATCTGGAAATCTAAGAATATCCTCTTTGT'}

result = requests.post(url, json=data).json()
print(result)

Example responce:

{
  "predictions": {
    "BIOLOGICAL_REGION": 0.00,
    "OTHER": 1.57,
    "PROTEIN_CODING": 89.0,
    "PSEUDO": 0.10,
    "ncRNA": 8.56,
    "snoRNA": 0.75,
    "tRNA": 0.02
  },
  "top_class": "PROTEIN_CODING",
  "top_probability": 89.0
}

Deployment

After evaluating multiple modeling approaches, a custom Transformer encoder was selected as the final model due to its superior performance in gene-type classification. The trained PyTorch model was exported to ONNX and packaged as a BentoML service using ONNX Runtime, exposing a stable HTTP /predict endpoint.

The service can be run locally with Docker Compose for reproducible end-to-end testing and deployed unchanged to AWS EKS via Amazon ECR, where it operates as a standard Kubernetes workload. Local Kubernetes deployment using Kind is also described to mirror the cloud setup during development.

To sum up, this project is deployed and tested incrementally across multiple environments, starting from local development and progressing to cloud-native model serving:

  • 1. Local deployment with Docker Compose (a gateway service + a model service): A multi-container setup consisting of a FastAPI gateway service and a BentoML-based model service running ONNX Runtime.

  • 2. Kubernetes-based deployment with ONNX Runtime

    • 2A. AWS EKS: Deployment on Amazon Elastic Kubernetes Service (EKS)
    • 2B. Local Kubernetes with Kind: A local Kubernetes setup using Kind to mirror the EKS deployment for development and testing.

Organization of the files in the repository:

Genetype-classifier-api/
├── notebooks/
│   ├── 01_EDA.ipynb                 # Data exploration: class balance, sequence length, cleaning checks
│   ├── 02_baseline_kmers.ipynb       # Baseline model using k-mers of the input (e.g., Logistic Regression / XGBoost)
│   ├── 02_baseline_CNN.ipynb         # Baseline deep model: CNN sequence classifier
│   ├── 02_transformer.ipynb          # Transformer-encoder training + evaluation
│   └── 03_model_comparison.ipynb     # Model comparison across baselines and transformer, metrics + plots
│
├── data/
│   ├── train.csv                    # Training split (raw/processed sequences + labels)
│   ├── validation.csv               # Validation split
│   ├── test.csv                     # Test split
│   ├── train_classimb.csv           # Training split with class imbalance variant
│   ├── validation_classimb.csv      # Validation split with class imbalance variant
│   └── test_classimb.csv            # Test split with class imbalance variant
│
├── models/
│   ├── transformer_model.pkl         # Serialized Transformer model (training artifact / checkpoint)
│   ├── transformer_classifier.onnx   # Exported ONNX model for optimized inference
│   ├── transformer_classifier.onnx.data # External tensor weights for ONNX (large model data)
│   ├── transformer_report.txt        # Evaluation report (metrics, per-class performance)
│   ├── cnn_model.pkl                 # Serialized CNN baseline model
│   ├── cnn_report.txt                # CNN evaluation report
│   ├── logistic_regression_model.pkl # Baseline classical model artifact
│   ├── logistic_regression_report.txt # Baseline evaluation report
│   ├── xboost_model.pkl              # XGBoost baseline model artifact
│   └── xboost_report.txt             # XGBoost evaluation report
│
├── service/
│   ├── app.py                        # FastAPI application (inference endpoint(s))
│   ├── Dockerfile_k8s                # Service image build for Kubernetes deployment
│   ├── test.py                       # Local service tests / smoke tests
│   └── __init__.py                   # Package marker
│
├── docker_compose/
│   ├── bentofile.yaml                # BentoML build configuration for model packaging
│   ├── service.py                    # Bento service definition (runner, API adapters, etc.)
│   ├── gateway.py                    # Gateway / routing layer (fronts model service)
│   ├── docker-compose.yaml           # Local multi-container setup (gateway + model service)
│   ├── image-model.dockerfile        # Docker build for model service container
│   ├── image-gateway.dockerfile      # Docker build for gateway container
│   └── __init__.py                   # Package marker
│
├── eks/
│   ├── eks-config.yaml               # EKS cluster configuration (cluster, nodegroups, region, VPC settings)
│   ├── model-deployment.yaml         # Kubernetes Deployment for model service
│   ├── model-service.yaml            # Kubernetes Service for model service (ClusterIP, ports)
│   ├── gateway-deployment.yaml       # Kubernetes Deployment for gateway
│   ├── gateway-service.yaml          # Kubernetes Service for gateway (LoadBalancer / ingress entrypoint)
│   └── test.py                       # EKS-related tests / helper script(s)
│
├── k8s/
│   ├── deployment.yaml    # Kubernetes Deployment for the inference service
│   ├── service.yaml       # Kubernetes Service exposing the API
│   ├── hpa.yaml           # Horizontal Pod Autoscaler for scaling under load
│   └── load_test.py       # Load testing script for the deployed service
│
├── plots/                            # Training curves and confusion matrices for the ML models
├── img/                              # README assets (logo/illustrations)
├── pyproject.toml                    # Python project config + dependencies
├── uv.lock                           # Locked dependency versions (uv)
└── README.md                         # Project overview, setup, usage, deployment notes

1. Local deployment with Docker Compose (a gateway service + a model service)

Figure. Deployment schema: a FastAPI gateway (HTTP) forwards requests to a BentoML model service that runs ONNX Runtime inference and returns JSON.

Docker Compose provides a simple and deterministic way to run multi-container applications on a single host. In this project, it is used to validate the full inference stack end-to-end (gateway routing, model loading, request/response schema, and container networking) in a production-like environment without Kubernetes. This same container-first approach transfers cleanly to AWS EKS later.

The deployment uses a lightweight FastAPI gateway for request routing and a BentoML model service for inference. BentoML is an open-source framework designed to package model execution code, runtime dependencies, and serving logic into a reproducible, containerized service. In this project, BentoML exposes the ONNX Runtime inference pipeline as a production-grade HTTP API, decoupling model training from deployment and runtime concerns.

Architecture overview. This deployment contains two services:

  1. Gateway service (FastAPI, port 9696)
  • Receives HTTP requests from the client (e.g., website or curl).
  • Performs lightweight request validation (optional).
  • Forwards the request to the model service via an internal Docker network.
  1. Model service (BentoML + ONNX Runtime, port 3000)
  • Loads the exported ONNX model at startup.
  • Runs domain-specific sequence preprocessing (tokenization, attention mask, padding).
  • Executes inference via ONNX Runtime.
  • Applies postprocessing (softmax, class mapping) and returns a JSON response.

Run the model service locally (without Docker)

This is useful for validating ONNX inference and the BentoML endpoint before containerizing.

bentoml serve docker_compose.service:GeneFunctionService --port 3000

Test in another terminal:

curl -X POST http://localhost:3000/predict   -H "Content-Type: application/json"   -d '{"sequence":"AACGGCTC"}'

Build and run the model service in Docker

Build/run locally:

# Build the model image (run from repository root):
docker build -t genetype-bento:001 -f docker_compose/image-model.dockerfile .
# Run the container:
docker run --rm -p 3000:3000 genetype-bento:001

Test it:

curl -X POST http://localhost:3000/predict \
  -H "Content-Type: application/json" \
  -d '{"sequence":"AACGGCTC"}'

Build the gateway image

The gateway forwards requests to the model service using:

MODEL_URL=http://onnx-model:3000/predict

Build the gateway image (run from repository root; adjust Dockerfile path if needed):

docker build -t genetype-gateway:001 -f docker_compose/image-gateway.dockerfile .

Run the full stack with Docker Compose

Start both containers and their internal network:

docker-compose -f docker_compose/docker-compose.yaml up --build

Test end-to-end via the gateway:

curl -X POST http://localhost:9696/predict \
  -H "Content-Type: application/json" \
  -d '{"sequence":"AACGGCTC"}'

You should receive the JSON response produced by BentoML (predictions + top_class + top_probability).

Useful commands:

  • docker-compose up: run docker compose
  • docker-compose up -d: run docker compose in detached mode
  • docker ps: to see the running containers
  • docker-compose down: stop the docker compose

2. Kubernetes-based deployment

Figure. Schematic overview of a Kubernetes cluster showing ingress-based request routing through a gateway service to replicated model-serving deployments across cluster nodes.

2A. AWS EKS

Here, I'll create Elastic Kubernetes Service (EKS) cluster on Amazon using cli, publish images to ECR and configure kubectl.

What is needed to be installed locally:

  • aws CLI configured for your AWS account (command line tool for working with AWS services)
  • kubectl (it manages Kubernetes objects within your Amazon EKS clusters)
  • eksctl CLI (it interacts with AWS to create, modify, and delete Amazon EKS clusters)
  • helm (popular package manager for kubernetes)
  • access to an EKS cluster (existing or newly created)

To create cluster and manage on EKS, I'll use a cli tool eksctl which can be downloaded from here. And next let's follow these steps:

1. In the kube-config folder create eks config file eks-config.yaml:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: genetype-ml-eks
  region: eu-west-1

nodeGroups: # for our case, we need only one node group (CPU)
  - name: ng-m5-xlarge
    instanceType: m5.xlarge
    desiredCapacity: 1

Create eks cluster:

eksctl create cluster -f eks-config.yaml

2. Push local BentoML image to Amazon ECR

Create aws ecr repository for eks cluster (one-time):

aws ecr create-repository --repository-name genetype-bento-images  --region eu-west-1

Bash commands to run in the terminal to push docker images to ecr repository:

# Registry URI
ACCOUNT_ID=XXXXXXXXXX
REGION=eu-west-1
REGISTRY_NAME=genetype-bento-images
PREFIX=${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/${REGISTRY_NAME}

# Tag local docker images to remote tag 
GATEWAY_LOCAL=genetype-gateway:001 # gateway service
GATEWAY_REMOTE=${PREFIX}:genetype-gateway-001 # notice the ':' is replaced with '-' before 001
docker tag ${GATEWAY_LOCAL} ${GATEWAY_REMOTE}

MODEL_LOCAL=genetype-bento:001 # ml model
MODEL_REMOTE=${PREFIX}:genetype-bento-001 # same thing ':' is replaced with '-' before genetype-bento
docker tag ${MODEL_LOCAL} ${MODEL_REMOTE}

# Push tagged docker images
docker push ${MODEL_REMOTE}
docker push ${GATEWAY_REMOTE}

Login to ecr and push images:

$(aws ecr get-login --no-include-email)

first push the model and then gateway remote image.

Get the uri of these images:

echo ${MODEL_REMOTE} 

and

echo ${GATEWAY_REMOTE} 

and add them to model-deployment.yaml and gateway-deployment.yaml respectively.

Apply all the yaml config files to remote node coming from eks (kubectl get nodes):

    kubectl apply -f model-deployment.yaml
    kubectl apply -f model-service.yaml
    kubectl apply -f gateway-deployment.yaml
    kubectl apply -f gateway-service.yaml

Testing the deployment pods and services should give us predictions.

Check nodes:

kubectl get nodes

My example output:

NAME                                          STATUS   ROLES    AGE   VERSION
ip-192-168-24-82.eu-west-1.compute.internal   Ready    <none>   15m   v1.32.9-eks-ecaa3a6
kubectl get service

My example output:

NAME             TYPE           CLUSTER-IP      EXTERNAL-IP                                                               PORT(S)        AGE
gateway          LoadBalancer   10.100.212.63   a180c213e0dc94bf3b569493738d5e85-1066055406.eu-west-1.elb.amazonaws.com   80:31910/TCP   5m15s
genetype-bento   ClusterIP      10.100.68.77    <none>                                                                    8500/TCP       5m20s
kubernetes       ClusterIP      10.100.0.1      <none>                                                                    443/TCP        24m

Executing kubectl get service should give us the external port address which need to add in the test.py as access url for predictions (e.g., url = 'http://a3399e***-5180***.ap-south-123.elb.amazonaws.com/predict').

Note: this load balancer is now open to everyone! Remember to restrict it.

Test it!

python3 eks/test.py

And the output:

{
  "predictions": {
    "BIOLOGICAL_REGION": 0.00,
    "OTHER": 1.57,
    "PROTEIN_CODING": 89.0,
    "PSEUDO": 0.10,
    "ncRNA": 8.56,
    "snoRNA": 0.75,
    "tRNA": 0.02
  },
  "top_class": "PROTEIN_CODING",
  "top_probability": 89.0
}

Figure. AWS EC2 console showing a Classic Load Balancer provisioned for the EKS deployment. The load balancer is internet-facing, spans three availability zones in eu-west-1, and currently has no registered instances in service.

Figure. Amazon EKS console displaying the active genetype-eks cluster running Kubernetes v1.32 in the eu-west-1 region, created recently and operating under standard support.

To delete the remote cluster:

eksctl delete cluster --name genetype-eks

2B. Local Kubernetes with Kind

Install kubectl on linux:

cd 

mkdir bin && cd bin

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl

cd

export PATH="${PATH}:${HOME}/bin"
# add it to .bashrc

# verify installation
which kubectl

kubectl version --client

Install Kind. Kind (Kubernetes in Docker) allow to run Kubernetes clusters locally using Docker containers.

curl -Lo ${HOME}/bin/kind https://kind.sigs.k8s.io/dl/v0.20.0/kind-linux-amd64
chmod +x ${HOME}/bin/kind

Install uv:

pip install uv

Create a Kind Cluster. Let's create a local Kubernetes cluster:

kind create cluster --name seq2genetype

This command will:

  • Create a single-node Kubernetes cluster
  • Configure kubectl to use this cluster
  • Take a few minutes on first run (downloads images)

Verify the cluster is running:

kubectl cluster-info
kubectl get nodes

We should see one node in "Ready" status.

2B.1. Model Preparation

We will use a pre-trained PyTorch model that classifies clothing items. The model has already been converted to ONNX format. The model predicts one of 10 clothing categories from an image URL.

Download the ONNX model:

mkdir service 
cd service

wget https://github.com/katwre/Genetype-classifier-api/blob/main/models/transformer_classifier.onnx -O transformer_classifier.onnx
wget https://github.com/katwre/Genetype-classifier-api/blob/main/models/transformer_classifier.onnx.data -O transformer_classifier.onnx.data

kubectl get nodes

2B.2. Building the FastAPI service

Here, I'll create a FastAPI application that serves the ONNX model for inference (prediction).

Initialize the project with uv:

uv init
rm main.py
uv add fastapi uvicorn onnxruntime keras-image-helper numpy requests

FastAPI application was created in service/app.py. To test locally, run the service:

uv run uvicorn service.app:app --host 0.0.0.0 --port 8080 --reload

Open http://localhost:8080/docs to see the API documentation.

And then test the service with:

curl -X POST http://localhost:8080/predict \
  -H "Content-Type: application/json" \
  -d '{"sequence": "ACGTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTA"}'

It's going to return:

{"predictions":{"BIOLOGICAL_REGION":0.04,"OTHER":14.28,"PROTEIN_CODING":0.0,"PSEUDO":0.9,"ncRNA":77.85,"snoRNA":6.84,"tRNA":0.08},"top_class":"ncRNA","top_probability":77.85}

2B.3. Docker Containerization

Let's containerize the application.

Build the image:

docker build -t gene-type-classifier:v1 .

Test the container locally:

docker run -it --rm -p 8080:8080 gene-type-classifier:v1

In another terminal, run the test script:

curl -X POST http://localhost:8080/predict \
  -H "Content-Type: application/json" \
  -d '{"sequence": "ACGTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTA"}'

2B.4. Loading Image to Kind

Kind clusters run in Docker, so they can't access images from the local Docker daemon by default. I need to load the image into Kind.

kind load docker-image gene-type-classifier:v1 --name seq2genetype

2B.5. Kubernetes Deployment

Understanding kubernetes resources:

  • Pod: The smallest deployable unit in Kubernetes (one or more containers)
  • Deployment: Manages a set of identical Pods, handles updates and scaling
  • Service: Exposes Pods to network traffic
  • HPA (Horizontal Pod Autoscaler): Automatically scales Pods based on metrics

2B.6. Create Deployment Manifest

Key configuration in k8s/deployment.yaml:

  • replicas: 2 - Run 2 copies of our service
  • imagePullPolicy: Never - Use local image (don't pull from registry)
  • resources - Memory and CPU limits/requests
  • livenessProbe - Restart container if unhealthy
  • readinessProbe - Only send traffic when ready

Deploy it:

kubectl apply -f k8s/deployment.yaml

Check the deployment:

kubectl get deployments
kubectl get pods
kubectl describe deployment gene-type-classifier

View logs:

kubectl logs -l app=gene-type-classifier --tail=20

2B.7. Create Service Manifest

Key configuration in k8s/service.yaml:

  • type: NodePort - Expose on a static port on each node
  • nodePort: 30080 - Accessible on port 30080 from host
  • selector - Routes traffic to Pods with matching labels

Deploy it:

kubectl apply -f k8s/service.yaml

Check the service:

kubectl get services
kubectl describe service gene-type-classifier

2B.8. Testing the Deployed Service

With NodePort, the service is accessible on localhost:30080:

Check the health endpoint:

curl http://localhost:30080/health

Our kind cluster is not configured for NodePort, so it won't work. We don't really need this for testing things locally, so let's just use a quick fix: Use kubectl port-forward. It starts a temporary TCP tunnel from our local machine to the Kubernetes Service (map local port 8080 to service port 8080). Visually: curl localhost:8080 -> kubectl (port-forward) -> Kubernetes API -> Service (gene-type-classifier) -> Pod :8080.

kubectl port-forward service/gene-type-classifier 30080:8080

Now it's accessible on port 30080:

curl http://localhost:30080/health

When we deploy to EKS or some other kubernetes in the cloud, it won't be a problem - there Elastic Load Balancer will solve this problem.

2B.9. Horizontal Pod Autoscaling

Kubernetes can automatically scale your application based on CPU or memory usage. First, we need metrics-server for HPA to work. Install it in kubectl:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

For Kind, we need to patch metrics-server to work without TLS:

kubectl patch -n kube-system deployment metrics-server --type=json -p '[{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--kubelet-insecure-tls"}]'

Wait for metrics-server to be ready:

kubectl get deployment metrics-server -n kube-system

We should see something like that:

NAME             READY   UP-TO-DATE   AVAILABLE   AGE
metrics-server   1/1     1            1           72s

In the file k8s/hpa.yaml there is a configuration to:

  • Scale between 2 and 10 replicas
  • Target 50% CPU utilization
  • Automatically adds/removes Pods based on load

To deploy HPA:

kubectl apply -f k8s/hpa.yaml

Check HPA status:

kubectl get hpa
kubectl describe hpa gene-type-classifier-hpa

2B.10. Testing Autoscaling

Generate load to trigger autoscaling. We can use a simple load test (see load_test.py).

First, check that you can access the endpoint:

curl http://localhost:30080/health

Run the test:

uv run python k8s/load_test.py

While running the load test, watch the HPA in another terminal:

kubectl get hpa -w

We should see the number of replicas increase as CPU usage rises.

Check Pods:

kubectl get pods -w

2B.11. Managing the Deployment

If we make changes to your code:

  1. Rebuild the image with a new tag:
docker build -t gene-type-classifier:v2 .
  1. Load to Kind:
kind load docker-image gene-type-classifier:v2 --name seq2genetype
  1. Update deployment:
kubectl set image deployment/gene-type-classifier gene-type-classifier=gene-type-classifier:v2

Or update the YAML file and apply:

kubectl apply -f k8s/deployment.yaml

Watch the rollout:

kubectl rollout status deployment/gene-type-classifier

2B.12. Scaling Manually

Scale to 5 replicas:

kubectl scale deployment gene-type-classifier --replicas=5

2B.13. Viewing Logs

All Pods:

kubectl logs -l app=gene-type-classifier --tail=50

Specific Pod:

kubectl logs <pod-name>

Follow logs:

kubectl logs -f -l app=gene-type-classifier

2B.14. Debugging

Describe resources:

kubectl describe deployment gene-type-classifier
kubectl describe pod <pod-name>
kubectl describe service gene-type-classifier

Get events:

kubectl get events --sort-by='.lastTimestamp'

Execute commands in a Pod:

kubectl exec -it <pod-name> -- /bin/bash

2B.15. Clean-up

Delete the deployment and service:

kubectl delete -f k8s/deployment.yaml
kubectl delete -f k8s/service.yaml
kubectl delete -f k8s/hpa.yaml

Or delete everything at once:

kubectl delete all -l app=gene-type-classifier
kubectl delete hpa gene-type-classifier-hpa

Delete the Kind cluster:

kind delete cluster --name seq2genetype

About

Gene type prediction from DNA sequence using a Transformer encoder - ONNX Runtime inference, FastAPI + BentoML serving, deployed on AWS EKS (Kubernetes)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages