Introduction

Figure. An example of a gene and its nucleotide sequence. Figure copied from this post.

The aim of this project is to predict gene type directly from a raw DNA sequence using classical machine learning models, convolutional neural networks, and a custom nucleotide-level Transformer, while also exploring practical deployment of machine learning services with Docker compose, Kubernetes (locally with Kind) and cloud infrastructure (AWS Kubernetes EKS).

A gene is a defined segment of DNA that contains the nucleotide sequence (A, T, C, and G) encoding the instructions for producing a functional product, most commonly a protein. The nucleotide sequence determines the amino acid sequence, which in turn specifies a protein’s structure and function. Genes typically begin with a start codon and end with a stop codon, which together define the boundaries of the coding region.

Genes can be broadly categorized into protein-coding genes, which are translated into proteins, and non-coding genes, which give rise to functional RNA molecules. Non-coding RNAs constitute the majority of the genome and carry out diverse regulatory and structural roles without being translated into proteins. They are often classified by size into small non-coding RNAs (e.g., snRNAs) and long non-coding RNAs (lncRNAs).

Project overview

Gene types (such as protein-coding genes, non-coding RNAs, and pseudogenes) and their corresponding DNA sequences were obtained from the NCBI Gene Database via the following Kaggle dataset: https://www.kaggle.com/datasets/harshvardhan21/dna-sequence-prediction.

The dataset contains:

Gene symbols and textual descriptions
Gene functional types
DNA nucleotide sequences represented as A, C, G, T
Predefined train / validation / test splits

Each sequence consists of a variable number of nucleotides and is capped at a maximum length of 1024 bp for transformer training.

Modeling Approaches

To systematically evaluate different learning paradigms for gene-type prediction, the following models were implemented and compared:

Classical machine learning (baseline)

6-mer generation from nucleotide sequences
k-mer frequency counting
Classification using:
- Logistic Regression
- XGBoost
This baseline captures local motif statistics but ignores long-range dependencies.

Deep learning with convolutions

1D Convolutional Neural Network (CNN)
One-hot encoded DNA input
It captures local sequence patterns and motifs.

Nucleotide-level transformer

A custom, lightweight, encoder-only Transformer trained from scratch for genomic sequence classification.
Architecture:
- Nucleotide-level, encoder-only Transformer
- 3 encoder layers
- Hidden dimension (d_model) of 192
- 6 attention heads
- Feed-forward hidden dimension of 512
- Dropout 0.2
- Early stopping based on validation macro-F1
- Max sequence length (max_len): 1024 to cover all sequences without truncation
- Tokenization: simply one token per nucleotide:
  - Vocabulary: ["A", "C", "G", "T", "N", "[PAD]", "[CLS]"] (plus maybe [UNK])
  - Encode each sequence as [CLS] + tokens + [PAD] up to max_len

Tech Stack

ML | Data Science

🧠 numpy • pandas • scikit-learn • PyTorch • XGBoost • CNN • transformer-encoder

📊 seaborn • matplotlib

Backend

🌐 FastAPI

🚀 gunicorn

MLOps | Deployment

🐳 Docker • Docker Compose

📦 BentoML

⚙️ ONNX Runtime

☁️ Kind • AWS Kubernetes (EKS)

Usage

⚠️ Service availability note

The public Kubernetes endpoint shown below was used during development and testing on AWS EKS. To avoid unnecessary cloud costs, the EKS cluster and external load balancer are not kept running continuously. As a result, the endpoint may be unavailable at the time of reading. All inference steps can be fully reproduced locally using Docker Compose or local Kubernetes (Kind), as described below.

Example request in python (EKS deployment - illustrative):

url = "http://a180c213e0dc94bf3b569493738d5e85-1066055406.eu-west-1.elb.amazonaws.com/predict"

# request JSON
data = {'sequence': 'CTTTCTGCCGCCATCTTGCTTCCGCGTTCCCTGCACAAAATGCCGGGCGAAGCCACAGAAACCGTCCCTGCTACAGAGCAGGAGTTGCCGCAGTCCCAGGCTGAGACAGGGTCTGGAACAGCATCTGATAGTGGTGAATCAGTACCAGGGATTGAAGAACAGGATTCCACCCAGACCACCACACAAAAAGCCTGGCTGGTGGCAGCAGCTGAAATTGATGAAGAACCAGTCGGTAAAGCAAAACAGAGTCGGAGTGAAAAGAGGGCACGGAAGGCTATGTCCAAACTGGGTCTTCTACAGGTTACAGGAGTTACTAGAGTCACTATCTGGAAATCTAAGAATATCCTCTTTGT'}

result = requests.post(url, json=data).json()
print(result)

Example responce:

{
  "predictions": {
    "BIOLOGICAL_REGION": 0.00,
    "OTHER": 1.57,
    "PROTEIN_CODING": 89.0,
    "PSEUDO": 0.10,
    "ncRNA": 8.56,
    "snoRNA": 0.75,
    "tRNA": 0.02
  },
  "top_class": "PROTEIN_CODING",
  "top_probability": 89.0
}

Deployment

After evaluating multiple modeling approaches, a custom Transformer encoder was selected as the final model due to its superior performance in gene-type classification. The trained PyTorch model was exported to ONNX and packaged as a BentoML service using ONNX Runtime, exposing a stable HTTP /predict endpoint.

The service can be run locally with Docker Compose for reproducible end-to-end testing and deployed unchanged to AWS EKS via Amazon ECR, where it operates as a standard Kubernetes workload. Local Kubernetes deployment using Kind is also described to mirror the cloud setup during development.

To sum up, this project is deployed and tested incrementally across multiple environments, starting from local development and progressing to cloud-native model serving:

1. Local deployment with Docker Compose (a gateway service + a model service): A multi-container setup consisting of a FastAPI gateway service and a BentoML-based model service running ONNX Runtime.
2. Kubernetes-based deployment with ONNX Runtime
- 2A. AWS EKS: Deployment on Amazon Elastic Kubernetes Service (EKS)
- 2B. Local Kubernetes with Kind: A local Kubernetes setup using Kind to mirror the EKS deployment for development and testing.

Organization of the files in the repository:

Genetype-classifier-api/
├── notebooks/
│   ├── 01_EDA.ipynb                 # Data exploration: class balance, sequence length, cleaning checks
│   ├── 02_baseline_kmers.ipynb       # Baseline model using k-mers of the input (e.g., Logistic Regression / XGBoost)
│   ├── 02_baseline_CNN.ipynb         # Baseline deep model: CNN sequence classifier
│   ├── 02_transformer.ipynb          # Transformer-encoder training + evaluation
│   └── 03_model_comparison.ipynb     # Model comparison across baselines and transformer, metrics + plots
│
├── data/
│   ├── train.csv                    # Training split (raw/processed sequences + labels)
│   ├── validation.csv               # Validation split
│   ├── test.csv                     # Test split
│   ├── train_classimb.csv           # Training split with class imbalance variant
│   ├── validation_classimb.csv      # Validation split with class imbalance variant
│   └── test_classimb.csv            # Test split with class imbalance variant
│
├── models/
│   ├── transformer_model.pkl         # Serialized Transformer model (training artifact / checkpoint)
│   ├── transformer_classifier.onnx   # Exported ONNX model for optimized inference
│   ├── transformer_classifier.onnx.data # External tensor weights for ONNX (large model data)
│   ├── transformer_report.txt        # Evaluation report (metrics, per-class performance)
│   ├── cnn_model.pkl                 # Serialized CNN baseline model
│   ├── cnn_report.txt                # CNN evaluation report
│   ├── logistic_regression_model.pkl # Baseline classical model artifact
│   ├── logistic_regression_report.txt # Baseline evaluation report
│   ├── xboost_model.pkl              # XGBoost baseline model artifact
│   └── xboost_report.txt             # XGBoost evaluation report
│
├── service/
│   ├── app.py                        # FastAPI application (inference endpoint(s))
│   ├── Dockerfile_k8s                # Service image build for Kubernetes deployment
│   ├── test.py                       # Local service tests / smoke tests
│   └── __init__.py                   # Package marker
│
├── docker_compose/
│   ├── bentofile.yaml                # BentoML build configuration for model packaging
│   ├── service.py                    # Bento service definition (runner, API adapters, etc.)
│   ├── gateway.py                    # Gateway / routing layer (fronts model service)
│   ├── docker-compose.yaml           # Local multi-container setup (gateway + model service)
│   ├── image-model.dockerfile        # Docker build for model service container
│   ├── image-gateway.dockerfile      # Docker build for gateway container
│   └── __init__.py                   # Package marker
│
├── eks/
│   ├── eks-config.yaml               # EKS cluster configuration (cluster, nodegroups, region, VPC settings)
│   ├── model-deployment.yaml         # Kubernetes Deployment for model service
│   ├── model-service.yaml            # Kubernetes Service for model service (ClusterIP, ports)
│   ├── gateway-deployment.yaml       # Kubernetes Deployment for gateway
│   ├── gateway-service.yaml          # Kubernetes Service for gateway (LoadBalancer / ingress entrypoint)
│   └── test.py                       # EKS-related tests / helper script(s)
│
├── k8s/
│   ├── deployment.yaml    # Kubernetes Deployment for the inference service
│   ├── service.yaml       # Kubernetes Service exposing the API
│   ├── hpa.yaml           # Horizontal Pod Autoscaler for scaling under load
│   └── load_test.py       # Load testing script for the deployed service
│
├── plots/                            # Training curves and confusion matrices for the ML models
├── img/                              # README assets (logo/illustrations)
├── pyproject.toml                    # Python project config + dependencies
├── uv.lock                           # Locked dependency versions (uv)
└── README.md                         # Project overview, setup, usage, deployment notes

1. Local deployment with Docker Compose (a gateway service + a model service)

Figure. Deployment schema: a FastAPI gateway (HTTP) forwards requests to a BentoML model service that runs ONNX Runtime inference and returns JSON.

Docker Compose provides a simple and deterministic way to run multi-container applications on a single host. In this project, it is used to validate the full inference stack end-to-end (gateway routing, model loading, request/response schema, and container networking) in a production-like environment without Kubernetes. This same container-first approach transfers cleanly to AWS EKS later.

The deployment uses a lightweight FastAPI gateway for request routing and a BentoML model service for inference. BentoML is an open-source framework designed to package model execution code, runtime dependencies, and serving logic into a reproducible, containerized service. In this project, BentoML exposes the ONNX Runtime inference pipeline as a production-grade HTTP API, decoupling model training from deployment and runtime concerns.

Architecture overview. This deployment contains two services:

Gateway service (FastAPI, port 9696)

Receives HTTP requests from the client (e.g., website or curl).
Performs lightweight request validation (optional).
Forwards the request to the model service via an internal Docker network.

Model service (BentoML + ONNX Runtime, port 3000)

Loads the exported ONNX model at startup.
Runs domain-specific sequence preprocessing (tokenization, attention mask, padding).
Executes inference via ONNX Runtime.
Applies postprocessing (softmax, class mapping) and returns a JSON response.

Run the model service locally (without Docker)

This is useful for validating ONNX inference and the BentoML endpoint before containerizing.

bentoml serve docker_compose.service:GeneFunctionService --port 3000

Test in another terminal:

curl -X POST http://localhost:3000/predict   -H "Content-Type: application/json"   -d '{"sequence":"AACGGCTC"}'

Build and run the model service in Docker

Build/run locally:

# Build the model image (run from repository root):
docker build -t genetype-bento:001 -f docker_compose/image-model.dockerfile .
# Run the container:
docker run --rm -p 3000:3000 genetype-bento:001

Test it:

curl -X POST http://localhost:3000/predict \
  -H "Content-Type: application/json" \
  -d '{"sequence":"AACGGCTC"}'

Build the gateway image

The gateway forwards requests to the model service using:

MODEL_URL=http://onnx-model:3000/predict

Build the gateway image (run from repository root; adjust Dockerfile path if needed):

docker build -t genetype-gateway:001 -f docker_compose/image-gateway.dockerfile .

Run the full stack with Docker Compose

Start both containers and their internal network:

docker-compose -f docker_compose/docker-compose.yaml up --build

Test end-to-end via the gateway:

curl -X POST http://localhost:9696/predict \
  -H "Content-Type: application/json" \
  -d '{"sequence":"AACGGCTC"}'

You should receive the JSON response produced by BentoML (predictions + top_class + top_probability).

Useful commands:

docker-compose up: run docker compose
docker-compose up -d: run docker compose in detached mode
docker ps: to see the running containers
docker-compose down: stop the docker compose

2. Kubernetes-based deployment

Figure. Schematic overview of a Kubernetes cluster showing ingress-based request routing through a gateway service to replicated model-serving deployments across cluster nodes.

2A. AWS EKS

Here, I'll create Elastic Kubernetes Service (EKS) cluster on Amazon using cli, publish images to ECR and configure kubectl.

What is needed to be installed locally:

aws CLI configured for your AWS account (command line tool for working with AWS services)
kubectl (it manages Kubernetes objects within your Amazon EKS clusters)
eksctl CLI (it interacts with AWS to create, modify, and delete Amazon EKS clusters)
helm (popular package manager for kubernetes)
access to an EKS cluster (existing or newly created)

To create cluster and manage on EKS, I'll use a cli tool eksctl which can be downloaded from here. And next let's follow these steps:

1. In the kube-config folder create eks config file eks-config.yaml:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: genetype-ml-eks
  region: eu-west-1

nodeGroups: # for our case, we need only one node group (CPU)
  - name: ng-m5-xlarge
    instanceType: m5.xlarge
    desiredCapacity: 1

Create eks cluster:

eksctl create cluster -f eks-config.yaml

2. Push local BentoML image to Amazon ECR

Create aws ecr repository for eks cluster (one-time):

aws ecr create-repository --repository-name genetype-bento-images  --region eu-west-1

Bash commands to run in the terminal to push docker images to ecr repository:

# Registry URI
ACCOUNT_ID=XXXXXXXXXX
REGION=eu-west-1
REGISTRY_NAME=genetype-bento-images
PREFIX=${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/${REGISTRY_NAME}

# Tag local docker images to remote tag 
GATEWAY_LOCAL=genetype-gateway:001 # gateway service
GATEWAY_REMOTE=${PREFIX}:genetype-gateway-001 # notice the ':' is replaced with '-' before 001
docker tag ${GATEWAY_LOCAL} ${GATEWAY_REMOTE}

MODEL_LOCAL=genetype-bento:001 # ml model
MODEL_REMOTE=${PREFIX}:genetype-bento-001 # same thing ':' is replaced with '-' before genetype-bento
docker tag ${MODEL_LOCAL} ${MODEL_REMOTE}

# Push tagged docker images
docker push ${MODEL_REMOTE}
docker push ${GATEWAY_REMOTE}

Login to ecr and push images:

$(aws ecr get-login --no-include-email)

first push the model and then gateway remote image.

Get the uri of these images:

echo ${MODEL_REMOTE}

and

echo ${GATEWAY_REMOTE}

and add them to model-deployment.yaml and gateway-deployment.yaml respectively.

Apply all the yaml config files to remote node coming from eks (kubectl get nodes):

    kubectl apply -f model-deployment.yaml
    kubectl apply -f model-service.yaml
    kubectl apply -f gateway-deployment.yaml
    kubectl apply -f gateway-service.yaml

Testing the deployment pods and services should give us predictions.

Check nodes:

kubectl get nodes

My example output:

NAME                                          STATUS   ROLES    AGE   VERSION
ip-192-168-24-82.eu-west-1.compute.internal   Ready    <none>   15m   v1.32.9-eks-ecaa3a6

kubectl get service

My example output:

NAME             TYPE           CLUSTER-IP      EXTERNAL-IP                                                               PORT(S)        AGE
gateway          LoadBalancer   10.100.212.63   a180c213e0dc94bf3b569493738d5e85-1066055406.eu-west-1.elb.amazonaws.com   80:31910/TCP   5m15s
genetype-bento   ClusterIP      10.100.68.77    <none>                                                                    8500/TCP       5m20s
kubernetes       ClusterIP      10.100.0.1      <none>                                                                    443/TCP        24m

Executing kubectl get service should give us the external port address which need to add in the test.py as access url for predictions (e.g., url = 'http://a3399e***-5180***.ap-south-123.elb.amazonaws.com/predict').

Note: this load balancer is now open to everyone! Remember to restrict it.

Test it!

python3 eks/test.py

And the output:

{
  "predictions": {
    "BIOLOGICAL_REGION": 0.00,
    "OTHER": 1.57,
    "PROTEIN_CODING": 89.0,
    "PSEUDO": 0.10,
    "ncRNA": 8.56,
    "snoRNA": 0.75,
    "tRNA": 0.02
  },
  "top_class": "PROTEIN_CODING",
  "top_probability": 89.0
}

Figure. AWS EC2 console showing a Classic Load Balancer provisioned for the EKS deployment. The load balancer is internet-facing, spans three availability zones in eu-west-1, and currently has no registered instances in service.

Figure. Amazon EKS console displaying the active genetype-eks cluster running Kubernetes v1.32 in the eu-west-1 region, created recently and operating under standard support.

To delete the remote cluster:

eksctl delete cluster --name genetype-eks

2B. Local Kubernetes with Kind

Install kubectl on linux:

cd 

mkdir bin && cd bin

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl

cd

export PATH="${PATH}:${HOME}/bin"
# add it to .bashrc

# verify installation
which kubectl

kubectl version --client

Install Kind. Kind (Kubernetes in Docker) allow to run Kubernetes clusters locally using Docker containers.

curl -Lo ${HOME}/bin/kind https://kind.sigs.k8s.io/dl/v0.20.0/kind-linux-amd64
chmod +x ${HOME}/bin/kind

Install uv:

pip install uv

Create a Kind Cluster. Let's create a local Kubernetes cluster:

kind create cluster --name seq2genetype

This command will:

Create a single-node Kubernetes cluster
Configure kubectl to use this cluster
Take a few minutes on first run (downloads images)

Verify the cluster is running:

kubectl cluster-info
kubectl get nodes

We should see one node in "Ready" status.

2B.1. Model Preparation

We will use a pre-trained PyTorch model that classifies clothing items. The model has already been converted to ONNX format. The model predicts one of 10 clothing categories from an image URL.

Download the ONNX model:

mkdir service 
cd service

wget https://github.com/katwre/Genetype-classifier-api/blob/main/models/transformer_classifier.onnx -O transformer_classifier.onnx
wget https://github.com/katwre/Genetype-classifier-api/blob/main/models/transformer_classifier.onnx.data -O transformer_classifier.onnx.data

kubectl get nodes

2B.2. Building the FastAPI service

Here, I'll create a FastAPI application that serves the ONNX model for inference (prediction).

Initialize the project with uv:

uv init
rm main.py
uv add fastapi uvicorn onnxruntime keras-image-helper numpy requests

FastAPI application was created in service/app.py. To test locally, run the service:

uv run uvicorn service.app:app --host 0.0.0.0 --port 8080 --reload

Open http://localhost:8080/docs to see the API documentation.

And then test the service with:

curl -X POST http://localhost:8080/predict \
  -H "Content-Type: application/json" \
  -d '{"sequence": "ACGTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTA"}'

It's going to return:

{"predictions":{"BIOLOGICAL_REGION":0.04,"OTHER":14.28,"PROTEIN_CODING":0.0,"PSEUDO":0.9,"ncRNA":77.85,"snoRNA":6.84,"tRNA":0.08},"top_class":"ncRNA","top_probability":77.85}

2B.3. Docker Containerization

Let's containerize the application.

Build the image:

docker build -t gene-type-classifier:v1 .

Test the container locally:

docker run -it --rm -p 8080:8080 gene-type-classifier:v1

In another terminal, run the test script:

curl -X POST http://localhost:8080/predict \
  -H "Content-Type: application/json" \
  -d '{"sequence": "ACGTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTA"}'

2B.4. Loading Image to Kind

Kind clusters run in Docker, so they can't access images from the local Docker daemon by default. I need to load the image into Kind.

kind load docker-image gene-type-classifier:v1 --name seq2genetype

2B.5. Kubernetes Deployment

Understanding kubernetes resources:

Pod: The smallest deployable unit in Kubernetes (one or more containers)
Deployment: Manages a set of identical Pods, handles updates and scaling
Service: Exposes Pods to network traffic
HPA (Horizontal Pod Autoscaler): Automatically scales Pods based on metrics

2B.6. Create Deployment Manifest

Key configuration in k8s/deployment.yaml:

replicas: 2 - Run 2 copies of our service
imagePullPolicy: Never - Use local image (don't pull from registry)
resources - Memory and CPU limits/requests
livenessProbe - Restart container if unhealthy
readinessProbe - Only send traffic when ready

Deploy it:

kubectl apply -f k8s/deployment.yaml

Check the deployment:

kubectl get deployments
kubectl get pods
kubectl describe deployment gene-type-classifier

View logs:

kubectl logs -l app=gene-type-classifier --tail=20

2B.7. Create Service Manifest

Key configuration in k8s/service.yaml:

type: NodePort - Expose on a static port on each node
nodePort: 30080 - Accessible on port 30080 from host
selector - Routes traffic to Pods with matching labels

Deploy it:

kubectl apply -f k8s/service.yaml

Check the service:

kubectl get services
kubectl describe service gene-type-classifier

2B.8. Testing the Deployed Service

With NodePort, the service is accessible on localhost:30080:

Check the health endpoint:

curl http://localhost:30080/health

Our kind cluster is not configured for NodePort, so it won't work. We don't really need this for testing things locally, so let's just use a quick fix: Use kubectl port-forward. It starts a temporary TCP tunnel from our local machine to the Kubernetes Service (map local port 8080 to service port 8080). Visually: curl localhost:8080 -> kubectl (port-forward) -> Kubernetes API -> Service (gene-type-classifier) -> Pod :8080.

kubectl port-forward service/gene-type-classifier 30080:8080

Now it's accessible on port 30080:

curl http://localhost:30080/health

When we deploy to EKS or some other kubernetes in the cloud, it won't be a problem - there Elastic Load Balancer will solve this problem.

2B.9. Horizontal Pod Autoscaling

Kubernetes can automatically scale your application based on CPU or memory usage. First, we need metrics-server for HPA to work. Install it in kubectl:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

For Kind, we need to patch metrics-server to work without TLS:

kubectl patch -n kube-system deployment metrics-server --type=json -p '[{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--kubelet-insecure-tls"}]'

Wait for metrics-server to be ready:

kubectl get deployment metrics-server -n kube-system

We should see something like that:

NAME             READY   UP-TO-DATE   AVAILABLE   AGE
metrics-server   1/1     1            1           72s

In the file k8s/hpa.yaml there is a configuration to:

Scale between 2 and 10 replicas
Target 50% CPU utilization
Automatically adds/removes Pods based on load

To deploy HPA:

kubectl apply -f k8s/hpa.yaml

Check HPA status:

kubectl get hpa
kubectl describe hpa gene-type-classifier-hpa

2B.10. Testing Autoscaling

Generate load to trigger autoscaling. We can use a simple load test (see load_test.py).

First, check that you can access the endpoint:

curl http://localhost:30080/health

Run the test:

uv run python k8s/load_test.py

While running the load test, watch the HPA in another terminal:

kubectl get hpa -w

We should see the number of replicas increase as CPU usage rises.

Check Pods:

kubectl get pods -w

2B.11. Managing the Deployment

If we make changes to your code:

Rebuild the image with a new tag:

docker build -t gene-type-classifier:v2 .

Load to Kind:

kind load docker-image gene-type-classifier:v2 --name seq2genetype

Update deployment:

kubectl set image deployment/gene-type-classifier gene-type-classifier=gene-type-classifier:v2

Or update the YAML file and apply:

kubectl apply -f k8s/deployment.yaml

Watch the rollout:

kubectl rollout status deployment/gene-type-classifier

2B.12. Scaling Manually

Scale to 5 replicas:

kubectl scale deployment gene-type-classifier --replicas=5

2B.13. Viewing Logs

All Pods:

kubectl logs -l app=gene-type-classifier --tail=50

Specific Pod:

kubectl logs <pod-name>

Follow logs:

kubectl logs -f -l app=gene-type-classifier

2B.14. Debugging

Describe resources:

kubectl describe deployment gene-type-classifier
kubectl describe pod <pod-name>
kubectl describe service gene-type-classifier

Get events:

kubectl get events --sort-by='.lastTimestamp'

Execute commands in a Pod:

kubectl exec -it <pod-name> -- /bin/bash

2B.15. Clean-up

Delete the deployment and service:

kubectl delete -f k8s/deployment.yaml
kubectl delete -f k8s/service.yaml
kubectl delete -f k8s/hpa.yaml

Or delete everything at once:

kubectl delete all -l app=gene-type-classifier
kubectl delete hpa gene-type-classifier-hpa

Delete the Kind cluster:

kind delete cluster --name seq2genetype

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Project overview

Modeling Approaches

Tech Stack

Usage

Deployment

1. Local deployment with Docker Compose (a gateway service + a model service)

2. Kubernetes-based deployment

2A. AWS EKS

2B. Local Kubernetes with Kind

2B.1. Model Preparation

2B.2. Building the FastAPI service

2B.3. Docker Containerization

2B.4. Loading Image to Kind

2B.5. Kubernetes Deployment

2B.6. Create Deployment Manifest

2B.7. Create Service Manifest

2B.8. Testing the Deployed Service

2B.9. Horizontal Pod Autoscaling

2B.10. Testing Autoscaling

2B.11. Managing the Deployment

2B.12. Scaling Manually

2B.13. Viewing Logs

2B.14. Debugging

2B.15. Clean-up

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
docker_compose		docker_compose
eks		eks
img		img
k8s		k8s
models		models
notebooks		notebooks
plots		plots
service		service
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Introduction

Project overview

Modeling Approaches

Tech Stack

Usage

Deployment

1. Local deployment with Docker Compose (a gateway service + a model service)

2. Kubernetes-based deployment

2A. AWS EKS

2B. Local Kubernetes with Kind

2B.1. Model Preparation

2B.2. Building the FastAPI service

2B.3. Docker Containerization

2B.4. Loading Image to Kind

2B.5. Kubernetes Deployment

2B.6. Create Deployment Manifest

2B.7. Create Service Manifest

2B.8. Testing the Deployed Service

2B.9. Horizontal Pod Autoscaling

2B.10. Testing Autoscaling

2B.11. Managing the Deployment

2B.12. Scaling Manually

2B.13. Viewing Logs

2B.14. Debugging

2B.15. Clean-up

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages