feat(helm): Medcat service - support custom init container for model downloading (#51)

alhendrickson · web-flow · commit e7727420acca · 2026-03-05T14:32:40.000Z
diff --git a/deployment/kubernetes/charts/medcat-service-helm/README.md b/deployment/kubernetes/charts/medcat-service-helm/README.md
@@ -5,61 +5,76 @@ This Helm chart deploys the MedCAT service to a Kubernetes cluster.
 ## Installation
 
 ```sh
-helm install my-medcat-service oci://registry-1.docker.io/cogstacksystems/medcat-service-helm
+helm install medcat-service-helm oci://registry-1.docker.io/cogstacksystems/medcat-service-helm
 ```
 
+## Usage
+For local testing, by default you can port forward the service using this command:
+
+```sh
+kubectl port-forward svc/medcat-service-helm 5000:5000
+```
+
+Then navigate to http://localhost:5000 to try the service. You can also use http://localhost:5000/docs to view the REST APIs
+
+
 ## Configuration
 
+To configure medcat service, create a values.yaml file and install with helm.
+
+### Model Pack
 You should specify a model pack to be used by the service. By default it will use a small bundled model, which can be used for testing
 
 ---
-### Option 1: Use the demo model pack
+#### Default: Use the demo model pack
 
 There is a model pack already bundled into medcat service, and is the default in this chart.
 
 This pack is only really used for testing, and has just a few concepts built in. 
 
-###  Option 2: Download Model on Startup
+####  Recommended: Download Model on Startup
 
 Enable MedCAT to download the model from a remote URL on container startup.
 
-Create a values file like `values-model-download.yaml` and update the env vars with: 
+Create a values file like `values-model-download.yaml` and set these values:
 ```yaml
-env:
-  ENABLE_MODEL_DOWNLOAD: "true"
-  MODEL_NAME: "medmen"
-  MODEL_VOCAB_URL: "https://cogstack-medcat-example-models.s3.eu-west-2.amazonaws.com/medcat-example-models/vocab.dat"
-  MODEL_CDB_URL: "https://cogstack-medcat-example-models.s3.eu-west-2.amazonaws.com/medcat-example-models/cdb-medmen-v1.dat"
-  MODEL_META_URL: "https://cogstack-medcat-example-models.s3.eu-west-2.amazonaws.com/medcat-example-models/mc_status.zip"
-  APP_MODEL_CDB_PATH: "/cat/models/medmen/cdb.dat"
+model:
+  downloadUrl: "http://localhost:9000/models/my-model.zip"
+  name: my-model.zip
 ```
 
 Use this if you prefer dynamic loading of models at runtime.
 
-### Option 3: Get a model into a k8s volume, and mount it
+#### Advanced: Create a custom volume and load a model into it
 
 The service can use a model pack if you want to setup your own download flow. For example, setup an initContainer pattern that downloads to a volume, then mount the volume yourself.
 
-Use this env variable to point to the file:
+1. Create a persistent volume and PVC in kubernetes following the official documentation. Alternatively specifiy it in `values.extraManifests` and it will be created.
+
+2. Create a values file like the following, which mounts the volume, and defines a custom init container.
 
-Create a values file like `values-model-pack.yaml` and update the env vars with: 
 ```yaml
 env:
-  # This defines the Model Pack used by the medcat service
-  APP_MEDCAT_MODEL_PACK: "/cat/models/examples/example-medcat-v1-model-pack.zip"
-```
+  APP_MEDCAT_MODEL_PACK: "/my/models/custom-model.zip"
+volumeMounts:
+  name: model-volume
+  mountPath: /my/models
+
+volumes:
+- name: model-volume
+  persistentVolumeClaim:
+    claimName: my-custom-pvc
+extraInitContainers:
+ - name: model-downloader
+   image: busybox:1.28
+   # In this command, you can write custom code required to download a file. For example you could configure authentication.
+   command: ["sh", "-c", "wget -O /my/models/custom-model.zip http://example.com"]
+   volumeMounts:
+    - name: model-volume
+      mountPath: /my/models
 
-## Example
-
-```sh
-helm install my-medcat ./medcat-chart -f values-model-pack.yaml
 ```
 
-or
-
-```sh
-helm install my-medcat ./medcat-chart -f values-model-download.yaml
-```
 
 ### DeID Mode
 
@@ -73,7 +88,7 @@ env:
 ```
 
 
-## GPU Support
+### GPU Support
 
 To run MedCAT Service with GPU acceleration, use the GPU-enabled image and set the pod runtime class accordingly.
 
@@ -101,7 +116,7 @@ env:
 > - [NVIDIA GPU Feature Discovery](https://github.com/NVIDIA/gpu-feature-discovery)
 > - The [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/)
 
-### Test GPU support
+#### Test GPU support
 You can verify that the MedCAT Service pod has access to the GPU by executing `nvidia-smi` inside the pod.
 
 
diff --git a/deployment/kubernetes/charts/medcat-service-helm/ci/default-values.yaml b/deployment/kubernetes/charts/medcat-service-helm/ci/default-values.yaml
@@ -0,0 +1 @@
+# Empty values file to run CI tests against the defaults
diff --git a/deployment/kubernetes/charts/medcat-service-helm/ci/initcontainer-values.yaml b/deployment/kubernetes/charts/medcat-service-helm/ci/initcontainer-values.yaml
@@ -0,0 +1,38 @@
+# This values file is used to test the init container functionality
+# It shows using an initContainer and a volume to run custom code to provide a model pack
+env:
+  APP_MEDCAT_MODEL_PACK: "/my/models/custom-model.zip"
+volumeMounts:
+  - name: model-volume
+    mountPath: /my/models
+volumes:
+  - name: model-volume
+    persistentVolumeClaim:
+      claimName: medcat-model-pvc
+
+extraInitContainers:
+  - name: custom-init-container
+    image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
+    # For demo purposes - copy the exmple model pack and reference under new path.
+    # This demonstrates that we can run any custom code desired to get a model pack.
+    command:
+      [
+        "sh",
+        "-c",
+        "cp /cat/models/examples/example-medcat-v2-model-pack.zip /my/models/custom-model.zip",
+      ]
+    volumeMounts:
+      - name: model-volume
+        mountPath: /my/models
+
+extraManifests:
+  - apiVersion: v1
+    kind: PersistentVolumeClaim
+    metadata:
+      name: medcat-model-pvc
+    spec:
+      accessModes:
+        - ReadWriteOnce
+      resources:
+        requests:
+          storage: 100Mi
diff --git a/deployment/kubernetes/charts/medcat-service-helm/templates/deployment.yaml b/deployment/kubernetes/charts/medcat-service-helm/templates/deployment.yaml
@@ -79,15 +79,17 @@ spec:
           {{- if or .Values.model.downloadUrl .Values.volumeMounts }}
           volumeMounts:
             {{- if .Values.volumeMounts }}
-            {{- toYaml .Values.volumeMounts | nindent 2 }}
+            {{- toYaml .Values.volumeMounts | nindent 12 }}
             {{- end }}
             {{- if .Values.model.downloadUrl }}
             - name: models
               mountPath: /models
             {{- end }}
           {{- end }}
-      {{- if .Values.model.downloadUrl }}
+      {{- if or .Values.extraInitContainers .Values.model.downloadUrl }}
       initContainers:
+      {{- end }}
+      {{- if .Values.model.downloadUrl }}
         - name: model-downloader
           image: busybox:1.28
           command: 
@@ -111,11 +113,14 @@ spec:
             - name: models
               mountPath: /models
       {{- end }}
-      
+      {{- $root := . -}}
+      {{- with .Values.extraInitContainers }}
+        {{- tpl (toYaml .) $root | nindent 8 }}
+      {{- end }}
       {{- if or .Values.model.downloadUrl .Values.volumes }}
       volumes:
         {{- if .Values.volumes }}
-        {{- toYaml .Values.volumes | nindent 2 }}
+        {{- toYaml .Values.volumes | nindent 8 }}
         {{- end }}
         {{- if .Values.model.downloadUrl }}
         - name: models
diff --git a/deployment/kubernetes/charts/medcat-service-helm/templates/extraManifests.yaml b/deployment/kubernetes/charts/medcat-service-helm/templates/extraManifests.yaml
@@ -0,0 +1,4 @@
+{{ range .Values.extraManifests }}
+---
+{{ tpl (toYaml .) $ }}
+{{ end }}
diff --git a/deployment/kubernetes/charts/medcat-service-helm/values.yaml b/deployment/kubernetes/charts/medcat-service-helm/values.yaml
@@ -233,3 +233,9 @@ networkPolicy:
       #           app.kubernetes.io/name: model-downloader
       #   ports:
       #     - port: 5000
+
+# Additional init containers to run before the main container. Can be templated
+extraInitContainers: []
+
+# Additional manifests to deploy to kubernetes. Can be templated
+extraManifests: []
diff --git a/docs/docs/platform/deployment/helm/charts/medcat-service-helm.md b/docs/docs/platform/deployment/helm/charts/medcat-service-helm.md
@@ -5,58 +5,125 @@ This Helm chart deploys the MedCAT service to a Kubernetes cluster.
 ## Installation
 
 ```sh
-helm install my-medcat-service oci://registry-1.docker.io/cogstacksystems/medcat-service-helm
+helm install medcat-service-helm oci://registry-1.docker.io/cogstacksystems/medcat-service-helm
 ```
 
+## Usage
+
+For local testing, by default you can port forward the service using this command:
+
+```sh
+kubectl port-forward svc/medcat-service-helm 5000:5000
+```
+
+Then navigate to http://localhost:5000 to try the service. You can also use http://localhost:5000/docs to view the REST APIs
+
 ## Configuration
 
+To configure medcat service, create a values.yaml file and install with helm.
+
+### Model Pack
+
 You should specify a model pack to be used by the service. By default it will use a small bundled model, which can be used for testing
 
 ---
-### Option 1: Use the demo model pack
+
+#### Default: Use the demo model pack
 
 There is a model pack already bundled into medcat service, and is the default in this chart.
 
-This pack is only really used for testing, and has just a few concepts built in. 
+This pack is only really used for testing, and has just a few concepts built in.
 
-###  Option 2: Download Model on Startup
+#### Recommended: Download Model on Startup
 
 Enable MedCAT to download the model from a remote URL on container startup.
 
-Create a values file like `values-model-download.yaml` and update the env vars with: 
+Create a values file like `values-model-download.yaml` and set these values:
+
 ```yaml
-env:
-  ENABLE_MODEL_DOWNLOAD: "true"
-  MODEL_NAME: "medmen"
-  MODEL_VOCAB_URL: "https://cogstack-medcat-example-models.s3.eu-west-2.amazonaws.com/medcat-example-models/vocab.dat"
-  MODEL_CDB_URL: "https://cogstack-medcat-example-models.s3.eu-west-2.amazonaws.com/medcat-example-models/cdb-medmen-v1.dat"
-  MODEL_META_URL: "https://cogstack-medcat-example-models.s3.eu-west-2.amazonaws.com/medcat-example-models/mc_status.zip"
-  APP_MODEL_CDB_PATH: "/cat/models/medmen/cdb.dat"
+model:
+  downloadUrl: "http://localhost:9000/models/my-model.zip"
+  name: my-model.zip
 ```
 
 Use this if you prefer dynamic loading of models at runtime.
 
-### Option 3: Get a model into a k8s volume, and mount it
+#### Advanced: Create a custom volume and load a model into it
 
 The service can use a model pack if you want to setup your own download flow. For example, setup an initContainer pattern that downloads to a volume, then mount the volume yourself.
 
-Use this env variable to point to the file:
+1. Create a persistent volume and PVC in kubernetes following the official documentation. Alternatively specifiy it in `values.extraManifests` and it will be created.
+
+2. Create a values file like the following, which mounts the volume, and defines a custom init container.
 
-Create a values file like `values-model-pack.yaml` and update the env vars with: 
 ```yaml
 env:
-  # This defines the Model Pack used by the medcat service
-  APP_MEDCAT_MODEL_PACK: "/cat/models/examples/example-medcat-v1-model-pack.zip"
+  APP_MEDCAT_MODEL_PACK: "/my/models/custom-model.zip"
+volumeMounts:
+  name: model-volume
+  mountPath: /my/models
+
+volumes:
+  - name: model-volume
+    persistentVolumeClaim:
+      claimName: my-custom-pvc
+extraInitContainers:
+  - name: model-downloader
+    image: busybox:1.28
+    # In this command, you can write custom code required to download a file. For example you could configure authentication.
+    command:
+      ["sh", "-c", "wget -O /my/models/custom-model.zip http://example.com"]
+    volumeMounts:
+      - name: model-volume
+        mountPath: /my/models
 ```
 
-## Example
+### DeID Mode
+
+The service can perform DeID of EHRs by swithcing to the following values
 
-```sh
-helm install my-medcat ./medcat-chart -f values-model-pack.yaml
 ```
+env:
+  APP_MEDCAT_MODEL_PACK: "/cat/models/examples/example-deid-model-pack.zip"
+  DEID_MODE: "true"
+  DEID_REDACT: "true"
+```
+
+### GPU Support
+
+To run MedCAT Service with GPU acceleration, use the GPU-enabled image and set the pod runtime class accordingly.
+
+Note GPU support is only used for deidentification
+
+Create a values file like `values-gpu.yaml` with the following content:
+
+```yaml
+image:
+  repository: ghcr.io/cogstack/medcat-service-gpu
 
-or
+runtimeClassName: nvidia
+
+resources:
+  limits:
+    nvidia.com/gpu: 1
+env:
+  APP_CUDA_DEVICE_COUNT: 1
+  APP_TORCH_THREADS: -1
+  DEID_MODE: true
+```
+
+> To use GPU acceleration, your Kubernetes cluster should be configured with the NVIDIA GPU Operator or the following components:
+>
+> - [NVIDIA device plugin for Kubernetes](https://github.com/NVIDIA/k8s-device-plugin)
+> - [NVIDIA GPU Feature Discovery](https://github.com/NVIDIA/gpu-feature-discovery)
+> - The [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/)
+
+#### Test GPU support
+
+You can verify that the MedCAT Service pod has access to the GPU by executing `nvidia-smi` inside the pod.
 
 ```sh
-helm install my-medcat ./medcat-chart -f values-model-download.yaml
+kubectl exec -it <POD_NAME> -- nvidia-smi
 ```
+
+You should see the NVIDIA GPU device listing if the GPU is properly accessible.
diff --git a/docs/docs/platform/deployment/helm/tutorial.md b/docs/docs/platform/deployment/helm/tutorial.md

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1 @@`
	`1`	`+# Empty values file to run CI tests against the defaults`