[gpu-operator] Update to v26.3.0 and add vGPU variant by lexfrei · Pull Request #2323 · cozystack/cozystack

lexfrei · 2026-04-02T12:22:12Z

What this PR does

Updates the NVIDIA GPU Operator from v25.3.0 to v26.3.0 and adds a new vgpu variant for GPU sharing between VMs using mediated devices.

Changes:

Update vendored upstream chart to v26.3.0 (CDI enabled by default, component version bumps)
Pin chart version in Makefile for reproducibility
Add values-talos-vgpu.yaml for vGPU mode (vGPU Manager + vGPU Device Manager)
Add vgpu variant to PackageSource alongside existing default (passthrough)
Add internal documentation for vGPU setup

The vGPU variant enables users to deploy GPU Operator with NVIDIA vGPU Manager. Users must provide their own proprietary vGPU Manager image via Package CR component values.

Release note

[gpu-operator] Update NVIDIA GPU Operator to v26.3.0 and add vGPU variant for GPU sharing between VMs

Summary by CodeRabbit

New Features
- vGPU support for the GPU Operator with KubeVirt integration and a new end-to-end vGPU setup guide.
- Kata sandbox device plugin and expanded CDI/NRI configuration options.
Upgrades
- GPU Operator bumped to v26.3.0.
- Node Feature Discovery bumped to v0.18.3.
Improvements
- HTTP-based health probes, PodDisruptionBudget templates, imagePullSecrets helper, enhanced RBAC/namespace scoping, CRD schema additions (licensing secret, vGPU fields, toleration ops), and chart/values refinements.

Update vendored upstream chart to v26.3.0. Key changes: - CDI enabled by default - Driver version 580.126.20 - NFD subchart updated to 0.18.3 - Component versions bumped (toolkit v1.19.0, device-plugin v0.19.0, dcgm-exporter 4.5.2) - Pin chart version in Makefile for reproducibility Assisted-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>

Add vGPU mode as a new variant alongside the existing passthrough (default) variant. This enables sharing a single physical GPU across multiple VMs using NVIDIA mediated devices. Changes: - Add values-talos-vgpu.yaml with vGPU Manager and vGPU Device Manager enabled - Add 'vgpu' variant to PackageSource - Add documentation for vGPU setup (driver image, NLS licensing, profiles, KubeVirt) Users select the vgpu variant via Package CR and provide their proprietary vGPU Manager image coordinates through component values. Assisted-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>

coderabbitai · 2026-04-02T12:22:20Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 50f62eaa-c639-4f6c-acb7-1e95d4184b65

📥 Commits

Reviewing files that changed from the base of the PR and between ab5c618 and 783b51b.

📒 Files selected for processing (1)

docs/gpu-vgpu.md

✅ Files skipped from review due to trivial changes (1)

docs/gpu-vgpu.md

📝 Walkthrough

Walkthrough

Adds vGPU support to the GPU Operator (new vgpu variant and Talos vGPU values), upgrades GPU Operator to v26.3.0 and Node Feature Discovery to v0.18.3, migrates NFD probes from gRPC to HTTP, enhances multiple CRDs/templates for licensing, vGPU/kernel-module config, CDI/NRI and kata sandbox integration, and adds documentation.

Changes

Cohort / File(s)	Summary
Documentation `docs/gpu-vgpu.md`	New end-to-end guide for configuring GPU Operator with NVIDIA vGPU: prerequisites, driver image build/publish, Package CR deployment, NLS licensing Secret, vGPU profile selection, vgpuDeviceManager config, and KubeVirt integration examples.
Platform Variant `packages/core/platform/sources/gpu-operator.yaml`	Added a new `vgpu` spec.variant referencing `values-talos-vgpu.yaml`, enabling a vGPU-specific installation path alongside the existing `default` variant.
Chart Pinning & Metadata `packages/system/gpu-operator/Makefile`, `packages/system/gpu-operator/charts/gpu-operator/Chart.yaml`	Pinned Helm pull to `--version v26.3.0`; bumped chart `appVersion`/`version` to `v26.3.0` and updated NFD dependency repo/version to `oci://registry.k8s.io/nfd/charts` `0.18.3`.
Node Feature Discovery (NFD) chart `packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/...`	Migrated NFD components from gRPC probes to HTTP `/healthz`, consolidated metrics/health ports into single `port`, added dnsPolicy values, imagePullSecrets helper, PodDisruptionBudget templates, and updated values/README. CRDs extended with `Ge/Le/GeLe`, `type`, and rule `vars`/`varsTemplate`.
CRD Extensions `.../crds/nvidia.com_clusterpolicies.yaml`, `.../crds/nvidia.com_nvidiadrivers.yaml`	Extended ClusterPolicy and NVIDIADrivers CRDs: added vgpuManager/kernelModuleConfig, licensing `secretName` (deprecated ConfigMap), `secretEnv`, `podSecurityContext`, kataSandboxDevicePlugin, sandboxWorkloads.mode, NRI plugin/CI-related fields, expanded toleration operators, and added `disabled` state.
Templates & Rendering Logic `packages/system/gpu-operator/charts/gpu-operator/templates/*.yaml`	Numerous template changes: namespace-scoped RBAC/SA, use `manage-crds` binary for CRD apply/delete in hooks, added `extraObjects` template, refined ClusterPolicy rendering for new fields (vgpu/kata/CDI), conditional NVIDIADriver/license secret handling, simplified nodefeaturerules to kernel-module rules, and added validation preventing NRI enablement when CDI is disabled.
Values & Talos vGPU overrides `packages/system/gpu-operator/charts/gpu-operator/values.yaml`, `packages/system/gpu-operator/values-talos-vgpu.yaml`	Major values updates: enabled CDI by default with `nriPluginEnabled: false`, introduced `sandboxWorkloads.mode`, switched many env lists to `env: []`, added `extraObjects`, consolidated NFD ports to 8080, added PodDisruptionBudgets, driver/licensing secret wiring, and added `values-talos-vgpu.yaml` enabling `vgpuManager`/`vgpuDeviceManager` and disabling the standard driver/devicePlugin for Talos vGPU setups.

Sequence Diagram(s)

sequenceDiagram
participant Admin
participant BuildServer
participant Registry
participant GPU_Operator
participant NLS as "NVIDIA License Server"
participant KubeVirt

Admin->>BuildServer: Provide NVIDIA .run files & EULA
BuildServer->>Registry: Build & push vgpu-manager driver image
Admin->>GPU_Operator: Install Package CR with variant=vgpu (values-talos-vgpu)
GPU_Operator->>Registry: Pull vgpu-manager image
GPU_Operator->>NLS: Reference licensing Secret (gridd.conf) -> configure licensing
Admin->>KubeVirt: Create VirtualMachine spec referencing vGPU resource
KubeVirt->>GPU_Operator: Request mediated device allocation on host
GPU_Operator->>Registry: Ensure driver/kernel modules loaded (vgpuManager)
GPU_Operator->>KubeVirt: Expose vGPU resource to VM (permittedHostDevices/mediatedDeviceTypes)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐇 I nibble bytes and build a crate,
I stitch vGPU to the cluster gate.
From driver image to license key,
I hop through pods so VMs roam free.
Hooray — kernels, secrets, configs done; let rabbits dance beneath the sun!

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title '[gpu-operator] Update to v26.3.0 and add vGPU variant' accurately and concisely summarizes the main changes: upgrading the GPU Operator chart to v26.3.0 and introducing a new vGPU variant for GPU sharing. It is specific, clear, and reflects the primary objectives of the PR.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/update-gpu-operator-v26.3.0

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces NVIDIA vGPU support, updates the GPU Operator to version v26.3.0, and upgrades Node Feature Discovery (NFD) to 0.18.3. Key changes include the addition of a vgpu package variant, transition to HTTP-based health probes in NFD, and the adoption of a dedicated utility for CRD management. Feedback suggests updating the documentation to use Secrets for licensing configuration to align with security best practices and removing the deprecated cdi.default field from the ClusterPolicy template.

gemini-code-assist · 2026-04-02T12:25:24Z

+1. Create a ConfigMap with the NLS client configuration in the `cozy-gpu-operator` namespace:
+
+```yaml
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: licensing-config
+  namespace: cozy-gpu-operator
+data:
+  gridd.conf: |
+    ServerAddress=nls.example.com
+    ServerPort=443
+    FeatureType=1
+```
+
+2. Reference the ConfigMap in the Package values:
+
+```yaml
+gpu-operator:
+  vgpuManager:
+    repository: registry.example.com/nvidia
+    version: "550.90.05"
+  driver:
+    licensingConfig:
+      configMapName: licensing-config
+```


The example uses ConfigMap and configMapName for licensing configuration. However, the updated CRDs (nvidia.com_clusterpolicies.yaml and nvidia.com_nvidiadrivers.yaml) mark configMapName as deprecated in favor of secretName. It is recommended to update the documentation to use a Secret for licensing configuration to align with the new best practices for security.

gemini-code-assist · 2026-04-02T12:25:24Z

+    {{- if .Values.cdi.default }}
    default: {{ .Values.cdi.default }}
+    {{- end }}


The cdi.default field is deprecated in the ClusterPolicy CRD and is no longer used by the GPU Operator (setting cdi.enabled=true now automatically configures it as the default). This block should be removed to align with the updated CRD schema and avoid confusion.

Replace simplified Containerfile with NVIDIA's Makefile-based build system from gitlab.com/nvidia/container-images/driver. The GPU Operator expects pre-compiled kernel modules, not a raw .run file. Add EULA notice about public redistribution of vGPU driver images. Add note about NLS ServerPort being deployment-dependent. Assisted-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>

ConfigMap-based licensing configuration (configMapName) is deprecated in GPU Operator v26.x. Switch to Secret-based configuration (secretName) as recommended by the updated CRD schema. Assisted-By: Claude <[email protected]> Signed-off-by: Aleksei Sviridkin <[email protected]>

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/gpu-vgpu.md`:
- Around line 81-109: The doc incorrectly instructs creating a ConfigMap and
using driver.licensingConfig.configMapName; update it to create a Secret (kind:
Secret) containing the gridd.conf under stringData and change the package values
example to use driver.licensingConfig.secretName: licensing-config so the
chart/schema picks up the NLS licensing configuration.

In
`@packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/post-delete-job.yaml`:
- Around line 97-100: The resources block is being rendered at the Pod spec
level using .Values.master.resources but must be moved into the nfd-master
container spec; update the template so the resources: {{- toYaml
.Values.master.resources }} block is placed inside the container definition for
"nfd-master" (e.g., immediately after that container's fields) rather than after
tolerations, ensuring the resources stanza is nested under the nfd-master
container.

In
`@packages/system/gpu-operator/charts/gpu-operator/crds/nvidia.com_nvidiadrivers.yaml`:
- Around line 917-924: The CRD docs for the `operator` field currently list
`Lt`/`Gt` without noting they require the Kubernetes feature gate; update the
`operator` description under the `operator` field (in the NVIDIADriver CRD) to
state that `Lt` and `Gt` require Kubernetes v1.35+ with the
TaintTolerationComparisonOperators feature gate enabled, or alternatively update
the chart's `kubeVersion` constraint to >=1.35.0 so that the chart and the
`NVIDIADriver` spec only allow these operators on supported Kubernetes versions;
pick one approach and make the matching change to either the `operator`
description or the chart `kubeVersion`.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 51f9626b-f0b3-498a-82e0-c845bb0b24ad

📥 Commits

Reviewing files that changed from the base of the PR and between 73ec5a5 and ab5c618.

⛔ Files ignored due to path filters (1)

packages/system/gpu-operator/charts/gpu-operator/Chart.lock is excluded by !**/*.lock

📒 Files selected for processing (32)

docs/gpu-vgpu.md
packages/core/platform/sources/gpu-operator.yaml
packages/system/gpu-operator/Makefile
packages/system/gpu-operator/charts/gpu-operator/Chart.yaml
packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/Chart.yaml
packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/README.md
packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/crds/nfd-api-crds.yaml
packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/_helpers.tpl
packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/master-pdb.yaml
packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/master.yaml
packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/nfd-gc-pdb.yaml
packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/nfd-gc.yaml
packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/post-delete-job.yaml
packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/prometheus.yaml
packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/topologyupdater.yaml
packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/worker.yaml
packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/values.yaml
packages/system/gpu-operator/charts/gpu-operator/crds/nvidia.com_clusterpolicies.yaml
packages/system/gpu-operator/charts/gpu-operator/crds/nvidia.com_nvidiadrivers.yaml
packages/system/gpu-operator/charts/gpu-operator/templates/cleanup_crd.yaml
packages/system/gpu-operator/charts/gpu-operator/templates/clusterpolicy.yaml
packages/system/gpu-operator/charts/gpu-operator/templates/extra-objects.yaml
packages/system/gpu-operator/charts/gpu-operator/templates/nodefeaturerules.yaml
packages/system/gpu-operator/charts/gpu-operator/templates/nvidiadriver.yaml
packages/system/gpu-operator/charts/gpu-operator/templates/operator.yaml
packages/system/gpu-operator/charts/gpu-operator/templates/role.yaml
packages/system/gpu-operator/charts/gpu-operator/templates/rolebinding.yaml
packages/system/gpu-operator/charts/gpu-operator/templates/serviceaccount.yaml
packages/system/gpu-operator/charts/gpu-operator/templates/upgrade_crd.yaml
packages/system/gpu-operator/charts/gpu-operator/templates/validations.yaml
packages/system/gpu-operator/charts/gpu-operator/values.yaml
packages/system/gpu-operator/values-talos-vgpu.yaml

coderabbitai · 2026-04-02T13:06:28Z

+      {{- with .Values.master.resources }}
+      resources:
+        {{- toYaml . | nindent 8 }}
+      {{- end }}


⚠️ Potential issue | 🔴 Critical

Resources block is misplaced outside the container spec.

The resources block is rendered at the pod spec level (after tolerations) rather than inside the nfd-master container definition. This would cause a Kubernetes API validation error since resources is not a valid pod spec field.

🐛 Proposed fix to move resources inside container spec

The resources block should be inside the container definition (after line 83, inside the nfd-master container):

args: - "-prune" {{- if .Values.master.instance | empty | not }} - "-instance={{ .Values.master.instance }}" {{- end }} + {{- with .Values.master.resources }} + resources: + {{- toYaml . | nindent 12 }} + {{- end }} restartPolicy: Never {{- with .Values.master.nodeSelector }} nodeSelector: {{- toYaml . | nindent 8 }} {{- end }} {{- with .Values.master.affinity }} affinity: {{- toYaml . | nindent 8 }} {{- end }} {{- with .Values.master.tolerations }} tolerations: {{- toYaml . | nindent 8 }} {{- end }} - {{- with .Values.master.resources }} - resources: - {{- toYaml . | nindent 8 }} - {{- end }} {{- end }}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/post-delete-job.yaml` around lines 97 - 100, The resources block is being rendered at the Pod spec level using .Values.master.resources but must be moved into the nfd-master container spec; update the template so the resources: {{- toYaml .Values.master.resources }} block is placed inside the container definition for "nfd-master" (e.g., immediately after that container's fields) rather than after tolerations, ensuring the resources stanza is nested under the nfd-master container.

coderabbitai · 2026-04-02T13:06:28Z

                    operator:
                      description: |-
                        Operator represents a key's relationship to the value.
-                        Valid operators are Exists and Equal. Defaults to Equal.
+                        Valid operators are Exists, Equal, Lt, and Gt. Defaults to Equal.
                        Exists is equivalent to wildcard for value, so that a pod can
                        tolerate all taints of a particular category.
+                        Lt and Gt perform numeric comparisons (requires feature gate TaintTolerationComparisonOperators).
                      type: string


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Do Kubernetes Pod tolerations support LtandGtoperators behind theTaintTolerationComparisonOperators feature gate, and in which Kubernetes versions is that feature available or enabled by default?

💡 Result:

Yes, Kubernetes Pod tolerations support the Lt (Less Than) and Gt (Greater Than) operators behind the TaintTolerationComparisonOperators feature gate. These operators enable numeric comparisons between taint values and toleration values (both must be valid 64-bit positive integers), allowing threshold-based scheduling. The feature is available starting in Kubernetes v1.35 as an alpha feature (disabled by default). It must be explicitly enabled via the --feature-gates=TaintTolerationComparisonOperators=true flag on the API server and kube-scheduler. As of 2026-04-02 (during v1.36 development), it remains alpha and is not enabled by default in any version.

Citations:

1: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration

2: https://github.com/kubernetes/api/blob/master/core/v1/toleration.go

3: https://kubernetes.io/blog/2026/01/05/kubernetes-v1-35-numeric-toleration-operators/

4: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/

🏁 Script executed:

#!/bin/bash # Search for Kubernetes version constraints in the repository fd -t f -e yaml -e txt -e md | xargs grep -l "kubernetes" | head -20 # Also search for version info in common locations if [ -f "Chart.yaml" ]; then echo "=== Chart.yaml ===" && cat Chart.yaml; fi if [ -f "requirements.yaml" ]; then echo "=== requirements.yaml ===" && cat requirements.yaml; fi if [ -f "README.md" ]; then echo "=== README.md (first 50 lines) ===" && head -50 README.md; fi if [ -f "go.mod" ]; then echo "=== go.mod ===" && cat go.mod; fi # Search for version constraints in the gpu-operator chart if [ -d "packages/system/gpu-operator" ]; then echo "=== Files in gpu-operator ===" ls -la packages/system/gpu-operator/ | head -20 fi

Repository: cozystack/cozystack

Length of output: 10781

🏁 Script executed:

#!/bin/bash # Search for references to TaintTolerationComparisonOperators or kubernetes version requirements rg -i "kubernetes|k8s|version|feature.*gate|taint" --type yaml -A 2 -B 2 | head -100

Repository: cozystack/cozystack

Length of output: 13930

🏁 Script executed:

#!/bin/bash # Check if this is a vendored chart by looking for upstream references if [ -d "packages/system/gpu-operator/charts" ]; then echo "=== GPU Operator Chart Structure ===" find packages/system/gpu-operator/charts -type f -name "Chart.yaml" -o -name "values.yaml" | xargs head -20 fi

Repository: cozystack/cozystack

Length of output: 2228

Verify Lt/Gt toleration operators are supported in supported Kubernetes versions.

The GPU operator chart supports Kubernetes >= 1.16.0, but the TaintTolerationComparisonOperators feature gate is available only in Kubernetes v1.35+ and is disabled by default. Users can define valid NVIDIADriver specs with Lt/Gt operators that will be rejected by the API server on Kubernetes versions 1.16–1.34 or on v1.35+ without the feature gate explicitly enabled. Either update the CRD documentation to clarify this requirement or update the chart's kubeVersion constraint to reflect the actual compatibility.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/system/gpu-operator/charts/gpu-operator/crds/nvidia.com_nvidiadrivers.yaml` around lines 917 - 924, The CRD docs for the `operator` field currently list `Lt`/`Gt` without noting they require the Kubernetes feature gate; update the `operator` description under the `operator` field (in the NVIDIADriver CRD) to state that `Lt` and `Gt` require Kubernetes v1.35+ with the TaintTolerationComparisonOperators feature gate enabled, or alternatively update the chart's `kubeVersion` constraint to >=1.35.0 so that the chart and the `NVIDIADriver` spec only allow these operators on supported Kubernetes versions; pick one approach and make the matching change to either the `operator` description or the chart `kubeVersion`.

Arsolitt · 2026-04-20T09:42:58Z

Testing Report: gpu-operator v26.3.0 on Talos Linux

Tested this PR on a Talos v1.12.6 cluster (3× VM nodes with NVIDIA A10 GPUs, Kubernetes v1.34.3).

Critical: Talos OS version parsing failure

The ClusterPolicy enters notReady state immediately after upgrade:

failed to retrieve GPU node OS tag: error processing OS major version v1: strconv.Atoi: parsing "v1": invalid syntax

This is a known bug in gpu-operator v26.3.0 (NVIDIA/gpu-operator#2239). The getGPUNodeOSTag() function in controllers/state_manager.go runs unconditionally during ClusterPolicy controller init for any cluster with GPU-labeled nodes — it is not gated by driver.enabled or any other ClusterPolicy setting. Since Talos sets VERSION_ID=v1.12.6 (with v prefix), strconv.Atoi fails on the major version.

Already fixed upstream in NVIDIA/gpu-operator#2244, released as v26.3.1 on 2026-04-18. The vendored chart should be bumped from v26.3.0 to v26.3.1.

This affects all Talos deployments regardless of variant (default or vgpu).

Secondary: nvidia-operator-validator permission error

On nodes where the validator pod was re-rolled, the driver-validation init container crashes with:

error creating symlink creator: failed to create device node nvidia-modeset: permission denied

This happens during /dev/char symlink creation. Could not fully isolate this issue since the ClusterPolicy failure above prevents normal reconciliation. Might be resolved by v26.3.1 as well, but worth verifying separately. See NVIDIA/gpu-operator#1687 for related Talos context.

Summary

The primary blocker is the OS version parsing bug — any Talos cluster with GPU nodes will hit it. Bumping the vendored chart to v26.3.1 should resolve this. The validator permission issue needs verification after the bump.

lexfrei added 2 commits April 2, 2026 14:58

lexfrei added this to Cozystack Roadmap Apr 2, 2026

lexfrei self-assigned this Apr 2, 2026

gemini-code-assist Bot reviewed Apr 2, 2026

View reviewed changes

lexfrei marked this pull request as ready for review April 2, 2026 12:51

lexfrei requested review from IvanHunters, androndo, kvaps, lllamnyp and sircthulhu as code owners April 2, 2026 12:51

dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request labels Apr 2, 2026

coderabbitai Bot reviewed Apr 2, 2026

View reviewed changes

IvanHunters self-assigned this Apr 9, 2026

lexfrei marked this pull request as draft April 10, 2026 11:46

lexfrei added the do-not-merge label Apr 12, 2026

IvanHunters removed their assignment Apr 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[gpu-operator] Update to v26.3.0 and add vGPU variant#2323

[gpu-operator] Update to v26.3.0 and add vGPU variant#2323
lexfrei wants to merge 4 commits intomainfrom
feat/update-gpu-operator-v26.3.0

lexfrei commented Apr 2, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 2, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 2, 2026

Uh oh!

gemini-code-assist Bot Apr 2, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 2, 2026

Uh oh!

coderabbitai Bot Apr 2, 2026

Uh oh!

Arsolitt commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lexfrei commented Apr 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does

Release note

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Arsolitt commented Apr 20, 2026

Testing Report: gpu-operator v26.3.0 on Talos Linux

Critical: Talos OS version parsing failure

Secondary: nvidia-operator-validator permission error

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lexfrei commented Apr 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 2, 2026 •

edited

Loading