Skip to content

fix(kubernetes): set explicit MTU for Cilium in tenant clusters#2147

Merged
kvaps merged 1 commit intocozystack:mainfrom
IvanHunters:fix/cilium-mtu-kubevirt-tenant
Mar 6, 2026
Merged

fix(kubernetes): set explicit MTU for Cilium in tenant clusters#2147
kvaps merged 1 commit intocozystack:mainfrom
IvanHunters:fix/cilium-mtu-kubevirt-tenant

Conversation

@IvanHunters
Copy link
Collaborator

@IvanHunters IvanHunters commented Mar 3, 2026

Summary

  • Set explicit MTU 1350 for Cilium in KubeVirt-based tenant Kubernetes clusters to prevent packet drops caused by VXLAN encapsulation overhead

Problem

Cilium's MTU auto-detection does not account for VXLAN overhead when running inside KubeVirt VMs. The VM network interface inherits MTU 1400 from the parent cluster's OVN/Geneve overlay (1500 - 100 Geneve overhead). Cilium detects this MTU and applies it to all tunnel interfaces without subtracting the 50-byte VXLAN encapsulation overhead.

This results in:

  • Large packets (> 1350 bytes) being silently dropped when crossing VXLAN tunnels between nodes
  • Intermittent connectivity issues for services in tenant clusters (TLS handshakes, HTTP responses with data)
  • HTTP 499 errors and timeouts observed under load

Fix

Explicitly set MTU: 1350 (1400 - 50 VXLAN overhead) in the default Cilium values for tenant clusters. This value can still be overridden via addons.cilium.valuesOverride if needed.

Test plan

  • Deploy a tenant Kubernetes cluster and verify Cilium interfaces use MTU 1350
  • Verify large packet connectivity from pods inside the tenant cluster

Summary by CodeRabbit

  • Chores
    • Updated Cilium network transmission settings in the Kubernetes deployment configuration.

Cilium's MTU auto-detection does not account for VXLAN overhead when
running inside KubeVirt VMs. The VM interface inherits MTU 1400 from
the parent OVN/Geneve overlay, and Cilium sets all interfaces
(cilium_vxlan, lxc*, cilium_host/net) to 1400 without subtracting
the 50-byte VXLAN encapsulation overhead.

This causes intermittent packet drops for large packets (TLS
handshakes, HTTP responses with data), resulting in timeouts and
499 errors for services running in tenant clusters.

Set MTU to 1350 (1400 - 50 VXLAN overhead) explicitly in the default
Cilium values for tenant Kubernetes clusters.

Signed-off-by: IvanHunters <[email protected]>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 3, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b6acf72f-7a3b-4b34-8a09-0da2c20c7864

📥 Commits

Reviewing files that changed from the base of the PR and between 1429b94 and 79b2546.

📒 Files selected for processing (1)
  • packages/apps/kubernetes/templates/helmreleases/cilium.yaml

📝 Walkthrough

Walkthrough

A single configuration parameter was added to the Cilium HelmRelease template, specifying an MTU value of 1350 in the cilium values block. This minimal change affects the network settings applied during HelmRelease rendering.

Changes

Cohort / File(s) Summary
Cilium HelmRelease Configuration
packages/apps/kubernetes/templates/helmreleases/cilium.yaml
Added MTU: 1350 parameter to the cilium values block to configure network maximum transmission unit size.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

🐰 A whisker-twitch of network care,
One line of config floating there,
MTU set at 1350,
Packets flow so smooth and easy!
Cilium grins with packet delight! 📦✨

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves critical connectivity issues within tenant Kubernetes clusters by adjusting the Maximum Transmission Unit (MTU) for Cilium. By setting a specific MTU value, it prevents silent packet drops that occurred due to the overhead introduced by VXLAN encapsulation, thereby improving network stability and reliability for services running in these environments.

Highlights

  • Cilium MTU Configuration: Explicitly set the MTU for Cilium to 1350 in KubeVirt-based tenant Kubernetes clusters. This addresses packet drops caused by VXLAN encapsulation overhead, which was not accounted for by Cilium's auto-detection.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • packages/apps/kubernetes/templates/helmreleases/cilium.yaml
    • Added an explicit MTU setting of 1350 for Cilium.
Activity
  • No activity has occurred on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses packet drops in KubeVirt-based tenant clusters by setting an explicit MTU of 1350 for Cilium. This change accounts for the VXLAN encapsulation overhead missed by auto-detection, resolving the described connectivity issues. I've added a suggestion to include a comment documenting the reason for this specific MTU value to improve long-term maintainability.

k8sServiceHost: {{ .Release.Name }}.{{ .Release.Namespace }}.svc
k8sServicePort: 6443
routingMode: tunnel
MTU: 1350
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While the value 1350 is correct according to the pull request description, it appears as a 'magic number' in the code. For better long-term maintainability, it's good practice to add a comment explaining how this value is derived (i.e., 1400 MTU from the host - 50 bytes for VXLAN overhead).

  # Set MTU to 1350 to account for VXLAN overhead (50 bytes) in KubeVirt VMs,
  # which inherit a 1400 MTU from the host cluster's overlay network.
  MTU: 1350

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 6, 2026
@kvaps kvaps marked this pull request as ready for review March 6, 2026 08:10
@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Mar 6, 2026
@kvaps kvaps merged commit 2731972 into cozystack:main Mar 6, 2026
7 of 8 checks passed
@dosubot dosubot bot added the bug Something isn't working label Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working lgtm This PR has been approved by a maintainer size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants