Releases: cozystack/cozystack
v1.3.0
Cozystack v1.3.0
Cozystack v1.3.0 brings storage-aware pod scheduling via a LINSTOR scheduler extender, a managed LINSTOR GUI web console with Keycloak SSO, a curated VM Default Images catalog for out-of-the-box virtual-machine provisioning, a new WorkloadsReady / Events observability surface with S3 bucket metering, and cross-namespace VMInstance backup restore with a full RestoreJob dashboard flow. The release also ships stricter tenant-name validation, VMInstance network-selector improvements, Keycloak theme injection and SMTP configuration, a host-runtime preflight check, and rolls up every fix from the v1.2.1 → v1.2.4 patch line.
Note: Items marked (backported to v1.2.x) were also shipped in v1.2.1, v1.2.2, v1.2.3, or v1.2.4 patch releases.
Feature Highlights
Storage-Aware Scheduling via the LINSTOR Extender
The cozystack-scheduler now calls a LINSTOR scheduler extender for storage-locality-aware pod placement. When a pod declares both a SchedulingClass and LINSTOR-backed PVCs, the scheduler consults LINSTOR to prefer nodes where volume replicas already exist — reducing cross-node replication traffic and improving I/O latency for storage-heavy workloads such as databases, object stores, and VMs.
The integration builds on the existing SchedulingClass tenant workload placement system introduced in v1.2.0 and requires no tenant-side configuration — workloads simply benefit once a SchedulingClass is assigned. Administrators can mix storage locality with the existing data-center / hardware-generation constraints defined on SchedulingClass CRs (@lllamnyp in #2330).
LINSTOR GUI: Managed Web Console for Storage Administration
A new opt-in linstor-gui system package deploys LINBIT's linstor-gui web UI alongside the LINSTOR controller with mTLS client authentication, non-root security context, and a ClusterIP-only service by default. When OIDC is configured on the platform, an optional Keycloak-protected Ingress (via oauth2-proxy) exposes the UI for browser access. Access is restricted to members of the cozystack-cluster-admin Keycloak group, consistent with host-cluster admin RBAC, and the gatekeeper blocks in-app LINSTOR authentication setup at the nginx proxy layer so the managed configuration cannot be subverted through the UI.
Operators who prefer CLI access keep the existing linstor command; the GUI is strictly additive and stays disabled by default (@myasnikovdaniil in #2382, #2390, #2415, #2419).
VM Default Images: Out-of-the-Box VM Provisioning
The new vm-default-images package provides a curated set of cluster-wide virtual-machine images (Ubuntu, Debian, CentOS Stream, and others) as pre-populated DataVolumes, so tenants can provision VMs against well-known base images without first having to upload them. The package is opt-in via the iaas bundle and defaults to replicated storage for high availability. Migration 38 renames legacy vm-image-* DataVolumes to the new vm-default-images-* naming scheme, and the vm-disk chart gains a new "disk" source type for cloning from existing vm-disks in the same namespace (@myasnikovdaniil in #2258).
Application Observability: WorkloadsReady, Events, and S3 Bucket Metering
Applications now expose a WorkloadsReady condition on their status by querying associated WorkloadMonitor resources, giving operators a single place to check whether all underlying workloads (Deployments, StatefulSets, DaemonSets, PVCs) are healthy. The dashboard gains a new Events tab showing namespace-scoped Kubernetes events per application, with fallback to .firstTimestamp when .eventTime is absent. A long-standing bug where WorkloadMonitor's Operational status was never persisted is fixed in the same change (@lexfrei in #2356).
The WorkloadMonitor reconciler is extended to track COSI BucketClaim objects as first-class Workloads, and the bucket controller now queries SeaweedFS logical and physical bucket-size metrics from VictoriaMetrics via a namespace-scoped monitoring endpoint, enabling S3 billing integration on par with Pods and PVCs (@kitsunoff in #2391). Workloads are also enriched with workloads.cozystack.io/resource-preset and source-object labels so downstream billing pipelines can correlate monitors with the tenant preset that produced them (@androndo in #2416).
Cross-Namespace VM Backup Restore and RestoreJob Dashboard
The backup system now supports restoring VMInstance backups into a different namespace (cross-namespace copy restores) with IP/MAC preservation and safe rename semantics. In-place backup and restore flows for VMDisk and VMInstance are improved: HelmReleases and DataVolumes are properly handled, and Velero failure messages are propagated to the Application status. The backup status structure has been refactored to store underlying resources as a generic opaque JSON object, enabling arbitrary application-specific metadata without status-schema churn (@androndo in #2251, #2319, #2329).
The dashboard now ships a complete RestoreJob experience: list view, details page, create form, and sidebar entry, with a "Same as backup" fallback rendering when spec.targetApplicationRef is omitted. Non-CRD-backed sidebar factories (kube-*, plan, backupjob, backup, restorejob) are marked static so they pick up consistent managed-by labels across reconciles (@myasnikovdaniil in #2437).
Major Features and Improvements
-
[api] Reject tenant names with dashes at Create time: Enforces alphanumeric-only naming for Tenants at the API level, preventing names with hyphens that would silently fail during Helm reconciliation. A corresponding regex tightening and regression test suite hardens the validation (@lexfrei in #2380).
-
[platform] Validate computed tenant namespace length: Rejects Tenant creation when the computed ancestor-chain namespace would exceed the 63-character Kubernetes namespace limit, preventing opaque HelmRelease reconcile errors downstream (@lexfrei in #2376).
-
[vm-instance] Rename subnets to networks and add dropdown selector: Renames the misleading
subnetsfield tonetworksin VMInstance for clarity, adds a dropdown selector for available networks in the dashboard form, and includes migration 36 to copy existingsubnetsvalues. The old field remains supported for backward compatibility (@sircthulhu in #2263). -
[keycloak] Enable injecting themes: Cozystack administrators can now inject custom Keycloak themes via
initContainersfor UI white-labeling and customization (@lllamnyp in #2142). -
[keycloak-configure] Add email verification and SMTP configuration: Adds configurable Keycloak settings for user self-registration, email verification, and SMTP server configuration, enabling automated user onboarding flows (@BROngineer in #2318).
-
[postgres] Pin system PostgreSQL to 17.7-standard-trixie: Pins the PostgreSQL image for system databases (Grafana, Alerta, Harbor, Keycloak, SeaweedFS) to
17.7-standard-trixieacross chart templates andvalues.yaml, and ships migration 37 to patch existing CNPG ClusterimageNamefields to the same variant (handling unset, any PG 17 tag, and bare-version tags). This prevents CNPG from defaulting to PostgreSQL 18 and locks system databases to the trixie variant consistent with the monitoring stack requirements (related backports shipped in v1.2.1 via #2309 and v1.2.2 via #2364) (@myasnikovdaniil in #2369). -
[platform] Prevent installed packages deletion: Adds the
helm.sh/resource-policy: keepannotation to platform packages so disabling a package no longer triggers automatic Helm deletion, restoring the documented behavior where operators must explicitly delete a package (backported to v1.2.1) (@kvaps in #2273). -
[mariadb] Always enable replication for consistent service naming: MariaDB now always enables replication, creating
-primary/-secondaryservices even for single-replica instances. This fixes dashboard visibility and backup functionality for single-replica setups (@sircthulhu in #2279). -
[hack] Add host runtime preflight check: New
check-host-runtime.shscript andmake preflighttarget that warns operators when a standalone containerd or docker runtime is running alongside the embedded k3s runtime, helping diagnose container-runtime conflicts early in an installation (@lexfrei in #2371). -
[hack] Add check-readiness.sh diagnostic script: A new diagnostic script for tracking platform reconciliation by checking readiness of Packages, ArtifactGenerators, ExternalArtifacts, and HelmReleases, with support for watch mode and continuous monitoring (@myasnikovdaniil in #2294).
-
[platform] Add resourcePreset labels to WorkloadMonitor labels: WorkloadMonitor labels with the
workloads.cozystack.io/prefix are now propagated onto created Workloads; created Workloads always include the reservedworkloads.cozystack.io/monitorlabel, and Helm app charts addworkloads.cozystack.io/resource-presetmetadata to WorkloadMonitor manifests, enabling downstream billing pipelines to correlate monitors with the tenant preset that produced them (@androndo in #2416)....
v1.2.3
v1.2.3 (2026-04-20)
A patch release with bug fixes and documentation updates.
Features and Improvements
No notable features in this patch release.
Fixes
- fix(kubernetes): set explicit ephemeral-storage on virt-launcher pods: Prevents VM crashes caused by ephemeral-storage eviction by setting explicit
domain.resourcesephemeral-storage on the VirtualMachine spec. Uses sanitized limits and requests so virt-launcher pods do not inherit too-small namespace defaults. (@kvaps in #2317, backport #2423).
Documentation
- [website] feat: add Telemetry page under OSS Health section: Add Telemetry page and initial data seeding to OSS Health docs (@tym83 in cozystack/website#471).
- [website] Refactor docs versions to major.minor variants: Move docs to major.minor versioning for v1.x series (@myasnikovdaniil in cozystack/website#477).
- [website] docs(tenants): document namespace layout and parent/child derivation (@lexfrei in cozystack/website#479).
- [website] docs(tenants): document the checkbox-then-edit-CR customization pattern (@lexfrei in cozystack/website#485).
- [website] docs: fix 14 broken links and stale talm anchor across v1 docs (@lexfrei in cozystack/website#486).
- [website] fix(og): update social badge image and title (@tym83 in cozystack/website#487).
- [website] docs(external-apps): rewrite guide for ApplicationDefinition API (@kitsunoff in cozystack/website#488).
- [website] docs: add CLAUDE.md for AI agent guidance (@myasnikovdaniil in cozystack/website#489).
- [website] fix: update /docs/v1/ redirect to latest v1.2 (@myasnikovdaniil in cozystack/website#492).
- [website] fix(ci): add OpenAPI spec download to GitHub Pages build (@myasnikovdaniil in cozystack/website#494).
- [website] feat(blog): add managed PostgreSQL with synchronous replication post (@tym83 in cozystack/website#497).
- [website] chore(blog): add images frontmatter for social preview on existing posts (@tym83 in cozystack/website#498).
- [website] feat(blog): taxonomies and client-side filter UI (@tym83 in cozystack/website#499).
- [website] style(oss-health): add breathing room between navbar and hero (@tym83 in cozystack/website#500).
Other repositories
- [talm] feat(config): migrate to Talos v1.12 multi-document config format: Upgrade Talos config format and modernize configuration handling (@lexfrei in cozystack/talm#116).
- [talm] chore(deps): bump dependencies and modernize codebase (@lexfrei in cozystack/talm#124).
- [external-apps-example] feat: replace MongoDB example with Minecraft apps from cozylex (@lexfrei in cozystack/external-apps-example#2).
- [ansible-cozystack] fix(examples): add v prefix to collection version in requirements.yml (@lexfrei in cozystack/ansible-cozystack#23).
- [ansible-cozystack] fix(plugins): replace ansible.utils.ipaddr with stdlib-based test plugin (@lexfrei in cozystack/ansible-cozystack#24).
- [ansible-cozystack] feat(examples): comprehensive node prerequisites audit (fixes #19) (@lexfrei in cozystack/ansible-cozystack#27).
- [ansible-cozystack] chore(deps): update dependency cozystack.installer to v1.2.3 (@app/renovate in cozystack/ansible-cozystack#29).
- [ansible-cozystack] feat(role): expose publishing.externalIPs and tenant-root ingress via role variables (@lexfrei in cozystack/ansible-cozystack#30).
Contributors
Thanks to everyone who contributed to this patch release:
Full Changelog: v1.2.2...v1.2.3
v1.3.0-rc.1
Cozystack v1.3.0-rc.1
Cozystack v1.3.0-rc.1 is the first release candidate for v1.3.0, bringing storage-aware scheduling via the LINSTOR scheduler extender, a managed LINSTOR GUI web UI with Keycloak SSO, a VM Default Images catalog for out-of-the-box virtual machine provisioning, WorkloadsReady conditions with a real-time Events tab in the dashboard, and cross-namespace VM backup restore capabilities. Additional highlights include stricter tenant name validation, VM network selector improvements, Keycloak theme injection and SMTP configuration, and a comprehensive host runtime preflight check.
Note: Fixes marked with (backported to v1.2.x) were also included in v1.2.1 or v1.2.2 patch releases.
Feature Highlights
Storage-Aware Scheduling via LINSTOR Extender
The cozystack-scheduler now calls the LINSTOR scheduler extender for storage-locality-aware pod placement. When a pod declares both a SchedulingClass and LINSTOR-backed PVCs, the scheduler consults LINSTOR to prefer nodes where volume replicas already exist — reducing cross-node replication traffic and improving I/O latency for storage-heavy workloads (@lllamnyp in #2330).
LINSTOR GUI: Managed Web UI for Storage Administration
A new opt-in linstor-gui system package deploys LINBIT's linstor-gui web UI alongside the LINSTOR controller with mTLS client authentication, non-root security context, and ClusterIP-only service. An optional Keycloak-protected Ingress (via oauth2-proxy) can be enabled for SSO-authenticated browser access when OIDC is configured on the platform (@myasnikovdaniil in #2382, #2390).
VM Default Images: Out-of-the-Box VM Provisioning
The new vm-default-images package provides a curated set of cluster-wide virtual machine images (Ubuntu, Debian, CentOS Stream, and others) as pre-populated DataVolumes. The package is opt-in via the iaas bundle and defaults to replicated storage for high availability. A companion migration (migration 38) renames legacy vm-image-* DataVolumes to the new vm-default-images-* naming scheme. The vm-disk chart also gains a new "disk" source type for cloning from existing vm-disks in the same namespace (@myasnikovdaniil in #2258).
WorkloadsReady Condition and Events Tab
Applications now expose a WorkloadsReady condition on their status by querying associated WorkloadMonitor resources, giving operators a single place to check whether all underlying workloads (Deployments, StatefulSets, DaemonSets) are healthy. The dashboard gains a new Events tab showing namespace-scoped Kubernetes events for each application, with fallback to .firstTimestamp when .eventTime is absent. A bug where WorkloadMonitor's Operational status was never persisted is also fixed (@lexfrei in #2356).
Cross-Namespace VM Backup Restore
The backup system now supports restoring VMInstance backups into a different namespace (cross-namespace copy restores), with IP/MAC preservation and safe rename semantics. In-place backup/restores for VMDisk and VMInstance are improved: HelmReleases and DataVolumes are properly handled, and Velero failure messages are propagated to the Application status. The backup status structure has been refactored to store underlying resources as a generic opaque JSON object, enabling arbitrary application-specific metadata (@androndo in #2251, #2329, #2319).
Major Features and Improvements
-
[api] Reject tenant names with dashes at Create time: Enforces alphanumeric-only naming for Tenants at the API level, preventing names with hyphens that would silently fail during Helm reconciliation. A corresponding regex tightening and regression test suite hardens the validation (@lexfrei in #2380).
-
[platform] Validate computed tenant namespace length: Rejects Tenant creation when the computed ancestor-chain namespace would exceed the 63-character Kubernetes namespace limit, preventing opaque HelmRelease reconcile errors downstream (@lexfrei in #2376).
-
[vm-instance] Rename subnets to networks and add dropdown selector: Renames the misleading
subnetsfield tonetworksin VMInstance for clarity, adds a dropdown selector for available networks in the dashboard form, and includes a migration to copy existingsubnetsvalues. The old field remains supported for backward compatibility (@sircthulhu in #2263). -
[keycloak] Enable injecting themes: Cozystack administrators can now inject custom Keycloak themes via
initContainersfor UI white-labeling and customization (@lllamnyp in #2142). -
[keycloak-configure] Add email verification and SMTP configuration: Adds configurable Keycloak settings for user self-registration, email verification, and SMTP server configuration, enabling automated user onboarding flows (@BROngineer in #2318).
-
[postgres] Hardcode PostgreSQL 17 for monitoring databases: Pins PostgreSQL 17.7 images for system databases (Grafana, Alerta, Harbor, Keycloak, SeaweedFS) and adds migration 37 to backfill
spec.version=v17for existing PostgreSQL resources, preventing CNPG from defaulting to PostgreSQL 18 (backported to v1.2.1) (@IvanHunters in #2304). -
[hack] Add host runtime preflight check: New
check-host-runtime.shscript andmake preflighttarget that warns operators when a standalone containerd or docker runtime is running alongside the embedded k3s runtime, helping diagnose container runtime conflicts (@lexfrei in #2371). -
[hack] Add check-readiness.sh diagnostic script: A new diagnostic script for tracking platform reconciliation by checking readiness of Packages, ArtifactGenerators, ExternalArtifacts, and HelmReleases, with support for watch mode and continuous monitoring (@myasnikovdaniil in #2294).
-
[mariadb] Always enable replication for consistent service naming: MariaDB now always enables replication, creating
-primary/-secondaryservices even for single-replica instances. This fixes dashboard visibility and backup functionality for single-replica setups (@sircthulhu in #2279). -
[platform] Prevent installed packages deletion: Adds
helm.sh/resource-policy: keepannotation to packages, preventing automatic deletion when packages are disabled and restoring documented behavior (backported to v1.2.1) (@kvaps in #2273).
Bug Fixes
-
[cilium] Opt-out of cri-containerd.apparmor.d for nsenter init containers: Opts cilium-agent init containers out of the
cri-containerd.apparmor.dAppArmor profile on non-Talos variants, fixingInit:CrashLoopBackOffon Ubuntu 22.04+ and Debian (backported to v1.2.2) (@lexfrei in #2370). -
[virtual-machine] Exclude external VM services from Cilium BPF LB: Adds
service-proxy-name: cozy-proxylabel to VM LoadBalancer services, telling Cilium to skip BPF processing. Fixes inter-tenant connectivity via public LB IPs and WholeIP functionality on Cilium 1.19+ (backported to v1.2.2) (@mattia-eleuteri in #2357). -
[monitoring] Fix infra dashboards missing in default variant: Includes
cozy-monitoringnamespace in the dashboard rendering condition, fixing infrastructure Grafana dashboards not rendering in the default platform variant (backported to v1.2.2) (@mattia-eleuteri in #2365). -
[postgres] Fix system PostgreSQL images to 17.7-standard-trixie: Normalizes system PostgreSQL image tags to use
17.7-standard-trixievariant with migration logic for existing CNPG clusters (backported to v1.2.2) (@myasnikovdaniil in #2364). -
[build] Filter git describe to match only v* tags: Adds
--match 'v*'togit describecalls, preventing API subtags from being picked up instead of release tags and producing invalid Docker image tags (backported to v1.2.2) (@kvaps in #2386). -
[platform] Fix resource allocation ratios not propagated to packages: Restores propagation of CPU, memory, and ephemeral-storage allocation ratios to managed applications and KubeVirt, which were silently ignored since the bundle restructure (backported to v1.2.1) (@sircthulhu in #2296).
-
[kubernetes] Set explicit ephemeral-storage on virt-launcher pods: Sets explicit
domain.resourceswith ephemeral-storage on VirtualMachine spec to prevent virt-launcher pods from being evicted due to LimitRange defaults being too low for actual emptyDisk capacity (@kvaps in #2317). -
[multus] Pin master CNI to 05-cilium.conflist: Prevents a boot-time race condition where multus could auto-detect kube-ovn's conflist instead of Cilium's (backported to v1.2.1) (@kvaps in #2315).
-
[multus] Build custom image with DEL cache fix: Fixes sandbox cleanup deadlock when CNI ADD never completes, preventing stale sandbox name reservations from permanently blocking pod creation (backported to v1.2.1) (@kvaps in #2313).
-
[linstor] Set verify-alg to crc32c: Prevents DRBD connection failures on kernels where
crct10difis unavailable (e.g., Talos v1.12.6 with kernel 6.18.18) (backported to v1.2.1) (@kvaps in #2303). -
**[lin...
v1.2.2
Features and Improvements
- [linstor] Update piraeus-server to v1.33.2 with selected backports: Bumps LINSTOR server from v1.33.1 to v1.33.2 and adds backported patches for improved storage reliability: a stale bitmap adjust retry mechanism for automatic recovery after bitmap attach errors, LUKS2 header sizing and optimal I/O size detection improvements for more reliable disk formatting, and the maintainer implementation backport. All patches verified against upstream v1.33.2 with
git apply --checkandgradlew compileJava(@kvaps in #2331, #2377).
Fixes
-
[postgres] Fix system PostgreSQL images to 17.7-standard-trixie: Hardcodes PostgreSQL 17.7-standard-trixie images for system PostgreSQL instances. This ensures system databases use the correct image variant consistent with the monitoring stack requirements introduced in v1.2.1 (@myasnikovdaniil in #2364, #2369).
-
[cilium] Opt-out of cri-containerd.apparmor.d for nsenter init containers: On Ubuntu 22.04+, Debian, and other distributions that load the
cri-containerd.apparmor.dAppArmor profile by default for containerd workloads, the kernel deniednsenternamespace entry in cilium-agent init containers (mount-cgroup,apply-sysctl-overwrites,clean-cilium-state), causing the agent to land inInit:CrashLoopBackOffand cascading platform failures. Per-containercontainer.apparmor.security.beta.kubernetes.ioannotations now opt the affected containers out of this profile, applied only on non-Talos cilium variants (cilium-generic,kubeovn-cilium-generic). The vendored daemonset template is also patched to strip the upstreamsemverCompare "<1.30.0"AppArmor block, preventing duplicate annotation keys. Talos variants are untouched as Talos does not load the AppArmor LSM (@lexfrei in #2370, #2378). -
[virtual-machine] Exclude external VM services from Cilium BPF LB: Adds the
service.kubernetes.io/service-proxy-name: "cozy-proxy"label to VM LoadBalancer services whenexternal: true, telling Cilium to skip BPF processing entirely for these services. This fixes two issues: inter-tenant connectivity via public LB IPs (Cilium's DNAT caused cross-tenant pod-to-pod flow classification, triggering CiliumClusterwideNetworkPolicy blocks) and WholeIP broken on Cilium 1.19+ (wildcard service drop entries blocked traffic to LB IPs on undeclared ports before it reached netfilter/cozy-proxy). MetalLB L2 advertisement and kube-ovn routing remain unaffected (@mattia-eleuteri in #2357, #2361). -
[monitoring] Fix infra dashboards missing in default variant: The default platform variant deploys the monitoring chart to the
cozy-monitoringnamespace, but the dashboard rendering condition introduced in #2197 only checked fortenant-root. Infrastructure dashboards were not rendered in the default variant. Thecozy-monitoringnamespace is now included in the rendering condition, consistent with the existing pattern invmagent.yaml(@mattia-eleuteri in #2365, #2367). -
[build] Filter git describe to match only v tags*: Adds
--match 'v*'to allgit describecalls inhack/common-envs.mk. Theapi/apps/v1alpha1/*subtags share the same commit as release tags, causinggit describe --exact-matchto pickapi/apps/v1alpha1/vX.Y.Zinstead ofvX.Y.Z, producing invalid Docker image tags (@kvaps in #2386, #2389).
Development, Testing, and CI/CD
-
[ci] Replace cozystack-bot PAT with cozystack-ci GitHub App: Replaces the long-lived
cozystack-botpersonal access token with short-lived, scoped tokens from thecozystack-ciGitHub App across all release workflows (tags.yaml,auto-release.yaml,pull-requests-release.yaml). Improves security and auditability of CI operations (@tym83 in #2351). -
[ci] Use cozystack org noreply email for bot commits: Updates CI workflows to use the cozystack organization noreply email for bot commits (@kvaps in #2392, #2393).
-
[ci] Replace GH_PAT with cozystack-ci GitHub App token in pull-requests workflow: Switches the pull-requests release workflow to use the cozystack-ci GitHub App token instead of the personal access token (@kvaps in #2383, #2384).
Documentation
-
[website] Add ApplicationDefinition naming convention reference: Added reference documentation on ApplicationDefinition naming conventions and how
cozystack-apiresolves kinds to their backing definitions (@lexfrei in cozystack/website#478). -
[website] Document Talos / talosctl / Cozystack version pairing: Added documentation covering Talos, talosctl, and Cozystack version compatibility matrix for installation (@lexfrei in cozystack/website#484).
-
[website] Fix KubeOVN MASTER_NODES example path and key in troubleshooting: Corrected the MASTER_NODES example path and key in the KubeOVN troubleshooting guide (@lexfrei in cozystack/website#483).
-
[website] Prefix bundle package names with cozystack. in v1 examples: Updated documentation examples to use the correct
cozystack.prefix for bundle package names in enabled/disabledPackages (@lexfrei in cozystack/website#482). -
[website] Finish isolated-field removal and document opt-in policy labels: Removed the obsolete
isolatedfield from tenant documentation and documented the new opt-in policy labels approach (@lexfrei in cozystack/website#481). -
[website] Add --take-ownership flag and describe networking. fields*: Added documentation for the
--take-ownershipflag and described thenetworking.*fields in the installation guide (@lexfrei in cozystack/website#480). -
[website] Add bonding (LACP) configuration how-to guide: Added a guide for configuring network bonding with LACP on Cozystack installations (@sircthulhu in cozystack/website#459).
-
[website] Improve registry mirrors for tenant Kubernetes in air-gapped guide: Improved documentation for configuring registry mirrors in tenant Kubernetes clusters for air-gapped environments (@sircthulhu in cozystack/website#461).
-
[website] Update backup/restore documentation for VMI/VMDisk: Updated backup documentation with information related to VM instance and VM disk restore improvements (@androndo in cozystack/website#466).
-
[website] Add updated OpenAPI spec: Updated the OpenAPI specification for managed applications reference (@myasnikovdaniil in cozystack/website#469).
-
[website] Add OSS Health pages and OpenSSF badge: Added OSS Health section with OpenSSF Scorecard and Best Practices badge to the website footer (@tym83 in cozystack/website#470).
-
[website] Add CozySummit Virtual 2026 program announcement: Published the CozySummit Virtual 2026 program announcement blog post (@tym83 in cozystack/website#472).
-
[website] Add missing release announcements for v0.1–v0.41: Backfilled missing release announcement blog posts for Cozystack versions v0.1 through v0.41 (@tym83 in cozystack/website#468).
-
[talm] Render templates online in apply to resolve lookups: Fixed talm
applycommand to render templates online, resolving template lookup failures when using modeline templates (@myasnikovdaniil in cozystack/talm#119). -
[talm] Update default Talos image to v1.12.6: Updated the default Talos image version to v1.12.6 in talm (@kvaps in cozystack/talm@03e9b6e).
Full Changelog: v1.2.1...v1.2.2
v1.1.6
Fixes
- [build] Filter git describe to match only v tags*: Adds
--match 'v*'to allgit describecalls inhack/common-envs.mk. Theapi/apps/v1alpha1/*subtags share the same commit as release tags, causinggit describe --exact-matchto pickapi/apps/v1alpha1/vX.Y.Zinstead ofvX.Y.Z, producing invalid Docker image tags (@kvaps in #2386, #2388).
Development, Testing, and CI/CD
-
[ci] Replace cozystack-bot PAT with cozystack-ci GitHub App: Replaces the long-lived
cozystack-botpersonal access token with short-lived, scoped tokens from thecozystack-ciGitHub App across all release workflows. Improves security and auditability of CI operations (@tym83 in #2351). -
[ci] Replace GH_PAT with cozystack-ci GitHub App token in pull-requests workflow: Switches the pull-requests release workflow to use the cozystack-ci GitHub App token instead of the personal access token (@kvaps in #2383).
-
[ci] Use cozystack org noreply email for bot commits: Updates CI workflows to use the cozystack organization noreply email for bot commits (@kvaps in #2392).
Full Changelog: v1.1.5...v1.1.6
v1.2.1
Features and Improvements
- [postgres] Hardcode PostgreSQL 17 for monitoring databases and add migration: CloudNativePG operator defaults to PostgreSQL 18.3 when no explicit image is specified, but monitoring queries in Grafana and Alerta rely on PostgreSQL 17 features such as
pg_stat_checkpointerand the updatedpg_stat_bgwriter. This mismatch could break monitoring after fresh installs or database recreation. PostgreSQL 17.7 images are now hardcoded for monitoring databases, and migration 37 is added to set version v17 for any existing PostgreSQL resources (@IvanHunters in #2304, #2309).
Fixes
-
[platform] Prevent installed packages deletion: Added the
helm.sh/resource-policy: keepannotation to all platform packages. Previously, moving a package todisabledPackagesor removing it fromenabledPackagescaused Helm to automatically delete the corresponding resource, contradicting the documented behavior that requires the platform administrator to manually delete packages when needed (@myasnikovdaniil in #2273, #2297). -
[linstor] Preserve TCP ports during toggle-disk operations: During toggle-disk operations,
removeLayerData()freed TCP ports from the number pool andensureStackDataExists()could then allocate different ports. If a satellite missed the resulting update (e.g. due to a controller restart), it retained the old ports while peers received the new ones, causing DRBD connections to fail with StandAlone state. The fix addscopyDrbdTcpPortsIfExists()which saves existing TCP ports into theLayerPayloadbeforeremoveLayerData()deletes them (@kvaps in #2292, #2299). -
[platform] Fix resource allocation ratios not propagated to managed packages: A regression introduced in the bundle restructure caused
cpuAllocationRatio,memoryAllocationRatio, andephemeralStorageAllocationRatioset inplatform/values.yamlto become no-ops — they were never written to thecozystack-valuesSecret that cozy-lib reads in child packages. This meant all managed applications silently used the hardcoded defaults (10, 1, 40) regardless of operator-configured values. The fix restores propagation by writing the ratios into the_clustersection of thecozystack-valuesSecret and passingcpuAllocationRatioto the KubeVirt Package component (@sircthulhu in #2296, #2301). -
[linstor] Fix DRBD connectivity failures on kernels without
crct10difby setting verify-alg tocrc32c: LINSTOR's auto-verify algorithm selection defaults tocrct10dif, but this kernel crypto module is no longer available in newer kernels (e.g. Talos v1.12.6, kernel 6.18.18). Whencrct10difis unavailable, DRBD peer connections fail withVERIFYAlgNotAvail: failed to allocate crct10dif for verify, causing all DRBD resources to enter Diskless state and lose quorum.DrbdOptions/Net/verify-algis now set tocrc32cat the controller level (@kvaps in #2303, #2312). -
[multus] Fix stale sandbox reservations permanently blocking pod creation after CNI ADD failure: After a node disruption (e.g. DRBD or kube-ovn issues during upgrade), containerd accumulated stale sandbox name reservations. Cleanup failed because multus called delegate plugins for DEL without cached state and they rejected the incomplete config, causing DEL to fail instead of succeeding. Stale entries were never released, permanently blocking new pod creation on the affected node. A custom multus-cni image is now built with a patch that returns success from DEL when CNI ADD never completed (@kvaps in #2313, #2314).
-
[multus] Pin master CNI to
05-cilium.conflistto prevent race condition at boot: During node boot or Talos upgrade, multus auto-detects the master CNI conflist by scanning the CNI config directory. If kube-ovn writes10-kube-ovn.conflistbefore Cilium writes05-cilium.conflist, multus selects the wrong file and pods bypass the Cilium chain entirely, have no Cilium endpoint, and their traffic is blocked by cluster-wide network policies.multusMasterCNIis now pinned to05-cilium.conflist(@kvaps in #2315, #2316).
Documentation
-
[website] Add custom Keycloak themes documentation: Added documentation for custom Keycloak theme injection to the White Labeling guide, covering the theme image contract (
/themes/directory structure), configuration via thecozystack.keycloakPackage resource,imagePullSecretsfor private registries, and theme activation in the Keycloak admin console (@lexfrei in cozystack/website#463). -
[website] Add documentation for Go types usage: Added a guide for using the generated Go types for Cozystack managed applications as a Go module, including installation instructions, programmatic resource management examples, and deployment approaches (@myasnikovdaniil in cozystack/website#465).
Full Changelog: v1.2.0...v1.2.1
v1.1.5
Fixes
-
[platform] Prevent installed packages deletion: Added the
helm.sh/resource-policy: keepannotation to all platform packages. Previously, moving a package todisabledPackagesor removing it fromenabledPackagescaused Helm to automatically delete it, contradicting the documented behavior that requires the platform administrator to manually delete packages when needed (@myasnikovdaniil in #2273, #2298). -
[linstor] Fix TCP port mismatches after toggle-disk operations causing DRBD resources to enter StandAlone state: During toggle-disk operations,
removeLayerData()freed TCP ports from the number pool andensureStackDataExists()could then allocate different ports. If a satellite missed the resulting update (e.g. due to a controller restart), it retained the old ports while peers received the new ones, causing DRBD connections to fail with StandAlone state. The fix introducescopyDrbdTcpPortsIfExists(), which preserves existing TCP ports in theLayerPayloadbeforeremoveLayerData()releases them (@kvaps in #2292, #2300).
Full Changelog: v1.1.4...v1.1.5
v1.1.4
Features and Improvements
-
[boot-to-talos] Add support for ISO, RAW, and HTTP image sources: The
boot-to-talostool can now use ISO files, raw disk images, and HTTP URLs as Talos image sources in addition to container registry images. This allows bootstrapping nodes in air-gapped environments or from locally stored images without requiring a container registry (@lexfrei in cozystack/boot-to-talos#13). -
[boot-to-talos] Use permanent MAC address for predictable network interface names: Interface name detection now reads the permanent MAC address directly from sysfs instead of relying on udev data, providing a stable hardware MAC that is unaffected by user modifications to the active MAC address. This makes network interface naming more reliable across reboots and hardware changes (@IvanHunters in cozystack/boot-to-talos#14).
Fixes
-
[dashboard] Fix broken backup menu links missing cluster context: Backup resources (plans, backupjobs, backups) are not
ApplicationDefinitions, soensureNavigation()never created theirbaseFactoriesMappingentries. Without these entries the OpenUI frontend could not resolve the{cluster}context for backup pages, producing broken sidebar links with an empty cluster segment (e.g./openapi-ui//tenant-root/...). The missingbaseFactoriesMappingentries for all backup resource types are now added to the staticNavigationresource (@sircthulhu in #2232, #2269). -
[platform] Fix tenant admins unable to create FoundationDB, Harbor, MongoDB, OpenBAO, OpenSearch, Qdrant, and VPN applications: The
cozy:tenant:admin:baseClusterRolewas missing seven application resources fromapps.cozystack.io(foundationdbs,harbors,mongodbs,openbaos,opensearches,qdrants,vpns). Without these permissions, tenant admins could not create these applications — the "Add" button was inactive in the dashboard. The missing resources have been added to the ClusterRole (@sircthulhu in #2268, #2272). -
[dashboard] Fix StorageClass dropdown showing "Error" in application forms: The dashboard UI fetches
StorageClassresources to populate dropdowns (e.g. in the Postgres form), but thecozystack-dashboard-readonlyClusterRoledid not includestorage.k8s.io/storageclasses. This caused authenticated users to see "Error" instead of the StorageClass name.get/list/watchpermissions forstorageclasseshave been added to the dashboard readonly role (@sircthulhu in #2267, #2274). -
[system] Fix 403 error on Service details page by granting tenants read access to EndpointSlices: The dashboard requested
EndpointSlicesfrom thediscovery.k8s.ioAPI group to display the "Pod serving" section on the Service details page, butcozy:tenant:baseandcozy:tenant:view:baseClusterRoles lacked permissions for this resource. Tenant users received a 403 error when opening the Service details page.get/list/watchpermissions forendpointsliceshave been added to both tenant ClusterRoles (@sircthulhu in #2257, #2285). -
[dashboard] Fix "Pod serving" table displaying "Raw:" and "Invalid Date" on Service details page: The Service details page
EndpointSlicetable showed "Raw:" prefixes and "Invalid Date" values because theEnrichedTablereferencedcustomizationIdfactory-kube-service-details-endpointslicewhich had no correspondingCustomColumnsOverride. Column definitions for Pod (.targetRef.name), Addresses (.addresses), Ready (.conditions.ready), and Node (.nodeName) have been added (@sircthulhu in #2266, #2283). -
[piraeus-operator] Fix LINSTOR satellite alert labels, reduce scrape-flap false positives, and improve controller alerting: Three alerting issues in
cozy-piraeus-operatorhave been addressed: (1)linstorSatelliteErrorRateused a non-existentnamelabel in annotations, resulting inSatellite ""in alert notifications — corrected to{{ $labels.hostname }}; (2)linstorSatelliteErrorRatecould produce false positives when thelinstor-controllerscrape flapped and historicallinstor_error_reports_countcounters reappeared inside the alert window — fixed by adding a minimum scrape-count guard; (3) TheLinstorControllerOfflinealert has been split into separate availability and metrics-availability alerts with configurable hold time to reduce noise during brief connectivity interruptions (@sasha-sup in #2265, #2286). -
[linstor] Fix swapped VMPodScrape job labels causing incorrect controller offline alerts: The
cozy-linstorVictoriaMetricsVMPodScrapetemplates had thejobrelabeling rules swapped:linstor-satellitemetrics were labeled asjob=linstor-controllerand vice versa. This causedlinstorControllerOfflinealerts to fire for satellite endpoints (:9942) while reporting that the controller was unreachable. Thejoblabels are now correctly assigned to their respective targets (@sasha-sup in #2264, #2289). -
[boot-to-talos] Fix triple-fault on hosts with 5-level paging (LA57) enabled: On hosts with
CONFIG_X86_5LEVEL=yin the kernel, kexec into Talos caused a triple-fault because the Talos kernel does not support 5-level page tables.boot-to-talosnow detects LA57 before kexec and automatically patches GRUB withno5lvl, runsupdate-grub, and reboots. After reboot with 5-level paging disabled,boot-to-talosproceeds normally (@IvanHunters in cozystack/boot-to-talos#15). -
[boot-to-talos] Fix EFI boot entry creation when using loop device images: Talos installer skips EFI variable creation when running on loop devices.
boot-to-talosnow creates a proper UEFI boot entry with anHD()device path pointing to the real target disk's ESP by reading the GPT partition table from the target disk after image copy, instead of relying on the Talos installer (@kvaps in cozystack/boot-to-talos#16). -
[talm] Fix silent empty output when no template files are specified: Running
talm templatewithout--fileor--templateflags previously produced minimal or empty output without any error. Validation has been added toengine.Renderto return a clear error message when no template files are specified, making misconfigured invocations immediately apparent (@kvaps in cozystack/talm#112).
Documentation
-
[website] Add documentation for VMInstance and VMDisk backups: Added a new virtualization-focused Backup and Recovery guide covering one-off and scheduled backups for
VMInstanceandVMDiskresources, restore procedures, status verification commands, and troubleshooting notes including Velero-related issues (@myasnikovdaniil in cozystack/website#456). -
[website] Update developer guide with operator-driven architecture and OCIRepository migration flow: Rewrote the development guide to describe the operator-driven in-cluster architecture, bootstrap flow, operator responsibilities, and the platform install/update sequence. Added an "OCIRepositories and Migration Flow" section with migration hook examples and sequencing rules for pre-upgrade hooks (@myasnikovdaniil in cozystack/website#458).
Full Changelog: v1.1.3...v1.1.4
v1.0.7
Fixes
-
[platform] Fix tenant admins unable to create FoundationDB, Harbor, MongoDB, OpenBAO, OpenSearch, Qdrant, and VPN applications: The
cozy:tenant:admin:baseClusterRole was missing RBAC entries forfoundationdbs,harbors,mongodbs,openbaos,opensearches,qdrants, andvpnsresources fromapps.cozystack.io. Without these permissions, tenant admins could not create these applications — the "Add" button was inactive in the dashboard. The fix adds all seven missing resource verbs (@sircthulhu in #2268, #2271). -
[system] Fix 403 error on Service details page for tenant users: The
cozy:tenant:baseandcozy:tenant:view:baseClusterRoles were missing read permissions fordiscovery.k8s.io/endpointslices. The dashboard requests EndpointSlices to display the "Pod serving" section on the Service details page, and without this permission tenant users received a 403 error. The fix addsget,list, andwatchverbs for endpointslices to both tenant roles (@sircthulhu in #2257, #2284). -
[dashboard] Fix "Pod serving" table showing "Raw:" prefixes and "Invalid Date" on Service details page: The EndpointSlice table on the service details page displayed raw data and broken timestamps because the
EnrichedTablecomponent referenced thefactory-kube-service-details-endpointslicecustomization ID which had no correspondingCustomColumnsOverride. The fix adds column definitions for Pod (.targetRef.name), Addresses (.addresses), Ready (.conditions.ready), and Node (.nodeName) (@sircthulhu in #2266, #2282). -
[dashboard] Fix broken backup menu links missing cluster context: Backup resources (plans, backupjobs, backups) are not
ApplicationDefinitions, soensureNavigation()never created theirbaseFactoriesMappingentries. Without these mappings, the OpenUI frontend could not resolve the{cluster}context for backup pages, producing broken sidebar links with an empty cluster segment (e.g./openapi-ui//tenant-root/...instead of/openapi-ui/default/tenant-root/...). The fix adds the three missing static entries to the Navigation resource (@sircthulhu in #2232, #2270). -
[linstor] Fix swapped VMPodScrape job labels causing incorrect alerts: The
joblabels in thecozy-linstorVictoriaMetricsVMPodScrapetemplates were swapped:linstor-satellitemetrics were relabeled asjob=linstor-controllerand vice versa. This causedlinstorControllerOfflinealerts to fire against satellite endpoints (:9942) while reporting the controller as unreachable. The fix ensureslinstor-satellitemetrics keepjob=linstor-satelliteandlinstor-controllermetrics keepjob=linstor-controller, restoring consistent alerting and dashboard semantics (@sasha-sup in #2264, #2288). -
[piraeus-operator] Fix LINSTOR satellite alert annotations and reduce false-positive alerts: Two issues in the LINSTOR alerts shipped by
cozy-piraeus-operatorwere fixed. First,linstorSatelliteErrorRateused a non-existentnamelabel in annotations, resulting inSatellite ""in alert notifications — corrected to use{{ $labels.hostname }}. Second,linstorSatelliteErrorRateproduced false positives when thelinstor-controllerscrape flapped and historicallinstor_error_reports_countcounters reappeared inside the alert window — fixed by requiring stableup{job="linstor-controller"}for the full 15-minute window. Additionally, the controller availability alert was split to add a dedicated warning for metrics scrape failures with a 10-minute hold time to reduce transient noise (@sasha-sup in #2265, #2287).
Documentation
-
[website] Add Backup and Recovery guide for VMInstance and VMDisk: Replaced the generic Kubernetes Backup and Recovery guide with a virtualization-focused Backup and Recovery doc covering VMInstance and VMDisk one-off and scheduled backups, restores, status checks, and troubleshooting (including Velero-related notes) (@myasnikovdaniil in cozystack/website#456).
-
[website] Update developer guide with operator-driven architecture and OCIRepository/migration flow: Rewrote the development guide to describe the operator-driven in-cluster architecture, bootstrap flow, operator responsibilities, and platform install/update sequence. Added documentation for OCIRepositories and the migration flow with migration hook examples and sequencing rules for pre-upgrade/install migrations. Also updated the concepts guide with the two-repository update model, dependency ordering rules, namespace creation behavior, and cluster-wide values injection (@myasnikovdaniil in cozystack/website#458).
Full Changelog: v1.0.6...v1.0.7
v1.2.0
Cozystack v1.2.0
⚠️ WARNING: Do not use this release. This version includes CloudNativePG operator, which updates the default PostgreSQL image to version 18. CNPG is unable to perform the migration from the previous major version automatically, which will cause PostgreSQL clusters to fail to start after the upgrade. Please use v1.2.1 instead.
Cozystack v1.2.0 delivers significant platform enhancements: a fully managed OpenSearch service joining the application catalog, VPC peering for secure inter-tenant networking, tenant workload placement control via the new SchedulingClass system, a highly-available VictoriaLogs cluster replacing the single-node setup, and Linstor volume relocation for optimized clone and snapshot restore placement. Additional highlights include external-dns as a standalone extra package, multi-node RWX volume fixes, and a wave of dashboard and monitoring improvements.
Feature Highlights
OpenSearch: Managed Search and Analytics Service
Cozystack now ships OpenSearch as a fully managed PaaS application — supporting OpenSearch v1, v2, and v3 in a multi-role topology with dedicated master, data, ingest, coordinating, and ML nodes. TLS is enabled by default, HTTP Basic auth is provided out of the box, and custom user definitions allow per-application credentials. The optional OpenSearch Dashboards UI can be enabled alongside the engine. External access, topology spread policies, and a comprehensive JSON schema are all included.
A companion opensearch-operator system package wraps the upstream Opster OpenSearch Operator v2.8.0 and adds a sysctl DaemonSet to configure the required vm.max_map_count kernel parameter on every node automatically. An ApplicationDefinition package ties everything into the Cozystack platform dashboard with schema validation and resource management.
SchedulingClass: Tenant Workload Placement
Cozystack now supports a SchedulingClass CRD that allows platform operators to define cluster-wide scheduling constraints — pinning tenant workloads to specific data centers, hardware generations, or node groups without requiring tenants to manage scheduler configuration themselves. Tenants declare a schedulingClass in their Tenant spec; the platform injects the appropriate schedulerName into all workloads in that namespace.
The lineage-controller-webhook has been extended to verify the referenced SchedulingClass CR before injection, and child tenants inherit their parent's scheduling constraints (children cannot override). A SchedulingClass dropdown in the Tenant creation form in the dashboard makes the feature fully self-service. The underlying cozystack-scheduler — a custom kube-scheduler extension with SchedulingClass-aware affinity plugins — is now installed and enabled by default as part of the platform.
VPC Peering for Multi-Tenant Environments
The vpc application gains bilateral VPC peering using Kube-OVN's native vpcPeerings mechanism, allowing tenants to securely interconnect their private networks without routing traffic through public endpoints. Peering link-local IPs (169.254.0.0/16) are allocated deterministically from a hash of the sorted VPC pair names, ensuring stable addresses across reconciliations. Static route support (staticRoutes) enables fine-grained inter-VPC routing policies. A cozy-lib helper (hexToInt) performs the deterministic IP allocation, and a JSON Schema validation enforces the ^tenant- namespace pattern for peered VPCs.
VictoriaLogs: Clustered Mode for High Availability
The platform's log storage has been upgraded from the deprecated single-node VLogs CR to a VLCluster deployment with separate vlinsert, vlselect, and vlstorage components, each running with 2 replicas by default — consistent with the existing VMCluster setup. This brings horizontal scalability and resilience to the logging tier. VPA autoscaling is enabled for all VLCluster components, and the victoria-metrics-operator has been upgraded from v0.55.0 to v0.68.1 to add VLCluster CRD support.
Linstor CSI: Volume Relocation After Clone and Restore
The Linstor CSI driver now carries upstream patches enabling automatic replica relocation after PVC clone and snapshot restore operations. Two new parameters control the behavior: linstor.csi.linbit.com/relocateAfterClone on StorageClasses moves replicas to optimal nodes after a clone, and snap.linstor.csi.linbit.com/relocate-after-restore on VolumeSnapshotClasses does the same after a restore. VolumeSnapshotClasses for Velero and Kasten use cases are pre-configured. This enables full PVC-level backup and restore workflows with automatic data rebalancing, a key prerequisite for production Velero/Kasten integrations.
Major Features and Improvements
-
[apps] Add managed OpenSearch service: Deployed as a PaaS application supporting OpenSearch v1/v2/v3 with multi-role node topology, TLS, HTTP Basic auth, custom users, optional OpenSearch Dashboards UI, external access, and topology spread policies; backed by the opster OpenSearch Operator v2.8.0 and a sysctl DaemonSet for
vm.max_map_count(@matthieu-robin in #1953). -
[vpc] Add VPC peering support for multi-tenant environments: Bilateral VPC peering via Kube-OVN's
vpcPeerings, deterministic link-local IP allocation from sorted VPC pair hash, static routes support, ConfigMap peer discovery enrichment, and JSON Schema validation enforcing^tenant-namespace pattern (@mattia-eleuteri in #2152). -
[monitoring] Migrate VictoriaLogs from VLogs to VLCluster: Replaced deprecated single-node
VLogsCR with clusteredVLCluster(vlinsert/vlselect/vlstorage, 2 replicas each), added VPA for all components, upgraded victoria-metrics-operator to v0.68.1 (@sircthulhu in #2153). -
[scheduler] Integrate SchedulingClass support for tenant workloads: Added
schedulingClassTenant parameter with inheritance enforcement,scheduling.cozystack.io/classnamespace label, lineage-webhook extension to verify and injectschedulerName, SchedulingClass dropdown in Tenant dashboard form (@sircthulhu in #2223). -
[cozystack-scheduler] Add custom scheduler as an optional system package: Vendored
cozystack-schedulerfrom github.com/cozystack/cozystack-scheduler — a kube-scheduler extension with SchedulingClass-aware affinity plugins, including Helm chart with RBAC, ConfigMap, Deployment, and CRD (@lllamnyp in #2205). -
[platform] Enable cozystack-scheduler by default: The cozystack-scheduler and SchedulingClass CRD are now installed as default system packages; the backup tool has been moved to optional packages (@lllamnyp in #2253).
-
[extra] Add external-dns as a standalone extra package: Packaged external-dns as an installable extra (tenant-level) component for automatic DNS record management from Kubernetes Service and Ingress resources (@mattia-eleuteri in #1988).
-
[linstor] Add linstor-csi patches for clone/snapshot relocation: New patch enabling
relocateAfterCloneStorageClass parameter andrelocate-after-restoreVolumeSnapshotClass parameter; pre-configured VolumeSnapshotClasses for Velero and relocation workflows; CDI switched to csi-clone strategy (@kvaps in #2133). -
[monitoring] Add inlineScrapeConfig support to tenant vmagent: Tenants can now define inline scrape configurations directly in their VMAgent spec, enabling custom metrics collection from services that are not discoverable via standard Kubernetes service discovery (@mattia-eleuteri in #2200).
-
[monitoring] Add Slack dashboard URL, vmagent environment label, and dynamictext Grafana plugin: Added
SLACK_DASHBOARD_URLandSLACK_SUMMARY_FMTenvironment variables for richer alert notifications, per-vmagentenvironmentlabel for metric source identification, and thedynamictext-panelplugin for Grafana dashboards (@vnyakas in #2210). -
[monitoring] Scope infrastructure dashboards to tenant-root only: Infrastructure-level Grafana dashboards are now scoped to the tenant-root namespace only, preventing them from appearing in tenant sub-namespaces and reducing dashboard noise (@mattia-eleuteri in #2197).
-
[tenant] Allow egress to virt-handler for VM metrics scraping: Extended tenant NetworkPolicy to permit egress to virt-handler pods, enabling Prometheus to scrape VM-level metrics from KubeVirt without additional policy exceptions (@mattia-eleuteri in #2199).
-
[dashboard] Add keycloakInternalUrl for backend-to-backend OIDC requests: Added a
keycloakInternalUrlplatform value for the dashboard backend to perform OIDC token introspection via an internal cluster URL, avoiding external round-trips and improving reliability in air-gapped environments (@sircthulhu in #2224). -
[dashboard] Add secret-hash annotation to KeycloakClient for secret sync: Added a
secret-hashannotation to the KeycloakClient resource so that changes to the client secret trigger automatic reconciliation and propagation to dependent components (@sircthulhu in #2231). -
[docs] Add OpenAPI and Go types code generation for apps: Added tooling to generate OpenAPI schemas and Go types from Helm chart values, enabling type-safe programmatic access to managed application configurations and automatic API reference ge...