Production-Ready Kubernetes Operator for Zero-Code Python Application Instrumentation
The OpenLIT Operator automatically instruments existing Python applications in your Kubernetes cluster with comprehensive observability - without changing a single line of application code.
- No application restarts required for existing workloads
- No code changes - just add a label to your pods
- Runtime injection using admission controller webhook
- Automatic OpenLIT SDK installation and configuration
- Automatic TLS certificate management following Velotio best practices
- Self-signed certificates with automatic rotation
- RBAC with minimal required permissions
- Secure webhook with configurable failure policies
- Environment-based configuration - no hardcoded values
- Flexible OTLP endpoints for any backend
- Customizable service names and environments
- Configurable certificate validity and rotation
- Multiple failure policy options
# 1. OpenLIT backend deployed
helm repo add openlit https://openlit.github.io/helm/
helm repo update
helm install openlit openlit/openlit
# 2. Kubernetes cluster (1.20+)
kubectl version --client# Single command deployment - everything included!
kubectl apply -f https://raw.githubusercontent.com/openlit/openlit/main/operator/deploy/openlit-operator.yaml# Just add a label to your existing deployments
kubectl patch deployment your-python-app -p '{"spec":{"template":{"metadata":{"labels":{"openlit.io/instrument":"true"}}}}}'
# Or add to your YAML:
# metadata:
# labels:
# openlit.io/instrument: "true"graph TB
A[Your Python App] -->|Labeled| B[Kubernetes API]
B -->|Pod Create| C[OpenLIT Webhook]
C -->|Mutate Pod| D[Init Container Injection]
D -->|SDK Install| E[Auto-Instrumented App]
E -->|Traces/Metrics| F[OpenLIT Backend]
G[Certificate Manager] -->|TLS Certs| C
H[Webhook Config] -->|CA Bundle| B
- Pod Creation: Kubernetes API receives pod creation request
- Webhook Interception: OpenLIT webhook intercepts pods with
openlit.io/instrument: "true" - Init Container Injection: Adds init container to install OpenLIT SDK
- Runtime Configuration: Sets environment variables and Python path
- Automatic Initialization: SDK auto-initializes via
sitecustomize.py - Zero-Code Tracing: Application runs normally with full observability
The operator automatically handles TLS certificates using the Velotio approach:
- ✅ Self-signed CA generation on startup
- ✅ Server certificate creation with proper SANs
- ✅ Kubernetes Secret storage for certificate persistence
- ✅ Automatic webhook registration with CA bundle
- ✅ Certificate rotation based on configurable thresholds
| Variable | Default | Description |
|---|---|---|
OPENLIT_OTLP_ENDPOINT |
http://openlit.default.svc.cluster.local:4318 |
OpenLIT backend OTLP endpoint |
OPENLIT_DEFAULT_ENVIRONMENT |
kubernetes |
Default environment tag |
WEBHOOK_SERVICE_NAME |
openlit-webhook-service |
Webhook service name |
WEBHOOK_PORT |
9443 |
Webhook server port |
CERT_VALIDITY_DAYS |
365 |
Certificate validity period |
CERT_REFRESH_DAYS |
30 |
Certificate refresh threshold |
WEBHOOK_FAILURE_POLICY |
Ignore |
Webhook failure behavior (Ignore/Fail) |
apiVersion: v1
kind: ConfigMap
metadata:
name: openlit-operator-config
namespace: openlit
data:
OPENLIT_OTLP_ENDPOINT: "http://your-backend:4318"
OPENLIT_DEFAULT_ENVIRONMENT: "production"
OPENLIT_CAPTURE_MESSAGE_CONTENT: "true"apiVersion: apps/v1
kind: Deployment
metadata:
name: my-python-app
spec:
replicas: 1
selector:
matchLabels:
app: my-python-app
template:
metadata:
labels:
app: my-python-app
# 🎯 This label triggers automatic instrumentation
openlit.io/instrument: "true"
spec:
containers:
- name: app
image: python:3.11-slim
command: ["python", "-c"]
args:
- |
# 🚨 ZERO OpenLIT imports - completely vanilla code!
import requests
import time
while True:
response = requests.get("https://httpbin.org/json")
print(f"Response: {response.status_code}")
time.sleep(10)apiVersion: apps/v1
kind: Deployment
metadata:
name: langchain-app
spec:
template:
metadata:
labels:
openlit.io/instrument: "true" # 🎯 Auto-instrumentation
spec:
containers:
- name: langchain-app
image: python:3.11
command: ["python", "-c"]
args:
- |
# Pure LangChain code - no OpenLIT imports!
from langchain.llms import OpenAI
llm = OpenAI(api_key="your-key")
response = llm("Hello, world!")
print(response)# Operator health
kubectl get pods -n openlit
kubectl logs -n openlit deployment/openlit-operator
# Webhook configuration
kubectl get mutatingwebhookconfigurations openlit-instrumentation-webhook
# TLS certificates
kubectl get secrets -n openlit openlit-webhook-certs# Check if pods are instrumented
kubectl get pods -l openlit.io/instrument=true
kubectl describe pod <pod-name> | grep openlit
# Check application logs for SDK initialization
kubectl logs <pod-name> | grep "OpenLIT instrumentation initialized"- 🌐 OpenLIT Dashboard: http://localhost:3000 (after
kubectl port-forward) - 📊 Check traces from your instrumented applications
- 🔍 Monitor metrics and performance data
# Check operator logs
kubectl logs -n openlit deployment/openlit-operator
# Common causes:
# - Missing RBAC permissions
# - TLS certificate generation failure
# - Webhook configuration issues# Verify label is present
kubectl get pods --show-labels | grep openlit
# Check webhook logs
kubectl logs -n openlit deployment/openlit-operator | grep "pod not marked"
# Verify webhook is registered
kubectl get mutatingwebhookconfigurations openlit-instrumentation-webhook -o yaml# Check init container logs
kubectl logs <pod-name> -c openlit-init
# Check main container logs
kubectl logs <pod-name> -c <main-container>
# Common causes:
# - Python path issues
# - SDK installation failure
# - Missing dependencies# Enable verbose logging
kubectl set env deployment/openlit-operator -n openlit LOG_LEVEL=debug
# Check detailed webhook processing
kubectl logs -n openlit deployment/openlit-operator -f# Scale operator for HA
spec:
replicas: 2
# Add anti-affinity
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: openlit-operator
topologyKey: kubernetes.io/hostnameresources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 100m
memory: 128Mi# For external OpenLIT backend
env:
- name: OPENLIT_OTLP_ENDPOINT
value: "https://your-openlit-backend.com:4318"If you have applications already instrumented with OpenLIT SDK:
- Remove OpenLIT imports from your application code
- Add the instrumentation label
openlit.io/instrument: "true" - Redeploy - the operator will handle everything automatically
- Verify traces continue flowing to your backend
- ✅ Any Python application (3.8+)
- ✅ LangChain applications
- ✅ OpenAI API clients
- ✅ Requests/HTTP clients
- ✅ Database libraries (SQLAlchemy, psycopg2, etc.)
- ✅ Web frameworks (FastAPI, Flask, Django)
- ✅ Kubernetes 1.20+
- ✅ All distributions (EKS, GKE, AKS, k3s, etc.)
- ✅ ARM64 and AMD64 architectures
- 🔍 Distributed tracing across all instrumented services
- 📊 Custom metrics and performance monitoring
- 💰 Cost tracking for LLM API calls
- 🔒 Privacy controls for sensitive data
- 🎯 Business intelligence and analytics
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
- 📖 Documentation: docs.openlit.io
- 💬 Community: OpenLIT Slack
- 🐛 Issues: GitHub Issues
- 📧 Email: [email protected]
Built with ❤️ by the OpenLIT team | Production-ready zero-code observability for Kubernetes