A production-ready observability stack template with OpenTelemetry, Prometheus, Loki, Tempo, and Grafana, all routed through Traefik with automatic HTTPS.
This template provides a complete observability stack that collects, stores, and visualizes:
- Metrics (Prometheus)
- Logs (Loki)
- Traces (Tempo)
- Dashboards (Grafana)
All services are secured behind Traefik reverse proxy with automatic HTTPS redirection and TLS certificates.
Real-time monitoring dashboard showing system status, resource utilization, and container metrics
┌─────────────────────────────────────────────────────────┐
│ Traefik │
│ (Reverse Proxy & Load Balancer) │
│ Routes: *.localhost with HTTPS/TLS support │
└─────────────────────────────────────────────────────────┘
│
┌───────────────────┼─────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────┐ ┌──────────┐ ┌──────────┐
│ Web │ │ Grafana │ │Portainer │
│ :80 │ │ :3000 │ │ :9000 │
│(Nginx) │ └──────────┘ └──────────┘
└─────────┘ │
│
┌──────────────────────────────────────┐
│ │
▼ ▼
┌──────────────┐ ┌──────────────┐
│OpenTelemetry │◄────────────────────►│ Prometheus │
│ Collector │ (Metrics & Traces) │ :9090 │
│ :4318 │ └──────────────┘
└──────────────┘ │
│ │
┌───┴────┬──────────────────┐ │
▼ ▼ ▼ ▼
┌──────┐ ┌───────┐ ┌──────────┐ ┌──────────┐
│ Loki │ │ Tempo │ │Node Exp. │ │ cAdvisor │
│:3100 │ │ :3200 │ │ :9100 │ │ :8080 │
└──────┘ └───────┘ └──────────┘ └──────────┘
(Logs) (Traces) (Host Metrics) (Container)
- Docker & Docker Compose
- mkcert (for local HTTPS certificates)
# Install mkcert (if not already installed)
# macOS
brew install mkcert
# Linux
apt install mkcert # or your package manager
# Generate certificates
cd infrastructure/traefik/certs
mkcert -install
mkcert -cert-file local-dev.crt -key-file local-dev.key \
"localhost" \
"*.localhost" \
"traefik.localhost" \
"portainer.localhost" \
"grafana.localhost" \
"prometheus.localhost" \
"cadvisor.localhost"# Copy the example environment file
cp example.env .env
# Edit .env with your preferences
nano .env# Start all services
docker compose up -d
# Check service status
docker compose ps
# View logs
docker compose logs -fOnce running, access services at:
| Service | URL | Description |
|---|---|---|
| Web | https://localhost | Landing page (index.html) |
| Grafana | https://grafana.localhost | Main dashboard & visualization |
| Traefik | https://traefik.localhost | Reverse proxy dashboard |
| Portainer | https://portainer.localhost | Docker management UI |
| Prometheus | Internal only | Metrics database |
- Purpose: Serves the main landing page at the root domain
- Technology: Nginx Alpine
- Features:
- Lightweight static file server
- Hosts index.html with animated clock
- Accessible at the root domain (e.g., https://localhost)
- Resources: 32MB memory, 0.1 CPU
- Purpose: Routes all traffic with automatic HTTPS
- Features:
- Automatic service discovery via Docker labels
- HTTP to HTTPS redirection
- TLS certificate management
- Access logs, metrics, and traces sent to OpenTelemetry
Example routing configuration:
labels:
- traefik.enable=true
- traefik.http.routers.myapp.rule=Host(`myapp.${DOMAIN_NAME}`)
- traefik.http.routers.myapp.entrypoints=websecure
- traefik.http.routers.myapp.tls=true
- traefik.http.services.myapp.loadbalancer.server.port=8080- Purpose: Central telemetry data collector
- Receives: Logs, metrics, and traces from Traefik and applications
- Exports to: Loki (logs), Tempo (traces), Prometheus (metrics)
- Endpoints:
- HTTP:
http://otel-collector:4318 - gRPC:
http://otel-collector:4317
- HTTP:
- Purpose: Time-series metrics database
- Scrapes:
- Prometheus itself
- OpenTelemetry Collector
- Node Exporter (host metrics)
- cAdvisor (container metrics)
- Retention: Configurable via environment variables
- Purpose: Log aggregation system
- Receives: Logs from OpenTelemetry Collector
- Query: Via Grafana with LogQL
- Purpose: Distributed tracing backend
- Receives: Traces from OpenTelemetry Collector
- Query: Via Grafana with TraceQL
- Purpose: Visualization and dashboards
- Pre-configured datasources:
- Prometheus (metrics)
- Loki (logs)
- Tempo (traces)
- Pre-loaded dashboards:
- System Overview
- Traefik Dashboard
- Node Exporter
- cAdvisor
- Loki Logs
- Purpose: Host machine metrics
- Exports: CPU, memory, disk, network stats
- Purpose: Container metrics
- Exports: Container resource usage and performance
- Purpose: Docker container management
- Features: Web UI for managing containers, images, volumes, networks
# Domain configuration
DOMAIN_NAME=localhost # Base domain
TRAEFIK_SUBDOMAIN=traefik # Traefik dashboard subdomain
PORTAINER_SUBDOMAIN=portainer # Portainer subdomain
GRAFANA_SUBDOMAIN=grafana # Grafana subdomain
PROMETHEUS_SUBDOMAIN=prometheus # Prometheus subdomain
CADVISOR_SUBDOMAIN=cadvisor # cAdvisor subdomain
# Prometheus retention
PROMETHEUS_RETENTION_TIME=30d # How long to keep metrics
PROMETHEUS_RETENTION_SIZE=10GB # Max storage size
# Grafana credentials
GRAFANA_ADMIN_USER=admin # Grafana admin username
GRAFANA_ADMIN_PASSWORD=secret # Grafana admin passwordTo add your own application to this stack with Traefik routing and observability:
services:
myapp:
image: myapp:latest
container_name: myapp
restart: unless-stopped
networks:
- frontend_network
labels:
# Enable Traefik
- traefik.enable=true
# Configure routing
- traefik.http.routers.myapp.rule=Host(`myapp.${DOMAIN_NAME}`)
- traefik.http.routers.myapp.entrypoints=websecure
- traefik.http.routers.myapp.tls=true
# Configure service
- traefik.http.services.myapp.loadbalancer.server.port=8080
- traefik.docker.network=frontend_network
# Optional: Send telemetry to OpenTelemetry Collector
environment:
- OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
- OTEL_SERVICE_NAME=myappThe stack includes pre-configured dashboards:
-
System Health Overview - Overall system health and metrics with real-time stats
- System status indicator
- Running containers count
- CPU, Memory, and Disk usage at a glance
- HTTP request rate and error monitoring
- Average response time tracking
- Detailed resource utilization graphs (CPU by core, Memory breakdown, Disk I/O)
- Container-level metrics
-
Traefik Dashboard - Request rates, latency, status codes
-
Node Exporter - Host machine metrics
-
cAdvisor - Container resource usage
-
Loki Logs - Log exploration and analysis
Access at: https://grafana.localhost
Default credentials (if set in .env):
- Username:
${GRAFANA_ADMIN_USER} - Password:
${GRAFANA_ADMIN_PASSWORD}
- Collected by Prometheus every 15 seconds
- Available in Grafana for visualization
- Pre-configured scrape configs for all services
- Traefik access logs sent to Loki via OpenTelemetry
- JSON format for easy parsing
- Queryable via LogQL in Grafana
- Traefik distributed tracing enabled
- Traces sent to Tempo via OpenTelemetry
- Queryable via TraceQL in Grafana
- Correlations with logs and metrics
- Prometheus alert rules in
infrastructure/prometheus/alerts/ - System alerts (CPU, memory, disk)
- Traefik alerts (error rates, latency)
- All services run with
no-new-privilegessecurity option - Backend services isolated in internal network
- Frontend services accessible only via Traefik
- Read-only root filesystems where applicable
- Non-root users for Loki, Tempo, Prometheus, Grafana
- TLS encryption for all external traffic
- Docker socket mounted as read-only
All services have configured resource limits:
| Service | Memory | CPU |
|---|---|---|
| Web | 32MB | 0.1 |
| Traefik | 128MB | 0.1 |
| OpenTelemetry | 128MB | 0.1 |
| Prometheus | 128MB | 0.1 |
| Grafana | 128MB | 0.1 |
| Loki | 128MB | 0.1 |
| Tempo | 64MB | 0.1 |
| Node Exporter | 16MB | 0.1 |
| cAdvisor | 64MB | 0.1 |
Adjust in compose.yml based on your needs.
# All services
docker compose logs -f
# Specific service
docker compose logs -f grafana# All services
docker compose restart
# Specific service
docker compose restart prometheus# Pull latest images
docker compose pull
# Recreate containers
docker compose up -d# Backup all volumes
docker compose down
sudo tar -czf o11y-backup.tar.gz \
/var/lib/docker/volumes/o11y-stack-template_*
docker compose up -d# Check logs
docker compose logs <service-name>
# Check health status
docker compose ps- Ensure certificates are generated correctly
- Check if mkcert root CA is installed:
mkcert -install - Verify DNS resolution:
nslookup grafana.localhost
- Adjust resource limits in
compose.yml - Reduce Prometheus retention time
- Decrease scrape intervals
- Traefik Documentation
- OpenTelemetry Collector
- Prometheus Documentation
- Loki Documentation
- Tempo Documentation
- Grafana Documentation
This template is provided as-is for use in your projects.
Feel free to submit issues and enhancement requests!
Built with ❤️ for modern observability

