Get Ampel's observability stack running in 5 minutes.
- Docker and Docker Compose installed
- Ampel services running (API, Worker, PostgreSQL, Redis)
# Start monitoring services
make monitoring-up
# Or manually
docker-compose -f docker/docker-compose.monitoring.yml up -dThis starts:
- Prometheus (http://localhost:9090) - Metrics storage
- Grafana (http://localhost:3000) - Dashboards (admin/admin)
- Postgres Exporter (http://localhost:9187/metrics)
- Redis Exporter (http://localhost:9121/metrics)
- Loki (http://localhost:3100) - Log aggregation
# API health check
curl http://localhost:8080/health
# API readiness check
curl http://localhost:8080/ready
# View Prometheus metrics
curl http://localhost:8080/metrics- Open http://localhost:3000
- Login with admin/admin
- Navigate to Dashboards > Ampel Overview
- Open http://localhost:9090
- Try these queries:
# Request rate
rate(http_requests_total[5m])
# Error rate
sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))
# Database connections
pg_stat_database_numbackends
Pre-configured dashboard shows:
- HTTP request rate by endpoint
- Request duration (p95)
- Status code distribution
- Database connection count
- Active pull requests
# View logs
make monitoring-logs
# Restart services
make monitoring-restart
# Check health
make monitoring-health
# Stop monitoring
make monitoring-down
# Clean all data
make monitoring-cleanAlerts are configured in /monitoring/alerts/ampel.yml:
- High error rate (>5% for 5min)
- High latency (p95 >1s for 10min)
- Database down
- Service unavailable
Check prometheus.yml targets match your service ports:
- targets: ['api:8080'] # Should match your API port- Verify Prometheus datasource: Configuration > Data Sources
- Check Prometheus is scraping: http://localhost:9090/targets
- Verify metrics endpoint returns data: curl http://localhost:8080/metrics
Reduce retention period in prometheus.yml:
global:
scrape_interval: 30s # Increase from 15sOr add storage retention flags:
--storage.tsdb.retention.time=15d
--storage.tsdb.retention.size=10GB