Skip to content

Latest commit

 

History

History
142 lines (96 loc) · 2.73 KB

File metadata and controls

142 lines (96 loc) · 2.73 KB

Observability Quick Start

Get Ampel's observability stack running in 5 minutes.

Prerequisites

  • Docker and Docker Compose installed
  • Ampel services running (API, Worker, PostgreSQL, Redis)

Start Monitoring Stack

# Start monitoring services
make monitoring-up

# Or manually
docker-compose -f docker/docker-compose.monitoring.yml up -d

This starts:

Verify Setup

Check Health Endpoints

# API health check
curl http://localhost:8080/health

# API readiness check
curl http://localhost:8080/ready

# View Prometheus metrics
curl http://localhost:8080/metrics

Access Grafana

  1. Open http://localhost:3000
  2. Login with admin/admin
  3. Navigate to Dashboards > Ampel Overview

View Metrics

Prometheus UI

  1. Open http://localhost:9090
  2. Try these queries:
# Request rate
rate(http_requests_total[5m])

# Error rate
sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))

# Database connections
pg_stat_database_numbackends

Grafana Dashboards

Pre-configured dashboard shows:

  • HTTP request rate by endpoint
  • Request duration (p95)
  • Status code distribution
  • Database connection count
  • Active pull requests

Common Commands

# View logs
make monitoring-logs

# Restart services
make monitoring-restart

# Check health
make monitoring-health

# Stop monitoring
make monitoring-down

# Clean all data
make monitoring-clean

Alerts

Alerts are configured in /monitoring/alerts/ampel.yml:

  • High error rate (>5% for 5min)
  • High latency (p95 >1s for 10min)
  • Database down
  • Service unavailable

Next Steps

Troubleshooting

Prometheus not scraping metrics

Check prometheus.yml targets match your service ports:

- targets: ['api:8080'] # Should match your API port

Grafana dashboard empty

  1. Verify Prometheus datasource: Configuration > Data Sources
  2. Check Prometheus is scraping: http://localhost:9090/targets
  3. Verify metrics endpoint returns data: curl http://localhost:8080/metrics

High memory usage

Reduce retention period in prometheus.yml:

global:
  scrape_interval: 30s # Increase from 15s

Or add storage retention flags:

--storage.tsdb.retention.time=15d
--storage.tsdb.retention.size=10GB