This repository contains an Elasticsearch Rally with Docker and custom tracks for various scenarios to demonstrate the effect that oversharding, or correctly sharding but simply a lot of shards, can play on the non functional requirements of an Elastic cluster.
Detailed analysis is contained in the write up
To run
# Run the setup script
./setup.sh
# Or manually:
poetry install
poetry run poe cluster_up
poetry run poe rally_help- Docker (Compose v2 CLI)
- Poetry β₯ 1.6
- curl and jq (for health checks)
This project provides:
- 3-node Elasticsearch cluster optimized for load testing
- Custom Rally tracks for different workload scenarios
- Helper scripts for common benchmarking tasks
- Docker Compose configuration tuned for performance
Use case: Product catalog search and filtering
- Challenges:
index-and-search,search-heavy - Operations: Product indexing, search, filtering, aggregations
- Data: 1M products with categories, prices, ratings, locations
# Run ecommerce benchmark
poetry run poe rally_ecommerce challenge=search-heavy user_tag=test-run-1Use case: High-volume log ingestion and analysis
- Challenges:
high-throughput-ingestion,search-heavy,mixed-workload - Operations: Log indexing, search, filtering, aggregations
- Data: 10M log entries with timestamps, levels, services
# Run log aggregation benchmark
poetry run poe rally_logs challenge=high-throughput-ingestion user_tag=logs-testUse case: Metrics collection and monitoring
- Challenges:
metrics-ingestion,metrics-analysis - Operations: Metrics indexing, time-series queries, aggregations
- Data: 5M metrics with timestamps, hosts, services
# Run metrics benchmark
poetry run poe rally_metrics challenge=metrics-analysis user_tag=metrics-testpoetry run poe cluster_up # Start Elasticsearch cluster
poetry run poe cluster_down # Stop cluster (keeps data)
poetry run poe cluster_status # Check container status
poetry run poe cluster_logs # Follow cluster logspoetry run poe rally_help # Show Rally help
poetry run poe rally_list_tracks # List available tracks
poetry run poe rally_list_races # List completed races
poetry run poe rally_compare baseline=<id> contender=<id> # Compare runspoetry run poe rally_ecommerce # E-commerce products track
poetry run poe rally_logs # Log aggregation track
poetry run poe rally_metrics # Time-series metrics track
poetry run poe rally_quick_benchmark # Quick test with geonames track# Create track from live cluster
poetry run poe rally_create_track target_hosts=localhost:9200 indices=my-index track_name=my-track
# Run custom track
poetry run poe rally_race target_hosts=localhost:9200 track_path=tracks/my-trackThe docker-compose.yml is optimized for load testing:
- Memory: 2GB heap per node (3GB container limit)
- Performance: Disabled monitoring, increased thread pools
- Storage: Persistent volumes for data retention
Create .env file to customize:
ELASTIC_VERSION=8.12.2
RALLY_HOME=.rallyRally stores its state in .rally/ directory. Configure advanced options:
poetry run esrally configure --advanced# Check cluster status
curl -s http://localhost:9200/_cluster/health | jq .
# Monitor resource usage
poetry run poe cluster_logs# List recent races
poetry run poe rally_list_races
# Compare performance
poetry run poe rally_compare baseline=2024-01-01-01-01-01 contender=2024-01-01-02-02-02- Create track directory:
tracks/my-track/ - Add
track.jsonwith track definition - Add index mapping:
index.json - Add query files:
search_queries.json, etc. - Add poe task in
pyproject.toml
- Edit track definitions in
tracks/*/track.json - Modify queries in
tracks/*/*.json - Adjust challenges and schedules as needed
- Modify
docker-compose.ymlfor different cluster sizes - Adjust JVM settings in
ES_JAVA_OPTS - Change shard/replica counts in track definitions
# Start cluster
poetry run poe cluster_up
# Run quick benchmark
poetry run poe rally_quick_benchmark
# Check results
poetry run poe rally_list_races# Run product indexing and search
poetry run poe rally_ecommerce challenge=index-and-search user_tag=ecommerce-test-1
# Run search-heavy workload
poetry run poe rally_ecommerce challenge=search-heavy user_tag=ecommerce-test-2
# Compare results
poetry run poe rally_compare baseline=ecommerce-test-1 contender=ecommerce-test-2# Test log ingestion performance
poetry run poe rally_logs challenge=high-throughput-ingestion user_tag=logs-ingestion
# Test mixed workload
poetry run poe rally_logs challenge=mixed-workload user_tag=logs-mixed# Check cluster status
poetry run poe cluster_status
# View logs
poetry run poe cluster_logs
# Restart cluster
poetry run poe cluster_down
poetry run poe cluster_up# Check Rally configuration
poetry run poe rally_help
# List available tracks
poetry run poe rally_list_tracks
# Check race results
poetry run poe rally_list_races- Increase heap size in
docker-compose.yml - Adjust bulk sizes in track definitions
- Monitor system resources during tests
- Fork the repository
- Create a feature branch
- Add your custom tracks or improvements
- Test with
./setup.sh - Submit a pull request
MIT License - see LICENSE file for details.