Skip to content

EsraaKamel11/Simulation-Chaos-Lab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Agent User Simulation & Chaos Lab

Python 3.11+ LangGraph Playwright License: MIT Tests: 27 passed

A production-grade autonomous simulation platform that generates diverse user personas, executes multi-step journeys with chaos injection, detects anomalies across three analytical layers, and discovers novel edge cases — all orchestrated as a 12-agent LangGraph pipeline.


Architecture

The system is organized into four planes, each containing specialized agents that communicate through a shared CampaignState.

Architecture Diagram


Persona Taxonomy

Persona Class Behavior Profile Typical Session Depth API Preference
POWER_USER Deep navigation, complex workflows, high feature coverage 10+ steps Mixed
CASUAL_USER Shallow browsing, quick tasks, low error tolerance 2-4 steps UI only
EDGE_CASE_USER Unusual inputs, rare paths, boundary exploration 5-8 steps Mixed
ADVERSARIAL_ACTOR Injection attempts, auth bypass, rate limit testing 6-12 steps API heavy
ACCESSIBILITY_USER Screen reader paths, keyboard-only navigation, high-contrast 4-7 steps UI only
LEGACY_MIGRANT Outdated patterns, deprecated endpoints, version mismatches 3-6 steps API heavy
CONCURRENCY_USER Parallel requests, race conditions, session overlap 8-15 steps API only

Personas are generated via K-Means clustering on clickstream data (real or synthetic), then enriched by LLM synthesis per cluster, and finally mutated deterministically (Unicode injection, boundary inputs, session timing variants) — no LLM needed for mutations.


Agent Roster

# Agent Plane Input Output
1 CampaignPlannerAgent Control Campaign spec ExecutionPlan
2 PersonaGeneratorAgent Control ExecutionPlan + clickstream List[Persona]
3 VulnScanAgent Control OpenAPI spec + prod metrics VulnerabilityMap
4 JourneyGeneratorAgent Execution Personas + VulnerabilityMap List[UserJourney]
5 SimulatorAgent (x N) Execution UserJourney SimulationResult
6 ChaosAgent Execution Journeys + VulnerabilityMap List[chaos events]
7 ObserverAgent Observation SimulationResults + traces List[AnomalyEvent] (Layer 1)
8 AnomalyDetectorAgent Observation Results + L1 anomalies Enriched List[AnomalyEvent] (L2+L3)
9 EdgeCaseAgent Observation Anomalies List[EdgeCase] + pytest stubs
10 ResilienceScorerAgent Reporting Results + EdgeCases + VulnMap ResilienceScore (0-100)
11 ReportGeneratorAgent Reporting Score + EdgeCases + Anomalies CampaignReport
12 QABridgeAgent Reporting EdgeCases + Report Push to QA pipeline

Installation

# Clone
git clone <repo-url>
cd simulation-chaos-lab

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # Linux/macOS
.venv\Scripts\activate     # Windows

# Install dependencies
pip install -r requirements.txt

# Install Playwright browsers (optional — for UI simulation)
playwright install chromium

Configuration (.env template)

Create a .env file in the project root:

# LLM — required
OPENAI_API_KEY=voc-your-key-here
OPENAI_BASE_URL=https://openai.vocareum.com/v1

# External services — all optional (graceful fallback)
KAFKA_BOOTSTRAP_SERVERS=localhost:9092
DATABASE_URL=postgresql://user:pass@localhost/simulation_lab
REDIS_URL=redis://localhost:6379

# Chaos tuning
CHAOS_HALT_THRESHOLD=0.40
CHAOS_INTENSITY=medium

# Cost governance
MAX_LLM_SPEND_USD_PER_CAMPAIGN=10.0

# Simulation
MAX_CONCURRENT_SIMULATORS=10
TARGET_BASE_URL=http://localhost:3000
SLA_RESPONSE_TIME_MS=2000

# QA Bridge
QA_AGENT_API_URL=http://localhost:8001
ADIP_API_URL=http://localhost:8002

Usage

Mock Mode (zero external dependencies)

python main.py --mock

Runs the entire 12-agent pipeline with:

  • Synthetic clickstream data for persona generation
  • Mock OpenAPI spec + production metrics for vulnerability scanning
  • Simulated HTTP responses (no real network calls)
  • In-memory fallbacks for Kafka, PostgreSQL, Redis, pgvector
  • Real LLM calls to GPT-4o (only external dependency)

Real Mode (full infrastructure)

# Start services
docker compose up -d  # Kafka, PostgreSQL, Redis

python main.py

API Mode (FastAPI server)

python main.py --server
# Server starts at http://localhost:8000

# Trigger a campaign
curl -X POST http://localhost:8000/api/campaigns \
  -H "Content-Type: application/json" \
  -d '{
    "target_workflows": ["login", "checkout", "search"],
    "chaos_intensity": "medium",
    "max_llm_spend_usd": 5.0
  }'

# Health check
curl http://localhost:8000/api/health

Save Report to File

python main.py --mock --output report.json

Pipeline Walkthrough

A real campaign execution proceeds as follows:

$ python main.py --mock

[INFO] Starting simulation campaign (mock=True, model=gpt-4o)

# 1. CONTROL PLANE
[INFO] LLM call by campaign_planner | spent: $0.00 / $10.00
[INFO] LLM call by campaign_planner complete | cost: $0.03
[INFO] LLM call by persona_generator x3 clusters | cost: $0.09
[INFO] Generated 6 personas across 3 clusters
[INFO] Static analysis flagged 7 endpoints
[INFO] Dynamic analysis flagged 7 endpoints
[INFO] LLM call by vuln_scan_agent | cost: $0.03

# 2. EXECUTION PLANE
[INFO] LLM call by journey_generator x6 journeys | cost: $0.18
[INFO] Generated 6 journeys
[INFO] Journey j1: CHAOS_TRIGGERED (6 steps, 24ms)
[INFO] Journey j2: CHAOS_TRIGGERED (5 steps, 33ms)
...
[INFO] Collected 6 simulation results

# 3. OBSERVATION PLANE
[INFO] Kafka stream processor initialized
[INFO] ObserverAgent identified 12 rule-based anomalies
[INFO] AnomalyDetectorAgent enriched 12 anomalies (3 statistical, 2 LLM semantic)
[INFO] EdgeCaseAgent discovered 4 novel edge cases

# 4. REPORTING PLANE
[INFO] ResilienceScorerAgent calculated score: 85/100
[INFO] ReportGeneratorAgent created campaign report
[INFO] QABridgeAgent pushed 4 edge cases to QA pipeline

[INFO] Campaign completed in 120 seconds.

Project Structure

.github/                       # GitHub Actions workflows
├── workflows/
│   └── ci.yml
├── docs/                        # Documentation and diagrams
├── agents/                      # Core agent implementations
│   ├── control/
│   │   ├── campaign_planner.py
│   │   ├── persona_generator.py
│   │   └── vuln_scan_agent.py
│   ├── execution/
│   │   ├── journey_generator.py
│   │   ├── simulator.py
│   │   └── chaos_agent.py
│   ├── observation/
│   │   ├── observer.py
│   │   ├── anomaly_detector.py
│   │   └── edge_case_agent.py
│   └── reporting/
│       ├── resilience_scorer.py
│       ├── report_generator.py
│       └── qa_bridge_agent.py
├── persona/
│   ├── clustering.py               # K-Means + elbow method + PII scrubber
│   ├── models.py                   # PersonaPopulation + diversity scoring
│   └── mutator.py                  # Unicode/boundary/timing mutations
├── chaos/
│   ├── vulnerability_map.py        # 3-stage OpenAPI + metrics analysis
│   ├── injection_engine.py         # mitmproxy + MockChaosEngine fallback
│   └── budget_governor.py          # Redis atomic counters + auto-halt
├── observation/
│   ├── stream_processor.py         # Kafka consumer + mock list fallback
│   ├── anomaly_rules.py            # Layer 1 rule definitions
│   └── edge_case_discovery.py      # DBSCAN + novelty scoring
├── persistence/
│   ├── campaign_db.py              # PostgreSQL/JSONB -> SQLite fallback
│   └── knowledge_base.py           # pgvector -> numpy fallback
├── llm/
│   ├── prompts.py                  # 12 prompt templates (6 system/user pairs)
│   └── factory.py                  # LLM factory + CostGovernor wrapper
├── observability/
│   ├── metrics.py                  # Prometheus counters/histograms/gauges
│   └── tracing.py                  # OpenTelemetry setup
├── api/app.py                      # FastAPI (POST /api/campaigns, GET /health)
├── main.py                         # CLI: --mock, --server, --output
├── requirements.txt
└── tests/
    ├── unit/                       # 24 unit tests
    └── integration/                # 3 integration tests (full pipeline)

Roadmap

  • Playwright UI simulation: execute browser-based journeys alongside API journeys
  • Real Kafka integration: production trace ingestion with Avro schema registry
  • Checkpoint replay: LangGraph checkpoint-based edge case reproduction validation
  • Sentence-transformers embeddings: replace hash-based embeddings with SBERT for higher-fidelity novelty scoring
  • Multi-campaign comparison: track resilience score trends across campaigns over time
  • Grafana dashboard: pre-built dashboards consuming Prometheus metrics
  • CI/CD integration: GitHub Actions workflow running --mock pipeline on every PR
  • Distributed simulation: Celery or Ray-based worker pool for large persona populations
  • A/B chaos profiles: compare resilience under different chaos intensity configurations
  • Auto-generated regression suite: export validated edge cases as permanent pytest fixtures

License

MIT

About

Multi-Agent User Simulation & Chaos Lab — A 12-agent LangGraph pipeline that generates diverse user personas, executes multi-step journeys with vulnerability-guided chaos injection, detects anomalies across three analytical layers, and discovers novel edge cases with automated pytest stub generation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages