Skip to content

lsmithg12/ai-detection-engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

179 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Detection Engineering Lab

A template for building an AI-powered detection engineering pipeline using Claude Code as an autonomous blue team agent. Deploy a full SIEM lab, generate simulated attack telemetry, and let an AI agent build, validate, tune, and deploy security detections — all mapped to the MITRE ATT&CK framework.

What This Does

An AI agent (Claude Code) acts as a senior detection engineer, executing the full lifecycle:

INTEL → DISCOVER → AUTHOR → VALIDATE → DEPLOY → TUNE → REPORT

For each detection the agent:

  1. Reads threat intel about the Fawkes C2 agent (59 commands mapped to ATT&CK)
  2. Discovers available log data in your SIEM
  3. Authors a Sigma rule with full MITRE ATT&CK mapping
  4. Validates against simulated attack telemetry (true positive + false positive testing)
  5. Deploys to Elastic Security and/or Splunk saved searches
  6. Tunes based on alert feedback — adding exclusions, tightening thresholds
  7. Updates coverage tracking and commits to git with conventional messages

Lab Architecture

┌──────────────────────────────────────────────────────────────────┐
│                      Lab Network (Docker)                         │
│                                                                    │
│  ┌──────────────┐    ┌─────────────────┐    ┌─────────────────┐  │
│  │ Log Simulator│───▶│  Cribl Stream   │───▶│ Elasticsearch   │  │
│  │ Fawkes TTPs  │    │ :9000 (Phase 3) │    │ :9200           │  │
│  │ + baseline   │    │ CIM normalize   │───▶│ Kibana :5601    │  │
│  └──────────────┘    │ Log reduction   │    └─────────────────┘  │
│         │             │ Route by tactic │                          │
│         │             │ (streaming path)│    ┌─────────────────┐  │
│         │             └────────┬────────┘───▶│ Splunk          │  │
│         └───────────────────────────────────▶│ :8000 (optional)│  │
│                      ┌───────────────┐       └─────────────────┘  │
│                      │  Claude Code  │                            │
│                      │  (AI Agent)   │ ◀── MCP: Elasticsearch     │
│                      └───────────────┘                            │
└──────────────────────────────────────────────────────────────────┘

Quick Start

# 1. Clone the repo
git clone <your-repo-url>
cd ai-detection-engineering

# 2. Run setup (interactive — picks your SIEM, installs tooling)
./setup.sh

# 3. Launch the AI agent
claude
# Paste the first-run prompt from PROMPTS.md

See QUICKSTART.md for a detailed walkthrough.

Startup Options

Command What Runs
./setup.sh --elastic Elasticsearch + Kibana + Simulator
./setup.sh --splunk Splunk + Simulator
./setup.sh --both Both SIEMs + Simulator
./setup.sh --cribl Elastic + Cribl Stream + Simulator
./setup.sh --full Everything

Or use make setup for the same interactive experience.

Credentials

Service URL Username Password
Kibana http://localhost:5601 elastic changeme
Elasticsearch http://localhost:9200 elastic changeme
Splunk Web http://localhost:8000 admin BlueTeamLab1!
Splunk REST API https://localhost:8089 admin BlueTeamLab1!
Splunk HEC http://localhost:8288 blue-team-lab-hec-token
Cribl Stream http://localhost:9000 admin admin

Data Sources

Simulated Telemetry (Always Available)

Index (Elastic) Index (Splunk) Content
sim-baseline sysmon Normal enterprise Windows/Linux activity
sim-attack attack_simulation Fawkes C2 TTP simulations

Event types generated: Sysmon EID 1, 3, 7, 8, 10, 11, 13, 17/18, 22 + WinEvent 4624, 4104, 7045

Attack Scenarios

The simulator generates 13 attack scenarios matching Fawkes C2 + Scattered Spider capabilities:

  • Process injection (vanilla-injection) — EID 8 + 10
  • Registry persistence — EID 13
  • PowerShell with bypass flags — EID 1
  • Scheduled task creation — EID 1
  • Discovery command burst — EID 1
  • LSASS token theft — EID 10
  • C2 beaconing — EID 3
  • AMSI/CLR bypass — EID 7
  • Encoded PowerShell / Mimikatz via Script Block — EID 4104
  • RMM tool binary drops (AnyDesk, TeamViewer, ScreenConnect) — EID 11
  • RMM tool DNS resolution — EID 22
  • C2 named pipe communication — EID 17/18
  • Malicious service persistence — EID 7045

Primary Threat: Fawkes C2 Agent

Fawkes is a Golang-based Mythic C2 agent with 59 commands:

Category Commands ATT&CK Techniques
Process Injection vanilla-injection, apc-injection, threadless-inject, poolparty, opus T1055.001-005
Credential Access steal-token, make-token, keylog T1134.001, T1056.001
Persistence persist (registry, startup, schtask, service, crontab) T1547.001, T1053.005
Defense Evasion autopatch, start-clr, timestomp, binary-inflation T1562.001, T1027
Discovery ps, whoami, net-enum, arp, ifconfig, av-detect T1057, T1033, T1087
Lateral Movement socks5, wmi T1090, T1047
C2 sleep, domain-fronting, tls-cert-pin T1071.001

Full TTP mapping: threat-intel/fawkes/fawkes-ttp-mapping.md

Project Structure

ai-detection-engineering/
├── CLAUDE.md                    # AI agent instructions (role, workflow, guardrails)
├── PROMPTS.md                   # Starter prompts for the agent
├── QUICKSTART.md                # New user walkthrough
├── docker-compose.yml           # Lab infrastructure (Elastic, Splunk, Cribl, Simulator)
├── setup.sh                     # One-command interactive setup
├── Makefile                     # Quick commands (make setup, make agent, etc.)
├── simulator/                   # Log generator (13 attack + 10 baseline scenarios)
├── autonomous/                  # Patronus multi-agent pipeline
│   ├── orchestration/           # Agent runner, state machine, budget, learnings
│   └── detection-requests/      # Detection lifecycle tracking (YAML per technique)
├── .github/workflows/           # CI/CD — daily agent runs + PR security gate
├── cribl/                       # Cribl Stream MCP server + config
├── pipeline/                    # Deployment & automation scripts
├── detections/                  # Detection-as-Code (Sigma rules by MITRE tactic)
│   └── <tactic>/compiled/       # Transpiled KQL/SPL
├── tests/                       # True positive & true negative test cases
├── templates/                   # Sigma rule template + authoring lessons
├── threat-intel/                # Threat intelligence inputs
│   ├── fawkes/                  # Fawkes C2 → ATT&CK TTP mapping
│   └── analysis/                # Scattered Spider / UNC3944 intel
├── coverage/                    # ATT&CK coverage matrix & detection backlog
├── gaps/                        # Data source and detection gaps
├── tuning/                      # Exclusion lists & tuning changelog
└── mcp-config.example.json      # MCP server config template

Autonomous Agent Pipeline (Patronus)

Five specialized AI agents run the detection lifecycle end-to-end, triggered by GitHub Actions on a daily schedule or manually:

Agent Role Trigger
Intel Ingests threat reports, creates detection requests Daily
Red Team Generates attack + benign scenarios per technique On intel merge
Blue Team Authors Sigma rules, validates, deploys to SIEMs On intel/red-team merge
Quality Health scoring, daily monitoring reports Daily
Security PR gate — secrets scanning, rule quality checks Every PR to main

Each agent creates a feature branch, commits its work, pushes, and opens a PR for human review. See STATUS.md for current pipeline state.

Claude LLM Integration

Agents invoke Claude Code CLI (claude -p) for reasoning tasks at key decision points:

Agent Claude Task Model Fallback
Blue Team Author Sigma rules from attack/benign event data Opus Deterministic field extraction
Red Team Generate attack scenarios dynamically Sonnet Hardcoded scenario generators
Quality Analyze fleet health, recommend tuning actions Sonnet Fixed threshold scoring
Intel Extract MITRE techniques from raw report text + web search Sonnet Regex table parsing
  • Locally: Uses your Claude Pro subscription (OAuth session in ~/.claude/)
  • In CI: Falls back to deterministic Python logic automatically (no credentials needed)
  • Security: Invocations use --allowed-tools "Bash(curl:*)" for web search or --tools "" for pure reasoning
  • Validation: Blue-team validates against Elasticsearch when available, falls back to local JSON in CI

Current Detection Coverage

29 detection rules authored (11 deployed to SIEM, 12 validated, 2 authored, 4 need rework), covering 13/21 Fawkes techniques (62%). Full matrix: coverage/attack-matrix.md

Phase Status Key Deliverable
Phase 1 COMPLETED Fixed stuck detections, compiled all outputs
Phase 2 COMPLETED Elasticsearch-based SIEM validation
Phase 3 COMPLETED Raw logs → Cribl Stream → normalized → SIEM data pipeline
Phase 4 NOT STARTED Agent intelligence upgrades (EQL, thresholds)
Phase 5 NOT STARTED Coverage expansion to 75%+ Fawkes
Phase 6 NOT STARTED Operational maturity (dashboards, SLAs)
Phase 7 NOT STARTED Advanced capabilities (Agent SDK, live C2)

MCP Configuration

The AI agent uses MCP (Model Context Protocol) for direct Elasticsearch access and optionally GitHub/GitLab for PR workflows:

# Copy the template to the project root
cp mcp-config.example.json .mcp.json

# Edit .mcp.json and optionally add your GitHub/GitLab PAT

The Elasticsearch MCP server runs as a Docker container on the blue-team-lab network.

Cribl Stream (Phase 3 — Completed 2026-03-14)

When running with --cribl, Cribl Stream provides a full end-to-end data pipeline: raw vendor events → Cribl Stream normalization → indexed in SIEM(s).

  • Full data pipeline: raw vendor events → Cribl Stream → normalized → SIEM
  • CIM normalization: ECS fields mapped to Splunk CIM aliases
  • Log reduction: Drop noisy baseline events before indexing
  • Routing: Attack events to both SIEMs, baseline to Elastic only
  • Attack enrichment: MITRE technique tags added to events
  • Structured data source gap tracking (YAML): gaps documented in gaps/data-source-gaps.md

Configure Cribl: ./pipeline/configure-cribl.sh

Prerequisites

  • Docker Desktop (macOS/Windows) or Docker Engine + Compose (Linux)
  • Git
  • ~8 GB RAM, ~20 GB free disk
  • Claude Pro subscription (includes Claude Code)
  • Python + pip (optional — for sigma-cli transpilation)

Using as a Template

This project is designed to be forked/cloned and customized:

  1. Fork or clone this repo
  2. Run ./setup.sh to start the lab
  3. Launch Claude Code and paste a prompt from PROMPTS.md
  4. Watch the agent discover data, review threat intel, and build detections
  5. Customize: swap Fawkes for your own threat model, add data sources, etc.

Customizing the Threat Model

To target a different adversary:

  1. Replace threat-intel/fawkes/ with your own TTP mapping
  2. Update coverage/detection-backlog.md with your priority techniques
  3. Update CLAUDE.md to reference your threat actor
  4. Modify simulator/simulator.py to generate matching attack telemetry

Useful Commands

# Lab management
docker compose ps                           # Check service status
docker compose logs -f log-simulator        # Watch simulated events
docker compose down                         # Stop everything
docker compose down -v                      # Full reset (delete all data)

# Elasticsearch
curl -u elastic:changeme http://localhost:9200/_cluster/health
curl -u elastic:changeme http://localhost:9200/sim-attack/_count

# Splunk
curl -sk https://localhost:8089/services/server/health -u admin:BlueTeamLab1!

# Sigma transpilation
sigma convert -t lucene -p ecs_windows detections/<tactic>/<rule>.yml
sigma convert -t splunk --without-pipeline detections/<tactic>/<rule>.yml

License

This project is licensed under MIT.

Third-party components (Elasticsearch, Splunk, Cribl, etc.) have their own licenses. See THIRD-PARTY-LICENSES.md for details. No third-party binaries are redistributed — Docker pulls official images at runtime.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages