Honeypots collect threat intelligence by emulating vulnerable services, but real-world deployments face a hard constraint: limited resources mean only a fraction of possible services can be exposed at any given time. Choosing which services to expose is typically a static, manual decision — one that grows stale as attacker tactics shift.
This repository implements an LLM-based autonomous agent that turns honeypot exposure into a dynamic, inference-driven process. Rather than relying on fixed configurations, the agent continuously analyzes IDS alerts, infers where attackers are in a multi-stage exploitation chain (aligned with MITRE ATT&CK), and reconfigures the honeynet to expose the services most likely to sustain engagement — all under a strict deployment budget.
The key idea: treat adaptive honeypot management as a sequential decision problem under partial observability, where an LLM reasons over noisy security telemetry to track attacker intent and allocate deception resources accordingly.
The approach is evaluated in a discrete-time simulation with scripted attackers executing proof-of-concept exploits against real CVEs (GitLab, Apache Struts, Docker API, Xdebug), multiple attacker persistence models, and several LLM backends.
📄 Paper: Towards Agentic Honeynet Configuration — F. Mirra, M. Boffa, D. Giordano, M. Mellia (Politecnico di Torino), I. Drago (Università di Torino)
- Docker & Docker Compose
- Python 3.9+
- API Configuration: Create a
.envfile in theMultiAgent/directory and add your LLM API keys (e.g., OpenAI).
Deploy the firewall (IDS/IPS) and the attacker simulation environment.
# Launch the Attacker Container
cd Benchmark/attackerContainer
docker-compose up -d
# Launch the Firewall/IDS Container
cd ../firewallContainer
docker-compose up -dPopulate the internal network with vulnerable services and decoys.
cd ../deploy
bash all_exploitables.shEvaluate inference accuracy and engagement efficiency using the automated benchmarking suite:
graph.ipynb
├── MultiAgent/ # Core AI reasoning engine
│ └── src/
│ ├── nodes/ # Agents for network analysis, exploitation inference,
│ │ # and exposure management
│ └── benchmark/ # Scripts for automated simulations and performance reporting
│
├── Benchmark/ # Containerized lab environment
│ ├── attackerContainer/ # Automated scripts simulating real-world RCE exploits
│ ├── firewallContainer/ # Suricata-based monitoring and routing
│ ├── vulnerableContainers/ # Target services and deception decoys
│ └── deploy/ # Orchestration scripts for network setup
AI Prompts — MultiAgent/src/nodes/prompts.py
Defines the reasoning logic for the Attack Inference and Exposure Management nodes.
Attack PoCs — Benchmark/attackerContainer/scripts/
Automated exploit scripts targeting the following vulnerabilities:
- GitLab Pre-Auth Remote Command Execution (CVE-2021-22205)
- Struts2 S2-057 Remote Code Execution (CVE-2018-11776)
- Docker Remote API Unauthorized Access → Remote Code Execution
- PHP XDebug Remote Debugging Code Execution
IDS Alerts — Benchmark/firewallContainer/log/suricata/eve.json
Suricata logs consumed by the agent as aggregated JSON alerts to trigger reasoning cycles. Example:
{
"timestamp": "2026-02-08T18:21:57.697544+0100",
"event_type": "alert",
"src_ip": "192.168.100.2",
"dest_ip": "172.20.0.3",
"dest_port": 8080,
"proto": "TCP",
"alert": {
"signature": "ET EXPLOIT Apache Struts Possible OGNL Java Exec In URI",
"category": "Attempted User Privilege Gain",
"severity": 1
}
}Vulnerable Assets — Benchmark/vulnerableContainers/vulnerable/
Emulated real-world targets derived from VulnHub.