A curated list of tools, frameworks, papers, and resources for AI/ML security testing, adversarial machine learning, LLM red-teaming, and agentic AI safety.
Contributions welcome! See CONTRIBUTING.md for guidelines.
- Frameworks
- LLM Red-Teaming Tools
- Adversarial ML Libraries
- Agentic AI Security
- Guardrails & Runtime Protection
- Compliance & Governance
- Vulnerability Databases
- Standards & Guidelines
- Research Papers
- Courses & Training
- Conferences & Events
Comprehensive security testing frameworks that cover multiple attack categories.
| Tool | Stars | Coverage | License |
|---|---|---|---|
| Tessera | 42 OWASP tests, 5 categories (MOD/APP/INF/DAT/AGT), full Agentic AI Top 10 | Apache 2.0 | |
| Garak | LLM vulnerability probing | Apache 2.0 | |
| Counterfit | Adversarial ML attack automation | MIT | |
| AIShield | ML model security | Apache 2.0 |
Tools specifically designed for testing Large Language Models.
| Tool | Stars | Focus | License |
|---|---|---|---|
| Tessera | 14 APP tests + 10 AGT tests, 3-phase methodology | Apache 2.0 | |
| Garak | LLM vulnerability probing and scanning | Apache 2.0 | |
| PyRIT | Python Risk Identification Toolkit for GenAI (Microsoft) | MIT | |
| Agentic Radar | Agentic workflow security scanner | Apache 2.0 | |
| ClawMoat | Runtime security scanner for AI agents | MIT | |
| LLMFuzzer | Fuzzing framework for LLMs | MIT | |
| Rebuff | Prompt injection detection | Apache 2.0 |
Libraries for adversarial attacks and defenses on ML models.
| Tool | Stars | Focus | License |
|---|---|---|---|
| IBM ART | Adversarial attacks, defenses, certifications | MIT | |
| Foolbox | Adversarial perturbations | MIT | |
| CleverHans | Adversarial examples for ML | MIT | |
| TextAttack | NLP adversarial attacks | MIT | |
| AugLy | Data augmentation for robustness testing | MIT |
Tools and resources specific to AI agent security — the OWASP Top 10 for Agentic Applications (ASI 2026).
| ASI Risk | Description | Test Tools |
|---|---|---|
| ASI-01 | Agent Goal Hijacking | Tessera AGT-03 |
| ASI-02 | Tool Misuse | Tessera AGT-02 |
| ASI-03 | Identity & Privilege Abuse | Tessera AGT-05 |
| ASI-04 | Agentic Supply Chain | Tessera AGT-01 |
| ASI-05 | Unexpected Code Execution | Tessera AGT-06 |
| ASI-06 | Memory & Context Poisoning | Tessera AGT-04 |
| ASI-07 | Insecure Inter-Agent Comms | Tessera AGT-07 |
| ASI-08 | Cascading Failures | Tessera AGT-08 |
| ASI-09 | Human-Agent Trust Exploitation | Tessera AGT-09 |
| ASI-10 | Rogue Agents | Tessera AGT-10 |
Tools that protect AI systems at runtime.
| Tool | Stars | Focus | License |
|---|---|---|---|
| LLM Guard | Input/output guardrails for LLMs | MIT | |
| NeMo Guardrails | Programmable guardrails for LLM apps | Apache 2.0 | |
| Guardrails AI | Input/output validation for LLMs | Apache 2.0 | |
| Lakera Guard | — | Prompt injection detection API | SaaS |
| Detoxify | Toxicity detection | Apache 2.0 |
Frameworks and tools for AI regulatory compliance.
| Resource | Type | Coverage |
|---|---|---|
Tessera --format compliance |
Tool | EU AI Act (42-test mapping), NIST AI RMF, SOC 2, ISO 27001 |
| EU AI Act | Regulation | EU regulation on AI systems (deadline: Aug 2, 2026) |
| NIST AI RMF | Framework | US AI risk management framework |
| ISO/IEC 42001 | Standard | AI management system standard |
| Fairlearn | Tool | ML fairness assessment |
| AI Verify | Tool | AI governance testing framework (Singapore) |
| Resource | Description |
|---|---|
| OWASP Top 10 for LLM Applications | Top 10 risks for LLM-based applications |
| OWASP Top 10 for Agentic Applications (ASI 2026) | Top 10 risks for AI agent systems |
| MITRE ATLAS | Adversarial Threat Landscape for AI Systems |
| AI Incident Database | Database of AI-related incidents and failures |
| AVID | AI Vulnerability Database |
| Standard | Organization | Focus |
|---|---|---|
| OWASP AI Testing Guide | OWASP | AI security testing methodology |
| NIST AI 100-2 | NIST | Adversarial ML taxonomy |
| ISO/IEC 27090 | ISO | Cybersecurity for AI |
| EU AI Act | European Union | AI regulation |
| Singapore AI Verify | IMDA | AI governance framework |
- A Survey of Adversarial Machine Learning in Cybersecurity — comprehensive overview of adversarial ML
- Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection — foundational prompt injection research
- Jailbroken: How Does LLM Safety Training Fail? — analysis of LLM jailbreak techniques
- The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies
- Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
- Ignore This Title and HackAPrompt: Exposing Systemic Weaknesses of LLMs
- Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
| Course | Provider | Topic |
|---|---|---|
| AI Red Teaming | Microsoft | Red teaming AI systems |
| Adversarial Machine Learning | Academic | Adversarial ML fundamentals |
| LLM Security | Community | LLM-specific security |
| Damn Vulnerable LLM Agent | WithSecure | Hands-on LLM agent security |
| Event | Focus |
|---|---|
| OWASP Global AppSec | Application security (AI track) |
| DEF CON AI Village | AI security research |
| NeurIPS ML Safety Workshop | ML safety and robustness |
| IEEE SaTML | Security and trustworthy ML |
Contributions welcome! Please submit a PR with:
- Tool name, link, and brief description
- Star badge if it's a GitHub project
- Correct category placement
