Built-in Rules
Firmis ships with 287 built-in detection rules across 21 threat categories, covering prompt injection, credential harvesting, supply chain attacks, and more.
Summary
Section titled “Summary”| Severity | Count |
|---|---|
| 🔴 Critical | 82 |
| 🟠 High | 143 |
| 🟡 Medium | 59 |
| 🟢 Low | 3 |
Access Control
Section titled “Access Control”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
ac-002 | Authentication Bypass Patterns | 🔴 Critical | 60% | All |
ac-003 | JWT None Algorithm or Weak Signing | 🔴 Critical | 60% | All |
ac-001 | API Key or Token in URL Query Parameter | 🟠 High | 55% | All |
Rule Details
Section titled “Rule Details”ac-002 — Authentication Bypass Patterns
Section titled “ac-002 — Authentication Bypass Patterns”Severity: 🔴 Critical | Category: Access Control | Confidence threshold: 60% | Platforms: All
Detects hardcoded boolean flags and query parameters used to bypass authentication checks in agent code or configurations
Remediation:
Authentication bypass flags are critical vulnerabilities that remove access controls. Remove all hardcoded is_admin, skip_auth, and bypass_auth flags from agent code. Authentication decisions must be made by the identity provider, not boolean flags that can be trivially modified. Use role-based access control (RBAC) instead.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM08 Excessive Permissions
- https://atlas.mitre.org/techniques/AML.T0043
ac-003 — JWT None Algorithm or Weak Signing
Section titled “ac-003 — JWT None Algorithm or Weak Signing”Severity: 🔴 Critical | Category: Access Control | Confidence threshold: 60% | Platforms: All
Detects JWT configurations using the ‘none’ algorithm or weak symmetric secrets, enabling token forgery attacks
Remediation:
JWT ‘none’ algorithm allows forging tokens without a valid signature. Always use RS256 or ES256 (asymmetric) for production systems. Never disable JWT verification. Reject tokens with ‘none’ algorithm explicitly. Use cryptographically random secrets of at least 256 bits for HS256.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM08 Excessive Permissions
- https://atlas.mitre.org/techniques/AML.T0043
ac-001 — API Key or Token in URL Query Parameter
Section titled “ac-001 — API Key or Token in URL Query Parameter”Severity: 🟠 High | Category: Access Control | Confidence threshold: 55% | Platforms: All
Detects API keys, tokens, and secrets passed as URL query parameters instead of headers, exposing credentials in logs and browser history
Remediation:
API keys and tokens in URL query parameters are logged by web servers, proxies, CDNs, and browser history in plaintext. Use HTTP Authorization headers or request body parameters instead. Never embed secrets in URLs.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM08 Excessive Permissions
- https://atlas.mitre.org/techniques/AML.T0043
Agent Memory Poisoning
Section titled “Agent Memory Poisoning”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
kc-006 | SpAIware Persistent Memory Injection | 🔴 Critical | 50% | All |
mem-003 | Agent Config File Modification | 🔴 Critical | 50% | All |
aci-002 | Agent Memory Injection via External Write | 🟠 High | 55% | All |
mem-001 | Agent Memory File Write | 🟠 High | 60% | All |
mem-002 | Session/Conversation File Access | 🟠 High | 60% | All |
mem-005 | Copilot Instructions Manipulation | 🟠 High | 60% | All |
mem-006 | OpenAI Agents Memory Manipulation | 🟠 High | 60% | All |
mem-007 | Aider Agent Config Manipulation | 🟠 High | 60% | All |
mem-008 | Memory Injection via Instruction-Like Content (MINJA) | 🟠 High | 55% | All |
mem-004 | Time-Delayed Execution | 🟡 Medium | 60% | All |
mem-009 | Inter-Session Message Without Provenance | 🟡 Medium | 60% | openclaw |
Rule Details
Section titled “Rule Details”kc-006 — SpAIware Persistent Memory Injection
Section titled “kc-006 — SpAIware Persistent Memory Injection”Severity: 🔴 Critical | Category: Agent Memory Poisoning | Confidence threshold: 50% | Platforms: All
Detects memory tools that auto-execute stored instructions without user approval
Remediation:
Memory tools that auto-execute stored instructions enable persistent injection (SpAIware). Attackers inject instructions that survive across sessions. All memory write operations must require explicit user approval.
mem-003 — Agent Config File Modification
Section titled “mem-003 — Agent Config File Modification”Severity: 🔴 Critical | Category: Agent Memory Poisoning | Confidence threshold: 50% | Platforms: All
Modifies agent platform config files (.clawdbot/, .openclaw/, .claude/)
Remediation:
Skills must not modify agent platform configuration files. This could inject malicious MCP servers or change security settings.
aci-002 — Agent Memory Injection via External Write
Section titled “aci-002 — Agent Memory Injection via External Write”Severity: 🟠 High | Category: Agent Memory Poisoning | Confidence threshold: 55% | Platforms: All
Detects write operations targeting agent memory files (MEMORY.md, memory/ directories, long-term memory stores, conversation history) from external or untrusted input sources. Attackers inject persistent malicious instructions that survive across sessions.
Remediation:
Agent memory stores must validate the source of all write operations. Implement write-ahead logging for memory modifications. Never allow external inputs to directly modify agent memory without owner verification.
References:
- Agents of Chaos (arXiv:2602.20021) — CS10: Non-owner injected constitutional rules into agent memory
- OWASP LLM05 (Supply Chain Vulnerabilities)
mem-001 — Agent Memory File Write
Section titled “mem-001 — Agent Memory File Write”Severity: 🟠 High | Category: Agent Memory Poisoning | Confidence threshold: 60% | Platforms: All
Writes to agent persistent memory files (MEMORY.md, .memories/) — potential memory poisoning
Remediation:
Skills should not modify agent memory files. This could be used to inject persistent malicious instructions that survive across sessions.
mem-002 — Session/Conversation File Access
Section titled “mem-002 — Session/Conversation File Access”Severity: 🟠 High | Category: Agent Memory Poisoning | Confidence threshold: 60% | Platforms: All
Reads agent session or conversation log files — potential data exfiltration
Remediation:
Skills should not read agent session or conversation files. This may be an attempt to exfiltrate conversation data.
mem-005 — Copilot Instructions Manipulation
Section titled “mem-005 — Copilot Instructions Manipulation”Severity: 🟠 High | Category: Agent Memory Poisoning | Confidence threshold: 60% | Platforms: All
Writes to .github/copilot-instructions.md — persistent Copilot behavior injection
Remediation:
Skills should not modify GitHub Copilot instruction files. This could inject persistent malicious behavior into Copilot-assisted development.
mem-006 — OpenAI Agents Memory Manipulation
Section titled “mem-006 — OpenAI Agents Memory Manipulation”Severity: 🟠 High | Category: Agent Memory Poisoning | Confidence threshold: 60% | Platforms: All
Writes to AGENTS.md or .codex/ — OpenAI Codex/Agents persistent memory injection
Remediation:
Skills should not modify OpenAI agent memory files. This could inject persistent malicious instructions.
mem-007 — Aider Agent Config Manipulation
Section titled “mem-007 — Aider Agent Config Manipulation”Severity: 🟠 High | Category: Agent Memory Poisoning | Confidence threshold: 60% | Platforms: All
Writes to .aider/ config or .aider.conf.yml — Aider AI agent manipulation
Remediation:
Skills should not modify Aider AI agent configuration. This could inject malicious instructions or change security settings.
mem-008 — Memory Injection via Instruction-Like Content (MINJA)
Section titled “mem-008 — Memory Injection via Instruction-Like Content (MINJA)”Severity: 🟠 High | Category: Agent Memory Poisoning | Confidence threshold: 55% | Platforms: All
Detects instruction-like content injected into agent memory files — MINJA attack (NeurIPS 2025) achieves 95%+ success rate via query-only interaction
Remediation:
Memory injection (MINJA, NeurIPS 2025) poisons agent persistent memory with instruction-like content that overrides the agent’s behavior on future queries. Memory files should contain only factual data, never behavioral directives. Sanitize memory content by stripping instruction patterns before persisting.
References:
mem-004 — Time-Delayed Execution
Section titled “mem-004 — Time-Delayed Execution”Severity: 🟡 Medium | Category: Agent Memory Poisoning | Confidence threshold: 60% | Platforms: All
Uses time-delayed execution patterns — may be evading real-time analysis
Remediation:
Long time delays in AI agent skills are suspicious. Legitimate skills should execute promptly, not schedule deferred actions.
mem-009 — Inter-Session Message Without Provenance
Section titled “mem-009 — Inter-Session Message Without Provenance”Severity: 🟡 Medium | Category: Agent Memory Poisoning | Confidence threshold: 60% | Platforms: openclaw
Detects sessions_send patterns where messages lack provenance markers — GHSA-w5c7
Remediation:
Inter-session messages must carry explicit provenance metadata. Add inputProvenance: { kind: “inter_session”, sessionId: … } to all messages delivered via sessions_send. Without this, a compromised session can inject instructions.
References:
- GHSA-w5c7-9qqw-6645
credential-extraction
Section titled “credential-extraction”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
cex-001 | Browser Cookie Extraction | 🔴 Critical | 50% | All |
cex-002 | Password Manager Access | 🟠 High | 50% | All |
cex-003 | Credential File Enumeration | 🟠 High | 50% | All |
Rule Details
Section titled “Rule Details”cex-001 — Browser Cookie Extraction
Section titled “cex-001 — Browser Cookie Extraction”Severity: 🔴 Critical | Category: credential-extraction | Confidence threshold: 50% | Platforms: All
Extracts cookies or session data from browser profiles — credential theft from another app
Remediation:
Extracting credentials from browser cookie stores accesses another application’s authentication material. This is a credential theft vector if triggered by prompt injection. Require explicit operator approval.
cex-002 — Password Manager Access
Section titled “cex-002 — Password Manager Access”Severity: 🟠 High | Category: credential-extraction | Confidence threshold: 50% | Platforms: All
Accesses password managers or OS keychains to extract stored credentials
Remediation:
Accessing password managers from an agent context could leak credentials if the agent is compromised.
cex-003 — Credential File Enumeration
Section titled “cex-003 — Credential File Enumeration”Severity: 🟠 High | Category: credential-extraction | Confidence threshold: 50% | Platforms: All
Reads or enumerates credential storage files from other applications
Remediation:
Enumerating credential files could expose stored secrets to the agent’s context where they become exfiltrable.
Credential Harvesting
Section titled “Credential Harvesting”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
cred-002 | SSH Private Key Access | 🔴 Critical | 75% | All |
cred-005 | Browser Cookie/Credential Access | 🔴 Critical | 85% | All |
cred-006 | Keychain/Credential Manager Access | 🔴 Critical | 80% | All |
cred-015 | Container Environment Variable Theft | 🔴 Critical | 55% | All |
cred-018 | Python Subprocess Credential Theft | 🔴 Critical | 70% | All |
cred-020 | Service Role Keys in MCP Config | 🔴 Critical | 50% | mcp, claude, cursor |
cred-001 | AWS Credentials Access | 🟠 High | 80% | All |
cred-003 | GCP Service Account Key | 🟠 High | 80% | All |
cred-007 | Git Credentials Access | 🟠 High | 75% | All |
cred-008 | NPM Token Access | 🟠 High | 80% | All |
cred-009 | Docker Credentials Access | 🟠 High | 80% | All |
cred-010 | Kubernetes Credentials Access | 🟠 High | 80% | All |
cred-011 | API Key in Config | 🟠 High | 50% | All |
cred-012 | Azure CLI Credentials Access | 🟠 High | 70% | All |
cred-013 | AWS SSO Token Cache Access | 🟠 High | 70% | All |
cred-014 | Vault Token File Access | 🟠 High | 70% | All |
cred-016 | Python Pathlib Credential Access | 🟠 High | 70% | All |
cred-017 | Python Open Credential File | 🟠 High | 70% | All |
cred-019 | API Base URL Override for Key Exfiltration | 🟠 High | 55% | All |
kc-011 | Environment Variable Serialization to File | 🟠 High | 55% | All |
kc-012 | Credential Staging to Temp File | 🟠 High | 55% | All |
adv-004 | Credential Path via path.join or homedir | 🟡 Medium | 55% | All |
cred-004 | Environment Variable Harvesting | 🟡 Medium | 70% | All |
Rule Details
Section titled “Rule Details”cred-002 — SSH Private Key Access
Section titled “cred-002 — SSH Private Key Access”Severity: 🔴 Critical | Category: Credential Harvesting | Confidence threshold: 75% | Platforms: All
Detects access to SSH private keys
Remediation:
SSH keys should never be accessed by AI agents. Use SSH agent forwarding or API-based access.
cred-005 — Browser Cookie/Credential Access
Section titled “cred-005 — Browser Cookie/Credential Access”Severity: 🔴 Critical | Category: Credential Harvesting | Confidence threshold: 85% | Platforms: All
Detects access to browser credential stores
Remediation:
Never access browser credential stores. This is highly suspicious behavior.
cred-006 — Keychain/Credential Manager Access
Section titled “cred-006 — Keychain/Credential Manager Access”Severity: 🔴 Critical | Category: Credential Harvesting | Confidence threshold: 80% | Platforms: All
Detects access to OS credential managers
Remediation:
Do not access OS credential managers directly. Request credentials through secure channels.
cred-015 — Container Environment Variable Theft
Section titled “cred-015 — Container Environment Variable Theft”Severity: 🔴 Critical | Category: Credential Harvesting | Confidence threshold: 55% | Platforms: All
Detects reading /proc/1/environ to steal container credentials
Remediation:
Reading /proc/*/environ exposes all environment variables including secrets injected by container orchestrators. Use the runtime’s secret management instead.
cred-018 — Python Subprocess Credential Theft
Section titled “cred-018 — Python Subprocess Credential Theft”Severity: 🔴 Critical | Category: Credential Harvesting | Confidence threshold: 70% | Platforms: All
Detects Python subprocess calls targeting credential stores
Remediation:
Do not use subprocess to access credential stores. Use official SDKs with proper authentication.
cred-020 — Service Role Keys in MCP Config
Section titled “cred-020 — Service Role Keys in MCP Config”Severity: 🔴 Critical | Category: Credential Harvesting | Confidence threshold: 50% | Platforms: mcp, claude, cursor
Detects Supabase service_role keys or admin-level secrets passed directly in MCP server configurations
Remediation:
Service role keys and admin secrets must never be passed directly in MCP server configurations. These keys bypass Row Level Security and grant full database access. Use environment variables with restricted scopes and anon keys for client-side MCP servers.
cred-001 — AWS Credentials Access
Section titled “cred-001 — AWS Credentials Access”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 80% | Platforms: All
Detects access to AWS credential files
Remediation:
Remove direct access to AWS credentials. Use environment variables or IAM roles instead.
cred-003 — GCP Service Account Key
Section titled “cred-003 — GCP Service Account Key”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 80% | Platforms: All
Detects access to Google Cloud service account keys
Remediation:
Use Workload Identity or Application Default Credentials instead of service account keys.
cred-007 — Git Credentials Access
Section titled “cred-007 — Git Credentials Access”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 75% | Platforms: All
Detects access to Git credential storage
Remediation:
Use Git credential helpers or SSH keys instead of accessing credential files directly.
cred-008 — NPM Token Access
Section titled “cred-008 — NPM Token Access”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 80% | Platforms: All
Detects access to NPM authentication tokens
Remediation:
Use npm login or CI/CD secret management instead of embedding tokens.
cred-009 — Docker Credentials Access
Section titled “cred-009 — Docker Credentials Access”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 80% | Platforms: All
Detects access to Docker authentication
Remediation:
Use Docker credential helpers instead of storing credentials in config.json.
cred-010 — Kubernetes Credentials Access
Section titled “cred-010 — Kubernetes Credentials Access”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 80% | Platforms: All
Detects access to Kubernetes configs
Remediation:
Use RBAC and service accounts instead of accessing kubeconfig directly.
cred-011 — API Key in Config
Section titled “cred-011 — API Key in Config”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 50% | Platforms: All
Detects API keys and tokens hardcoded in configuration files
Remediation:
Never hardcode API keys or tokens. Use environment variables, secrets managers, or credential vaults.
cred-012 — Azure CLI Credentials Access
Section titled “cred-012 — Azure CLI Credentials Access”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 70% | Platforms: All
Detects access to Azure CLI credential files
Remediation:
Remove direct access to Azure CLI credentials. Use managed identities or service principals with proper RBAC.
cred-013 — AWS SSO Token Cache Access
Section titled “cred-013 — AWS SSO Token Cache Access”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 70% | Platforms: All
Detects access to AWS SSO cached tokens
Remediation:
Remove direct access to AWS SSO token cache. Use the AWS SDK with proper credential providers.
cred-014 — Vault Token File Access
Section titled “cred-014 — Vault Token File Access”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 70% | Platforms: All
Detects access to HashiCorp Vault token files
Remediation:
Remove direct access to Vault token files. Use AppRole or Kubernetes auth methods for automated credential retrieval.
cred-016 — Python Pathlib Credential Access
Section titled “cred-016 — Python Pathlib Credential Access”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 70% | Platforms: All
Detects Python pathlib-based access to credential files using the / operator
Remediation:
Do not construct paths to credential files using Python pathlib or os.path. Use environment variables or credential providers.
cred-017 — Python Open Credential File
Section titled “cred-017 — Python Open Credential File”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 70% | Platforms: All
Detects Python open() calls targeting credential files
Remediation:
Do not open credential files directly. Use credential providers or environment variables.
cred-019 — API Base URL Override for Key Exfiltration
Section titled “cred-019 — API Base URL Override for Key Exfiltration”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 55% | Platforms: All
Detects ANTHROPIC_BASE_URL / OPENAI_BASE_URL overrides that redirect API calls (with auth keys) to attacker-controlled endpoints — CVE-2026-21852
Remediation:
Overriding API base URLs redirects all API traffic (including auth headers with API keys) to a potentially malicious endpoint. Only use official API endpoints. CVE-2026-21852 demonstrated this attack vector for Anthropic API key theft.
References:
- CVE-2026-21852
kc-011 — Environment Variable Serialization to File
Section titled “kc-011 — Environment Variable Serialization to File”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 55% | Platforms: All
Detects full process.env serialization written to log or temp files
Remediation:
Serializing the entire process.env to a file exposes all environment variables including API keys, database credentials, and tokens. Only log specific, non-sensitive configuration values.
kc-012 — Credential Staging to Temp File
Section titled “kc-012 — Credential Staging to Temp File”Severity: 🟠 High | Category: Credential Harvesting | Confidence threshold: 55% | Platforms: All
Detects encoding and writing of credentials to temporary files for later exfiltration
Remediation:
Encoding credentials and writing them to temp files is a staging technique for later exfiltration. Credentials should never be written to disk in any form.
adv-004 — Credential Path via path.join or homedir
Section titled “adv-004 — Credential Path via path.join or homedir”Severity: 🟡 Medium | Category: Credential Harvesting | Confidence threshold: 55% | Platforms: All
Detects credential file access constructed via path.join(homedir(), ‘.ssh’) pattern — evades static file-access rules
Remediation:
Constructing credential paths via path.join(homedir(), ‘.ssh’) evades static file path detection rules. This is functionally identical to accessing ~/.ssh/id_rsa. AI agents should never access credential directories regardless of path construction method.
cred-004 — Environment Variable Harvesting
Section titled “cred-004 — Environment Variable Harvesting”Severity: 🟡 Medium | Category: Credential Harvesting | Confidence threshold: 70% | Platforms: All
Detects bulk enumeration or targeted access to sensitive environment variables
Remediation:
Only access specific, required environment variables. Never serialize the entire environment.
cross-agent-propagation
Section titled “cross-agent-propagation”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
kc-007 | Cross-Repo Agent Propagation | 🔴 Critical | 50% | All |
mat-002 | Missing Authority Verification | 🔴 Critical | 60% | All |
mat-001 | Cross-Agent Trust Without Verification | 🟠 High | 55% | All |
Rule Details
Section titled “Rule Details”kc-007 — Cross-Repo Agent Propagation
Section titled “kc-007 — Cross-Repo Agent Propagation”Severity: 🔴 Critical | Category: cross-agent-propagation | Confidence threshold: 50% | Platforms: All
Detects skills or tools that modify agent config files across repositories and push changes
Remediation:
Skills that modify CLAUDE.md or copilot-instructions.md and push to remote repos can spread malicious instructions across projects (AgentHopper attack). Never allow automated modification of agent instruction files.
mat-002 — Missing Authority Verification
Section titled “mat-002 — Missing Authority Verification”Severity: 🔴 Critical | Category: cross-agent-propagation | Confidence threshold: 60% | Platforms: All
Agent configurations with no owner or authority verification, allowing any caller to invoke tools or access agent state
Remediation:
Every agent must verify the identity and authority of input sources. Implement role-based access control (RBAC) for all agent interactions. Define and enforce an owner/authority hierarchy in agent configuration. MCP servers must require authentication tokens for all tool invocations.
References:
- Agents of Chaos (arXiv:2602.20021) — CS2: Agent returned confidential data to non-owner
- Agents of Chaos — CS8: Attacker impersonated owner with username change
- OWASP LLM01 (Prompt Injection)
- MITRE ATLAS AML.T0051
mat-001 — Cross-Agent Trust Without Verification
Section titled “mat-001 — Cross-Agent Trust Without Verification”Severity: 🟠 High | Category: cross-agent-propagation | Confidence threshold: 55% | Platforms: All
Multi-agent configs where agents can modify each other”s state without mutual authentication or identity verification
Remediation:
Multi-agent systems must implement mutual authentication between agents. Never allow one agent to modify another agent”s state, memory, or configuration. Use signed messages for inter-agent communication and verify sender identity.
References:
- Agents of Chaos (arXiv:2602.20021) — CS10: Corrupted agent removed server members
- Agents of Chaos — CS11: Agent broadcast false accusations to 52+ agents
- MITRE ATLAS AML.T0048
Data Exfiltration
Section titled “Data Exfiltration”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
adv-001 | Process Environment in HTTP Body | 🔴 Critical | 50% | All |
exfil-011 | Cloud Metadata Service Access (IMDS/SSRF) | 🔴 Critical | 50% | All |
adv-006 | Base64-Decoded Network Hostname | 🟠 High | 55% | All |
adv-015 | Suspicious MCP Server Environment Variables | 🟠 High | 50% | All |
exfil-001 | Suspicious External HTTP Request | 🟠 High | 70% | All |
exfil-003 | File Upload to External Service | 🟠 High | 75% | All |
exfil-004 | DNS Exfiltration Pattern | 🟠 High | 80% | All |
exfil-006 | Screenshot Capture | 🟠 High | 80% | All |
exfil-008 | Archive Creation Before Upload | 🟠 High | 75% | All |
exfil-012 | WebSocket Exfiltration | 🟠 High | 70% | All |
kc-008 | DNS Exfiltration via Encoded Subdomain | 🟠 High | 55% | All |
kc-009 | Render-Based Data Exfiltration | 🟠 High | 55% | All |
kc-010 | Clipboard Content Exfiltration | 🟠 High | 55% | All |
adv-012 | String Concatenation URL Construction | 🟡 Medium | 60% | All |
exfil-002 | Base64 Encoded Data Transmission | 🟡 Medium | 55% | All |
exfil-005 | Clipboard Data Access | 🟡 Medium | 70% | All |
exfil-007 | Bulk File Read Pattern | 🟡 Medium | 65% | All |
exfil-009 | Webhook Data Transmission | 🟡 Medium | 70% | All |
exfil-010 | Email Data Transmission | 🟡 Medium | 70% | All |
Rule Details
Section titled “Rule Details”adv-001 — Process Environment in HTTP Body
Section titled “adv-001 — Process Environment in HTTP Body”Severity: 🔴 Critical | Category: Data Exfiltration | Confidence threshold: 50% | Platforms: All
Detects process.env passed directly in fetch/request body — common exfiltration of all environment variables
Remediation:
Passing the entire process.env in an HTTP request body exfiltrates all environment variables including secrets, tokens, and API keys. Access only specific required variables and never transmit the entire environment.
exfil-011 — Cloud Metadata Service Access (IMDS/SSRF)
Section titled “exfil-011 — Cloud Metadata Service Access (IMDS/SSRF)”Severity: 🔴 Critical | Category: Data Exfiltration | Confidence threshold: 50% | Platforms: All
Detects access to cloud instance metadata services for credential theft
Remediation:
Cloud metadata service access from agent code is extremely suspicious. This is the primary vector for SSRF-to-credential-theft in cloud environments. Agents should never access instance metadata endpoints directly.
adv-006 — Base64-Decoded Network Hostname
Section titled “adv-006 — Base64-Decoded Network Hostname”Severity: 🟠 High | Category: Data Exfiltration | Confidence threshold: 55% | Platforms: All
Detects hostnames decoded from base64 at runtime — obfuscated network destination
Remediation:
Decoding hostnames from base64 at runtime hides the actual network destination from static analysis. This is used to evade URL pattern matching rules. All network destinations should be clearly visible in source code.
adv-015 — Suspicious MCP Server Environment Variables
Section titled “adv-015 — Suspicious MCP Server Environment Variables”Severity: 🟠 High | Category: Data Exfiltration | Confidence threshold: 50% | Platforms: All
Detects MCP server config with environment variables that enable data forwarding or credential inclusion
Remediation:
MCP server environment variables like INCLUDE_ENV=true and FORWARD_HEADERS=true instruct the server to include all environment variables or authentication headers in outbound requests. This silently exfiltrates secrets to upstream endpoints. Only pass specific, required environment variables to MCP servers.
exfil-001 — Suspicious External HTTP Request
Section titled “exfil-001 — Suspicious External HTTP Request”Severity: 🟠 High | Category: Data Exfiltration | Confidence threshold: 70% | Platforms: All
Detects HTTP requests to suspicious TLDs or tunneling services
Remediation:
Review all external HTTP requests. Ensure they go to legitimate, expected endpoints.
exfil-003 — File Upload to External Service
Section titled “exfil-003 — File Upload to External Service”Severity: 🟠 High | Category: Data Exfiltration | Confidence threshold: 75% | Platforms: All
Detects file uploads to external services
Remediation:
Review file uploads to external services. Ensure sensitive data is not being exfiltrated.
exfil-004 — DNS Exfiltration Pattern
Section titled “exfil-004 — DNS Exfiltration Pattern”Severity: 🟠 High | Category: Data Exfiltration | Confidence threshold: 80% | Platforms: All
Detects potential DNS-based data exfiltration
Remediation:
DNS queries with dynamic subdomains may indicate data exfiltration. Review DNS usage.
exfil-006 — Screenshot Capture
Section titled “exfil-006 — Screenshot Capture”Severity: 🟠 High | Category: Data Exfiltration | Confidence threshold: 80% | Platforms: All
Detects screenshot capture API calls or library imports
Remediation:
Screenshot capture is highly sensitive. Ensure this is explicitly requested by the user.
exfil-008 — Archive Creation Before Upload
Section titled “exfil-008 — Archive Creation Before Upload”Severity: 🟠 High | Category: Data Exfiltration | Confidence threshold: 75% | Platforms: All
Detects creating archives before network transmission
Remediation:
Creating archives before upload may indicate bulk data exfiltration. Review carefully.
exfil-012 — WebSocket Exfiltration
Section titled “exfil-012 — WebSocket Exfiltration”Severity: 🟠 High | Category: Data Exfiltration | Confidence threshold: 70% | Platforms: All
Detects WebSocket connections that may exfiltrate data to external servers
Remediation:
WebSocket connections can maintain persistent channels for data exfiltration. Verify the destination server is trusted and the data being sent is appropriate.
kc-008 — DNS Exfiltration via Encoded Subdomain
Section titled “kc-008 — DNS Exfiltration via Encoded Subdomain”Severity: 🟠 High | Category: Data Exfiltration | Confidence threshold: 55% | Platforms: All
Detects DNS lookup tools where subdomain contains dynamically encoded data
Remediation:
DNS tools that encode data into subdomains can exfiltrate sensitive information through DNS queries that bypass network security controls. CVE-2025-55284 demonstrated this in Claude Code.
kc-009 — Render-Based Data Exfiltration
Section titled “kc-009 — Render-Based Data Exfiltration”Severity: 🟠 High | Category: Data Exfiltration | Confidence threshold: 55% | Platforms: All
Detects analytics pixels or render outputs that encode sensitive data in URLs
Remediation:
Mermaid diagrams, markdown images, and HTML renders can exfiltrate data by encoding it into external URLs. The attacker server receives the data when the image loads.
kc-010 — Clipboard Content Exfiltration
Section titled “kc-010 — Clipboard Content Exfiltration”Severity: 🟠 High | Category: Data Exfiltration | Confidence threshold: 55% | Platforms: All
Detects clipboard access followed by outbound transmission
Remediation:
Clipboard access combined with outbound network requests enables exfiltration of copied passwords, tokens, and sensitive data. Clipboard access should require explicit user consent.
adv-012 — String Concatenation URL Construction
Section titled “adv-012 — String Concatenation URL Construction”Severity: 🟡 Medium | Category: Data Exfiltration | Confidence threshold: 60% | Platforms: All
Detects URL construction via string concatenation to evade URL pattern matching
Remediation:
Building URLs via string concatenation (e.g., ‘htt’ + ‘ps://’) is an evasion technique to prevent static scanners from detecting the full URL. Legitimate code should use complete URL strings or well-known URL construction APIs.
exfil-002 — Base64 Encoded Data Transmission
Section titled “exfil-002 — Base64 Encoded Data Transmission”Severity: 🟡 Medium | Category: Data Exfiltration | Confidence threshold: 55% | Platforms: All
Detects base64 encoding combined with network transmission in the same file
Remediation:
Base64 encoding before transmission may indicate data obfuscation. Review the data being sent.
exfil-005 — Clipboard Data Access
Section titled “exfil-005 — Clipboard Data Access”Severity: 🟡 Medium | Category: Data Exfiltration | Confidence threshold: 70% | Platforms: All
Detects access to clipboard contents
Remediation:
Clipboard access should be minimized. Review why clipboard data is being accessed.
exfil-007 — Bulk File Read Pattern
Section titled “exfil-007 — Bulk File Read Pattern”Severity: 🟡 Medium | Category: Data Exfiltration | Confidence threshold: 65% | Platforms: All
Detects reading multiple files in rapid succession
Remediation:
Bulk file reading should be scoped to specific directories. Review the access pattern.
exfil-009 — Webhook Data Transmission
Section titled “exfil-009 — Webhook Data Transmission”Severity: 🟡 Medium | Category: Data Exfiltration | Confidence threshold: 70% | Platforms: All
Detects data transmission via webhooks
Remediation:
Webhook data transmission should only send expected, non-sensitive data.
exfil-010 — Email Data Transmission
Section titled “exfil-010 — Email Data Transmission”Severity: 🟡 Medium | Category: Data Exfiltration | Confidence threshold: 70% | Platforms: All
Detects sending data via email
Remediation:
Email transmission should be explicitly requested. Review what data is being sent.
File System Abuse
Section titled “File System Abuse”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
fs-003 | System Account File Access | 🔴 Critical | 55% | All |
fs-005 | Kernel Memory Access | 🔴 Critical | 50% | All |
fs-008 | Temp Directory Code Execution | 🔴 Critical | 60% | All |
fs-010 | Recursive Directory Deletion | 🔴 Critical | 55% | All |
fs-001 | /proc Filesystem Enumeration | 🟠 High | 60% | All |
fs-002 | System Log Manipulation | 🟠 High | 55% | All |
fs-004 | Symlink Attack | 🟠 High | 65% | All |
fs-007 | Symlink Attack to Sensitive Files | 🟠 High | 55% | All |
fs-009 | Audit Log Manipulation | 🟠 High | 55% | All |
fs-011 | Config Include Path Traversal | 🟠 High | 60% | openclaw |
fs-012 | Local File Path in Media URL Parameter | 🟠 High | 60% | openclaw |
fs-006 | Insecure File Permissions | 🟡 Medium | 65% | All |
Rule Details
Section titled “Rule Details”fs-003 — System Account File Access
Section titled “fs-003 — System Account File Access”Severity: 🔴 Critical | Category: File System Abuse | Confidence threshold: 55% | Platforms: All
Detects reads of system authentication and authorization files
Remediation:
Reading system account files (/etc/passwd, /etc/shadow, /etc/sudoers, /etc/group) is a strong indicator of credential harvesting or privilege escalation preparation. /etc/shadow contains password hashes and must never be accessed by an AI agent. Remove all access to these files. Use dedicated APIs for any legitimate user lookup needs.
fs-005 — Kernel Memory Access
Section titled “fs-005 — Kernel Memory Access”Severity: 🔴 Critical | Category: File System Abuse | Confidence threshold: 50% | Platforms: All
Detects access to kernel memory devices and raw memory operations
Remediation:
Access to kernel memory devices (/dev/mem, /dev/kmem, /dev/port) is an extreme security violation that enables arbitrary memory reads, rootkit installation, and kernel-level compromise. mmap with PROT_EXEC is a code injection technique. This code must be removed immediately. No AI agent should ever touch kernel memory.
fs-008 — Temp Directory Code Execution
Section titled “fs-008 — Temp Directory Code Execution”Severity: 🔴 Critical | Category: File System Abuse | Confidence threshold: 60% | Platforms: All
Detects patterns of writing executable code to /tmp and then executing it — a classic malware staging technique
Remediation:
Writing code to /tmp and executing it is a standard malware staging technique. /tmp is world-writable and persists across processes, making it ideal for staging payloads. AI agents must never write executable content to temporary directories. Use secure temporary file handling with mode 600 and never execute temp files.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM06 Excessive Agency
- https://atlas.mitre.org/techniques/AML.T0049
fs-010 — Recursive Directory Deletion
Section titled “fs-010 — Recursive Directory Deletion”Severity: 🔴 Critical | Category: File System Abuse | Confidence threshold: 55% | Platforms: All
Detects recursive deletion commands targeting system or application directories, which can cause irreversible data destruction
Remediation:
Recursive deletion of system or application directories is destructive and irreversible. AI agents must never delete directories recursively without strict path validation. Implement path allowlists for deletion operations. Never allow deletion of paths matching /, ~, $HOME, or well-known system directories.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM06 Excessive Agency
- https://atlas.mitre.org/techniques/AML.T0049
fs-001 — /proc Filesystem Enumeration
Section titled “fs-001 — /proc Filesystem Enumeration”Severity: 🟠 High | Category: File System Abuse | Confidence threshold: 60% | Platforms: All
Detects access to /proc filesystem entries used for reconnaissance and credential theft
Remediation:
Access to /proc filesystem entries is a strong indicator of reconnaissance activity. AI agents should never read /proc entries outside of explicitly approved diagnostic tools. For container environments, /proc/1/environ access is a known credential theft vector. Remove all /proc reads and use legitimate APIs for any required system information.
fs-002 — System Log Manipulation
Section titled “fs-002 — System Log Manipulation”Severity: 🟠 High | Category: File System Abuse | Confidence threshold: 55% | Platforms: All
Detects reads, writes, or destruction of system log files to cover tracks
Remediation:
System log access or modification is a serious indicator of anti-forensic activity. AI agents must never read, write, truncate, or delete system log files. Disabling syslog or auditd services to evade detection is a critical security event. Remove all log manipulation code and review why the agent requires log access.
fs-004 — Symlink Attack
Section titled “fs-004 — Symlink Attack”Severity: 🟠 High | Category: File System Abuse | Confidence threshold: 65% | Platforms: All
Detects creation of symbolic links pointing to sensitive system paths
Remediation:
Symlink creation targeting sensitive paths (/etc, /root, ~/.ssh, ~/.aws) is a common privilege escalation and path traversal technique. AI agents should never create symlinks without explicit, scoped authorization. Remove symlink creation code and audit the intent behind any file redirection logic.
fs-007 — Symlink Attack to Sensitive Files
Section titled “fs-007 — Symlink Attack to Sensitive Files”Severity: 🟠 High | Category: File System Abuse | Confidence threshold: 55% | Platforms: All
Detects creation of symbolic links pointing to sensitive system files or directories, enabling path traversal and unauthorized access
Remediation:
Symlinks to credential files (/etc/shadow, ~/.ssh/id_rsa, ~/.aws/credentials) enable path traversal attacks where a process reading an “innocent” path is redirected to a sensitive file. Remove all symlinks to sensitive paths. Ensure tmp directories are on separate filesystems to prevent symlink races.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM06 Excessive Agency
- https://atlas.mitre.org/techniques/AML.T0049
fs-009 — Audit Log Manipulation
Section titled “fs-009 — Audit Log Manipulation”Severity: 🟠 High | Category: File System Abuse | Confidence threshold: 55% | Platforms: All
Detects truncation, clearing, or deletion of audit and application log files to destroy forensic evidence
Remediation:
Log manipulation is a critical anti-forensic action. Audit logs are the primary mechanism for detecting and reconstructing security incidents. AI agents must never truncate, delete, or disable logging systems. Implement log integrity controls (append-only, remote syslog) to prevent tampering.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM06 Excessive Agency
- https://atlas.mitre.org/techniques/AML.T0049
fs-011 — Config Include Path Traversal
Section titled “fs-011 — Config Include Path Traversal”Severity: 🟠 High | Category: File System Abuse | Confidence threshold: 60% | Platforms: openclaw
Detects $include directives referencing absolute paths or directory traversal — CVE-2026-32061
Remediation:
Validate that all $include paths resolve within the config root after symlink resolution. Reject absolute paths and sequences containing ’../’.
References:
- CVE-2026-32061
fs-012 — Local File Path in Media URL Parameter
Section titled “fs-012 — Local File Path in Media URL Parameter”Severity: 🟠 High | Category: File System Abuse | Confidence threshold: 60% | Platforms: openclaw
Detects media URL parameters set to local filesystem paths — CVE-2026-26321
Remediation:
Media URL parameters must be validated against approved schemes (https:// only). Local filesystem paths must never be accepted as media sources.
References:
- CVE-2026-26321
fs-006 — Insecure File Permissions
Section titled “fs-006 — Insecure File Permissions”Severity: 🟡 Medium | Category: File System Abuse | Confidence threshold: 65% | Platforms: All
Detects creation of files or directories with world-writable or overly permissive modes
Remediation:
Overly permissive file modes (777, 666) allow any user on the system to read or modify files, undermining access control and enabling privilege escalation. umask(0) is particularly dangerous as it makes all subsequently created files world-accessible. Use the principle of least privilege: apply only the minimum permissions required. Prefer 640 for files and 750 for directories. Never use 777 in production code.
Insecure Configuration
Section titled “Insecure Configuration”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
aci-001 | Agent Identity File Tampering | 🔴 Critical | 60% | All |
ic-002 | SSL/TLS Verification Disabled | 🔴 Critical | 60% | All |
ic-004 | Claude Code RCE via Malicious Hooks | 🔴 Critical | 50% | claude |
kc-005 | MCP Config File Injection | 🔴 Critical | 50% | All |
adv-007 | Wildcard Permission in Skill Definition | 🟠 High | 50% | All |
ic-003 | Default or Hardcoded Credentials in Config Files | 🟠 High | 55% | All |
ic-005 | Cursor Auto-Execute on Folder Open | 🟠 High | 50% | cursor, codex |
ic-006 | Unauthenticated Local WebSocket Endpoint | 🟠 High | 55% | openclaw, mcp, claude |
ic-001 | Debug Mode Enabled in Production Config | 🟡 Medium | 50% | All |
Rule Details
Section titled “Rule Details”aci-001 — Agent Identity File Tampering
Section titled “aci-001 — Agent Identity File Tampering”Severity: 🔴 Critical | Category: Insecure Configuration | Confidence threshold: 60% | Platforms: All
Detects write or modify operations targeting agent identity and configuration files (SOUL.md, IDENTITY.md, AGENTS.md, BOOTSTRAP.md, USER.md). Attackers overwrite these files to perform identity spoofing or inject constitutional rules.
Remediation:
Agent identity and configuration files (SOUL.md, IDENTITY.md, etc.) must be read-only. Never grant write access to these files via tool definitions or external input channels. Use file system permissions (chmod 444) and validate file integrity with checksums.
References:
- Agents of Chaos (arXiv:2602.20021) — CS8: Identity spoofing via IDENTITY.md overwrite
- Agents of Chaos — CS10: Constitutional injection via memory file modification
ic-002 — SSL/TLS Verification Disabled
Section titled “ic-002 — SSL/TLS Verification Disabled”Severity: 🔴 Critical | Category: Insecure Configuration | Confidence threshold: 60% | Platforms: All
Detects configurations that disable SSL/TLS certificate verification, enabling man-in-the-middle attacks on agent network connections
Remediation:
Disabling SSL/TLS verification allows man-in-the-middle attacks where an attacker intercepts and modifies all HTTPS traffic without detection. This is never acceptable in production code. Remove all verify=False, rejectUnauthorized:false, and InsecureSkipVerify:true configurations. Use a proper CA bundle for self-signed certs.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM08 Excessive Permissions
- https://atlas.mitre.org/techniques/AML.T0049
ic-004 — Claude Code RCE via Malicious Hooks
Section titled “ic-004 — Claude Code RCE via Malicious Hooks”Severity: 🔴 Critical | Category: Insecure Configuration | Confidence threshold: 50% | Platforms: claude
Detects malicious shell commands in .claude/settings.json hooks — CVE-2025-59536 (CVSS 8.7). Attackers commit poisoned settings that spawn reverse shells or exfiltrate data when Claude Code executes hooks.
Remediation:
Malicious .claude/settings.json hooks can execute arbitrary commands when Claude Code runs (CVE-2025-59536). Never commit .claude/settings.json to shared repos. Audit all hook commands for suspicious patterns: curl/wget piping to shell, base64 decoding, reverse shells, or backgrounded processes.
References:
- CVE-2025-59536
kc-005 — MCP Config File Injection
Section titled “kc-005 — MCP Config File Injection”Severity: 🔴 Critical | Category: Insecure Configuration | Confidence threshold: 50% | Platforms: All
Detects tools that write to .mcp.json to add new MCP servers — potential supply chain injection
Remediation:
Writing to .mcp.json programmatically can inject attacker-controlled MCP servers into the agent toolchain. MCP server configuration should only be modified by the user directly, never by tools or scripts.
adv-007 — Wildcard Permission in Skill Definition
Section titled “adv-007 — Wildcard Permission in Skill Definition”Severity: 🟠 High | Category: Insecure Configuration | Confidence threshold: 50% | Platforms: All
Detects wildcard permissions (shell:, filesystem:, network:*) in skill or tool definitions
Remediation:
Wildcard permissions grant unrestricted access. A weather tool should not need shell:* or filesystem:*. Request only the minimal permissions required for the tool’s stated purpose (e.g., network:read for a weather API tool).
ic-003 — Default or Hardcoded Credentials in Config Files
Section titled “ic-003 — Default or Hardcoded Credentials in Config Files”Severity: 🟠 High | Category: Insecure Configuration | Confidence threshold: 55% | Platforms: All
Detects default, well-known, or hardcoded credentials in configuration files that should use secrets management instead
Remediation:
Hardcoded credentials in configuration files are a critical security risk. They are committed to version control, visible to all team members, and cannot be rotated without code changes. Use environment variables, secrets managers (Vault, AWS Secrets Manager, Azure Key Vault), or .env files (gitignored). Rotate all credentials that may have been exposed in version history.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM08 Excessive Permissions
- https://atlas.mitre.org/techniques/AML.T0043
ic-005 — Cursor Auto-Execute on Folder Open
Section titled “ic-005 — Cursor Auto-Execute on Folder Open”Severity: 🟠 High | Category: Insecure Configuration | Confidence threshold: 50% | Platforms: cursor, codex
Detects .vscode/tasks.json with runOn:folderOpen that auto-executes shell commands when a project is opened in Cursor/VS Code
Remediation:
Tasks with runOn:folderOpen execute automatically when a project is opened. Attackers commit malicious .vscode/tasks.json files that run arbitrary commands without user interaction. Remove runOn:folderOpen from untrusted projects and review all task commands.
References:
- CVE-2025-59944
ic-006 — Unauthenticated Local WebSocket Endpoint
Section titled “ic-006 — Unauthenticated Local WebSocket Endpoint”Severity: 🟠 High | Category: Insecure Configuration | Confidence threshold: 55% | Platforms: openclaw, mcp, claude
Detects local WebSocket/HTTP server configs bound to loopback without auth — GHSA-qpjj
Remediation:
Loopback-only binding is insufficient. Any website can initiate WebSocket connections to localhost. Require a shared secret token on every WebSocket upgrade request.
ic-001 — Debug Mode Enabled in Production Config
Section titled “ic-001 — Debug Mode Enabled in Production Config”Severity: 🟡 Medium | Category: Insecure Configuration | Confidence threshold: 50% | Platforms: All
Detects debug mode flags enabled in application or agent configurations, which expose stack traces, internal state, and disable security controls
Remediation:
Debug mode exposes detailed error messages, stack traces, and internal state that attackers can use to understand application structure and find vulnerabilities. In production: set DEBUG=false, NODE_ENV=production, and disable verbose error pages. Use structured logging to capture diagnostic information without exposing it to end users.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM08 Excessive Permissions
Known Malicious Patterns
Section titled “Known Malicious Patterns”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
mal-infra-001 | Known Malicious C2/Exfiltration Infrastructure | 🔴 Critical | 30% | All |
mal-infra-002 | Known Malicious GitHub Resources | 🔴 Critical | 30% | All |
mal-sandworm-001 | SANDWORM MCP Config Injection | 🔴 Critical | 40% | All |
mal-skill-001 | Known Malicious Skill Name (Programmatic Campaign) | 🔴 Critical | 30% | openclaw |
mal-skill-002 | Known Malicious Skill (Unicode Contraband / DAN Jailbreaks) | 🔴 Critical | 30% | openclaw |
mal-skill-003 | Known Malicious Skill (Credential Harvesting) | 🔴 Critical | 30% | openclaw |
mal-skill-004 | ClawHavoc Campaign Skills | 🔴 Critical | 30% | openclaw |
mal-skill-005 | ClawHavoc YouTube Imitation Skills | 🔴 Critical | 30% | openclaw |
mal-typo-001 | ClawHub Typosquatting Pattern | 🔴 Critical | 30% | All |
yara-001 | Obfuscated Base64 Payload | 🔴 Critical | 40% | All |
yara-002 | Reverse Shell Pattern | 🔴 Critical | 40% | All |
yara-003 | Credential Stealer Signature | 🔴 Critical | 40% | All |
yara-006 | RAT/Backdoor Pattern | 🔴 Critical | 40% | All |
mal-author-001 | Known Malicious Author | 🟠 High | 30% | openclaw |
mal-updater-001 | Fake Auto-Updater Skill | 🟠 High | 40% | openclaw |
yara-005 | Coin Miner Signature | 🟠 High | 40% | All |
Rule Details
Section titled “Rule Details”mal-infra-001 — Known Malicious C2/Exfiltration Infrastructure
Section titled “mal-infra-001 — Known Malicious C2/Exfiltration Infrastructure”Severity: 🔴 Critical | Category: Known Malicious Patterns | Confidence threshold: 30% | Platforms: All
Code references known malicious command-and-control servers or exfiltration endpoints
Remediation:
This code communicates with known malicious infrastructure. Remove the skill and investigate potential data exfiltration.
mal-infra-002 — Known Malicious GitHub Resources
Section titled “mal-infra-002 — Known Malicious GitHub Resources”Severity: 🔴 Critical | Category: Known Malicious Patterns | Confidence threshold: 30% | Platforms: All
References to GitHub repositories known to host malware payloads
Remediation:
This references a known malware distribution point. Remove the skill and scan your system for compromise indicators.
mal-sandworm-001 — SANDWORM MCP Config Injection
Section titled “mal-sandworm-001 — SANDWORM MCP Config Injection”Severity: 🔴 Critical | Category: Known Malicious Patterns | Confidence threshold: 40% | Platforms: All
Detects MCP server injection patterns used by the SANDWORM_MODE worm to persist in Claude, Cursor, and Continue IDE configs
Remediation:
The SANDWORM worm injects malicious MCP servers into IDE configs (~/.claude/, ~/.cursor/, ~/.continue/) to maintain persistence. If you see unexpected MCP server entries, remove them and audit your npm packages for postinstall scripts that modify IDE configs.
References:
mal-skill-001 — Known Malicious Skill Name (Programmatic Campaign)
Section titled “mal-skill-001 — Known Malicious Skill Name (Programmatic Campaign)”Severity: 🔴 Critical | Category: Known Malicious Patterns | Confidence threshold: 30% | Platforms: openclaw
Skill matches a known malicious skill from the zaycv/Aslaep123 campaigns: programmatic malware distribution via ClawHub
Remediation:
Remove this skill immediately. It is a confirmed malicious package from a known attacker campaign. Report to ClawHub/OpenClaw security team.
mal-skill-002 — Known Malicious Skill (Unicode Contraband / DAN Jailbreaks)
Section titled “mal-skill-002 — Known Malicious Skill (Unicode Contraband / DAN Jailbreaks)”Severity: 🔴 Critical | Category: Known Malicious Patterns | Confidence threshold: 30% | Platforms: openclaw
Skill matches known malicious packages using Unicode contraband and DAN jailbreak techniques
Remediation:
Remove this skill immediately. Uses Unicode contraband to hide malicious instructions and DAN jailbreaks to bypass safety.
mal-skill-003 — Known Malicious Skill (Credential Harvesting)
Section titled “mal-skill-003 — Known Malicious Skill (Credential Harvesting)”Severity: 🔴 Critical | Category: Known Malicious Patterns | Confidence threshold: 30% | Platforms: openclaw
Skill matches known packages that harvest credentials, credit cards, or session data
Remediation:
Remove this skill immediately. It is a confirmed credential-harvesting or data-theft package.
mal-skill-004 — ClawHavoc Campaign Skills
Section titled “mal-skill-004 — ClawHavoc Campaign Skills”Severity: 🔴 Critical | Category: Known Malicious Patterns | Confidence threshold: 30% | Platforms: openclaw
Skill matches known ClawHavoc campaign: reverse shells, direct exfiltration, and YouTube imitation skills
Remediation:
Remove this skill immediately. Part of the ClawHavoc malware campaign with reverse shell and exfiltration capabilities.
mal-skill-005 — ClawHavoc YouTube Imitation Skills
Section titled “mal-skill-005 — ClawHavoc YouTube Imitation Skills”Severity: 🔴 Critical | Category: Known Malicious Patterns | Confidence threshold: 30% | Platforms: openclaw
Skill impersonates YouTube utilities to deliver malware
Remediation:
Remove this skill. It impersonates a YouTube utility to deliver malicious payloads.
mal-typo-001 — ClawHub Typosquatting Pattern
Section titled “mal-typo-001 — ClawHub Typosquatting Pattern”Severity: 🔴 Critical | Category: Known Malicious Patterns | Confidence threshold: 30% | Platforms: All
Detects typosquatted variations of ‘clawhub’ used in malware campaigns
Remediation:
This is a typosquatted version of ClawHub, a known malware distribution technique. Remove the skill and verify your package sources.
yara-001 — Obfuscated Base64 Payload
Section titled “yara-001 — Obfuscated Base64 Payload”Severity: 🔴 Critical | Category: Known Malicious Patterns | Confidence threshold: 40% | Platforms: All
Detects base64 decode combined with dynamic code execution — multi-layer obfuscation
Remediation:
No remediation guidance available.
yara-002 — Reverse Shell Pattern
Section titled “yara-002 — Reverse Shell Pattern”Severity: 🔴 Critical | Category: Known Malicious Patterns | Confidence threshold: 40% | Platforms: All
Detects classic reverse shell byte patterns across languages
Remediation:
No remediation guidance available.
yara-003 — Credential Stealer Signature
Section titled “yara-003 — Credential Stealer Signature”Severity: 🔴 Critical | Category: Known Malicious Patterns | Confidence threshold: 40% | Platforms: All
Detects high-risk credential file access combined with data exfiltration to suspicious targets
Remediation:
No remediation guidance available.
yara-006 — RAT/Backdoor Pattern
Section titled “yara-006 — RAT/Backdoor Pattern”Severity: 🔴 Critical | Category: Known Malicious Patterns | Confidence threshold: 40% | Platforms: All
Detects remote access trojan and backdoor communication patterns
Remediation:
No remediation guidance available.
mal-author-001 — Known Malicious Author
Section titled “mal-author-001 — Known Malicious Author”Severity: 🟠 High | Category: Known Malicious Patterns | Confidence threshold: 30% | Platforms: openclaw
Content authored by a known malicious actor who has published 40+ confirmed malicious skills
Remediation:
Skills by this author should be treated as malicious. Remove immediately and audit your system for compromise.
mal-updater-001 — Fake Auto-Updater Skill
Section titled “mal-updater-001 — Fake Auto-Updater Skill”Severity: 🟠 High | Category: Known Malicious Patterns | Confidence threshold: 40% | Platforms: openclaw
Detects skills masquerading as auto-updaters, a common malware delivery mechanism
Remediation:
Legitimate AI skills do not auto-update themselves. This is likely a malware delivery mechanism. Remove immediately.
yara-005 — Coin Miner Signature
Section titled “yara-005 — Coin Miner Signature”Severity: 🟠 High | Category: Known Malicious Patterns | Confidence threshold: 40% | Platforms: All
Detects cryptocurrency mining code and configuration patterns
Remediation:
No remediation guidance available.
Malware Distribution
Section titled “Malware Distribution”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
adv-013 | Remote Code Fetch and Execute | 🔴 Critical | 50% | All |
malware-002 | Password-Protected Archive Extraction | 🔴 Critical | 50% | All |
malware-003 | Base64-Encoded Command Execution | 🔴 Critical | 50% | All |
malware-004 | Remote Script Piping | 🔴 Critical | 40% | All |
malware-001 | Remote Archive Download | 🟠 High | 60% | All |
malware-005 | System Service Manipulation | 🟠 High | 60% | All |
malware-006 | Fake Prerequisite Installation Instructions | 🟡 Medium | 60% | All |
Rule Details
Section titled “Rule Details”adv-013 — Remote Code Fetch and Execute
Section titled “adv-013 — Remote Code Fetch and Execute”Severity: 🔴 Critical | Category: Malware Distribution | Confidence threshold: 50% | Platforms: All
Detects fetching remote content and executing it via eval/Function — delayed payload delivery
Remediation:
Fetching content from a remote URL and executing it via new Function() or eval() is a classic delayed payload delivery mechanism. The code appears clean at scan time but loads malicious payloads from attacker-controlled servers at runtime.
malware-002 — Password-Protected Archive Extraction
Section titled “malware-002 — Password-Protected Archive Extraction”Severity: 🔴 Critical | Category: Malware Distribution | Confidence threshold: 50% | Platforms: All
Extracts password-protected archives — used to evade static analysis
Remediation:
Password-protected archives are commonly used to evade antivirus and static analysis. This is highly suspicious in an AI agent context.
malware-003 — Base64-Encoded Command Execution
Section titled “malware-003 — Base64-Encoded Command Execution”Severity: 🔴 Critical | Category: Malware Distribution | Confidence threshold: 50% | Platforms: All
Executes base64-encoded commands — used to obfuscate malicious payloads
Remediation:
Base64-encoded execution is a classic obfuscation technique. Decode and review the payload before allowing this skill.
malware-004 — Remote Script Piping
Section titled “malware-004 — Remote Script Piping”Severity: 🔴 Critical | Category: Malware Distribution | Confidence threshold: 40% | Platforms: All
Pipes remote content directly to shell execution — classic malware delivery
Remediation:
Never pipe remote content directly to a shell interpreter. Download, verify, then execute separately.
malware-001 — Remote Archive Download
Section titled “malware-001 — Remote Archive Download”Severity: 🟠 High | Category: Malware Distribution | Confidence threshold: 60% | Platforms: All
Downloads archive files from GitHub releases or remote URLs — common malware delivery vector
Remediation:
Downloading archives from remote URLs is a common malware delivery technique. Verify the source and use package managers instead.
malware-005 — System Service Manipulation
Section titled “malware-005 — System Service Manipulation”Severity: 🟠 High | Category: Malware Distribution | Confidence threshold: 60% | Platforms: All
Modifies system services or daemons — potential persistence mechanism
Remediation:
AI agent skills should not manipulate system services. This may indicate a persistence mechanism.
malware-006 — Fake Prerequisite Installation Instructions
Section titled “malware-006 — Fake Prerequisite Installation Instructions”Severity: 🟡 Medium | Category: Malware Distribution | Confidence threshold: 60% | Platforms: All
Skill documentation instructs users to run suspicious installation commands
Remediation:
Review installation instructions carefully. Legitimate skills should not require manual downloads from unknown sources.
Network Abuse
Section titled “Network Abuse”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
na-007 | Reverse Shell Patterns | 🔴 Critical | 60% | All |
net-001 | Bind Shell | 🔴 Critical | 60% | All |
na-006 | DNS Exfiltration via Long Subdomain Queries | 🟠 High | 55% | All |
na-008 | Cryptocurrency Mining Endpoints | 🟠 High | 60% | All |
na-011 | MCP SSRF — Internal Network Access via Tool Parameters | 🟠 High | 55% | mcp, claude, cursor |
na-012 | Unrestricted gatewayUrl Override (SSRF) | 🟠 High | 65% | openclaw, mcp |
na-013 | Browser CDP Relay Without Auth | 🟠 High | 60% | openclaw |
na-014 | Dangerous URL Scheme in Browser Navigation | 🟠 High | 65% | openclaw |
net-002 | Raw Socket Creation | 🟠 High | 65% | All |
net-003 | SSH Tunneling | 🟠 High | 60% | All |
net-005 | DNS Covert Channel | 🟠 High | 60% | All |
na-009 | Tor Network and Anonymizing Proxy Connections | 🟡 Medium | 55% | All |
net-004 | Proxy and Tor Usage | 🟡 Medium | 65% | All |
na-010 | Non-Standard Port Usage for HTTP/HTTPS | 🟢 Low | 45% | All |
Rule Details
Section titled “Rule Details”na-007 — Reverse Shell Patterns
Section titled “na-007 — Reverse Shell Patterns”Severity: 🔴 Critical | Category: Network Abuse | Confidence threshold: 60% | Platforms: All
Detects reverse shell one-liners that connect back to an attacker-controlled host, providing interactive shell access
Remediation:
Reverse shells provide attackers with interactive command execution on compromised systems. These are unambiguous attack payloads — no legitimate use case exists for reverse shell one-liners in agent code. Remove immediately and investigate the source of this code.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM06 Excessive Agency
- https://atlas.mitre.org/techniques/AML.T0049
net-001 — Bind Shell
Section titled “net-001 — Bind Shell”Severity: 🔴 Critical | Category: Network Abuse | Confidence threshold: 60% | Platforms: All
Detects server-side bind shell patterns that open a listening port for incoming attacker connections
Remediation:
Bind shells open a network listener that an attacker can connect to directly. AI agents should never create raw TCP listeners. Remove all socket.bind/listen and net.createServer patterns unless they are part of a documented, sandboxed service with explicit user consent.
na-006 — DNS Exfiltration via Long Subdomain Queries
Section titled “na-006 — DNS Exfiltration via Long Subdomain Queries”Severity: 🟠 High | Category: Network Abuse | Confidence threshold: 55% | Platforms: All
Detects patterns of DNS exfiltration where data is encoded into unusually long subdomain labels to bypass network monitoring
Remediation:
DNS exfiltration encodes stolen data as subdomains of attacker-controlled domains. Each DNS query carries a fragment of exfiltrated content that bypasses HTTP/HTTPS monitoring. Implement DNS monitoring and block queries with unusually long labels. AI agents must not construct or resolve dynamically-encoded DNS queries.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM02 Insecure Output
- https://atlas.mitre.org/techniques/AML.T0049
na-008 — Cryptocurrency Mining Endpoints
Section titled “na-008 — Cryptocurrency Mining Endpoints”Severity: 🟠 High | Category: Network Abuse | Confidence threshold: 60% | Platforms: All
Detects connections to known cryptocurrency mining pool endpoints and mining-related protocol patterns
Remediation:
Cryptocurrency mining in agent environments consumes unauthorized compute resources and may indicate a broader supply-chain compromise. Remove all mining software, pool connections, and mining algorithm references. Investigate how this code was introduced.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM05 Supply Chain
- https://atlas.mitre.org/techniques/AML.T0049
na-011 — MCP SSRF — Internal Network Access via Tool Parameters
Section titled “na-011 — MCP SSRF — Internal Network Access via Tool Parameters”Severity: 🟠 High | Category: Network Abuse | Confidence threshold: 55% | Platforms: mcp, claude, cursor
Detects SSRF patterns in MCP tool parameters where URLs point to internal/localhost/metadata ranges. 36.7% of MCP servers are vulnerable.
Remediation:
MCP tool parameters must not accept URLs pointing to internal networks, localhost, or cloud metadata endpoints. Implement URL validation and allowlisting on the server side. Block private IP ranges (10.x, 172.16-31.x, 192.168.x), localhost, and metadata endpoints (169.254.169.254).
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM06 Excessive Agency
na-012 — Unrestricted gatewayUrl Override (SSRF)
Section titled “na-012 — Unrestricted gatewayUrl Override (SSRF)”Severity: 🟠 High | Category: Network Abuse | Confidence threshold: 65% | Platforms: openclaw, mcp
Detects gatewayUrl parameters pointing to private/internal addresses or cloud metadata — CVE-2026-26322
Remediation:
The gatewayUrl parameter must be validated against an explicit allowlist. Block all private IP ranges, localhost, and cloud metadata IPs.
References:
- CVE-2026-26322
na-013 — Browser CDP Relay Without Auth
Section titled “na-013 — Browser CDP Relay Without Auth”Severity: 🟠 High | Category: Network Abuse | Confidence threshold: 60% | Platforms: openclaw
Detects /cdp WebSocket endpoints that may lack token validation — GHSA-mr32
Remediation:
CDP relay endpoints must require a shared secret token on every WebSocket upgrade and validate the Origin header. Without both controls, any website can steal session data.
na-014 — Dangerous URL Scheme in Browser Navigation
Section titled “na-014 — Dangerous URL Scheme in Browser Navigation”Severity: 🟠 High | Category: Network Abuse | Confidence threshold: 65% | Platforms: openclaw
Detects file://, javascript:, or data: URL schemes in browser navigation — GHSA-45cg
Remediation:
Browser navigation guards must reject all URL schemes except http:// and https://. Use a deny-by-default approach for URL scheme validation.
net-002 — Raw Socket Creation
Section titled “net-002 — Raw Socket Creation”Severity: 🟠 High | Category: Network Abuse | Confidence threshold: 65% | Platforms: All
Detects creation of raw network sockets that bypass normal OS protocol stacks, enabling packet crafting and sniffing
Remediation:
Raw sockets allow crafting arbitrary network packets and capturing all traffic on an interface. This capability is not required by legitimate AI agents. Remove raw socket usage and use higher-level network APIs instead.
net-003 — SSH Tunneling
Section titled “net-003 — SSH Tunneling”Severity: 🟠 High | Category: Network Abuse | Confidence threshold: 60% | Platforms: All
Detects SSH-based tunneling and port-forwarding patterns used to bypass firewalls or exfiltrate data covertly
Remediation:
SSH tunneling can be used to bypass network controls, exfiltrate data, or grant reverse access to internal systems. AI agents should not establish SSH port-forwards or tunnels. Remove these patterns entirely.
net-005 — DNS Covert Channel
Section titled “net-005 — DNS Covert Channel”Severity: 🟠 High | Category: Network Abuse | Confidence threshold: 60% | Platforms: All
Detects DNS-over-HTTPS used as a covert communication channel and DNS tunneling tools that encode data in DNS queries
Remediation:
DNS covert channels encode data in DNS query subdomains or use DoH endpoints to bypass firewalls while exfiltrating data or maintaining C2 communication. AI agents should not use DNS-over-HTTPS programmatically or invoke DNS tunneling tools. Remove all such patterns and use standard HTTPS APIs instead.
na-009 — Tor Network and Anonymizing Proxy Connections
Section titled “na-009 — Tor Network and Anonymizing Proxy Connections”Severity: 🟡 Medium | Category: Network Abuse | Confidence threshold: 55% | Platforms: All
Detects .onion domain connections and Tor/proxy configurations used to anonymize malicious network activity
Remediation:
Tor and .onion connections are used to anonymize communication with C2 servers and exfiltrate data beyond network monitoring. AI agents must use direct, auditable connections only. Remove all Tor proxy configurations and .onion references.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM02 Insecure Output
- https://atlas.mitre.org/techniques/AML.T0049
net-004 — Proxy and Tor Usage
Section titled “net-004 — Proxy and Tor Usage”Severity: 🟡 Medium | Category: Network Abuse | Confidence threshold: 65% | Platforms: All
Detects use of SOCKS proxies, proxy chaining tools, and the Tor network to anonymize or reroute network traffic
Remediation:
Proxy and Tor usage in agent code can be used to anonymize malicious activity or bypass network monitoring. AI agents should use direct connections only. Remove SOCKS proxy configuration and Tor-related dependencies.
na-010 — Non-Standard Port Usage for HTTP/HTTPS
Section titled “na-010 — Non-Standard Port Usage for HTTP/HTTPS”Severity: 🟢 Low | Category: Network Abuse | Confidence threshold: 45% | Platforms: All
Detects HTTP or HTTPS traffic on non-standard ports, commonly used to bypass firewall rules and evade traffic inspection
Remediation:
Non-standard ports are frequently used to evade port-based firewall rules and network monitoring configured for standard ports (80, 443). Review all network connections using non-standard ports to ensure they are documented and authorized.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM02 Insecure Output
permission-bypass
Section titled “permission-bypass”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
kc-001 | Agent Config Permission Bypass | 🔴 Critical | 50% | All |
kc-002 | Agent Instruction File Rewrite | 🔴 Critical | 50% | All |
kc-003 | MCP Wildcard Permission Grant | 🔴 Critical | 50% | All |
pbp-001 | YOLO Mode / No-Approval Execution | 🔴 Critical | 50% | All |
pbp-003 | Sandbox Escape | 🔴 Critical | 50% | All |
pbp-002 | Full Disk Access Request | 🟠 High | 50% | All |
Rule Details
Section titled “Rule Details”kc-001 — Agent Config Permission Bypass
Section titled “kc-001 — Agent Config Permission Bypass”Severity: 🔴 Critical | Category: permission-bypass | Confidence threshold: 50% | Platforms: All
Detects writes to agent configuration files that disable human approval or enable autonomous mode
Remediation:
Tools that write agent configuration files to disable human approval enable autonomous execution without oversight. CVE-2025-53773 demonstrated this attack. Never allow tools to modify approval settings programmatically.
kc-002 — Agent Instruction File Rewrite
Section titled “kc-002 — Agent Instruction File Rewrite”Severity: 🔴 Critical | Category: permission-bypass | Confidence threshold: 50% | Platforms: All
Detects tools that write to CLAUDE.md, copilot-instructions.md, or other agent instruction files
Remediation:
Writing to agent instruction files (CLAUDE.md, copilot-instructions.md) modifies how AI agents behave. This enables persistent injection of malicious instructions that persist across sessions.
kc-003 — MCP Wildcard Permission Grant
Section titled “kc-003 — MCP Wildcard Permission Grant”Severity: 🔴 Critical | Category: permission-bypass | Confidence threshold: 50% | Platforms: All
Detects MCP server configurations with wildcard permissions granting unrestricted access
Remediation:
MCP server configurations with wildcard permissions grant unrestricted access to system resources. Always use principle of least privilege with specific allowed paths, commands, and hosts.
pbp-001 — YOLO Mode / No-Approval Execution
Section titled “pbp-001 — YOLO Mode / No-Approval Execution”Severity: 🔴 Critical | Category: permission-bypass | Confidence threshold: 50% | Platforms: All
Skill uses —yolo, —full-auto, or similar flags that disable safety confirmations
Remediation:
Disabling safety confirmations removes the human-in-the-loop barrier. If the agent is compromised via prompt injection, bypassed permissions allow arbitrary execution without operator approval.
pbp-003 — Sandbox Escape
Section titled “pbp-003 — Sandbox Escape”Severity: 🔴 Critical | Category: permission-bypass | Confidence threshold: 50% | Platforms: All
Skill explicitly escapes or disables sandboxing
Remediation:
Escaping the sandbox removes containment boundaries. A compromised agent with host-level access can affect the entire system.
pbp-002 — Full Disk Access Request
Section titled “pbp-002 — Full Disk Access Request”Severity: 🟠 High | Category: permission-bypass | Confidence threshold: 50% | Platforms: All
Skill requires or requests macOS Full Disk Access — overly broad system permission
Remediation:
Full Disk Access grants read access to all files on disk, far exceeding what most tools need. Request only the minimum permissions required for the tool’s function.
Permission Overgrant
Section titled “Permission Overgrant”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
perm-002 | Maximum Blast Radius Permission Combo | 🔴 Critical | 70% | openclaw |
po-005 | Agent Filesystem Write to Sensitive Directories | 🔴 Critical | 60% | All |
perm-001 | Wildcard Permission | 🟠 High | 50% | openclaw |
po-004 | MCP Server Wildcard Tool Permissions | 🟠 High | 55% | mcp, claude, openclaw |
po-007 | Allow-All Network Policy | 🟠 High | 55% | All |
perm-003 | Dangerous Tool Declarations | 🟡 Medium | 50% | openclaw |
po-006 | Overly Broad CORS Configuration | 🟡 Medium | 55% | All |
Rule Details
Section titled “Rule Details”perm-002 — Maximum Blast Radius Permission Combo
Section titled “perm-002 — Maximum Blast Radius Permission Combo”Severity: 🔴 Critical | Category: Permission Overgrant | Confidence threshold: 70% | Platforms: openclaw
Skill requests shell + network + filesystem permissions in a permissions block — maximum attack surface
Remediation:
Skills with shell + network + filesystem access can exfiltrate any data. This combination should be carefully reviewed.
po-005 — Agent Filesystem Write to Sensitive Directories
Section titled “po-005 — Agent Filesystem Write to Sensitive Directories”Severity: 🔴 Critical | Category: Permission Overgrant | Confidence threshold: 60% | Platforms: All
Detects agent configurations or code requesting write access to sensitive system directories like /etc, /root, or ~/.ssh
Remediation:
Agents must not write to system directories (/etc, /root, /boot, ~/.ssh). Confine filesystem write permissions to the application’s own data directory. Use explicit path allowlists, never path-prefix grants to system locations.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM08 Excessive Permissions
- https://atlas.mitre.org/techniques/AML.T0049
perm-001 — Wildcard Permission
Section titled “perm-001 — Wildcard Permission”Severity: 🟠 High | Category: Permission Overgrant | Confidence threshold: 50% | Platforms: openclaw
Skill requests wildcard permissions granting unrestricted access
Remediation:
Avoid wildcard permissions. Request only the specific permissions needed (e.g., shell:read, filesystem:home).
po-004 — MCP Server Wildcard Tool Permissions
Section titled “po-004 — MCP Server Wildcard Tool Permissions”Severity: 🟠 High | Category: Permission Overgrant | Confidence threshold: 55% | Platforms: mcp, claude, openclaw
Detects MCP server configurations that request wildcard or all-tools permissions, granting unrestricted tool access
Remediation:
MCP servers must declare the minimum set of tools required. Wildcard tool permissions grant agents access to every registered tool, including dangerous ones. Enumerate the specific tools needed explicitly.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM08 Excessive Permissions
- https://atlas.mitre.org/techniques/AML.T0043
po-007 — Allow-All Network Policy
Section titled “po-007 — Allow-All Network Policy”Severity: 🟠 High | Category: Permission Overgrant | Confidence threshold: 55% | Platforms: All
Detects network policies or firewall rules that permit all inbound or outbound traffic, removing network isolation
Remediation:
Allow-all network policies remove critical isolation for agent environments. Define explicit allowlists for permitted endpoints and ports. Apply zero-trust network principles: deny by default, allow by exception.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM08 Excessive Permissions
- https://atlas.mitre.org/techniques/AML.T0049
perm-003 — Dangerous Tool Declarations
Section titled “perm-003 — Dangerous Tool Declarations”Severity: 🟡 Medium | Category: Permission Overgrant | Confidence threshold: 50% | Platforms: openclaw
Skill declares tools that provide excessive system access
Remediation:
Minimize tool access in skill declarations. Use the most restrictive tools that accomplish the task.
po-006 — Overly Broad CORS Configuration
Section titled “po-006 — Overly Broad CORS Configuration”Severity: 🟡 Medium | Category: Permission Overgrant | Confidence threshold: 55% | Platforms: All
Detects CORS policies that allow all origins, enabling cross-origin attacks on agent APIs
Remediation:
CORS wildcard (Access-Control-Allow-Origin: *) allows any website to make cross-origin requests to your agent API, enabling data theft and CSRF attacks. Restrict allowed origins to an explicit allowlist of trusted domains.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM08 Excessive Permissions
Privilege Escalation
Section titled “Privilege Escalation”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
pe-011 | Sudo/Root Escalation in Agent Config | 🔴 Critical | 60% | All |
pe-013 | Docker Privileged Container or Capability Addition | 🔴 Critical | 60% | All |
pe-014 | AWS IAM Wildcard Permission Policy | 🔴 Critical | 65% | All |
pe-015 | Setuid/Setgid Bit Setting | 🔴 Critical | 65% | All |
privesc-001 | Sudo/Root Command Execution | 🔴 Critical | 85% | All |
privesc-002 | Process Injection Patterns | 🔴 Critical | 90% | All |
privesc-004 | Setuid/Capability Manipulation | 🔴 Critical | 85% | All |
privesc-007 | Kernel Module Loading | 🔴 Critical | 90% | All |
privesc-009 | Container Escape Patterns | 🔴 Critical | 85% | All |
pe-012 | World-Writable File Permission Setting | 🟠 High | 55% | All |
pe-016 | Crontab and Systemd Persistence Installation | 🟠 High | 60% | All |
privesc-003 | Shell Escape Sequences | 🟠 High | 80% | All |
privesc-005 | Cron/Scheduled Task Manipulation | 🟠 High | 80% | All |
privesc-006 | Service/Daemon Manipulation | 🟠 High | 80% | All |
privesc-010 | Debugger Attachment | 🟠 High | 80% | All |
pe-017 | safeBins Trusted Directory in User-Writable Path | 🟡 Medium | 55% | openclaw |
pe-018 | Unvalidated PID Kill Without Ownership Check | 🟡 Medium | 60% | openclaw |
privesc-008 | Environment Path Manipulation | 🟡 Medium | 70% | All |
Rule Details
Section titled “Rule Details”pe-011 — Sudo/Root Escalation in Agent Config
Section titled “pe-011 — Sudo/Root Escalation in Agent Config”Severity: 🔴 Critical | Category: Privilege Escalation | Confidence threshold: 60% | Platforms: All
Detects sudo, root, or administrator escalation commands embedded in agent configurations or scripts
Remediation:
AI agents must not execute commands with elevated privileges. Remove sudo, doas, runas, and run-as-root patterns from agent configurations. Use least-privilege service accounts and grant only the minimum permissions needed.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM06 Excessive Agency
- https://atlas.mitre.org/techniques/AML.T0043
pe-013 — Docker Privileged Container or Capability Addition
Section titled “pe-013 — Docker Privileged Container or Capability Addition”Severity: 🔴 Critical | Category: Privilege Escalation | Confidence threshold: 60% | Platforms: All
Detects Docker —privileged flag or —cap-add usage that grants host-level capabilities to containers
Remediation:
Privileged containers have unrestricted access to the host kernel. Remove —privileged and high-privilege —cap-add flags. Use specific minimal capabilities only when absolutely required and document the justification.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM06 Excessive Agency
- https://atlas.mitre.org/techniques/AML.T0049
pe-014 — AWS IAM Wildcard Permission Policy
Section titled “pe-014 — AWS IAM Wildcard Permission Policy”Severity: 🔴 Critical | Category: Privilege Escalation | Confidence threshold: 65% | Platforms: All
Detects IAM policy statements granting wildcard Action or Resource permissions, violating least-privilege
Remediation:
IAM policies with Action: '' or combined Action: '' + Resource: ’*’ grant full AWS account access. Apply least-privilege: enumerate only the specific actions and resources required. Use IAM Access Analyzer to validate policies.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM08 Excessive Permissions
- https://atlas.mitre.org/techniques/AML.T0049
pe-015 — Setuid/Setgid Bit Setting
Section titled “pe-015 — Setuid/Setgid Bit Setting”Severity: 🔴 Critical | Category: Privilege Escalation | Confidence threshold: 65% | Platforms: All
Detects chmod commands that set the setuid or setgid bit, enabling privilege escalation via SUID binaries
Remediation:
SUID/SGID binaries run with the owner’s privileges regardless of who executes them. AI agents must never set the setuid or setgid bit. Remove all chmod +s and numeric setuid/setgid modes. Audit any binary with these bits already set.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM06 Excessive Agency
- https://atlas.mitre.org/techniques/AML.T0049
privesc-001 — Sudo/Root Command Execution
Section titled “privesc-001 — Sudo/Root Command Execution”Severity: 🔴 Critical | Category: Privilege Escalation | Confidence threshold: 85% | Platforms: All
Detects attempts to execute commands with elevated privileges
Remediation:
AI agents should never execute commands with elevated privileges. Remove sudo/su usage.
privesc-002 — Process Injection Patterns
Section titled “privesc-002 — Process Injection Patterns”Severity: 🔴 Critical | Category: Privilege Escalation | Confidence threshold: 90% | Platforms: All
Detects process injection or DLL injection patterns
Remediation:
Process injection is a serious security concern. This should never be in an AI agent.
privesc-004 — Setuid/Capability Manipulation
Section titled “privesc-004 — Setuid/Capability Manipulation”Severity: 🔴 Critical | Category: Privilege Escalation | Confidence threshold: 85% | Platforms: All
Detects attempts to modify file permissions or capabilities
Remediation:
File permission and capability manipulation can lead to privilege escalation. Remove these.
privesc-007 — Kernel Module Loading
Section titled “privesc-007 — Kernel Module Loading”Severity: 🔴 Critical | Category: Privilege Escalation | Confidence threshold: 90% | Platforms: All
Detects attempts to load kernel modules
Remediation:
Kernel module manipulation is extremely dangerous. This should never be in an AI agent.
privesc-009 — Container Escape Patterns
Section titled “privesc-009 — Container Escape Patterns”Severity: 🔴 Critical | Category: Privilege Escalation | Confidence threshold: 85% | Platforms: All
Detects attempts to escape container environments
Remediation:
Container escape attempts are critical security issues. Remove these patterns.
pe-012 — World-Writable File Permission Setting
Section titled “pe-012 — World-Writable File Permission Setting”Severity: 🟠 High | Category: Privilege Escalation | Confidence threshold: 55% | Platforms: All
Detects chmod commands that set world-writable or world-executable bits, enabling privilege escalation via file replacement
Remediation:
Overly permissive file modes (777, 666) and root ownership changes are privilege escalation enablers. Use the principle of least privilege: 640 for files, 750 for directories. Never grant world-write permissions on files that could be executed.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM06 Excessive Agency
- https://atlas.mitre.org/techniques/AML.T0049
pe-016 — Crontab and Systemd Persistence Installation
Section titled “pe-016 — Crontab and Systemd Persistence Installation”Severity: 🟠 High | Category: Privilege Escalation | Confidence threshold: 60% | Platforms: All
Detects crontab modifications and systemd unit installations used to establish persistent backdoor execution
Remediation:
Crontab and systemd service installation are common persistence mechanisms. AI agents must not modify scheduled tasks or install services without explicit user authorization. Remove crontab edits and systemctl enable calls. Review any @reboot entries — these survive reboots and are hard to detect.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM06 Excessive Agency
- https://atlas.mitre.org/techniques/AML.T0043
privesc-003 — Shell Escape Sequences
Section titled “privesc-003 — Shell Escape Sequences”Severity: 🟠 High | Category: Privilege Escalation | Confidence threshold: 80% | Platforms: All
Detects attempts to escape restricted shells
Remediation:
Remove shell escape patterns. These attempt to break out of restricted environments.
privesc-005 — Cron/Scheduled Task Manipulation
Section titled “privesc-005 — Cron/Scheduled Task Manipulation”Severity: 🟠 High | Category: Privilege Escalation | Confidence threshold: 80% | Platforms: All
Detects modification of scheduled tasks
Remediation:
Scheduled task modification should not be performed by AI agents without explicit permission.
privesc-006 — Service/Daemon Manipulation
Section titled “privesc-006 — Service/Daemon Manipulation”Severity: 🟠 High | Category: Privilege Escalation | Confidence threshold: 80% | Platforms: All
Detects attempts to modify system services
Remediation:
System service manipulation requires careful review. Ensure this is intended behavior.
privesc-010 — Debugger Attachment
Section titled “privesc-010 — Debugger Attachment”Severity: 🟠 High | Category: Privilege Escalation | Confidence threshold: 80% | Platforms: All
Detects attempts to attach debuggers to processes
Remediation:
Debugger attachment can be used for privilege escalation. Review this carefully.
pe-017 — safeBins Trusted Directory in User-Writable Path
Section titled “pe-017 — safeBins Trusted Directory in User-Writable Path”Severity: 🟡 Medium | Category: Privilege Escalation | Confidence threshold: 55% | Platforms: openclaw
Detects exec-allowlist configs trusting user-writable package manager directories — GHSA-5gj7
Remediation:
safeBins must only trust immutable system directories (/bin, /usr/bin, /sbin). Package manager paths like /opt/homebrew/bin are writable and must not be in the default trusted set.
pe-018 — Unvalidated PID Kill Without Ownership Check
Section titled “pe-018 — Unvalidated PID Kill Without Ownership Check”Severity: 🟡 Medium | Category: Privilege Escalation | Confidence threshold: 60% | Platforms: openclaw
Detects process termination via pattern matching without verifying process ownership — CVE-2026-27486
Remediation:
Before sending SIGKILL, validate that the process is a direct child (ppid == process.pid). Never use pkill/killall as the sole process selector on shared systems.
References:
- CVE-2026-27486
privesc-008 — Environment Path Manipulation
Section titled “privesc-008 — Environment Path Manipulation”Severity: 🟡 Medium | Category: Privilege Escalation | Confidence threshold: 70% | Platforms: All
Detects PATH or library path manipulation
Remediation:
PATH manipulation can lead to binary hijacking. Review environment variable changes.
Prompt Injection
Section titled “Prompt Injection”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
adv-009 | Hidden Prompt Injection in HTML Content | 🔴 Critical | 50% | All |
adv-011 | Agent Backstory with External Data Exfiltration | 🟠 High | 55% | All |
prompt-001 | Instruction Override in Tool Description | 🟠 High | 75% | All |
prompt-002 | System Prompt Extraction | 🟠 High | 80% | All |
prompt-004 | Hidden Instructions in Unicode | 🟠 High | 85% | All |
prompt-009 | Recursive Prompt Injection | 🟠 High | 80% | All |
prompt-012 | Non-Latin Override Instructions | 🟠 High | 60% | All |
prompt-013 | Unicode Tag Characters | 🟠 High | 70% | All |
prompt-003 | Role Manipulation | 🟡 Medium | 70% | All |
prompt-005 | Delimiter Injection | 🟡 Medium | 70% | All |
prompt-006 | Encoded Instruction Injection | 🟡 Medium | 75% | All |
prompt-007 | Context Manipulation | 🟡 Medium | 75% | All |
prompt-010 | Markdown/HTML Injection | 🟡 Medium | 70% | All |
prompt-011 | Homoglyph Mixed-Script Attack | 🟡 Medium | 50% | All |
prompt-014 | Hypothetical Framing Prompt Injection | 🟡 Medium | 60% | All |
prompt-015 | Unsafe Markdown HTML Rendering (XSS via innerHTML) | 🟡 Medium | 60% | openclaw, claude, mcp |
prompt-016 | External Metadata Injected into System Prompt | 🟡 Medium | 60% | openclaw |
prompt-008 | Output Format Manipulation | 🟢 Low | 65% | All |
Rule Details
Section titled “Rule Details”adv-009 — Hidden Prompt Injection in HTML Content
Section titled “adv-009 — Hidden Prompt Injection in HTML Content”Severity: 🔴 Critical | Category: Prompt Injection | Confidence threshold: 50% | Platforms: All
Detects hidden prompt injection payloads in HTML using display:none or invisible text targeting AI agents
Remediation:
Hidden HTML elements containing instructions like [SYSTEM OVERRIDE] are indirect prompt injection attacks. Attackers embed invisible text in web pages that AI agents process, causing the agent to execute unauthorized actions. All tool output from web fetching should be sanitized and hidden content stripped before processing.
adv-011 — Agent Backstory with External Data Exfiltration
Section titled “adv-011 — Agent Backstory with External Data Exfiltration”Severity: 🟠 High | Category: Prompt Injection | Confidence threshold: 55% | Platforms: All
Detects CrewAI/agent backstory text that directs the agent to send data to external services
Remediation:
Agent backstory text that instructs sending data to external URLs is a social engineering attack disguised as agent configuration. Backstories should describe the agent’s persona, not contain operational directives to exfiltrate data.
prompt-001 — Instruction Override in Tool Description
Section titled “prompt-001 — Instruction Override in Tool Description”Severity: 🟠 High | Category: Prompt Injection | Confidence threshold: 75% | Platforms: All
Detects prompt injection patterns in tool/skill descriptions
Remediation:
Remove instruction override patterns from descriptions. These are prompt injection attempts.
prompt-002 — System Prompt Extraction
Section titled “prompt-002 — System Prompt Extraction”Severity: 🟠 High | Category: Prompt Injection | Confidence threshold: 80% | Platforms: All
Detects attempts to extract system prompts
Remediation:
Remove prompt extraction attempts. These try to reveal confidential instructions.
prompt-004 — Hidden Instructions in Unicode
Section titled “prompt-004 — Hidden Instructions in Unicode”Severity: 🟠 High | Category: Prompt Injection | Confidence threshold: 85% | Platforms: All
Detects hidden instructions using Unicode tricks
Remediation:
Remove invisible Unicode characters. These may hide malicious instructions.
prompt-009 — Recursive Prompt Injection
Section titled “prompt-009 — Recursive Prompt Injection”Severity: 🟠 High | Category: Prompt Injection | Confidence threshold: 80% | Platforms: All
Detects prompts designed to inject into future contexts
Remediation:
Remove recursive injection patterns. These attempt to persist malicious instructions.
prompt-012 — Non-Latin Override Instructions
Section titled “prompt-012 — Non-Latin Override Instructions”Severity: 🟠 High | Category: Prompt Injection | Confidence threshold: 60% | Platforms: All
Detects override keywords combined with non-Latin script characters
Remediation:
Remove override instructions combined with non-Latin text. This is a multi-lingual injection technique to bypass Latin-only filters.
prompt-013 — Unicode Tag Characters
Section titled “prompt-013 — Unicode Tag Characters”Severity: 🟠 High | Category: Prompt Injection | Confidence threshold: 70% | Platforms: All
Detects Unicode tag characters (U+E0001-U+E007F) used to hide invisible markup
Remediation:
Remove Unicode tag characters. These are invisible characters that can hide malicious instructions.
prompt-003 — Role Manipulation
Section titled “prompt-003 — Role Manipulation”Severity: 🟡 Medium | Category: Prompt Injection | Confidence threshold: 70% | Platforms: All
Detects attempts to change AI behavior through role play
Remediation:
Remove role manipulation patterns. These attempt to bypass AI safety measures.
prompt-005 — Delimiter Injection
Section titled “prompt-005 — Delimiter Injection”Severity: 🟡 Medium | Category: Prompt Injection | Confidence threshold: 70% | Platforms: All
Detects attempts to break out of delimiters
Remediation:
Remove fake delimiters that attempt to inject system-level instructions.
prompt-006 — Encoded Instruction Injection
Section titled “prompt-006 — Encoded Instruction Injection”Severity: 🟡 Medium | Category: Prompt Injection | Confidence threshold: 75% | Platforms: All
Detects encoded or obfuscated instructions
Remediation:
Remove encoded instructions. These attempt to bypass content filtering.
prompt-007 — Context Manipulation
Section titled “prompt-007 — Context Manipulation”Severity: 🟡 Medium | Category: Prompt Injection | Confidence threshold: 75% | Platforms: All
Detects false authorization claims and fake privilege escalation in tool descriptions
Remediation:
Remove context manipulation attempts. These try to mislead the AI about user intent.
prompt-010 — Markdown/HTML Injection
Section titled “prompt-010 — Markdown/HTML Injection”Severity: 🟡 Medium | Category: Prompt Injection | Confidence threshold: 70% | Platforms: All
Detects attempts to inject via markdown or HTML
Remediation:
Sanitize markdown and HTML content. These may execute malicious code.
prompt-011 — Homoglyph Mixed-Script Attack
Section titled “prompt-011 — Homoglyph Mixed-Script Attack”Severity: 🟡 Medium | Category: Prompt Injection | Confidence threshold: 50% | Platforms: All
Detects Cyrillic/Greek/Armenian characters mixed with Latin text (homoglyph attacks)
Remediation:
Remove mixed-script text. Homoglyph attacks use visually identical characters from different scripts to bypass filters.
prompt-014 — Hypothetical Framing Prompt Injection
Section titled “prompt-014 — Hypothetical Framing Prompt Injection”Severity: 🟡 Medium | Category: Prompt Injection | Confidence threshold: 60% | Platforms: All
Detects hypothetical/imaginative framing used to bypass safety guardrails by asking the AI to imagine having access to restricted resources
Remediation:
Hypothetical framing is a prompt injection technique where the attacker asks the AI to imagine having elevated access. The AI may then act on the hypothetical scenario as if it were real. Tool descriptions should never contain hypothetical prompts or imaginative framing of access to restricted resources.
prompt-015 — Unsafe Markdown HTML Rendering (XSS via innerHTML)
Section titled “prompt-015 — Unsafe Markdown HTML Rendering (XSS via innerHTML)”Severity: 🟡 Medium | Category: Prompt Injection | Confidence threshold: 60% | Platforms: openclaw, claude, mcp
Detects markdown parsers rendering directly to innerHTML without sanitization — GHSA-r294
Remediation:
Never render user-controlled markdown directly to innerHTML without HTML sanitization. Use DOMPurify or override the HTML token renderer in marked.js.
References:
- GHSA-r294-2894-92j3
prompt-016 — External Metadata Injected into System Prompt
Section titled “prompt-016 — External Metadata Injected into System Prompt”Severity: 🟡 Medium | Category: Prompt Injection | Confidence threshold: 60% | Platforms: openclaw
Detects Slack/channel metadata interpolated into system prompts — CVE-2026-24764
Remediation:
External metadata from third-party platforms must never be interpolated into system prompts. This creates a prompt injection channel for anyone with channel edit permissions.
References:
- CVE-2026-24764
prompt-008 — Output Format Manipulation
Section titled “prompt-008 — Output Format Manipulation”Severity: 🟢 Low | Category: Prompt Injection | Confidence threshold: 65% | Platforms: All
Detects attempts to control AI output format maliciously
Remediation:
Review output format instructions. Some may attempt to suppress safety warnings.
Secret Detection
Section titled “Secret Detection”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
sec-007 | Stripe Live Secret Key | 🔴 Critical | 55% | All |
sec-009 | Stripe Restricted Key | 🔴 Critical | 55% | All |
sec-010 | Square Application Secret | 🔴 Critical | 60% | All |
sec-011 | PayPal / Braintree Credentials | 🔴 Critical | 60% | All |
sec-035 | HashiCorp Vault Token | 🔴 Critical | 60% | All |
sec-037 | Cloudflare API Token and Key | 🔴 Critical | 60% | All |
sec-038 | Base64-Encoded Private Key | 🔴 Critical | 70% | All |
sec-056 | Supabase Service Role Key (Inline) | 🔴 Critical | 55% | All |
sec-001 | Azure Storage Account Key | 🟠 High | 70% | All |
sec-002 | Azure SAS Token | 🟠 High | 65% | All |
sec-003 | Azure Active Directory Client Secret | 🟠 High | 60% | All |
sec-004 | Azure Subscription Key (Cognitive Services / API Management) | 🟠 High | 65% | All |
sec-005 | Alibaba Cloud Access Key | 🟠 High | 70% | All |
sec-006 | IBM Cloud API Key | 🟠 High | 70% | All |
sec-008 | Stripe Live Publishable Key | 🟠 High | 60% | All |
sec-012 | Twilio Account SID and Auth Token | 🟠 High | 65% | All |
sec-013 | SendGrid API Key | 🟠 High | 60% | All |
sec-014 | Mailgun API Key | 🟠 High | 65% | All |
sec-016 | Postmark Server Token | 🟠 High | 65% | All |
sec-017 | Heroku API Key | 🟠 High | 65% | All |
sec-018 | DigitalOcean Personal Access Token | 🟠 High | 60% | All |
sec-019 | Terraform Cloud Token | 🟠 High | 65% | All |
sec-021 | CircleCI API Token | 🟠 High | 65% | All |
sec-022 | Travis CI API Token | 🟠 High | 65% | All |
sec-024 | Vercel API Token | 🟠 High | 60% | All |
sec-025 | Discord Bot Token | 🟠 High | 65% | All |
sec-027 | Twitch API Credentials | 🟠 High | 65% | All |
sec-028 | Telegram Bot Token | 🟠 High | 65% | All |
sec-029 | Facebook / Meta App Secret | 🟠 High | 60% | All |
sec-030 | Firebase API Key | 🟠 High | 60% | All |
sec-031 | Algolia Admin API Key | 🟠 High | 65% | All |
sec-034 | Datadog API and Application Keys | 🟠 High | 65% | All |
sec-036 | Consul ACL Token | 🟠 High | 65% | All |
sec-039 | Hardcoded JWT Token | 🟠 High | 55% | All |
sec-041 | Generic Secret in URL Query Parameter | 🟠 High | 70% | All |
sec-045 | Shopify Access Token | 🟠 High | 65% | All |
sec-046 | Okta API Token | 🟠 High | 65% | All |
sec-048 | Elastic Cloud API Key | 🟠 High | 65% | All |
sec-052 | Pinecone API Key | 🟠 High | 65% | All |
sec-053 | Cohere API Key | 🟠 High | 65% | All |
sec-054 | Hugging Face Token | 🟠 High | 60% | All |
sec-055 | Replicate API Token | 🟠 High | 65% | All |
sec-015 | Mailchimp API Key | 🟡 Medium | 65% | All |
sec-020 | Sentry DSN | 🟡 Medium | 70% | All |
sec-023 | Codecov Upload Token | 🟡 Medium | 65% | All |
sec-026 | Discord Webhook URL | 🟡 Medium | 70% | All |
sec-032 | Segment Write Key | 🟡 Medium | 65% | All |
sec-033 | Mixpanel Token and Secret | 🟡 Medium | 65% | All |
sec-040 | Generic API Key Assignment | 🟡 Medium | 75% | All |
sec-042 | High-Entropy Hex String Assigned to Secret Variable | 🟡 Medium | 75% | All |
sec-043 | PagerDuty Integration Key | 🟡 Medium | 65% | All |
sec-044 | Zendesk API Token | 🟡 Medium | 65% | All |
sec-047 | Atlassian API Token | 🟡 Medium | 65% | All |
sec-049 | Airtable API Key | 🟡 Medium | 65% | All |
sec-050 | Linear API Key | 🟡 Medium | 65% | All |
sec-051 | Notion Integration Token | 🟡 Medium | 65% | All |
sec-057 | Pusher Application Secret | 🟡 Medium | 65% | All |
sec-058 | Amplitude API Key and Secret | 🟡 Medium | 65% | All |
sec-059 | Mapbox Access Token | 🟡 Medium | 65% | All |
sec-060 | Intercom Access Token | 🟡 Medium | 65% | All |
Rule Details
Section titled “Rule Details”sec-007 — Stripe Live Secret Key
Section titled “sec-007 — Stripe Live Secret Key”Severity: 🔴 Critical | Category: Secret Detection | Confidence threshold: 55% | Platforms: All
Detects Stripe live-mode secret API keys which allow full account access
Remediation:
This is a critical incident. A Stripe live secret key can create charges, access customer data, and perform refunds. Immediately:
- Roll the key in the Stripe dashboard (Developers > API keys)
- Audit recent API calls for unauthorized activity
- Store keys exclusively in environment variables or a secrets manager
sec-009 — Stripe Restricted Key
Section titled “sec-009 — Stripe Restricted Key”Severity: 🔴 Critical | Category: Secret Detection | Confidence threshold: 55% | Platforms: All
Detects Stripe restricted API keys
Remediation:
Roll the restricted key immediately in the Stripe dashboard (Developers > API keys). Even restricted keys can perform sensitive operations within their scope. Store all Stripe keys in a secrets manager, never in source code.
sec-010 — Square Application Secret
Section titled “sec-010 — Square Application Secret”Severity: 🔴 Critical | Category: Secret Detection | Confidence threshold: 60% | Platforms: All
Detects Square OAuth application secrets and access tokens
Remediation:
Revoke the exposed Square credential immediately in the Square Developer dashboard under OAuth > Applications. Square application secrets can be used to impersonate your application. Rotate and store exclusively in a secrets manager or environment variables.
sec-011 — PayPal / Braintree Credentials
Section titled “sec-011 — PayPal / Braintree Credentials”Severity: 🔴 Critical | Category: Secret Detection | Confidence threshold: 60% | Platforms: All
Detects PayPal REST API client secrets and Braintree tokens
Remediation:
Revoke PayPal/Braintree credentials immediately in their respective dashboards. These credentials can process financial transactions. Use environment variables or a secrets manager and enforce secret scanning on your repositories.
sec-035 — HashiCorp Vault Token
Section titled “sec-035 — HashiCorp Vault Token”Severity: 🔴 Critical | Category: Secret Detection | Confidence threshold: 60% | Platforms: All
Detects Vault service tokens and batch tokens used for secrets access
Remediation:
Revoke the Vault token immediately using vault token revoke <token> or
via the Vault UI. Vault tokens can access any secret in their policy scope.
Use short-TTL tokens, AppRole authentication, or Kubernetes auth instead of
static tokens.
sec-037 — Cloudflare API Token and Key
Section titled “sec-037 — Cloudflare API Token and Key”Severity: 🔴 Critical | Category: Secret Detection | Confidence threshold: 60% | Platforms: All
Detects Cloudflare API tokens and global API keys
Remediation:
Revoke the Cloudflare token in My Profile > API Tokens. Cloudflare global API keys have access to your entire account including DNS, WAF, and Workers. Use scoped API tokens (not the global key) and grant only the permissions required for the specific use case.
sec-038 — Base64-Encoded Private Key
Section titled “sec-038 — Base64-Encoded Private Key”Severity: 🔴 Critical | Category: Secret Detection | Confidence threshold: 70% | Platforms: All
Detects base64-encoded PEM private keys used to obfuscate credentials
Remediation:
A base64-encoded private key is just as sensitive as the raw PEM key. Remove it from code immediately, rotate the key pair, and use a secrets manager or environment variable to provide keys at runtime. Encoding is not encryption and provides no security benefit.
sec-056 — Supabase Service Role Key (Inline)
Section titled “sec-056 — Supabase Service Role Key (Inline)”Severity: 🔴 Critical | Category: Secret Detection | Confidence threshold: 55% | Platforms: All
Detects full Supabase service role JWT tokens hardcoded inline
Remediation:
The Supabase service role key bypasses Row Level Security on all tables. Rotate it immediately in the Supabase dashboard under Settings > API. Never expose it in client-side code or commit it to version control. Use the anon key for client-side access and apply strict RLS policies.
sec-001 — Azure Storage Account Key
Section titled “sec-001 — Azure Storage Account Key”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 70% | Platforms: All
Detects Azure Storage account access keys embedded in code or config
Remediation:
Never hardcode Azure Storage keys. Use managed identities, Azure Key Vault, or environment variables instead:
- Assign the Storage Blob Data Contributor role to your managed identity
- Reference secrets via Key Vault references in App Service configuration
- Rotate the exposed key immediately in the Azure portal
sec-002 — Azure SAS Token
Section titled “sec-002 — Azure SAS Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Azure Shared Access Signature tokens which grant time-limited storage access
Remediation:
SAS tokens provide direct access to Azure resources. If exposed:
- Revoke the SAS token by regenerating the storage account key it was derived from
- Use short-lived SAS tokens generated server-side on demand
- Prefer managed identities over SAS tokens for service-to-service access
sec-003 — Azure Active Directory Client Secret
Section titled “sec-003 — Azure Active Directory Client Secret”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 60% | Platforms: All
Detects Azure AD application client secrets used for service principal authentication
Remediation:
Rotate the Azure AD client secret immediately in the Azure portal under App Registrations > Certificates & secrets. Switch to certificate-based authentication or managed identities to avoid secret rotation entirely.
sec-004 — Azure Subscription Key (Cognitive Services / API Management)
Section titled “sec-004 — Azure Subscription Key (Cognitive Services / API Management)”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Azure API Management or Cognitive Services subscription keys
Remediation:
Regenerate the exposed subscription key in Azure API Management or Cognitive Services. Use Azure Key Vault to store and retrieve keys at runtime rather than embedding them in source or config files.
sec-005 — Alibaba Cloud Access Key
Section titled “sec-005 — Alibaba Cloud Access Key”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 70% | Platforms: All
Detects Alibaba Cloud (Aliyun) access key IDs and secrets
Remediation:
Revoke the exposed Alibaba Cloud access key in the RAM console immediately. Use RAM roles with STS temporary credentials or instance RAM roles instead of long-lived access key pairs.
sec-006 — IBM Cloud API Key
Section titled “sec-006 — IBM Cloud API Key”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 70% | Platforms: All
Detects IBM Cloud IAM API keys
Remediation:
Revoke the IBM Cloud API key in the IAM console (Manage > Access > API keys). Use service IDs with IAM policies scoped to the minimum required permissions and generate keys via the API rather than storing static keys.
sec-008 — Stripe Live Publishable Key
Section titled “sec-008 — Stripe Live Publishable Key”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 60% | Platforms: All
Detects Stripe live-mode publishable keys which can be used to initiate payments
Remediation:
While publishable keys are designed for client-side use, they should not appear in server-side secrets files or VCS history. If the corresponding secret key is also exposed, treat this as a critical incident. Roll both keys in the Stripe dashboard.
sec-012 — Twilio Account SID and Auth Token
Section titled “sec-012 — Twilio Account SID and Auth Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Twilio account credentials which enable SMS/voice actions and billing
Remediation:
Rotate the Twilio auth token immediately in the Twilio Console under Account > General Settings. Auth tokens can send SMS/calls charged to your account. Use API Keys (more limited scope) instead of auth tokens where possible.
sec-013 — SendGrid API Key
Section titled “sec-013 — SendGrid API Key”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 60% | Platforms: All
Detects SendGrid email API keys which allow sending emails on your behalf
Remediation:
Delete the exposed SendGrid API key immediately in Settings > API Keys. Create a replacement key with the minimum required permissions (e.g., Mail Send only). Store in environment variables or a secrets manager.
sec-014 — Mailgun API Key
Section titled “sec-014 — Mailgun API Key”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Mailgun email service API keys
Remediation:
Rotate the Mailgun API key in the Mailgun Control Panel under Settings > API Keys. Use domain-level sending keys rather than the primary account API key to limit the blast radius of a leak.
sec-016 — Postmark Server Token
Section titled “sec-016 — Postmark Server Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Postmark transactional email server API tokens
Remediation:
Regenerate the Postmark server token in the Postmark app under Servers > API Tokens. Use separate tokens per environment and store in a secrets manager.
sec-017 — Heroku API Key
Section titled “sec-017 — Heroku API Key”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Heroku platform API keys which allow full account management
Remediation:
Revoke the Heroku API key in Account Settings > API Key. Heroku API keys grant full control over all your apps and pipelines. Use OAuth tokens with limited scopes for CI/CD automation instead.
sec-018 — DigitalOcean Personal Access Token
Section titled “sec-018 — DigitalOcean Personal Access Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 60% | Platforms: All
Detects DigitalOcean API tokens for infrastructure management
Remediation:
Delete the token in DigitalOcean API Settings and generate a new one with read-only or scoped access. DigitalOcean tokens with write access can create/destroy Droplets and databases.
sec-019 — Terraform Cloud Token
Section titled “sec-019 — Terraform Cloud Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Terraform Cloud / Enterprise API tokens
Remediation:
Revoke the token in Terraform Cloud under User/Organization Settings > Tokens. Terraform tokens can apply infrastructure changes. Use short-lived tokens and machine users (team tokens) for CI pipelines rather than personal tokens.
sec-021 — CircleCI API Token
Section titled “sec-021 — CircleCI API Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects CircleCI personal or project API tokens
Remediation:
Delete the CircleCI personal API token in User Settings > Personal API Tokens. CircleCI tokens with project access can trigger pipelines and read secrets. Use project-scoped tokens and rotate them regularly.
sec-022 — Travis CI API Token
Section titled “sec-022 — Travis CI API Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Travis CI authentication tokens
Remediation:
Regenerate the Travis CI token in Profile > Settings > API Authentication. Ensure Travis CI environment variables containing secrets are marked as hidden and not displayed in build logs.
sec-024 — Vercel API Token
Section titled “sec-024 — Vercel API Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 60% | Platforms: All
Detects Vercel deployment and management API tokens
Remediation:
Delete the Vercel token in Account Settings > Tokens. Vercel tokens can deploy code and manage projects. Use team-scoped tokens with the minimum required access level for CI/CD pipelines.
sec-025 — Discord Bot Token
Section titled “sec-025 — Discord Bot Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Discord bot tokens which allow full bot account control
Remediation:
Reset the bot token immediately in the Discord Developer Portal under Applications > Bot > Reset Token. Anyone with the token can act as your bot, join servers, and send messages. Rotate and store in environment variables only.
sec-027 — Twitch API Credentials
Section titled “sec-027 — Twitch API Credentials”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Twitch application client secrets and OAuth tokens
Remediation:
Revoke the Twitch application secret in the Twitch Developer Console. OAuth tokens should be treated as passwords and stored only in secure server-side secret stores.
sec-028 — Telegram Bot Token
Section titled “sec-028 — Telegram Bot Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Telegram bot API tokens issued by BotFather
Remediation:
Request a new token from Telegram’s BotFather using /revoke. Anyone with the bot token can read all messages sent to the bot and send messages as it. Never commit bot tokens to version control.
sec-029 — Facebook / Meta App Secret
Section titled “sec-029 — Facebook / Meta App Secret”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 60% | Platforms: All
Detects Facebook/Meta application secrets and access tokens
Remediation:
Rotate the Facebook app secret in the Meta Developer Console under App Settings > Basic. App secrets can generate user access tokens and make server-side API calls. Treat them as passwords and never expose them in client-side code.
sec-030 — Firebase API Key
Section titled “sec-030 — Firebase API Key”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 60% | Platforms: All
Detects Firebase project configuration API keys and service account credentials
Remediation:
Firebase Web API keys are intended for client-side use but should be restricted in the Google Cloud Console to specific HTTP referrers or IP addresses. For server-side access, use Firebase Admin SDK with a service account and store the private key in a secrets manager. Restrict Firebase security rules to prevent unauthorized database/storage access.
sec-031 — Algolia Admin API Key
Section titled “sec-031 — Algolia Admin API Key”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Algolia admin API keys which provide full index management access
Remediation:
Rotate the Algolia admin key in API Keys settings. The admin key can add, delete, and modify all records and indices. Use search-only or restricted API keys for client-side use, and never expose admin keys in frontend code.
sec-034 — Datadog API and Application Keys
Section titled “sec-034 — Datadog API and Application Keys”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Datadog API keys and application keys for monitoring access
Remediation:
Revoke and regenerate keys in Datadog Organization Settings > API Keys. Datadog application keys have broad read/write access to metrics, logs, and monitors. Use scoped API keys and rotate them on a schedule.
sec-036 — Consul ACL Token
Section titled “sec-036 — Consul ACL Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects HashiCorp Consul access control list tokens
Remediation:
Revoke the Consul ACL token via the Consul API or UI. Consul tokens control access to service discovery and KV store. Rotate bootstrap tokens immediately and use scoped service tokens for application access.
sec-039 — Hardcoded JWT Token
Section titled “sec-039 — Hardcoded JWT Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 55% | Platforms: All
Detects live JWT tokens hardcoded in source code
Remediation:
JWTs contain identity and authorization claims. A hardcoded JWT is valid until it expires or the signing key is rotated. Identify the issuer from the decoded payload, revoke or invalidate the token if possible, and rotate the JWT signing secret/key immediately.
sec-041 — Generic Secret in URL Query Parameter
Section titled “sec-041 — Generic Secret in URL Query Parameter”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 70% | Platforms: All
Detects secrets embedded directly in URL query strings
Remediation:
Never pass secrets as URL query parameters. They are logged by web servers, proxies, and browsers. Use HTTP Authorization headers or POST body instead. Rotate any exposed secrets immediately.
sec-045 — Shopify Access Token
Section titled “sec-045 — Shopify Access Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Shopify private app and OAuth access tokens
Remediation:
Revoke the Shopify token in Partners > Apps or in the store admin under Apps > App and sales channel settings. Shopify tokens can read/write orders, customers, and inventory. Rotate immediately if exposed.
sec-046 — Okta API Token
Section titled “sec-046 — Okta API Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Okta identity platform API tokens
Remediation:
Revoke the Okta API token in Security > API > Tokens. Okta tokens with admin privileges can manage users and applications. Use OAuth 2.0 service apps instead of SSWS tokens for non-human access.
sec-048 — Elastic Cloud API Key
Section titled “sec-048 — Elastic Cloud API Key”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Elasticsearch/Elastic Cloud API keys and credentials
Remediation:
Invalidate the API key via the Elasticsearch API: DELETE /_security/api_key with the key ID. Create replacement keys with minimal index privileges and source IP restrictions where possible.
sec-052 — Pinecone API Key
Section titled “sec-052 — Pinecone API Key”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Pinecone vector database API keys
Remediation:
Revoke the Pinecone API key in the Pinecone console under API Keys. Pinecone keys can upsert, query, and delete vector embeddings. Rotate immediately and store replacements in environment variables or a secrets manager.
sec-053 — Cohere API Key
Section titled “sec-053 — Cohere API Key”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Cohere AI API keys for NLP model access
Remediation:
Revoke the Cohere API key in the Cohere dashboard under API Keys. API keys can be used to invoke paid LLM endpoints. Create a new key and store it exclusively in environment variables or a secrets manager.
sec-054 — Hugging Face Token
Section titled “sec-054 — Hugging Face Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 60% | Platforms: All
Detects Hugging Face user access tokens for model hub and inference API
Remediation:
Revoke the token in Hugging Face Account Settings > Access Tokens. Tokens with write access can modify model repositories and datasets. Use read-only tokens for inference workloads and store in a secrets manager.
sec-055 — Replicate API Token
Section titled “sec-055 — Replicate API Token”Severity: 🟠 High | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Replicate AI inference platform API tokens
Remediation:
Revoke the Replicate API token in Account Settings > API Tokens. Tokens can be used to run paid model predictions. Create a replacement and store in environment variables or a secrets manager.
sec-015 — Mailchimp API Key
Section titled “sec-015 — Mailchimp API Key”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Mailchimp marketing API keys
Remediation:
Revoke the Mailchimp API key in Account > Extras > API Keys. Create a new key with read-only access where possible and store in environment variables.
sec-020 — Sentry DSN
Section titled “sec-020 — Sentry DSN”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 70% | Platforms: All
Detects Sentry Data Source Names which expose project identifiers and can ingest events
Remediation:
Sentry DSNs are semi-public (client-side use is expected) but should not appear in server-side secret stores or allow event submission from untrusted sources. Enable rate limiting and trusted domain filtering in Sentry project settings. For server-side Sentry auth tokens, treat as high severity.
sec-023 — Codecov Upload Token
Section titled “sec-023 — Codecov Upload Token”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Codecov coverage upload tokens
Remediation:
Regenerate the Codecov token in repository settings. Codecov tokens can be used to upload falsified coverage reports; always store them as CI secrets.
sec-026 — Discord Webhook URL
Section titled “sec-026 — Discord Webhook URL”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 70% | Platforms: All
Detects Discord webhook URLs which allow posting messages to channels
Remediation:
Delete the webhook in Discord channel settings and recreate it. Discord webhooks can be used to spam channels or phish users. Never hardcode webhook URLs in client-side code or public repositories.
sec-032 — Segment Write Key
Section titled “sec-032 — Segment Write Key”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Segment analytics write keys
Remediation:
While write keys are designed for client-side use, server-side write keys should be stored in environment variables. Rotate in the Segment workspace Settings > Sources if you suspect server-side keys were leaked.
sec-033 — Mixpanel Token and Secret
Section titled “sec-033 — Mixpanel Token and Secret”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Mixpanel project tokens and API secrets
Remediation:
Mixpanel project tokens are semi-public for ingestion but API secrets must be kept server-side. Rotate the API secret in Project Settings if exposed. Restrict data export access via Mixpanel service accounts.
sec-040 — Generic API Key Assignment
Section titled “sec-040 — Generic API Key Assignment”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 75% | Platforms: All
Detects high-entropy strings assigned to variables named key, token, or secret
Remediation:
Replace hardcoded credentials with environment variable references. Rotate any exposed keys/tokens. Use a secrets manager such as HashiCorp Vault, AWS Secrets Manager, or your cloud provider’s equivalent.
sec-042 — High-Entropy Hex String Assigned to Secret Variable
Section titled “sec-042 — High-Entropy Hex String Assigned to Secret Variable”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 75% | Platforms: All
Detects 32+ character hex strings assigned to secret-sounding variable names
Remediation:
Even if these appear to be test values, they may be real secrets committed by mistake. Rotate any values that may have been used in production and move them to environment variables or a secrets manager.
sec-043 — PagerDuty Integration Key
Section titled “sec-043 — PagerDuty Integration Key”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects PagerDuty service integration and API keys
Remediation:
Revoke the PagerDuty key in Integrations > API Access Keys. Leaked integration keys can trigger or silence incidents. Generate minimal-permission API keys and store them in a secrets manager.
sec-044 — Zendesk API Token
Section titled “sec-044 — Zendesk API Token”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Zendesk support platform API tokens
Remediation:
Revoke the Zendesk API token in Settings > Apps and Integrations > Zendesk API. Zendesk tokens can access ticket data and customer PII. Rotate and store securely in a secrets manager.
sec-047 — Atlassian API Token
Section titled “sec-047 — Atlassian API Token”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Atlassian (Jira/Confluence) API tokens
Remediation:
Revoke the Atlassian API token in Account Settings > Security > API tokens. These tokens authenticate as your user account. Generate tokens with the minimum required permissions and store them in a secrets manager.
sec-049 — Airtable API Key
Section titled “sec-049 — Airtable API Key”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Airtable personal access tokens and legacy API keys
Remediation:
Revoke the Airtable personal access token in Account > Developer hub > PATs. Create replacement tokens scoped to specific bases and operations. Airtable keys can read and modify all base data in scope.
sec-050 — Linear API Key
Section titled “sec-050 — Linear API Key”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Linear project management API keys
Remediation:
Revoke the Linear API key in Settings > API > Personal API Keys. Create a replacement key and store it in a secrets manager or CI/CD secrets.
sec-051 — Notion Integration Token
Section titled “sec-051 — Notion Integration Token”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Notion internal integration tokens
Remediation:
Revoke the Notion integration token in Settings & Members > Integrations. Create a replacement token and limit its access to only the required pages and databases.
sec-057 — Pusher Application Secret
Section titled “sec-057 — Pusher Application Secret”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Pusher real-time API application secrets
Remediation:
Rotate the Pusher app secret in the Pusher dashboard under App Keys. The app secret is used to sign webhook payloads and authenticate server-side publishing. Store in environment variables only.
sec-058 — Amplitude API Key and Secret
Section titled “sec-058 — Amplitude API Key and Secret”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Amplitude analytics API keys and secret keys
Remediation:
Rotate keys in Amplitude under Settings > Projects. The secret key is required for server-side event ingestion and export APIs. Store in a secrets manager and use the API key for client-side tracking only.
sec-059 — Mapbox Access Token
Section titled “sec-059 — Mapbox Access Token”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Mapbox public and secret access tokens
Remediation:
Rotate the Mapbox token in Account > Access Tokens. Secret tokens should never appear in client-side code. Public tokens should be URL-restricted in Mapbox account settings to prevent unauthorized tile requests.
sec-060 — Intercom Access Token
Section titled “sec-060 — Intercom Access Token”Severity: 🟡 Medium | Category: Secret Detection | Confidence threshold: 65% | Platforms: All
Detects Intercom customer messaging platform access tokens
Remediation:
Revoke the Intercom access token in Settings > Developers > Access tokens. Intercom tokens can read customer conversations and user data. Store in a secrets manager and scope to the minimum required permissions.
Supply Chain
Section titled “Supply Chain”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
adv-014 | Self-Replacing Code via writeFileSync | 🔴 Critical | 50% | All |
supply-001 | Known Malicious NPM Package | 🔴 Critical | 40% | All |
supply-005 | Known Malicious Python Package | 🔴 Critical | 40% | All |
supply-006 | Known Malicious NPM Package (Extended) | 🔴 Critical | 40% | All |
supply-007 | Known Malicious Python Package (Extended) | 🔴 Critical | 40% | All |
supply-009 | SANDWORM_MODE NPM Worm Packages | 🔴 Critical | 40% | All |
supply-010 | SANDWORM Git Hook Persistence | 🔴 Critical | 50% | All |
supply-011 | Vulnerable mcp-remote Package (CVE-2025-6514) | 🔴 Critical | 50% | mcp, claude, cursor |
yara-004 | Package.json Hijacking | 🔴 Critical | 40% | All |
supply-002 | NPM Typosquatting Pattern | 🟠 High | 50% | All |
supply-004 | Dangerous Postinstall Script | 🟠 High | 50% | All |
supply-008 | Common Typosquatting Heuristics | 🟠 High | 50% | All |
supply-003 | Overly Permissive Version Range | 🟡 Medium | 50% | All |
Rule Details
Section titled “Rule Details”adv-014 — Self-Replacing Code via writeFileSync
Section titled “adv-014 — Self-Replacing Code via writeFileSync”Severity: 🔴 Critical | Category: Supply Chain | Confidence threshold: 50% | Platforms: All
Detects code that overwrites its own source files with remote content — rug pull attack
Remediation:
Code that writes to its own source files (especially index.js in __dirname) is a rug pull attack. Version 1.0 is clean, but the auto-update mechanism replaces the source with whatever a remote server returns. Pin dependencies and use lockfiles to prevent silent code replacement.
supply-001 — Known Malicious NPM Package
Section titled “supply-001 — Known Malicious NPM Package”Severity: 🔴 Critical | Category: Supply Chain | Confidence threshold: 40% | Platforms: All
Dependency on an npm package known to be malicious or compromised
Remediation:
This dependency has a known security incident. Check if you’re using a patched version or find an alternative package.
supply-005 — Known Malicious Python Package
Section titled “supply-005 — Known Malicious Python Package”Severity: 🔴 Critical | Category: Supply Chain | Confidence threshold: 40% | Platforms: All
Dependency on a Python package known to be malicious
Remediation:
This Python package is known to be malicious. Remove it immediately and audit your system.
supply-006 — Known Malicious NPM Package (Extended)
Section titled “supply-006 — Known Malicious NPM Package (Extended)”Severity: 🔴 Critical | Category: Supply Chain | Confidence threshold: 40% | Platforms: All
Dependency on an npm package known to be malicious or compromised (extended list)
Remediation:
This package is known to be malicious or compromised. Remove it immediately and use the legitimate version.
supply-007 — Known Malicious Python Package (Extended)
Section titled “supply-007 — Known Malicious Python Package (Extended)”Severity: 🔴 Critical | Category: Supply Chain | Confidence threshold: 40% | Platforms: All
Dependency on a Python package known to be malicious (extended list)
Remediation:
This Python package is known to be malicious. Remove it immediately and audit your system.
supply-009 — SANDWORM_MODE NPM Worm Packages
Section titled “supply-009 — SANDWORM_MODE NPM Worm Packages”Severity: 🔴 Critical | Category: Supply Chain | Confidence threshold: 40% | Platforms: All
Detects typosquatted npm packages from the SANDWORM_MODE worm campaign (Feb 2026) targeting AI coding tools
Remediation:
This package is part of the SANDWORM_MODE npm worm campaign (Feb 2026) that targets AI coding tools. It performs multi-stage attacks: credential harvest, MCP injection, git hook persistence, and self-propagation via npm publish. Remove immediately and audit your system.
References:
supply-010 — SANDWORM Git Hook Persistence
Section titled “supply-010 — SANDWORM Git Hook Persistence”Severity: 🔴 Critical | Category: Supply Chain | Confidence threshold: 50% | Platforms: All
Detects git template directory manipulation used by the SANDWORM_MODE worm for persistence across new git repos
Remediation:
Modifying global git template directories or hooks paths is a persistence technique. The SANDWORM worm uses this to inject malicious hooks into every new git repo. Inspect and restore your git config: git config —global —unset init.templateDir
supply-011 — Vulnerable mcp-remote Package (CVE-2025-6514)
Section titled “supply-011 — Vulnerable mcp-remote Package (CVE-2025-6514)”Severity: 🔴 Critical | Category: Supply Chain | Confidence threshold: 50% | Platforms: mcp, claude, cursor
Detects mcp-remote versions 0.0.5-0.1.15 with critical RCE vulnerability (CVSS 9.6)
Remediation:
mcp-remote versions 0.0.5 through 0.1.15 have a critical RCE vulnerability (CVE-2025-6514, CVSS 9.6) allowing arbitrary OS command execution. Upgrade immediately to >= 0.1.16.
References:
- CVE-2025-6514
yara-004 — Package.json Hijacking
Section titled “yara-004 — Package.json Hijacking”Severity: 🔴 Critical | Category: Supply Chain | Confidence threshold: 40% | Platforms: All
Detects preinstall/postinstall scripts with encoded or obfuscated payloads
Remediation:
No remediation guidance available.
supply-002 — NPM Typosquatting Pattern
Section titled “supply-002 — NPM Typosquatting Pattern”Severity: 🟠 High | Category: Supply Chain | Confidence threshold: 50% | Platforms: All
Dependency name appears to be a typosquat of a popular package
Remediation:
Verify the package name is correct. Typosquatting is a common supply chain attack vector.
supply-004 — Dangerous Postinstall Script
Section titled “supply-004 — Dangerous Postinstall Script”Severity: 🟠 High | Category: Supply Chain | Confidence threshold: 50% | Platforms: All
Package runs scripts during installation that download or execute external code
Remediation:
Inspect install scripts before running. Use —ignore-scripts flag with npm install for untrusted packages.
supply-008 — Common Typosquatting Heuristics
Section titled “supply-008 — Common Typosquatting Heuristics”Severity: 🟠 High | Category: Supply Chain | Confidence threshold: 50% | Platforms: All
Detects common typosquatting patterns of popular packages
Remediation:
Verify the package name is correct. This appears to be a typosquat of a popular package.
supply-003 — Overly Permissive Version Range
Section titled “supply-003 — Overly Permissive Version Range”Severity: 🟡 Medium | Category: Supply Chain | Confidence threshold: 50% | Platforms: All
Dependencies use wildcard or overly permissive version ranges
Remediation:
Use exact versions or semver ranges with upper bounds (e.g., ^1.2.3 or ~1.2.3). Never use * or latest in production.
Suspicious Behavior
Section titled “Suspicious Behavior”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
sus-007 | Keylogging Patterns | 🔴 Critical | 85% | All |
sus-009 | Data Wiping Patterns | 🔴 Critical | 85% | All |
sus-010 | Reverse Shell Patterns | 🔴 Critical | 90% | All |
aaa-001 | Scheduled Task Injection | 🟠 High | 55% | All |
aaa-002 | Unrestricted Resource Consumption | 🟠 High | 50% | All |
adv-002 | Function Constructor Code Execution | 🟠 High | 55% | All |
adv-005 | Dynamic Module Require via String Concatenation | 🟠 High | 60% | All |
sus-003 | Anti-Debugging Techniques | 🟠 High | 80% | All |
sus-005 | Persistence Mechanisms | 🟠 High | 80% | All |
sus-006 | Cryptocurrency Mining Indicators | 🟠 High | 80% | All |
sus-008 | Camera/Microphone Access | 🟠 High | 80% | All |
sus-013 | Self-Modification | 🟠 High | 80% | All |
sus-016 | Python Dangerous Execution with Dynamic Input | 🟠 High | 65% | crewai, autogpt, mcp |
sus-001 | Obfuscated Code Detection | 🟡 Medium | 70% | All |
sus-002 | Dynamic Code Execution | 🟡 Medium | 70% | All |
sus-004 | Network Reconnaissance | 🟡 Medium | 75% | All |
sus-011 | Timestomping | 🟡 Medium | 75% | All |
sus-012 | Unusual File Locations | 🟡 Medium | 70% | All |
sus-014 | Abnormal Process Spawning | 🟡 Medium | 70% | All |
sus-015 | Encoding Without Clear Purpose | 🟢 Low | 60% | All |
Rule Details
Section titled “Rule Details”sus-007 — Keylogging Patterns
Section titled “sus-007 — Keylogging Patterns”Severity: 🔴 Critical | Category: Suspicious Behavior | Confidence threshold: 85% | Platforms: All
Detects keylogging or input capture patterns
Remediation:
Keylogging is highly malicious. This should never be present in an AI agent.
sus-009 — Data Wiping Patterns
Section titled “sus-009 — Data Wiping Patterns”Severity: 🔴 Critical | Category: Suspicious Behavior | Confidence threshold: 85% | Platforms: All
Detects patterns that could wipe data
Remediation:
Data wiping commands are extremely dangerous. These should never be in an AI agent.
sus-010 — Reverse Shell Patterns
Section titled “sus-010 — Reverse Shell Patterns”Severity: 🔴 Critical | Category: Suspicious Behavior | Confidence threshold: 90% | Platforms: All
Detects reverse shell creation patterns
Remediation:
Reverse shells are highly malicious. This is a critical security threat.
aaa-001 — Scheduled Task Injection
Section titled “aaa-001 — Scheduled Task Injection”Severity: 🟠 High | Category: Suspicious Behavior | Confidence threshold: 55% | Platforms: All
Detects cron jobs, heartbeat configs, or scheduled tasks that can be created or modified by agent tools, enabling persistent autonomous loops
Remediation:
Scheduled tasks and heartbeat configurations must not be modifiable by agent tools or external inputs. Implement rate limits and maximum execution counts for recurring tasks. Require owner approval for any new scheduled task registration.
References:
- Agents of Chaos (arXiv:2602.20021) — CS4: Heartbeat/cron injection enabled 9-day infinite resource loop
- MITRE ATLAS AML.T0040
aaa-002 — Unrestricted Resource Consumption
Section titled “aaa-002 — Unrestricted Resource Consumption”Severity: 🟠 High | Category: Suspicious Behavior | Confidence threshold: 50% | Platforms: All
Detects agent configurations missing rate limits, token limits, or execution timeouts, enabling denial-of-service and runaway cost attacks
Remediation:
All agent tool invocations must have explicit rate limits, token budgets, and execution timeouts. Implement circuit breakers for agent-to-agent relay patterns. Set maximum iteration counts for loops and recursive tool calls.
References:
- Agents of Chaos (arXiv:2602.20021) — CS4: Mutual relay loop lasting ~1 hour
- Agents of Chaos — CS5: Mass email flooding
- OWASP LLM04 (Denial of Service)
adv-002 — Function Constructor Code Execution
Section titled “adv-002 — Function Constructor Code Execution”Severity: 🟠 High | Category: Suspicious Behavior | Confidence threshold: 55% | Platforms: All
Detects new Function() used to execute dynamically constructed code — indirect eval that bypasses eval() detection
Remediation:
new Function() is equivalent to eval() and executes arbitrary code. This is commonly used to evade static scanners that only detect direct eval() calls. Never construct functions from untrusted or dynamic strings.
adv-005 — Dynamic Module Require via String Concatenation
Section titled “adv-005 — Dynamic Module Require via String Concatenation”Severity: 🟠 High | Category: Suspicious Behavior | Confidence threshold: 60% | Platforms: All
Detects require() with string concatenation to hide the actual module being loaded
Remediation:
Using string concatenation inside require() (e.g., require(‘node:’ + ‘http’)) hides the actual module being imported from static analysis. This is a common evasion technique to bypass module import detection rules.
sus-003 — Anti-Debugging Techniques
Section titled “sus-003 — Anti-Debugging Techniques”Severity: 🟠 High | Category: Suspicious Behavior | Confidence threshold: 80% | Platforms: All
Detects attempts to detect or evade debugging
Remediation:
Anti-debugging techniques indicate the code may be trying to hide malicious behavior.
sus-005 — Persistence Mechanisms
Section titled “sus-005 — Persistence Mechanisms”Severity: 🟠 High | Category: Suspicious Behavior | Confidence threshold: 80% | Platforms: All
Detects attempts to establish persistence
Remediation:
Persistence mechanisms should not be created by AI agents. Remove these patterns.
sus-006 — Cryptocurrency Mining Indicators
Section titled “sus-006 — Cryptocurrency Mining Indicators”Severity: 🟠 High | Category: Suspicious Behavior | Confidence threshold: 80% | Platforms: All
Detects potential cryptocurrency mining code
Remediation:
Cryptocurrency mining should never be present in AI agent code.
sus-008 — Camera/Microphone Access
Section titled “sus-008 — Camera/Microphone Access”Severity: 🟠 High | Category: Suspicious Behavior | Confidence threshold: 80% | Platforms: All
Detects attempts to access camera or microphone
Remediation:
Camera and microphone access requires explicit user consent. Review this carefully.
sus-013 — Self-Modification
Section titled “sus-013 — Self-Modification”Severity: 🟠 High | Category: Suspicious Behavior | Confidence threshold: 80% | Platforms: All
Detects code that modifies itself
Remediation:
Self-modifying code is suspicious and may be used to hide malicious payloads.
sus-016 — Python Dangerous Execution with Dynamic Input
Section titled “sus-016 — Python Dangerous Execution with Dynamic Input”Severity: 🟠 High | Category: Suspicious Behavior | Confidence threshold: 65% | Platforms: crewai, autogpt, mcp
Detects dangerous Python execution with user-controlled or dynamic input
Remediation:
Avoid these functions in AI agent code. Use safe alternatives like ast.literal_eval() and yaml.safe_load().
sus-001 — Obfuscated Code Detection
Section titled “sus-001 — Obfuscated Code Detection”Severity: 🟡 Medium | Category: Suspicious Behavior | Confidence threshold: 70% | Platforms: All
Detects heavily obfuscated or encoded code
Remediation:
Heavily obfuscated code is suspicious. Deobfuscate and review the actual behavior.
sus-002 — Dynamic Code Execution
Section titled “sus-002 — Dynamic Code Execution”Severity: 🟡 Medium | Category: Suspicious Behavior | Confidence threshold: 70% | Platforms: All
Detects dynamic code execution with user-controlled or variable input
Remediation:
Dynamic code execution can hide malicious behavior. Review the executed code carefully.
sus-004 — Network Reconnaissance
Section titled “sus-004 — Network Reconnaissance”Severity: 🟡 Medium | Category: Suspicious Behavior | Confidence threshold: 75% | Platforms: All
Detects network scanning or reconnaissance patterns
Remediation:
Network reconnaissance should not be performed by AI agents without explicit permission.
sus-011 — Timestomping
Section titled “sus-011 — Timestomping”Severity: 🟡 Medium | Category: Suspicious Behavior | Confidence threshold: 75% | Platforms: All
Detects file timestamp manipulation
Remediation:
Timestamp manipulation is often used to hide malicious activity. Review carefully.
sus-012 — Unusual File Locations
Section titled “sus-012 — Unusual File Locations”Severity: 🟡 Medium | Category: Suspicious Behavior | Confidence threshold: 70% | Platforms: All
Detects operations in unusual file locations
Remediation:
Hidden files in unusual locations may indicate attempts to hide malicious activity.
sus-014 — Abnormal Process Spawning
Section titled “sus-014 — Abnormal Process Spawning”Severity: 🟡 Medium | Category: Suspicious Behavior | Confidence threshold: 70% | Platforms: All
Detects suspicious process creation patterns
Remediation:
Detached background processes may indicate persistence attempts. Review carefully.
sus-015 — Encoding Without Clear Purpose
Section titled “sus-015 — Encoding Without Clear Purpose”Severity: 🟢 Low | Category: Suspicious Behavior | Confidence threshold: 60% | Platforms: All
Detects unnecessary encoding or weak encryption
Remediation:
Unnecessary encoding or weak encryption may be used to obfuscate malicious code.
third-party-content
Section titled “third-party-content”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
tpc-001 | Email Content Ingestion | 🟠 High | 50% | All |
tpc-002 | Chat Message Ingestion | 🟠 High | 50% | All |
tpc-003 | Social Media Content Ingestion | 🟠 High | 50% | All |
tpc-005 | GitHub Content Ingestion | 🟠 High | 50% | All |
tpc-006 | Registry/Marketplace Content Ingestion | 🟠 High | 50% | All |
tpc-004 | Web Content Fetch | 🟡 Medium | 50% | All |
Rule Details
Section titled “Rule Details”tpc-001 — Email Content Ingestion
Section titled “tpc-001 — Email Content Ingestion”Severity: 🟠 High | Category: third-party-content | Confidence threshold: 50% | Platforms: All
Skill reads email content (IMAP, Gmail, Outlook) — untrusted third-party content enters agent context
Remediation:
Email content is untrusted third-party content. Agents processing email bodies are exposed to indirect prompt injection. Consider: content sanitization, untrusted-content wrapping, or operator approval before acting on email content.
tpc-002 — Chat Message Ingestion
Section titled “tpc-002 — Chat Message Ingestion”Severity: 🟠 High | Category: third-party-content | Confidence threshold: 50% | Platforms: All
Skill reads chat messages (Discord, Slack, WhatsApp, iMessage, Telegram) — untrusted content
Remediation:
Chat messages are untrusted third-party content. Any contact can send crafted messages containing prompt injection payloads.
tpc-003 — Social Media Content Ingestion
Section titled “tpc-003 — Social Media Content Ingestion”Severity: 🟠 High | Category: third-party-content | Confidence threshold: 50% | Platforms: All
Skill reads social media content (Twitter/X, Reddit, HN) — untrusted user-generated content
Remediation:
Social media content is untrusted. Posts, comments, and threads can contain prompt injection payloads targeting the agent.
tpc-005 — GitHub Content Ingestion
Section titled “tpc-005 — GitHub Content Ingestion”Severity: 🟠 High | Category: third-party-content | Confidence threshold: 50% | Platforms: All
Skill reads GitHub issues, PRs, or comments — user-generated content enters agent context
Remediation:
GitHub issues and PR bodies are user-generated content. Attackers can craft issues with embedded prompt injection instructions.
tpc-006 — Registry/Marketplace Content Ingestion
Section titled “tpc-006 — Registry/Marketplace Content Ingestion”Severity: 🟠 High | Category: third-party-content | Confidence threshold: 50% | Platforms: All
Skill installs or reads content from skill registries or marketplaces — untrusted code
Remediation:
Skills from registries are untrusted third-party code. Install with verification. Community content can contain prompt injection.
tpc-004 — Web Content Fetch
Section titled “tpc-004 — Web Content Fetch”Severity: 🟡 Medium | Category: third-party-content | Confidence threshold: 50% | Platforms: All
Skill fetches and processes arbitrary web page content — untrusted external content
Remediation:
Web content from arbitrary URLs is untrusted. A malicious page can contain prompt injection targeting agents that process its content.
Tool Poisoning
Section titled “Tool Poisoning”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
tp-001 | Hidden Instructions in Tool Descriptions | 🔴 Critical | 50% | All |
tp-004 | MCP Server Config Injection | 🔴 Critical | 50% | All |
tp-006 | Homoglyph Characters in Tool Names | 🔴 Critical | 70% | All |
adv-003 | Hidden Directive in HTML Comment | 🟠 High | 55% | All |
adv-008 | Tool Shadowing of AI Agent Built-ins | 🟠 High | 55% | All |
adv-010 | Cross-Tool Instruction in Tool Description | 🟠 High | 55% | All |
tp-002 | Prompt Override in Tool Description | 🟠 High | 55% | All |
tp-003 | Tool Shadowing via Known Trusted Names | 🟠 High | 55% | All |
tp-007 | Base64-Encoded Payload in Tool Description | 🟠 High | 65% | All |
tp-008 | Tool Name Shadows Common System Commands | 🟠 High | 60% | mcp, claude, codex, cursor |
tp-009 | Hidden Markdown or HTML Directives in Tool Descriptions | 🟠 High | 60% | All |
tp-011 | Cursor MCPoison — MCP Config in Git Repository | 🟠 High | 50% | cursor, codex, mcp |
tp-005 | Suspicious Sensitive Parameters in Tool Definitions | 🟡 Medium | 60% | All |
tp-010 | Tool Description Length Anomaly | 🟡 Medium | 50% | All |
tp-012 | MCP Sampling Attack Vector | 🟡 Medium | 55% | mcp, claude, cursor |
Rule Details
Section titled “Rule Details”tp-001 — Hidden Instructions in Tool Descriptions
Section titled “tp-001 — Hidden Instructions in Tool Descriptions”Severity: 🔴 Critical | Category: Tool Poisoning | Confidence threshold: 50% | Platforms: All
Detects invisible Unicode characters and HTML comments used to hide malicious instructions inside tool or function descriptions
Remediation:
Remove all invisible Unicode characters and HTML comments from tool descriptions. These are used by attackers to smuggle hidden instructions that are processed by AI agents but invisible to human reviewers. Audit any tool description that was fetched from an external or untrusted source.
tp-004 — MCP Server Config Injection
Section titled “tp-004 — MCP Server Config Injection”Severity: 🔴 Critical | Category: Tool Poisoning | Confidence threshold: 50% | Platforms: All
Detects code that writes to MCP configuration files or dynamically adds server entries, which can silently register malicious tools
Remediation:
Code must not write to MCP configuration files at runtime. MCP server registration is an administrative action that should only happen through official, user-approved configuration channels. Dynamic modification of MCP configs is a primary attack vector for silently registering malicious tool servers. Remove any code that constructs or writes mcpServers entries programmatically.
tp-006 — Homoglyph Characters in Tool Names
Section titled “tp-006 — Homoglyph Characters in Tool Names”Severity: 🔴 Critical | Category: Tool Poisoning | Confidence threshold: 70% | Platforms: All
Detects visually deceptive Unicode characters mixed with Latin text in tool names — homoglyph attacks that impersonate legitimate tools
Remediation:
Tool names containing mixed-script homoglyphs are a visual deception attack. An attacker registers a tool whose name looks identical to a trusted tool but uses different Unicode codepoints. Validate that all tool names contain only standard ASCII characters (U+0020-U+007E). Reject any tool with non-ASCII identifiers.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM02 Insecure Output
- https://atlas.mitre.org/techniques/AML.T0043
adv-003 — Hidden Directive in HTML Comment
Section titled “adv-003 — Hidden Directive in HTML Comment”Severity: 🟠 High | Category: Tool Poisoning | Confidence threshold: 55% | Platforms: All
Detects HTML comments containing data collection directives like endpoint URLs or include-env flags
Remediation:
HTML comments containing data collection directives are a tool poisoning vector. Attackers embed hidden configuration that instructs the tool to exfiltrate data. Tool descriptions and skill files should not contain HTML comments with directives.
adv-008 — Tool Shadowing of AI Agent Built-ins
Section titled “adv-008 — Tool Shadowing of AI Agent Built-ins”Severity: 🟠 High | Category: Tool Poisoning | Confidence threshold: 55% | Platforms: All
Detects MCP tools registered with names matching AI agent built-in tools (Read, Bash, Edit, Write) — lower confidence to unverified for deep scan triage
Remediation:
A tool registered with the exact name of an AI agent built-in (Read, Bash, Edit, Write) is a tool shadowing attack. The malicious tool intercepts operations intended for the legitimate built-in and can exfiltrate all data passed through it. Tool names must be unique and must not collide with agent built-in names.
adv-010 — Cross-Tool Instruction in Tool Description
Section titled “adv-010 — Cross-Tool Instruction in Tool Description”Severity: 🟠 High | Category: Tool Poisoning | Confidence threshold: 55% | Platforms: All
Detects tool descriptions that instruct the AI to pipe output to another tool — cross-server data exfiltration chain
Remediation:
Tool descriptions should describe the tool’s purpose, not instruct the AI to chain calls to other tools. This is a cross-server orchestration attack where a legitimate tool’s description directs the AI to pipe sensitive data to a second tool controlled by the attacker.
tp-002 — Prompt Override in Tool Description
Section titled “tp-002 — Prompt Override in Tool Description”Severity: 🟠 High | Category: Tool Poisoning | Confidence threshold: 55% | Platforms: All
Detects prompt injection language embedded in tool descriptions or metadata that attempts to override AI instructions
Remediation:
Remove all prompt injection language from tool descriptions and metadata. Tool descriptions should only describe the tool’s legitimate purpose and parameters. Any text attempting to override AI instructions is a tool poisoning attack. Validate all tool descriptions fetched from external MCP servers before use.
tp-003 — Tool Shadowing via Known Trusted Names
Section titled “tp-003 — Tool Shadowing via Known Trusted Names”Severity: 🟠 High | Category: Tool Poisoning | Confidence threshold: 55% | Platforms: All
Detects tool registrations that use the names of well-known trusted tools to hijack AI behavior
Remediation:
A tool is being registered under a name that matches a well-known trusted tool. This is a classic tool shadowing attack: a malicious MCP server registers a tool with an identical name to intercept calls intended for the legitimate tool. Audit the source of this tool registration and verify the server’s identity before use.
tp-007 — Base64-Encoded Payload in Tool Description
Section titled “tp-007 — Base64-Encoded Payload in Tool Description”Severity: 🟠 High | Category: Tool Poisoning | Confidence threshold: 65% | Platforms: All
Detects base64-encoded content with decode operations or data URIs in tool descriptions, which may hide malicious instructions
Remediation:
Tool descriptions must contain only human-readable text describing the tool’s legitimate purpose. Base64-encoded content in descriptions is used to smuggle hidden instructions that are decoded and executed by the AI agent. Remove all encoded payloads and fetch tool descriptions only from trusted sources.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM01 Prompt Injection
- https://atlas.mitre.org/techniques/AML.T0043
tp-008 — Tool Name Shadows Common System Commands
Section titled “tp-008 — Tool Name Shadows Common System Commands”Severity: 🟠 High | Category: Tool Poisoning | Confidence threshold: 60% | Platforms: mcp, claude, codex, cursor
Detects tool registrations using names of common system commands (ls, cat, curl, wget, bash) to intercept agent shell operations
Remediation:
A tool with the same name as a system command is a tool shadowing attack. The malicious tool intercepts calls intended for the legitimate system command. Tool names must be unique, namespaced (e.g., vendor-toolname), and must not collide with system command names or other registered tools.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM05 Supply Chain
- https://atlas.mitre.org/techniques/AML.T0043
tp-009 — Hidden Markdown or HTML Directives in Tool Descriptions
Section titled “tp-009 — Hidden Markdown or HTML Directives in Tool Descriptions”Severity: 🟠 High | Category: Tool Poisoning | Confidence threshold: 60% | Platforms: All
Detects dangerous HTML elements, suspicious HTML comments, and malicious markdown links in tool descriptions
Remediation:
Tool descriptions must be plain text only. HTML, Markdown with active links, and CSS styles embedded in descriptions are used to hide instructions from human reviewers while remaining visible to AI agents parsing the raw text. Strip all HTML/Markdown formatting from tool descriptions before display.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM01 Prompt Injection
- https://atlas.mitre.org/techniques/AML.T0043
tp-011 — Cursor MCPoison — MCP Config in Git Repository
Section titled “tp-011 — Cursor MCPoison — MCP Config in Git Repository”Severity: 🟠 High | Category: Tool Poisoning | Confidence threshold: 50% | Platforms: cursor, codex, mcp
Detects .cursor/mcp.json or .vscode/mcp.json files committed to a git repository — CVE-2025-54136. Attackers commit benign configs then silently modify them to backdoor.
Remediation:
MCP configuration files (.cursor/mcp.json, .vscode/mcp.json) should not be committed to repositories. CVE-2025-54136 demonstrated that attackers commit benign configs, then silently modify server entries to backdoor the development environment. Add these files to .gitignore and use user-level MCP configuration instead.
References:
- CVE-2025-54136
tp-005 — Suspicious Sensitive Parameters in Tool Definitions
Section titled “tp-005 — Suspicious Sensitive Parameters in Tool Definitions”Severity: 🟡 Medium | Category: Tool Poisoning | Confidence threshold: 60% | Platforms: All
Detects tool parameter definitions that request sensitive credentials, keys, or secrets from the user
Remediation:
Tool parameter definitions must not request passwords, tokens, API keys, or private keys. Legitimate tools access credentials through secure environment variables or secrets managers, never by asking the user (or the AI agent) to supply them as tool arguments. A tool that requires credentials as parameters is likely a credential-harvesting attack.
tp-010 — Tool Description Length Anomaly
Section titled “tp-010 — Tool Description Length Anomaly”Severity: 🟡 Medium | Category: Tool Poisoning | Confidence threshold: 50% | Platforms: All
Detects abnormally long tool descriptions (>5000 characters) which strongly suggest hidden content or embedded instructions
Remediation:
Legitimate tool descriptions are concise (typically under 500 characters). Descriptions over 5000 characters almost always indicate hidden content: invisible text, encoded payloads, or injected instructions. Cap tool description length at 1000 characters and reject over-length descriptions.
References:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/ LLM01 Prompt Injection
- https://atlas.mitre.org/techniques/AML.T0043
tp-012 — MCP Sampling Attack Vector
Section titled “tp-012 — MCP Sampling Attack Vector”Severity: 🟡 Medium | Category: Tool Poisoning | Confidence threshold: 55% | Platforms: mcp, claude, cursor
Detects MCP servers declaring sampling capability, which enables reverse prompt injection by allowing the server to request the AI generate content
Remediation:
MCP servers with sampling capability can request the AI to generate content, creating a reverse injection channel. The server crafts prompts that manipulate the AI into executing actions the user did not intend. Only grant sampling capability to fully trusted MCP servers. Audit what the server sends via sampling requests.
References:
unsupervised-execution
Section titled “unsupervised-execution”| ID | Name | Severity | Confidence | Platforms |
|---|---|---|---|---|
kc-004 | Autonomous Agent Spawn Without Oversight | 🟠 High | 55% | All |
uex-001 | Background Agent Spawning | 🟠 High | 50% | All |
uex-002 | Persistent Daemon / Service | 🟠 High | 50% | All |
uex-003 | Multi-Agent Orchestration | 🟡 Medium | 50% | All |
Rule Details
Section titled “Rule Details”kc-004 — Autonomous Agent Spawn Without Oversight
Section titled “kc-004 — Autonomous Agent Spawn Without Oversight”Severity: 🟠 High | Category: unsupervised-execution | Confidence threshold: 55% | Platforms: All
Detects spawning of detached/autonomous agent processes that bypass approval mechanisms
Remediation:
Spawning autonomous agent processes with approval bypass enables uncontrolled execution. Always require human-in-the-loop approval for agent tool invocations.
uex-001 — Background Agent Spawning
Section titled “uex-001 — Background Agent Spawning”Severity: 🟠 High | Category: unsupervised-execution | Confidence threshold: 50% | Platforms: All
Spawns agents or processes in the background without per-action human oversight
Remediation:
Background agents run without per-action human oversight. If compromised, they can execute arbitrary actions until manually stopped. Require periodic check-ins or approval gates for long-running background agents.
uex-002 — Persistent Daemon / Service
Section titled “uex-002 — Persistent Daemon / Service”Severity: 🟠 High | Category: unsupervised-execution | Confidence threshold: 50% | Platforms: All
Installs a persistent daemon or system service that runs autonomously
Remediation:
Persistent daemons run indefinitely without human oversight. Combined with agent capabilities, a compromised daemon can continuously process and act on untrusted input.
uex-003 — Multi-Agent Orchestration
Section titled “uex-003 — Multi-Agent Orchestration”Severity: 🟡 Medium | Category: unsupervised-execution | Confidence threshold: 50% | Platforms: All
Spawns multiple parallel agents with tool access — amplified blast radius
Remediation:
Multi-agent orchestration amplifies the blast radius of any single compromise. If one agent is compromised, it can influence outputs consumed by other agents.