Skip to content

ppcvote/prompt-defense-audit-action

Use this GitHub action with your project
Add this Action to an existing workflow or create a new one
View on Marketplace

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Prompt Defense Audit Action

CI GitHub Marketplace License: MIT

Scan system prompts for missing defenses against 12 attack vectors. Runs in your CI pipeline — blocks PRs with weak prompts before they reach production.

Pure regex. Zero LLM cost. < 5ms per file.

Quick Start

- uses: ppcvote/prompt-defense-audit-action@v1
  with:
    path: "prompts/**/*.txt"
    min-grade: B

That's it. The action scans all matching files, posts a PR comment with results, and fails the check if any file scores below the minimum grade.

PR Comment

The action posts a summary table on your PR:

🛡️ Prompt Defense Audit

1 of 3 file(s) below minimum grade B

File Grade Score Defended Missing defenses
prompts/support.txt 🔴 F 17/100 2/12 role-escape, data-leakage, indirect-injection +7 more
prompts/chatbot.txt 🔵 B 75/100 9/12 unicode-attack, context-overflow, abuse-prevention
prompts/admin.txt 🟢 A 92/100 11/12 unicode-attack

The comment is updated on each push (no spam).

Inputs

Input Required Default Description
path Yes Glob pattern for prompt files (e.g. prompts/**/*.txt)
min-grade No C Minimum passing grade: A, B, C, D, or F
github-token No ${{ github.token }} Token for posting PR comments
comment No true Post results as PR comment
sarif No Output SARIF file path (for GitHub Code Scanning)
fail-on-missing No false Fail if no files match the glob pattern

Outputs

Output Description
total-files Number of files scanned
passed Number of files that passed
failed Number of files that failed
lowest-grade Lowest grade across all files
report Full results as JSON string

Examples

Basic — fail on grade D or F

name: Prompt Security
on: [pull_request]

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: ppcvote/prompt-defense-audit-action@v1
        with:
          path: "**/*.prompt.txt"
          min-grade: C

Strict — require grade A for production prompts

- uses: ppcvote/prompt-defense-audit-action@v1
  with:
    path: "config/prompts/**/*.txt"
    min-grade: A

With SARIF upload (GitHub Code Scanning)

- uses: ppcvote/prompt-defense-audit-action@v1
  with:
    path: "prompts/**/*.txt"
    min-grade: B
    sarif: results.sarif

- uses: github/codeql-action/upload-sarif@v3
  if: always()
  with:
    sarif_file: results.sarif

Use outputs in subsequent steps

- uses: ppcvote/prompt-defense-audit-action@v1
  id: audit
  with:
    path: "prompts/**/*.txt"

- run: echo "Lowest grade: ${{ steps.audit.outputs.lowest-grade }}"

Comment-only (don't fail the build)

- uses: ppcvote/prompt-defense-audit-action@v1
  continue-on-error: true
  with:
    path: "prompts/**/*.txt"
    min-grade: A

12 Attack Vectors

Based on OWASP LLM Top 10:

# Vector What it checks
1 Role Escape Role definition + boundary enforcement
2 Instruction Override Refusal clauses + meta-instruction protection
3 Data Leakage System prompt / training data disclosure prevention
4 Output Manipulation Output format restrictions
5 Multi-language Bypass Language-specific defense
6 Unicode Attacks Homoglyph / zero-width character detection
7 Context Overflow Input length limits
8 Indirect Injection External data validation
9 Social Engineering Emotional manipulation resistance
10 Output Weaponization Harmful content generation prevention
11 Abuse Prevention Rate limiting / auth awareness
12 Input Validation XSS / SQL injection / sanitization

How It Works

  1. Globs for prompt files matching your pattern
  2. Scans each file with prompt-defense-audit (deterministic regex, no LLM)
  3. Posts a markdown summary table as a PR comment
  4. Optionally outputs SARIF for GitHub Code Scanning integration
  5. Fails the workflow if any file is below the minimum grade

Comparison

Tool Layer Needs LLM? Cost Speed
This action Static prompt text analysis No Free < 5ms/file
Promptfoo Code Scan Code data-flow analysis Yes (AI agent) API costs Minutes
Garak Action Behavioral endpoint testing Yes (target LLM) API costs Minutes

These three layers are complementary: static prompt → code flow → behavioral testing.

License

MIT — Ultra Lab

Related

About

GitHub Action — scan system prompts for missing defenses against 12 attack vectors. Pure regex, zero LLM cost, < 5ms.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors