Skip to content

rivian/ai-sast

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

82 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

AI-SAST - AI-Powered Static Application Security Testing

License: Apache 2.0 Python 3.8+ GitHub Actions

AI-SAST is an AI-driven static application security testing tool. Run it in your repository on your runnersβ€”PR scan on pull requests, full scan on demand. Your code never runs on ai-sast infrastructure.

Features

  • Vertex AI (Gemini) (default), AWS Bedrock (Claude Opus), or Ollama backends
  • PR scan: Scans changed files on pull requests; posts findings as PR comments
  • Full scan: Manual run to scan the entire repository
  • Your runners: Workflow runs in your repo (e.g. self-hosted); your code stays on your infrastructure
  • Multi-language support (Python, JS/TS, Java, Go, Rust, C/C++, and more)
  • CVSS scoring, configurable exclusions, optional feedback loop

Architecture

AI-SAST Architecture

  1. Trigger: Pull request or manual "Run workflow"
  2. Scan: Code is analyzed by Vertex AI (Gemini), AWS Bedrock (Claude), or Ollama
  3. Validator (if configured): A second LLM (e.g. Bedrock/Claude) validates findings as true or false positive; only validated true positives are posted in the PR.
  4. Results: PR comments and HTML/text report artifacts

πŸ“– docs/ARCHITECTURE.md

Integrate in your repository

  1. Fork this repository so the workflow uses your fork.

  2. Copy the workflow file into the repo you want to scan as .github/workflows/ai-sast.yml:
    .github/workflows/ai-sast.yml

  3. Set the repository variable (Settings β†’ Secrets and variables β†’ Actions β†’ Variables):
    AI_SAST_REPO = your fork (e.g. your-username/ai-sast).

  4. Add repository secrets (Settings β†’ Secrets and variables β†’ Actions β†’ Secrets):
    GOOGLE_CLOUD_PROJECT, GOOGLE_CREDENTIALS

The workflow checks out your fork at runtime. One file runs PR scan (on pull requests when base branch is main), full scan (manual "Run workflow"), and feedback collection (when someone edits a PR comment and checks a true/false positive box).

Optional: Set AI_SAST_BASE_BRANCH (default main); AI_SAST_REF (default main); runs-on: self-hosted in the workflow for your own runners.

πŸ“š Full guide: docs/INTEGRATION.md

Example PR comment

When the PR scan finds issues, it posts a comment like this:

### πŸ€– AI-SAST Security Scan
**2** potential issue(s) found.

> πŸ’‘ **Help us improve!** Use the checkboxes below to mark each finding as a true positive (βœ…) or false positive (❌).

| Severity | Count |
|---|---|
| πŸ”₯ High | 2 |

---

<!-- vuln-id: abc12345 -->
- [ ] βœ… True Positive
- [ ] ❌ False Positive

**ID**: `abc12345`
**Severity**: High
**Issue**: SQL Injection vulnerability in user query
**Location**: `src/api/users.py:42`
**CVSS Vector**: `CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N`

<details><summary>πŸ“‹ Click to see details, risk, and remediation</summary>

**Risk:** Attacker could manipulate SQL queries...

**Validator proof:** User input is concatenated into the query without sanitization; a malicious payload could execute arbitrary SQL.

**Remediation:** Use parameterized queries...

</details>

Feedback loop

Developers can mark findings as true positive (βœ…) or false positive (❌) directly in the PR comment. That feedback is stored in a database (SQLite by default, or Databricks) and included in the prompt sent to Vertex AI on future scans so the model can improve accuracy (e.g. avoid repeating false positives and pay attention to patterns similar to confirmed vulnerabilities).

How it works:

  1. PR scan posts a comment with checkboxes next to each finding.
  2. Developer checks one box per finding (True Positive or False Positive).
  3. collect-feedback workflow runs when the comment is edited and stores feedback in the database.
  4. Future scans load that feedback and add it to the Vertex AI (Gemini) prompt as context, so the AI gets more accurate over time.

Feedback collection is included in the same workflow file (see Integrate in your repository). No extra configuration needed for SQLite.

πŸ“– Full guide: docs/FEEDBACK-LOOP.md

Environment variables

All configuration is driven by environment variables. The table below lists supported variables, their description, and default value (if any). Set them as repository Secrets or Variables in GitHub (Settings β†’ Secrets and variables β†’ Actions), or in the workflow env block.

Variable Description Default
Workflow & scan behavior
AI_SAST_REPO Your fork of this repo (e.g. your-username/ai-sast). Requiredβ€”fork first, then set this variable. β€”
AI_SAST_BASE_BRANCH Branch that triggers PR scan; PRs targeting this branch are scanned. main
AI_SAST_SEVERITY Comma-separated severities to include in PR comments (e.g. critical,high,medium). The validator runs only on findings in these severities. critical,high
AI_SAST_UPDATE_SAME_PR_COMMENT When true, each PR scan run updates the same PR comment with the latest results instead of posting a new comment (reduces noise on multi-commit PRs). When false or unset, a new comment is posted every time (default). false
AI_SAST_EXCLUDE_PATHS Comma-separated path keywords to exclude from scanning (e.g. test,vendor,mock). β€”
AI_SAST_PR_SCAN_BATCH_SIZE Number of files to send in one Vertex/Bedrock call. PR scan always uses batching (no option to turn off). 10
AI_SAST_PR_SCAN_BATCH_MAX_BYTES Max total bytes of added code per batch (avoids oversized requests). 2097152 (2 MB)
AI_SAST_PR_SCAN_MAX_FILES Max number of files to scan per PR (DoS protection). Set to 0 for no limit. 100
AI_SAST_PR_SCAN_MAX_FILE_SIZE Max size in bytes of added lines per file (DoS protection). Files exceeding this are skipped. Set to 0 for no limit. 500000 (500 KB)
AI_SAST_PR_SCAN_MAX_TOTAL_SIZE Max total bytes of added lines per PR (DoS protection). Once exceeded, remaining files are skipped. Set to 0 for no limit. 5242880 (5 MB)
AI_SAST_CUSTOM_PROMPT Extra instructions appended to the scan prompt (e.g. focus on certain vuln types). β€”
AI_SAST_STORE_FINDINGS When true, store scan findings (and validator results) in the database. false
AI_SAST_DB_PATH Path to SQLite database for feedback and optional scan storage. ~/.ai-sast/scans.db
Initial scan LLM
AI_SAST_LLM LLM for the initial security scan: vertex, bedrock, or ollama. vertex
Validator LLM
AI_SAST_VALIDATOR_LLM LLM to validate findings (true/false positive): vertex, bedrock, or ollama. Only validated true positives are posted in the PR. If unset or error, all findings are posted. bedrock
AI_SAST_VALIDATOR_BEDROCK_MODEL_ID Bedrock model used when validator is bedrock. anthropic.claude-3-5-sonnet-20241022-v2:0
AI_SAST_VALIDATOR_GEMINI_MODEL Gemini model used when validator is vertex. same as GEMINI_MODEL
Vertex AI (Google)
GOOGLE_CLOUD_PROJECT Google Cloud project ID (required for Vertex). β€”
GOOGLE_CREDENTIALS Service account JSON (secret); used by workflow for auth. β€”
GOOGLE_LOCATION Vertex AI region (e.g. us-central1). us-central1
GEMINI_MODEL Gemini model for initial scan when AI_SAST_LLM=vertex. gemini-2.5-pro
GOOGLE_APPLICATION_CREDENTIALS Path to service account key file (alternative to GOOGLE_TOKEN). β€”
GOOGLE_TOKEN Raw service account JSON or access token (alternative to file path). β€”
AWS Bedrock
AWS_REGION AWS region for Bedrock (e.g. us-east-1). us-east-1
BEDROCK_MODEL_ID Claude model for initial scan when AI_SAST_LLM=bedrock. anthropic.claude-opus-4-5-20251101-v1:0
Ollama (local)
OLLAMA_BASE_URL Ollama API base URL. http://localhost:11434
OLLAMA_MODEL Model name for scan or validator when using Ollama. qwen2.5-coder:14b
Jira (historical context)
JIRA_URL Jira server URL (e.g. https://your.atlassian.net). β€”
JIRA_USERNAME Jira user email. β€”
JIRA_API_TOKEN Jira API token (e.g. from id.atlassian.com). β€”
JIRA_JQL_QUERY JQL query to fetch vulnerability tickets for context. β€”
Feedback backend (Databricks)
AI_SAST_FEEDBACK_BACKEND Set to databricks to use Databricks instead of SQLite for feedback. (SQLite)
AI_SAST_DATABRICKS_HOST Databricks workspace hostname. β€”
AI_SAST_DATABRICKS_HTTP_PATH SQL warehouse HTTP path. β€”
AI_SAST_DATABRICKS_TOKEN Databricks personal access token. β€”
AI_SAST_DATABRICKS_CATALOG Unity Catalog name. β€”
AI_SAST_DATABRICKS_SCHEMA Schema name. β€”
AI_SAST_DATABRICKS_TABLE Table name for feedback. β€”
Webhook (notifications)
AI_SAST_WEBHOOK_URL Webhook endpoint URL for scan notifications. β€”
AI_SAST_WEBHOOK_SECRET Optional secret for HMAC signature. β€”
AI_SAST_WEBHOOK_TYPE Webhook format: slack, teams, discord, or generic. generic
Vector / logging
AI_SAST_VECTOR_URL Vector/log aggregator endpoint URL. β€”
AI_SAST_VECTOR_TOKEN Authentication token for the vector endpoint. β€”

Secrets (e.g. GOOGLE_CREDENTIALS, JIRA_API_TOKEN, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) have no default and must be set in the repo.

Troubleshooting

  • Auth errors: Service account needs "Vertex AI User" role; GOOGLE_CREDENTIALS must be the full JSON key.
  • No PR comment: Ensure the PR targets the branch set by AI_SAST_BASE_BRANCH (default main).
  • Feedback not triggering: The feedback job runs from your default branch (e.g. main). Make sure ai-sast.yml is merged to that branchβ€”if it only exists on a feature branch, checking boxes in the PR comment won’t trigger the workflow.
  • Setup: Fork this repo, then set repository variable AI_SAST_REPO to your fork (e.g. your-username/ai-sast).

Support


Made with ❀️ by the AI-SAST community

About

AI-powered SAST accelerator built to speed up secure development.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Languages