Skip to content

conorrusso/bandit

Repository files navigation

Bandit — Vendor Risk Intelligence Suite

Open-source CLI for vendor privacy risk assessment. Free forever. Provider-agnostic.

License: MIT Rubric: v1.0.0 Providers: Claude · GPT-4o · Gemini · Ollama


What Bandit does

Bandit is a Python CLI that assesses vendor privacy policies against a published, enforcement-grounded rubric. Point it at a company name, domain, or URL — it finds the policy, extracts evidence with an LLM, scores it deterministically across 8 dimensions, and saves an HTML report with findings for each team.

The AI extracts evidence. Bandit scores it. GPT-4o and Claude produce the same score from the same policy.


Installation

git clone https://github.com/conorrusso/bandit.git
cd bandit
pip install -e .

Set your API key (or pass --api-key):

export ANTHROPIC_API_KEY=sk-ant-...

Quick start

Install

git clone https://github.com/conorrusso/bandit.git
cd bandit
pip install -e .

With Google Drive support:

pip install -e ".[drive]"

Configure

bandit setup

5 questions. Takes about 2 minutes. Configures Bandit for your industry, location, and regulatory profile. Run once, update anytime.

For Google Drive integration:

bandit setup --drive

Run your first assessment

# Public privacy policy only
bandit assess "Salesforce"

# With local documents
bandit assess "Salesforce" --docs ./vendor-docs/Salesforce/

# With Google Drive
bandit assess "Salesforce" --drive

Usage

bandit assess <vendor>           Run a full privacy risk assessment
bandit assess <vendor> -v        Verbose — see fetched pages and signals
bandit assess <vendor> --json    Output raw JSON
bandit batch <vendors.txt>       Assess a full vendor list
bandit profile <vendor>          Show vendor function profile and doc requirements
bandit profile --show            List all cached vendor profiles
bandit rubric                    Show the scoring rubric summary
bandit rubric --dim D5           Show criteria for one dimension
bandit setup                     Configure your industry and regulatory profile
bandit setup --stack             Set up your internal tech stack
bandit setup --notify            Configure IT notification contact
bandit setup --show              Show current profile
bandit setup --reset             Start setup over
bandit vendor add <vendor>       Run 12-question intake wizard for a new vendor
bandit vendor show <vendor>      View vendor profile and assessment history
bandit vendor edit <vendor>      Update intake answers
bandit vendor list               List all vendors with risk tier and next due date
bandit vendor list --due         Vendors due for reassessment only
bandit vendor list --risk HIGH   Filter by risk tier
bandit legal <vendor>            Standalone contract gap analysis
bandit dashboard                 Portfolio risk overview
bandit schedule                  Reassessment schedule
bandit schedule --due            Vendors due for reassessment
bandit register                  Export TPRM register (CSV / JSON / HTML)
bandit notify --all              Send all queued IT notifications
bandit sync                      Sync with Drive — discovers, links, pulls docs

Input formats

bandit assess accepts three input formats:

Format Example Behaviour
Company name "Salesforce" Full discovery: DDG search → domain probe → AI reasoning
Bare domain hubspot.com Skips domain resolution, probes privacy paths directly
Full URL https://acme.com/privacy Skips all discovery, fetches directly

Batch assessment

Create a text file with one vendor per line (names, domains, or URLs — any format):

Salesforce
hubspot.com
https://notion.so/privacy
# This line is ignored
Anecdotes AI

Run:

bandit batch vendors.txt

HTML reports are saved to ./reports/ after every run. The batch command prints a summary table when all vendors are done.


Commands

bandit assess

Run a full privacy risk assessment for one vendor.

bandit assess "Salesforce"                           # company name
bandit assess salesforce.com                         # domain
bandit assess https://salesforce.com/privacy         # full URL
bandit assess "Acme Corp" --verbose                  # show all stages
bandit assess "Acme Corp" --json > acme.json         # raw JSON output
bandit assess "Acme Corp" --no-report                # skip HTML report
bandit assess "Acme Corp" --model claude-opus-4-6    # override model
bandit assess "Acme Corp" --docs ./vendor-docs/acme/ # with documents
bandit assess "Acme Corp" --drive                    # from Google Drive
Flag Description
-v, --verbose Show discovery stages and signal extraction detail
--json Print raw JSON to stdout (report still saved)
--no-report Skip saving the HTML report
--model MODEL Override the LLM model
--api-key KEY Provide API key directly (default: env var)
--docs PATH Folder containing vendor documents (DPA, MSA, SOC 2, etc.)
--drive Fetch vendor documents from configured Google Drive folder

bandit setup

Run the interactive setup wizard. 5 core questions + up to 3 conditional. Infers regulatory frameworks automatically and adjusts dimension weights. Saves to bandit.config.yml.

bandit setup            # Run wizard (~2 minutes)
bandit setup --stack    # Collect internal tools by category (used in vendor intake)
bandit setup --notify   # Configure IT notification contact and method
bandit setup --show     # Show current config
bandit setup --reset    # Start over
bandit setup --advanced # Advanced config (coming soon)

The wizard asks:

  1. Organisation type — sets industry-specific defaults
  2. Locations — where you and your customers operate
  3. Sensitive data types — PHI, PCI, children's data, biometric, HR, special categories
  4. Required certifications — SOC 2, HIPAA BAA, GDPR DPA, PCI AOC, etc.
  5. Risk approach — Strict / Standard / Pragmatic (sets cadence and escalation)

Conditional (asked only when relevant):

  • Infrastructure location (if EU/EEA in locations) — affects D4 cross-border transfer weight
  • BAA required (if PHI selected) — HIPAA Business Associate Agreement requirement
  • PCI merchant level (if payment card selected) — Level 1/2/3

Bandit then infers: applicable frameworks (GDPR, HIPAA, CCPA, etc.), dimension weights, reassessment cadence, and escalation triggers — all shown for review before saving.

If no config exists, Bandit prompts you to run setup before the assessment starts. You can set up inline, skip it, or quit.

bandit profile

Show the vendor function profile for any vendor — what category it belongs to, weight modifiers that apply, and documents expected.

bandit profile "Salesforce"    # Detect and show profile
bandit profile --show          # List all cached profiles
bandit profile --unknown       # Show vendors with unknown classification

Bandit auto-detects vendor functions (HR, payments, AI/ML, healthcare, etc.) using a curated library of 330+ vendors. If detection confidence is below 0.6, you'll be prompted to confirm during bandit assess.

bandit batch

Assess a list of vendors from a text file. See vendors.txt format above.

bandit rubric

Show the scoring rubric. Use --dim D1 through --dim D8 to see criteria for a specific dimension.


The HTML report

Every bandit assess run saves an HTML report to ./reports/<vendor>-<date>.html.

The report includes:

  • Score summary — risk tier (LOW / MEDIUM / HIGH), weighted average, per-dimension scores
  • Assessment scope notice — shows which document types were assessed. Currently: public privacy policy only. D8 (DPA Completeness) is marked "Requires DPA" and D2, D4, D5 are marked "Partially assessed" until a DPA is provided (v1.1 Google Drive integration)
  • Per-dimension detail — evidence found (confirmed signals), gaps identified (missing signals), red flags triggered
  • Evidence confidence — each dimension shows whether evidence is Confirmed (genuine score), Partially assessed (DPA would complete), or Requires DPA (cannot score from public policy)
  • Profile header — active profile shown with industry, frameworks, and modified dimension weights
  • Escalation banner — HIGH risk vendors with auto-escalation triggers show a prominent banner with specific reasons
  • Vendor follow-up questions — 2–3 questions per gap, ready to send
  • Contract recommendations — specific DPA/MSA redline language for scores ≤ 3
  • Team summary — GRC decision, Legal contract checklist, Security posture (D5/D6/D8)
  • Vendor email template — consolidated questions formatted as a ready-to-send email

Use --no-report to skip saving. Use --json to print raw JSON to stdout (report is still saved).


Assessment scope

Bandit assesses public privacy policies and uploaded documents.

Different documents reveal different information:

Document Dimensions unlocked
Public policy only D1, D3, D6, D7 fully · D2, D4, D5 partially · D8 not assessed
+ DPA D8 fully · D2/D4/D5 complete
+ BAA (healthcare) D5 HIPAA timeline
+ SOC 2 Type II D2, D5, D7, D8 supplemented
+ AI Policy D6 deep assessment
+ Sub-processor list D2 complete

Document sources

Bandit supports public privacy policies, local document folders, and Google Drive, unlocking full scoring across all 8 dimensions.

What documents unlock

Document Dimensions unlocked
Public policy only D1 D3 D6 D7 fully · D2 D4 D5 partial · D8 not scored
+ DPA D8 fully · D2 D4 D5 complete
+ BAA D5 HIPAA timeline · D8 HIPAA provisions
+ SOC 2 Type II Evidence for D2 D5 D7 D8
+ AI Policy D6 deep assessment
+ Sub-processor list D2 complete

Local folder

# Single vendor
bandit assess "Salesforce" --docs ./vendor-docs/Salesforce/

# Batch with auto-matching
bandit batch vendors.txt --docs-root ./vendor-docs/

Folder structure:

vendor-docs/
├── Salesforce/
│   ├── dpa.pdf
│   ├── msa.pdf
│   └── soc2-2025.pdf
└── HubSpot/
    └── dpa.pdf

File names don't matter — Bandit auto-detects types. Supports: PDF · DOCX · HTML · TXT · MD · JSON

Google Drive

# One-time setup
bandit setup --drive

# Assess with Drive documents
bandit assess "Salesforce" --drive

# Batch with Drive
bandit batch vendors.txt --drive

First time with existing Drive folders:

bandit setup --drive           # configure credentials
bandit sync                    # discover, link, and pull docs
bandit assess "Vendor" --drive # run with Drive docs
bandit dashboard               # view portfolio

Full setup guide: docs/google-drive-setup.md

Local vs Drive

Local folder Google Drive
Setup None — just create folders One-time OAuth setup
Best for Quick assessments, offline, Ollama Teams, scheduled batches
Air-gapped Yes (with Ollama) No
Auto-discover Manual folder path Automatic by vendor name
Save reports Local ./reports/ Back to Drive vendor folder
Team access No Yes

The 8 dimensions

ID Dimension Weight Regulatory basis
D1 Data minimization ×1.0 GDPR Art. 5(1)(c)
D2 Sub-processor management ×1.0 GDPR Art. 28(2),(4)
D3 Data subject rights ×1.0 GDPR Arts. 12–23
D4 Transfer mechanisms ×1.0 GDPR Arts. 44–50
D5 Breach notification ×1.0 GDPR Art. 33
D6 AI/ML data usage ×1.5 EU AI Act 2024 · FTC
D7 Retention & deletion ×1.0 GDPR Art. 5(1)(e)
D8 DPA completeness ×1.5 GDPR Art. 28(3)(a)–(h)

D6 and D8 are weighted 1.5× — AI/ML is the fastest-moving regulatory area and DPA quality sets the enforceability ceiling for D2, D5, and D7.

Weights are adjusted automatically when you run bandit setup. A healthcare company gets D5 weighted higher; an EU company gets D4 weighted higher.

Each dimension is scored 1–5. Risk tier:

Tier Weighted average
HIGH < 2.5
MEDIUM 2.5 – 3.5
LOW > 3.5

See core/scoring/RUBRIC.md for full criteria, enforcement precedents, and red-flag phrase registry.


Providers

Provider Model Notes
Anthropic (default) claude-haiku-4-5-20251001 Fast and cost-effective
Anthropic claude-opus-4-6 Best quality for nuanced analysis
OpenAI gpt-4o Excellent, widely used in enterprise
Google gemini-1.5-pro Strong quality at lower cost
Mistral mistral-large-latest European-hosted option
Ollama llama3.1, mistral Fully local. No API key. Free.

Override the model with --model:

bandit assess "Acme Corp" --model claude-opus-4-6
bandit assess "Acme Corp" --model gpt-4o

Discovery pipeline

When given a company name or domain, Bandit locates the privacy policy through a staged fallback:

  1. DDG search — queries DuckDuckGo for {vendor} privacy policy
  2. TLD probe — tries common privacy paths (/privacy, /privacy-policy, etc.)
  3. AI reasoning — asks the LLM to infer the most likely domain
  4. Homepage scrape — fetches the homepage and looks for policy links
  5. Manual flag — logs to ~/.bandit/manual-review.json if all stages fail

Discovered domain→URL mappings are cached in ~/.bandit/domain-cache.json.


Troubleshooting

ANTHROPIC_API_KEY not set — Export the key or pass --api-key sk-ant-....

Could not locate a privacy policy — Try passing the URL directly: bandit assess https://vendor.com/privacy. The vendor is logged to ~/.bandit/manual-review.json.

Sparse policy text — Bandit automatically retries via Jina Reader if the direct fetch returns too little text (JS-rendered pages). If both fail, the policy may require authentication.

Wrong policy found — Pass the full URL to bypass discovery entirely.

Document issues

All dimensions still show "Requires DPA" after --docs Check the manifest output — the DPA may have failed to extract (scanned PDF) or been classified as UNKNOWN. Run with --verbose to see the full manifest. Rename the file to include "dpa" if classification failed.

Scanned PDF not readable Bandit requires PDFs with a text layer. Request a native PDF from your vendor. OCR support for scanned PDFs is coming in v1.2.

Google Drive: "Vendor folder not found" The subfolder name must match the vendor name. Check spelling and case. Example: folder "Salesforce" matches bandit assess "Salesforce".

Google Drive: token expired or "403 Insufficient Permission" Delete the token and re-authenticate:

rm ~/.bandit/google-token.json
bandit assess "YourVendor" --drive

Running bandit setup --drive alone does not always fix this — the token must be deleted so a fresh browser consent is triggered with the correct permissions.


Roadmap

v1.0 — Live

Privacy Bandit · CLI · HTML reports · Setup profiles · Evidence confidence · Provider-agnostic

v1.1 — Live

Local folder document sources · Google Drive integration · PDF/DOCX parsing · 47 document types · Full D8 scoring · Signal source attribution · Multi-document assessment

v1.2 — Live

Legal Bandit · Full GDPR Art. 28(3) DPA checklist · MSA commercial data protection terms · SCC version check · Contract-based score updates · Legal redline brief HTML report

v1.3 — Live

Vendor Intelligence · 12-question intake wizard · Tech stack integration · Assessment history per vendor · Intake context injected into assessments · Weight modifiers from intake data · IT notification queue · Google Drive profile sync

v1.4 — Live

Portfolio dashboard · Reassessment schedule · TPRM register export (CSV / JSON / HTML) · IT notification sending · Unified data resolver

v2.0 — Planned

Full vendor onboarding workflow · Submission portal · Approval workflow · Vendor self-service · API


The crew

Each Bandit is a specialised agent with its own tool belt:

Agent Status Scope
Privacy Bandit Live All 8 dimensions from privacy policy + DPA
Legal Bandit Live D2, D5, D7, D8 from MSA/DPA · GDPR Art. 28 checklist · redline brief
AI Bandit Planned D6 focused, EU AI Act compliance
Audit Bandit Planned D2, D5, D8 from SOC 2 / ISO 27001 reports
Data Bandit Planned Data flow and transfer mapping

Contributing

See CONTRIBUTING.md. The most impactful contributions:

  • Rubric improvements — edit core/scoring/rubric.py (no AI expertise needed)
  • New red-flag patterns — add enforcement-backed phrases to the signal registry
  • Provider adapters — add a new LLM provider to core/llm/
  • New agent implementations — build out a Bandit agent class in core/agents/

License

MIT — see LICENSE.

About

Open-source AI agent for vendor privacy risk assessment. Scores vendors across 8 GDPR/CCPA dimensions using a published, enforcement-grounded rubric. Works with Claude, GPT-4o, Gemini, Mistral, or fully local with Ollama. Free forever.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors