Skip to content

jasoncobra3/LLM_Sentinel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

18 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ LLM Sentinel [LLM Red Teaming Platform]

Enterprise-Grade Automated Security Testing for Large Language Models

Python 3.10+ License: MIT Streamlit FastAPI DeepTeam LangChain


πŸ“Œ Project Description

The Challenge

As organizations increasingly adopt Large Language Models (LLMs) for production applications, ensuring their security against adversarial attacks, jailbreaks, and prompt injections has become critical. Traditional security testing approaches are inadequate for evaluating LLM-specific vulnerabilities.

Our Solution

The LLM Red Teaming Platform is a comprehensive, production-ready security testing framework designed specifically for Large Language Models. It automates the discovery of vulnerabilities through adversarial red teaming techniques, enabling security researchers, AI engineers, and organizations to identify weaknesses before they can be exploited in production.
Ask DeepWiki

Who Is This For?

  • Security Researchers conducting AI safety assessments
  • AI/ML Engineers building production LLM applications
  • Enterprise Organizations ensuring compliance and security
  • Red Team Professionals specializing in AI system testing

Key Value Proposition

  • Comprehensive Coverage: Tests 7+ vulnerability categories with 12+ attack methods
  • Multi-Provider Support: Works with any LLM via unified interface
  • Production-Ready: Enterprise-grade with persistence, authentication, and reporting
  • Extensible Framework: Modular architecture for custom attacks and integrations
  • Automated Workflows: Reduces manual testing effort by 80%+

πŸ—οΈ Architecture Overview

System Design

Architetcture

Architecture

Flow Diagram

Flow Diagram

LLM Flow

LLM Flow

Sequence Diagram

Sequence Diagram


βš™οΈ Tech Stack

Backend

Component Technology Purpose
Core Language Python 3.10+ Application runtime
Web Framework FastAPI 0.115+ REST API and async request handling
UI Framework Streamlit 1.32+ Interactive dashboard
Red Teaming DeepTeam 3.8+ Adversarial testing framework
LLM Integration LangChain 1.2+ Universal LLM provider interface

Frontend

Component Technology Purpose
Templates Jinja2 3.1+ Server-side HTML rendering
Styling Custom CSS Modern, responsive UI design
Charts Plotly 6.0+ Interactive data visualization
Static Assets FastAPI StaticFiles CSS/JS/Image serving

Database

Component Technology Purpose
Primary Database SQLite Embedded relational database
ORM SQLAlchemy 2.0+ Database abstraction and migrations
Models SQLAlchemy ORM Scans, TestCases, Configurations

Cloud & DevOps

Component Technology Purpose
Cloud Platform Microsoft Azure Application hosting and infrastructure
Compute Azure App Service / AKS Web application deployment
Secrets Environment Variables API key management

AI/ML Providers

Provider Integration Models Supported
OpenAI langchain-openai GPT-4, GPT-3.5, GPT-4o
Azure OpenAI langchain-openai Azure-hosted OpenAI models
Anthropic langchain-anthropic Claude 3, Claude 2
Google langchain-google-genai Gemini Pro, Gemini Ultra
Groq langchain-groq Llama 3, Mixtral
AWS Bedrock langchain-aws Bedrock models
HuggingFace langchain-huggingface Open-source models

Authentication & Security

Component Technology Purpose
Password Hashing bcrypt 4.2+ Secure credential storage
Session Management Starlette SessionMiddleware Stateful authentication
Secrets Management python-dotenv Environment configuration

Monitoring & Logging

Component Technology Purpose
Logging Python logging module Structured application logs
Log Output File-based logging Audit trail and debugging

Utilities

Component Technology Purpose
PDF Generation fpdf2 2.8+ Security report generation
Data Processing Pandas 2.2+ Results analysis and aggregation
Validation Pydantic 2.10+ Type-safe configuration management
Environment Config pydantic-settings 2.7+ Settings validation

✨ Features

Core Features

βœ… Automated Red Teaming

  • One-click security scans against any LLM
  • Configurable attack intensity (1-20 attacks per vulnerability)
  • Support for batch scanning multiple models

βœ… Multi-Provider LLM Support

  • OpenAI (GPT-4, GPT-3.5, GPT-4o)
  • Azure OpenAI (all deployments)
  • Anthropic Claude (2.x, 3.x)
  • Google Gemini (Pro, Ultra)
  • Groq (Llama 3, Mixtral)
  • AWS Bedrock
  • HuggingFace models

βœ… Comprehensive Vulnerability Testing

  • Robustness: Input overreliance, misinformation
  • Indirect Injection: Cross-prompt leaking
  • Jailbreak: System prompt bypassing
  • Shell Injection: Code execution attempts
  • Prompt Leaking: System prompt extraction
  • Goal Hijacking: Task redirection
  • Inter-Agent Security: Multi-agent vulnerabilities

βœ… Attack Library

  • 100+ pre-built adversarial prompts
  • Categorized by attack type and severity
  • Custom attack builder interface

Advanced Features

πŸ”¬ Attack Enhancement Methods

  • Jailbreak Strategies: DAN, Evil Confidant, STAN, role-playing
  • Encoding Attacks: ROT13, Base64, Caesar cipher
  • Prompt Probing: Iterative refinement
  • Gray Box Testing: Partial knowledge exploitation
  • Multilingual Attacks: Non-English prompts

🎯 Custom Attack Builder

  • Visual interface for creating custom prompts
  • Template-based attack creation
  • Real-time testing against target models
  • Save and reuse custom attacks

πŸ“Š Advanced Analytics

  • Interactive dashboards with Plotly charts
  • Vulnerability distribution analysis
  • Time-series trend tracking
  • Attack success rate metrics
  • Model comparison views

πŸ“‘ Professional Reporting

  • Auto-generated PDF security reports
  • Executive summary with risk scores
  • Detailed test case breakdowns
  • Remediation recommendations
  • Compliance-ready documentation

πŸ“‚ Project Structure

Red_Teaming/
β”‚
β”œβ”€β”€ app.py                      # Streamlit UI entry point
β”œβ”€β”€ web_app.py                  # FastAPI web application
β”œβ”€β”€ migrate_db.py               # Database migration script
β”œβ”€β”€ requirements.txt            # Python dependencies
β”œβ”€β”€ sample.json                 # Sample configuration
β”œβ”€β”€ .env.example                # Environment template
β”‚
β”œβ”€β”€ auth/                       # Authentication module
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── authentication.py       # Login logic, password hashing
β”‚
β”œβ”€β”€ config/                     # Configuration management
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ settings.py             # Environment-based settings
β”‚   └── providers.py            # LLM provider configurations
β”‚
β”œβ”€β”€ core/                       # Core business logic
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ red_team_engine.py      # Main orchestration engine
β”‚   β”œβ”€β”€ llm_factory.py          # LLM instance factory
β”‚   β”œβ”€β”€ attack_registry.py      # Vulnerability & attack registry
β”‚   β”œβ”€β”€ attack_library.py       # Pre-built attack prompts
β”‚   β”œβ”€β”€ jailbreak_strategies.py # Jailbreak method implementations
β”‚   └── custom_red_team_engine.py # Custom attack execution
β”‚
β”œβ”€β”€ database/                   # Data persistence layer
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ db_manager.py           # Database operations & queries
β”‚   └── models.py               # SQLAlchemy ORM models
β”‚
β”œβ”€β”€ reports/                    # Report generation
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── pdf_generator.py        # PDF report creation
β”‚
β”œβ”€β”€ ui/                         # Streamlit UI components
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ components/             # Reusable UI widgets
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ charts.py           # Plotly visualization components
β”‚   β”‚   β”œβ”€β”€ model_selector.py  # LLM selection widget
β”‚   β”‚   └── sidebar.py          # Navigation sidebar
β”‚   └── pages/                  # Application pages
β”‚       β”œβ”€β”€ __init__.py
β”‚       β”œβ”€β”€ dashboard.py        # Main dashboard view
β”‚       β”œβ”€β”€ configure.py        # Provider configuration
β”‚       β”œβ”€β”€ attack_lab.py       # Attack testing interface
β”‚       β”œβ”€β”€ results.py          # Scan results display
β”‚       └── reports_page.py     # Report management
β”‚
β”œβ”€β”€ templates/                  # Jinja2 HTML templates (FastAPI)
β”‚   β”œβ”€β”€ base.html               # Base template with layout
β”‚   β”œβ”€β”€ index.html              # Landing page
β”‚   β”œβ”€β”€ dashboard.html          # Dashboard view
β”‚   β”œβ”€β”€ config.html             # Configuration page
β”‚   β”œβ”€β”€ attack.html             # Attack execution page
β”‚   β”œβ”€β”€ custom_attack.html      # Custom attack builder
β”‚   β”œβ”€β”€ results.html            # Results display
β”‚   └── reports.html            # Reports page
β”‚
β”œβ”€β”€ static/                     # Static assets (CSS, JS, images)
β”‚   β”œβ”€β”€ css/
β”‚   β”œβ”€β”€ js/
β”‚   └── images/
β”‚
β”œβ”€β”€ utils/                      # Utility functions
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ logger.py               # Logging configuration
β”‚   └── helpers.py              # Helper functions
β”‚
β”œβ”€β”€ docs/                       # Documentation
β”‚   β”œβ”€β”€ README.md               # Documentation index
β”‚   β”œβ”€β”€ ARCHITECTURE.md         # Architecture details
β”‚   β”œβ”€β”€ CONTRIBUTING.md         # Contribution guidelines
β”‚   β”œβ”€β”€ DEMO.md                 # Demo walkthrough
β”‚   β”œβ”€β”€ REQUIREMENTS.md         # Detailed requirements
β”‚   β”œβ”€β”€ SECURITY.md             # Security policies
β”‚   └── CHANGELOG.md            # Version history
β”‚
β”œβ”€β”€ logs/                       # Application logs (gitignored)
β”œβ”€β”€ reports/                    # Generated PDF reports (gitignored)
└── __pycache__/                # Python bytecode (gitignored)

Key Directory Explanations

Directory Purpose
auth/ Handles user authentication, password hashing, and session management
config/ Centralized configuration management and provider definitions
core/ Core red teaming logic including engine, factory, and attack registry
database/ SQLAlchemy models and database operation wrappers
reports/ PDF generation for security assessment reports
ui/ Streamlit-based user interface components and pages
templates/ Jinja2 templates for FastAPI web interface
static/ CSS, JavaScript, and image assets for web UI
utils/ Shared utility functions and helpers
docs/ Comprehensive project documentation

πŸ–₯️ Local Setup Guide

Prerequisites

Ensure you have the following installed:

  • Python 3.10 or higher (Python 3.11 recommended)
  • pip (Python package manager)
  • Git (for cloning repository)
  • API keys for at least one LLM provider (see provider list below)

Environment Variables

Create a .env file in the project root:

cp .env.example .env

Installation Steps

  1. Clone the repository:
git clone <repository-url>
cd Red_Teaming
  1. Create virtual environment:
python -m venv venv
  1. Activate virtual environment:

Windows:

.\venv\Scripts\activate

macOS/Linux:

source venv/bin/activate
  1. Install dependencies:
pip install --upgrade pip
pip install -r requirements.txt
  1. Initialize database:
python migrate_db.py

Running the Application

Option 1: Streamlit UI (Interactive Dashboard)

streamlit run app.py

The application will open at http://localhost:8501

Option 2: FastAPI Web App (REST API + HTML)

python web_app.py

Or with Uvicorn directly:

uvicorn web_app:app --reload --host 0.0.0.0 --port 8000

The application will be available at http://localhost:8000

Running Tests

Currently, the project uses manual testing workflows. To validate your setup:

  1. Test provider connectivity:

    • Navigate to the Configuration page
    • Click "Test Connection" for each configured provider
  2. Run a sample scan:

    • Go to Attack Lab
    • Select models and vulnerability types
    • Execute a small scan (5 attacks)
  3. Verify results:

    • Check the Results page for scan output
    • Generate a PDF report

πŸ“Έ Screenshots

Dashboard

Dashboard Overview Dashboard Analytics Dashboard

Configuration

Configuration Setup Configuration Options

Attack Lab

Attack Lab Setup Attack Lab Execution

Custom Attack

Custom Attack Configuration Custom Attack Results

PDF Report

Report


πŸ“š Documentation

Comprehensive documentation is available in the docs/ folder:

  • πŸ“ Architecture - System design and technical details
  • 🀝 Contributing - How to contribute to the project
  • πŸ”’ Security - Security best practices and considerations
  • 🎬 Demo Guide - Creating demos and videos
  • πŸ“‹ Changelog - Version history and release notes
  • πŸ“¦ Requirements - System and dependency requirements

Supported Models

Provider Models Notes
OpenAI GPT-4, GPT-4o, GPT-3.5-turbo Best JSON support
Groq Llama 3.x, Mixtral, Gemma Fast, free tier available
Anthropic Claude 3 Opus, Sonnet, Haiku High-quality responses
Azure OpenAI Same as OpenAI Enterprise deployment
Google Gemini Pro, Gemini Pro Vision Multimodal support
Ollama Any local model Privacy-focused

πŸ™ Acknowledgments

Frameworks & Libraries

Security Research

  • OWASP LLM Top 10 - Vulnerability classification
  • NIST AI RMF - Risk management framework
  • Security researchers and the open-source community

⭐ Star History

If you find this project useful, please consider giving it a star! ⭐

Star History Chart


πŸ“ˆ Stats

GitHub repo size GitHub language count GitHub top language GitHub last commit GitHub issues GitHub pull requests


Built with ❀️ for AI Security

⬆ Back to Top

About

LLM Sentinel Red Teaming Platform is an enterprise-grade framework for automated security testing of Large Language Models, detecting vulnerabilities such as jailbreaks, prompt injection, and system prompt leakage across multiple providers, with structured attack orchestration, risk scoring, and security reporting to harden models before production

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors