🛡️ Web Application Firewall (WAF) Enhanced with AI

Autonomous Dynamic Learning with Generative Models

A novel approach combining traditional ML with generative AI for real-time threat detection

📖 Abstract • 🚀 Quick Start • 🏗️ Architecture • 📊 Results

📖 Abstract

The evolution of web application defense mechanisms has led to the development of Web Application Firewalls (WAF) powered by machine learning models for threat detection. This paper presents a novel approach that combines traditional machine learning techniques (Naive Bayes) with generative models such as ChatGPT for the dynamic classification of threats in web applications.

Our solution leverages ChatGPT's capabilities to detect novel attacks and enhances detection capabilities through continuous retraining. This system progressively learns from new attack patterns, eventually reducing its dependence on the generative model.

✨ Key Features

🤖 Hybrid AI System - Combines Naive Bayes + ChatGPT for optimal detection
🔄 Autonomous Learning - Continuously retrains from new attack patterns
⚡ Real-time Detection - Instant classification of known and novel attacks
🎯 Zero-day Protection - Detects previously unknown attack vectors
📈 Progressive Independence - Reduces reliance on ChatGPT over time
🛡️ Multi-attack Support - XSS, SQL Injection, Path Traversal, and more

🎥 Demo Videos

Model Learning Process

Autonomous Operation

Click images to watch on YouTube

🚀 Quick Start

Prerequisites

Python 3.7+
OpenAI API key
Required libraries

Installation

# Clone the repository
git clone https://github.com/daletoniris/Web-Application-Firewall-Purple-AI-Paper.git
cd Web-Application-Firewall-Purple-AI-Paper

# Install dependencies
pip install flask requests colorama scikit-learn openai

Configuration

Add your OpenAI API key in:

WAF_TRAIN_GPT.py
WAF_POST_GPT_NAIVES.py

openai.api_key = "your-api-key-here"

🏃 How to Run

Step 1: Start the Web Server

python server.py

Server will be available at http://localhost:5051

Step 2: Simulate Attacks

In a new terminal:

python ATTACK.py

This sends random attacks (XSS, SQL Injection, etc.) every 5 seconds.

Step 3: Monitor with AI

Start monitoring and classifying logs with ChatGPT:

python WAF_TRAIN_GPT.py

Step 4: Train Naive Bayes Model

Train the local classifier:

python WAF_POST_GPT_NAIVES.py

The model will now classify logs locally without consulting ChatGPT.

🏗️ Architecture

System Workflow

┌─────────────┐
│ Web Server  │ ──► Logs ──► ┌──────────────────┐
│  (server.py)│              │  Naive Bayes     │
└─────────────┘              │  Classifier      │
                              └────────┬─────────┘
                                       │
                    ┌──────────────────┴──────────────────┐
                    │                                       │
              ✅ Confident                            ❓ Uncertain
                    │                                       │
                    │                                       ▼
                    │                              ┌─────────────────┐
                    │                              │    ChatGPT      │
                    │                              │  Classification │
                    │                              └────────┬─────────┘
                    │                                       │
                    └───────────────────┬───────────────────┘
                                        │
                                        ▼
                              ┌──────────────────┐
                              │  Retrain Model   │
                              │  (Feedback Loop)│
                              └──────────────────┘

Components

Component	Description
server.py	Simulates web application and logs incoming requests
ATTACK.py	Sends random simulated attacks to the server
WAF_TRAIN_GPT.py	Classifies logs using ChatGPT and stores learned patterns
WAF_POST_GPT_NAIVES.py	Trains and uses Naive Bayes model for local classification

🎯 Supported Attack Types

✅ XSS (Cross-site Scripting)
✅ SQL Injection
✅ Path Traversal
✅ Command Injection
✅ Remote File Inclusion (RFI)
✅ LDAP Injection
✅ Code Injection

📊 Results

Performance Improvements

Accuracy: Naive Bayes model improved significantly after retraining with ChatGPT feedback
Real-time Detection: Near-instant detection of novel attack vectors
Continuous Learning: Detection rates improve with each interaction
Autonomy: System reduces dependence on ChatGPT as it learns

Example Output

ATTACK.py:

⚔️ Attacker started. Sending attacks every 5 seconds...
✖ Attack (SQL Injection) sent: 1' OR '1'='1 | Response Code: 200
✖ Attack (XSS) sent: <script>alert("XSS")</script> | Response Code: 200

WAF_TRAIN_GPT.py:

➤ Processing new log line: INFO:werkzeug:127.0.0.1 - - [19/Nov/2024:15:10:35] "POST /login HTTP/1.1" 200 -
🔍 ChatGPT classified the line as: SQL Injection
✔ Memory saved successfully.

WAF_POST_GPT_NAIVES.py:

➤ Processing new log line: INFO:werkzeug:127.0.0.1 - - [19/Nov/2024:15:12:40] "POST /login HTTP/1.1" 200 -
✔ Classified by the model as: XSS
✔ Memory saved successfully.

🔬 Technical Details

Naive Bayes Implementation

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB

# Vectorize logs
vectorizer = TfidfVectorizer(max_features=1000)
X = vectorizer.fit_transform(logs)
y = labels

# Train model
model = MultinomialNB().fit(X, y)

ChatGPT Integration

import openai

def consult_gpt4(log_line):
    response = openai.ChatCompletion.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": "Classify this log line as 'XSS', 'SQL Injection', 'No Attack', or another type of attack."},
            {"role": "user", "content": f"Log line: {log_line}"}
        ]
    )
    return response['choices'][0]['message']['content']

🚧 Challenges & Future Work

Current Challenges

⏱️ Latency: ChatGPT API calls introduce some delay
📊 Data Quality: Performance depends on training data quality
📈 Scalability: Managing growing training data efficiently

Future Improvements

Optimize ChatGPT interactions
Explore alternative ML models
Improve scalability
Enhanced pattern recognition

📄 License

This work is licensed under the Apache License 2.0.

✅ Use: Personal, educational, or commercial purposes
✅ Modify: Adapt and build upon the material
✅ Distribute: Share under the same license

⚠️ Ethical Use Only: Intended for lawful purposes including educational research, penetration testing, and cybersecurity defense.

📚 References

"Application Layer Security for Modern Web Applications", 2023
"Generative Models in Cybersecurity: A New Approach to Threat Detection", Journal of AI Research, 2024
"Advances in Machine Learning for Web Application Firewalls", Cybersecurity Review, 2024

👤 Author

Daniel Dieser - Independent Robotics Researcher & AI Developer

GitHub: @daletoniris
Organizations: @initiasur, @NiperiaLab

🛡️ Protecting Web Applications with AI-Powered Defense

⭐ Star this repo if you find it useful

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
ATTACK.py		ATTACK.py
LICENSE		LICENSE
README.md		README.md
WAF_POST_GPT_NAIVES.py		WAF_POST_GPT_NAIVES.py
WAF_TRAIN_GPT.py		WAF_TRAIN_GPT.py
modelo-aprediendo-de-forma-guiada-chatgpt.mp4		modelo-aprediendo-de-forma-guiada-chatgpt.mp4
server.py		server.py

Folders and files

Latest commit

History

Repository files navigation

🛡️ Web Application Firewall (WAF) Enhanced with AI

Autonomous Dynamic Learning with Generative Models

📖 Abstract

✨ Key Features

🎥 Demo Videos

Model Learning Process

Autonomous Operation

🚀 Quick Start

Prerequisites

Installation

Configuration

🏃 How to Run

Step 1: Start the Web Server

Step 2: Simulate Attacks

Step 3: Monitor with AI

Step 4: Train Naive Bayes Model

🏗️ Architecture

System Workflow

Components

🎯 Supported Attack Types

📊 Results

Performance Improvements

Example Output

🔬 Technical Details

Naive Bayes Implementation

ChatGPT Integration

🚧 Challenges & Future Work

Current Challenges

Future Improvements

📄 License

📚 References

👤 Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages