Skip to content

daletoniris/ekoparty-ml-security

Repository files navigation

Ekoparty ML Security — Machine Learning for Threat Detection

Training materials and scripts from the AI Village at Ekoparty. Covers supervised, unsupervised, and active learning approaches to network threat detection using KNN, anomaly detection, and LLM-assisted labeling.

Python scikit-learn Security

Overview

A progressive collection of ML scripts that teach threat detection from simple to advanced:

  1. Unsupervised — Detect anomalies without labels
  2. Supervised KNN — Classify with labeled training data
  3. Active Learning — Human + AI in the loop for uncertain samples
  4. Real-time Detection — Live network traffic classification

Scripts

Script Approach Description
unsupervised_simple.py Unsupervised Basic anomaly detection on HTTP requests
supervised_model.py Supervised Train a classifier on labeled traffic data
supervised_model_knn.py Supervised KNN K-Nearest Neighbors for traffic classification
analisis_trafico_knn.py Supervised KNN Network traffic analysis with KNN pipeline
analisis_secuencial_knn.py Sequential KNN Sequential analysis of traffic patterns
demo_aprendizaje_activo.py Active Learning LLM-assisted active learning demo
aprendizaje_activo_llm_venice.py Active Learning Active learning with Venice AI for labeling
aprendizaje_activo_venice_trafico.py Active Learning Traffic classification with LLM feedback loop
analisis_trafico_knn_activo_venice.py Active + Real-time Full pipeline: KNN + active learning + live capture
detector_escaneos_tiempo_real.py Real-time Live port scan detection with trained model

Key Concepts

Active Learning with LLM

When the KNN model is uncertain about a sample (low confidence margin), it queries an LLM to help classify. The LLM response is used as a label to retrain the model, improving accuracy over time.

Traffic → KNN → Confident? → YES → Classify
                            → NO  → Query LLM → Label → Retrain KNN

Feature Extraction

Network packets are converted to feature vectors: protocol type, port numbers, packet length, TTL, TCP flags, timing patterns.

Setup

pip install scikit-learn numpy pandas joblib scapy

# For active learning with LLM (optional)
# Create .env.venice with your API credentials

Usage

# Start with unsupervised anomaly detection
python unsupervised_simple.py

# Train supervised KNN model
python supervised_model_knn.py

# Run active learning demo
python demo_aprendizaje_activo.py

# Real-time scan detection (requires root for packet capture)
sudo python detector_escaneos_tiempo_real.py

Context

Developed for the AI Village at Ekoparty, the largest security conference in Latin America. These scripts were used in hands-on workshops teaching ML-based threat detection.

License

MIT

About

ML for threat detection training materials from Ekoparty AI Village: KNN, active learning, LLM-assisted labeling

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages