Note
This is clanker code, I am supposed to edit this stub of code and its under construction right now, use it with care because things arent tested and they can break
A proof-of-concept network intrusion detection system using Kafka for streaming network logs between cloud endpoints and a centralized monitoring server, with machine learning-based anomaly detection.
- Client Endpoints: Capture network logs and stream to Kafka
- Kafka: Message broker for log streaming
- IDS Server: Consumes logs and analyzes using ML models (One-Class SVM & Isolation Forest)
- Python 3.8+
- Apache Kafka
- Required Python packages:
pykafka,scikit-learn,numpy,pandas
chmod +x server_setup.sh
./server_setup.shThis single script will:
- Download and configure Kafka in KRaft mode (no Zookeeper)
- Initialize SQLite database for persistent storage
- Install Python dependencies
- Start the IDS server with web dashboard at
http://localhost:8501
chmod +x client/setup_client.sh
./client/setup_client.shThe client will:
- Register with the IDS server
- Get assigned a dedicated Kafka topic
- Stream network logs every 15 seconds (delta values)
chmod +x stop_kafka.sh
./stop_kafka.shIf you prefer manual setup:
pip install -r requirements.txt# Download and extract Kafka 3.6.1
# Configure KRaft mode
# Start Kafka server
# See server_setup.sh for detailed stepspython3 -c "
import sqlite3
conn = sqlite3.connect('ids_data.db')
# See server_setup.sh for complete schema
"streamlit run ids/server.pyAccess the dashboard at http://localhost:8501 and click "Start" to begin monitoring.
python client/network_logger.py --kafka-host localhost:9092 --interval 15The client will register with the server and stream delta values every 15 seconds.
.
├── client/
│ ├── network_logger.py # Network log capture and Kafka producer
│ └── setup_client.sh # Automated client setup script
├── ids/
│ ├── server.py # Unified IDS server with web dashboard
│ ├── db.py # Database operations module
│ ├── models.py # ML model implementations
│ ├── ids_server.py # CLI-only version (legacy)
│ └── dashboard.py # Dashboard-only version (legacy)
├── server_setup.sh # Complete server setup (Kafka + DB + Dashboard)
├── stop_kafka.sh # Stop Kafka server
├── ids_data.db # SQLite database (created by setup)
├── requirements.txt
└── README.md
The SQLite database (ids_data.db) stores:
- clients: Registered clients and their topics
- network_logs: All analyzed network traffic logs with anomaly flags (only logs that passed through ML model)
- anomalies: Quick access table for detected anomalies
- model_status: ML model training status per client
- client_statistics: Daily aggregated statistics per client
- Only analyzed traffic is stored (after ML model processes it)
- Automatic daily statistics aggregation
- Indexed queries for fast retrieval
- Context manager for safe database operations
- Singleton pattern for database instance
- Client Registration: Clients automatically register and get dedicated Kafka topics
- Delta Monitoring: Tracks network traffic changes every 15 seconds (not cumulative)
- Real-time Dashboard: Live monitoring of all registered clients
- Anomaly Detection: ML-based detection (One-Class SVM) per client
- Historical View: Track previous anomalies with detailed metrics
- Traffic Statistics: Interface-wise traffic distribution and anomaly rates
- Multi-Client Support: Monitor multiple endpoints simultaneously
- Implement IPTABLES integration for automatic threat blocking
- Add real-time alerting system
- Implement model training pipeline with labeled datasets
- Add configuration file support
- Implement logging and monitoring dashboard
- Add unit tests and integration tests