An intelligent system for discovering IPv6 web servers using Machine Learning and Metaheuristic algorithms.
Given the enormous IPv6 address space (2^128 addresses), traditional scanning is impossible. This project uses ML and optimization algorithms to learn address allocation patterns and predict active addresses.
ipv6-crawler/
βββ config.yaml # Configuration
βββ requirements.txt # Dependencies
βββ main.py # Main entry point
βββ src/
β βββ __init__.py
β βββ seed_collector.py # Initial address collection
β βββ feature_extractor.py # Feature extraction
β βββ ml_model.py # Machine learning model
β βββ address_generator.py # Address generation (classic)
β βββ metaheuristic_generator.py # Metaheuristic algorithms
β βββ prober.py # Network scanner
β βββ fingerprinter.py # Infrastructure identification
β βββ feedback_loop.py # Feedback and model improvement
β βββ database.py # Data management
βββ data/
β βββ seeds/ # Initial seed addresses
β βββ models/ # Saved models
β βββ results/ # Results
βββ logs/ # Logs
- Prefix-based: Generate addresses in known active prefixes
- Mutation-based: Mutate active addresses (increment, decrement, nearby)
- Pattern Learning: Learn from Interface ID patterns
| Algorithm | Description | Advantage |
|---|---|---|
| Genetic Algorithm (GA) | Crossover and mutation of addresses | Combinatorial search space exploration |
| Ant Colony (ACO) | Pheromone-based path finding | Learning from successful paths |
| Cuckoo Search (CS) | LΓ©vy Flight for large jumps | Exploration/exploitation balance |
Intelligent combination of all methods with dynamic resource allocation based on each algorithm's success rate.
- Python 3.10 or higher
- pip (Python package manager)
- Git
1. Clone the repository
git clone https://github.com/AmirDanesh/ipv6-intelligent-crawler.git
cd ipv6-intelligent-crawler2. Create and activate virtual environment
Windows (PowerShell):
python -m venv venv
.\venv\Scripts\Activate.ps1Windows (CMD):
python -m venv venv
venv\Scripts\activate.batLinux/macOS:
python3 -m venv venv
source venv/bin/activate3. Install dependencies
pip install -r requirements.txt4. Run the crawler
# Full crawling pipeline
python main.py
# With custom config
python main.py --config custom_config.yaml
# Quick test run
python main.py --quick-test| Option | Description |
|---|---|
--config |
Path to configuration file (default: config.yaml) |
--quick-test |
Run a quick test with minimal addresses |
--collect-only |
Only collect seed addresses |
--probe-only |
Only probe existing addresses |
--help |
Show all available options |
python -c "from src.ml_model import IPv6ActivePredictor; print('β
Installation successful!')"- Seed Collection: Gather initial IPv6 addresses from various sources
- Feature Extraction: Convert addresses to feature vectors
- Model Training: Learn addressing patterns (Ensemble: RF + XGBoost + GB)
- Address Generation: Predict using GA + ACO + Cuckoo Search
- ML Filtering: Select best candidates using prediction model
- Probing: Verify address activity
- Fingerprinting: Identify server characteristics
- Feedback Loop: Update algorithm weights based on success rates
Edit config.yaml to customize:
- Scanning parameters
- ML model settings
- Metaheuristic algorithm parameters
- Probe timeouts and concurrency
- Ensemble ML Model: Random Forest + XGBoost + Gradient Boosting
- Adaptive Algorithm Selection: Automatically favors better-performing algorithms
- Closed-loop Learning: Continuously improves from probe results
- Efficient Probing: Concurrent scanning with rate limiting
MIT License