Optifiner is a self-evolving code framework that automatically improves codebases through multi-agent AI-driven optimization. It spawns parallel AI agents that propose and test code improvements, keeping only changes that measurably improve performance against benchmark metrics.
- Multi-Agent Evolution: Deploy 10+ parallel AI agents that autonomously improve your code
- Benchmark-Driven: All improvements are validated against your custom evaluation metrics
- Git-Integrated: Every improvement is tracked, version-controlled, and reversible
- Real-Time Visualization: Monitor evolution progress through an interactive web dashboard
- Multi-Model Support: Works with Claude, GPT-4, Gemini, and other LLMs
- Generational Optimization: Runs multiple generations with automatic convergence detection
- Production-Ready: Docker support, scalable architecture, comprehensive observability
- Performance Optimization: Automatically refactor slow code for better throughput/latency
- Algorithm Improvement: Evolve sorting, pathfinding, and scheduling algorithms
- Game AI Enhancement: Improve NPC behavior and game mechanics
- Competitive Programming: Auto-optimize solutions for algorithmic contests
- ML Model Tuning: Refine hyperparameters and training code
- Python 3.10+
- Node.js 18+
- Docker & Docker Compose (for full stack)
- API keys for at least one LLM provider:
- Anthropic (Claude)
- Google (Gemini)
- OpenAI (GPT)
# Clone the repository
git clone https://github.com/yourusername/optifiner.git
cd optifiner
# Install worker dependencies
cd services/worker
pip install -r requirements.txt
# Install web UI dependencies
cd ../../apps/web
npm installYour codebase needs a benchmark script that outputs JSON with your metric:
# optifiner_benchmark.py
import json
import subprocess
import time
def evaluate():
"""Run benchmarks and return a score."""
start = time.time()
result = subprocess.run(['python', 'main.py'], capture_output=True)
elapsed = time.time() - start
if result.returncode == 0:
return 100.0 / (elapsed + 1) # Faster = higher score
return 0.0
if __name__ == '__main__':
score = evaluate()
# Output JSON format (supports both higher-is-better and lower-is-better metrics)
print(json.dumps({
"score": score,
"metric_name": "throughput", # e.g., "FPS", "throughput", "cycles", "latency_ms"
"test_gate": True, # Set False if tests fail
"higher_is_better": True, # True for FPS/throughput, False for cycles/latency
}))Metric Direction: The system auto-detects whether higher or lower is better based on metric names like "cycles", "latency", "ms" (lower is better) vs "FPS", "throughput" (higher is better). You can also explicitly set "higher_is_better": false for lower-is-better metrics.
cd services/worker
# Single generation with 5 agents
python cli.py /path/to/your/repo \
--evaluator /path/to/evaluate.py \
--agents 5 \
--generations 1 \
--model-provider google \
--model-name gemini-2.5-flash
# Multiple generations with parallel execution
python cli.py /path/to/your/repo \
--evaluator /path/to/evaluate.py \
--agents 10 \
--parallel 4 \
--generations 5 \
--output results.json# Results are committed to git
git log --oneline
# View detailed evolution metrics
cat results.json | jq '.'cd apps/web
# Development
npm run dev # Runs on http://localhost:5173
# Production build
npm run build
npm run previewOptifiner consists of three main components:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Web Dashboard (React) β
β Real-time visualization of evolution progress β
ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ
β WebSocket
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β API Backend (FastAPI) β
β Project management & coordination β
ββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββ΄βββββββββββββββββββ
βΌ βΌ
βββββββββββββββ βββββββββββββββββββ
β Redis β β PostgreSQL β
β Task Queue β β History DB β
βββββββββββββββ βββββββββββββββββββ
β²
β Celery Tasks
β
ββββββββ΄βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β LangGraph Evolution Worker β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Agent Pool (Analyzer, Refactorer, Optimizer, etc) β β
β β β β
β β Each Agent: β β
β β β’ Analyzes code with LLM β β
β β β’ Proposes improvements β β
β β β’ Edits files in sandbox workspace β β
β β β’ Runs evaluator benchmarks β β
β β β’ Commits improvements if score improves β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Tools: read_file, write_file, edit_file, grep, eval... β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Getting Started - Detailed setup and configuration guide
- Architecture - System design, component details, and workflows
- Agent Types - Description of each agent type and its capabilities
- API Reference - CLI commands, endpoints, and configuration options
- Examples - Real-world example projects and use cases
- Deployment - Production deployment with Docker Compose
# LLM Provider (google, anthropic, or openai)
MODEL_PROVIDER=google
MODEL_NAME=gemini-2.5-flash
GOOGLE_API_KEY=your-key-here
# Evolution parameters
AGENTS=10 # Number of parallel agents
GENERATIONS=5 # Number of evolution generations
MAX_ITERATIONS=15 # Max tool calls per agent
PARALLEL=4 # Parallel execution workers
# Workspace
WORKSPACE_ROOT=/tmp/optifiner-workspace
# Database (for full stack)
DATABASE_URL=postgresql://user:pass@localhost/optifiner
REDIS_URL=redis://localhost:6379| Provider | Models |
|---|---|
| Anthropic | claude-opus-4-20250514, claude-sonnet-4-5-20250514, claude-haiku-4-20250514 |
gemini-2.5-flash, gemini-3-flash-preview |
|
| OpenAI | gpt-4o, gpt-4-turbo |
1. Initialize Evolution
ββ Create workspace (isolated copy of repo)
ββ Get baseline score from evaluator
2. Generation 1 (10 agents in parallel)
ββ Agent 1 (analyzer): Identifies bottlenecks
β ββ Proposes refactoring β Tests β Score improves! β
ββ Agent 2 (optimizer): Tweaks parameters
β ββ Proposes changes β Tests β No improvement β
ββ Agent 3 (feature): Adds caching
β ββ Proposes changes β Tests β Score improves! β
ββ ...more agents...
3. Generation 2
ββ Builds on successful changes from Gen 1
ββ Proposes additional improvements
4. Results
ββ All improvements committed to git
ββ Fitness curve plotted
ββ Summary report generated
optifiner/
βββ apps/
β βββ web/ # React frontend UI
β βββ api/ # FastAPI backend
βββ services/
β βββ worker/ # LangGraph evolution agent
βββ packages/
β βββ shared/ # Shared utilities
βββ examples/ # Example projects
βββ infra/ # Docker & deployment
βββ docs/ # Documentation
βββ scripts/ # Utility scripts
# Install all dependencies
npm install -ws
# Run linter
npm run lint -ws
# Run tests
npm run test -ws
# Build everything
npm run build -ws# Build all images
docker-compose build
# Start all services
docker-compose up
# View logs
docker-compose logs -f workerWe welcome contributions! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/your-feature) - Commit your changes (
git commit -am 'Add your feature') - Push to the branch (
git push origin feature/your-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Built with LangGraph for agent orchestration
- Powered by leading LLM providers: Anthropic, Google, and OpenAI
- UI inspired by modern DevOps dashboards
- Issues: Report bugs on GitHub Issues
- Discussions: Join our GitHub Discussions
- Documentation: See the docs folder for detailed guides
- β Core evolution engine working
- β Multi-agent orchestration with LangGraph
- β React web dashboard
- π Full API backend (in progress)
- π Distributed task queue (in progress)
- π Production deployment guide
Start evolving your code today! π§¬