A multi-agent code optimization system built with LangGraph that analyzes and optimizes code across multiple languages using AI agents.
- Multi-Phase Optimization: Two-phase architecture with parallel summarization and sequential optimization
- Provider Agnostic: Support for Ollama (local), OpenAI, Anthropic, Gemini, and custom providers with easy switching
- Multi-Language: Optimize Python, JavaScript, TypeScript, Java, and more
- Extensible Architecture: Clean base classes for easy agent and provider development
- Parallel Processing: LangGraph-based workflow with parallel summarization agents
- Execution Tracking: Comprehensive metrics and artifact management with RunManager
- Beautiful Logging: Beautilog integration for terminal + file logging simultaneously
- Comprehensive Analysis:
- Performance optimization (algorithmic efficiency, memory usage, execution time)
- Code quality improvements (readability, maintainability, best practices)
- Security analysis (vulnerabilities, input validation, secure coding)
Three specialized agents run in parallel to analyze different aspects of code:
- Environment Summary Agent - Analyzes dependencies, imports, and environment setup
- Behavior Summary Agent - Understands code behavior, logic flow, and patterns
- Component Summary Agent - Identifies structure, functions, classes, and components
- Static Analysis Tools - Collects hotspots, client usage, and dependency signals
- Analyzer Agent - Reviews summaries + static signals to identify optimization opportunities
- Optimization Agent - Applies optimizations based on analysis and generates improved code
- Code Correctness Agent - Verifies that applied changes preserve intended behavior
┌─────────────────────────────────────────────┐
│ Input Code │
└────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ PHASE 1: SUMMARIZATION (Parallel) │
├─────────────────────────────────────────────┤
│ Environment Summary Agent │
│ Behavior Summary Agent │
│ Component Summary Agent │
│ (run simultaneously) │
└────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ Combine Summaries │
└────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ PHASE 2: STATIC ANALYSIS │
├─────────────────────────────────────────────┤
│ Static Analysis Tools │
└────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ PHASE 3: OPTIMIZATION (Sequential) │
├─────────────────────────────────────────────┤
│ Analyzer Agent │
│ ↓ │
│ Optimization Agent │
│ ↓ │
│ Update Repository │
└────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ PHASE 4: CORRECTNESS CHECK │
├─────────────────────────────────────────────┤
│ Code Correctness Agent │
└────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ Optimized Code + Reports + Artifacts │
└─────────────────────────────────────────────┘
agentic-code-optimization/
├── agents/ # Agent framework
│ ├── base.py # BaseAgent abstract class
│ ├── __init__.py
│ ├── checkers/ # Validation/checker agents
│ │ └── code_correctness/ # Correctness check agent
│ └── summarizers/ # Specialized summarizer agents
│ ├── environment.py # Environment Summary Agent
│ ├── behavior.py # Behavior Summary Agent
│ ├── component.py # Component Summary Agent
│ └── __init__.py
├── providers/ # LLM provider implementations
│ ├── base.py # BaseProvider abstract class
│ ├── registry.py # ProviderRegistry (factory pattern)
│ ├── ollama.py # Ollama local provider
│ ├── openai.py # OpenAI provider
│ ├── anthropic.py # Anthropic Claude provider
│ └── __init__.py
├── config/ # Configuration system
│ ├── base.py # SubSectionParser ABC
│ ├── parser.py # ConfigParser singleton
│ ├── providers.py # Provider configurations
│ └── __init__.py
├── tools/ # Code analysis tools
│ ├── environment.py # Dependency/environment analysis
│ ├── behavior.py # Logic/pattern analysis
│ ├── component.py # Structure analysis
│ └── __init__.py
├── utils/ # Utilities (NEW)
│ ├── metrics.py # ExecutionMetrics, Trace, ObservabilityManager
│ ├── runs.py # RunManager for artifact management
│ └── __init__.py
├── evaluate.py # Main evaluation script
├── evaluate_code_correctles.py # Full workflow + correctness check
├── config.ini # Configuration file
├── .env.example # Environment variables template
├── CLAUDE.md # Claude Code guidance
├── README.md
├── requirements.txt
└── .gitignore
- Python 3.11+ (required for built-in
tomllibsupport)
# Clone repository
git clone <repository-url>
cd agentic-code-optimization
# Create virtual environment (Python 3.11+ required)
python3.11 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt-
Copy environment file:
cp .env.example .env
-
Configure API Keys - Edit
.envwith your credentials:# OpenAI OPENAI_API_KEY=sk-... OPENAI_ORGANIZATION_ID=org-... # Anthropic Claude ANTHROPIC_API_KEY=sk-ant-... # Gemini GEMINI_API_KEY=... # Ollama (local, no API key needed) OLLAMA_BASE_URL=http://localhost:11434
-
Configure Provider - Edit
config.ini:[agents] max_iterations = 30 default_provider = ollama temperature = 0.7 [ollama] base_url = http://localhost:11434 model = devstral-2:123b [openai] api_key = ${OPENAI_API_KEY} model = gpt-5 [anthropic] api_key = ${ANTHROPIC_API_KEY} model = claude-3-5-sonnet-20241022 [gemini] api_key = ${GEMINI_API_KEY} model = gemini-1.5-pro
-
(Optional) Use Ollama locally:
# Install Ollama from https://ollama.ai ollama pull devstral-2:123b
# Run on current project
python evaluate.py
# Run on specific repository
python evaluate.py /path/to/repoThis runs: summary → static analysis → analysis → optimize → correctness.
venv/bin/python evaluate_code_correctles.py /path/to/repoPython API:
from agents.summarizers import EnvironmentSummarizer
# Create agent
agent = EnvironmentSummarizer()
# Execute (synchronous)
result = agent.run("/path/to/code")
# Access metrics
print(f"LLM Iterations: {agent.iteration_count}")
print(f"Tools Used: {agent.tools_used_count}")
print(f"Tool Names: {agent.tools_used_names}")
# Get LangGraph output
output = agent.get_langgraph_output()
print(output)When you execute python evaluate.py, the system creates a timestamped run directory with all artifacts:
runs/
└── EnvironmentSummarizer_20250112_120000/
├── config.ini # Copy of configuration used
├── input.txt # Execution parameters
├── response.txt # Agent response
├── metrics.json # Execution metrics
├── state.json # Agent state snapshot
└── summary.md # Human-readable summary
- LLM Iterations: Count of LLM calls (not tool executions)
- Tools Used: Total number of tool executions
- Unique Tools: List of distinct tools used
- Execution Time: Total runtime in seconds
- Provider: Which LLM provider was used
The system uses Beautilog for beautiful terminal + file logging:
- Console: Colored output for easy readability
- Files:
logs/agent.logandlogs/evaluate.log - Levels:
INFO: Major steps (agent start/end, LLM calls, tool execution)DEBUG: Detailed state inspection (state dicts, messages, results)ERROR: Failures with full traceback
# View logs
tail -f logs/agent.log
tail -f logs/evaluate.log# OpenAI Configuration
OPENAI_API_KEY=sk-your-key-here
OPENAI_ORGANIZATION_ID=org-your-id-here
# Anthropic Claude Configuration
ANTHROPIC_API_KEY=sk-ant-your-key-here
# Ollama Configuration (local, no keys needed)
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=devstral-2:123b
# Logging
LOG_LEVEL=INFO
# Application
DEFAULT_PROVIDER=ollama
MAX_ITERATIONS=30
TIMEOUT=60
VERBOSE=true[agents]
max_iterations = 30
default_provider = ollama
temperature = 0.7
[ollama]
base_url = http://localhost:11434
model = devstral-2:123b
temperature = 0.7
max_tokens = 8192
timeout = 60
[openai]
api_key = ${OPENAI_API_KEY}
model = gpt-5
temperature = 0.7
max_tokens = 4096
timeout = 60
[anthropic]
api_key = ${ANTHROPIC_API_KEY}
model = claude-3-5-sonnet-20241022
temperature = 0.7
max_tokens = 4096
timeout = 60# Format code
black agents/ config/ providers/ tools/ utils/ evaluate.py
# Lint
ruff check agents/ config/ providers/ tools/ utils/ evaluate.py
# Type check
mypy agents/ config/ providers/ tools/ utils/ evaluate.py
# All checks
black agents/ config/ providers/ tools/ utils/ evaluate.py && \
ruff check agents/ config/ providers/ tools/ utils/ evaluate.py && \
mypy agents/ config/ providers/ tools/ utils/ evaluate.py# Run all tests
pytest tests/
# Run specific test
pytest tests/test_base_agent.py::TestAgent -v
# With coverage
pytest tests/ --cov=agents --cov=providers --cov=config --cov=utilsAgents use a declarative pattern with class attributes:
from agents.base import BaseAgent
from langchain_core.tools import tool
# Define tools
@tool
def analyze_complexity(code: str) -> str:
"""Analyze code complexity."""
# Implementation
return "analysis result"
# Define agent
class MyAnalysisAgent(BaseAgent):
prompt = """You are an expert code analyzer.
Analyze code structure, complexity, and quality."""
tools = [analyze_complexity]
return_state_field = "my_analysis"
# Optional overrides
max_iterations = 8
temperature = 0.3
provider_name = "anthropic"
# Use the agent
agent = MyAnalysisAgent()
result = agent.run("code to analyze")
print(result)
# Access metrics
print(f"LLM calls: {agent.iteration_count}")
print(f"Tools used: {agent.tools_used_count}")Key Agent Attributes:
prompt- System prompt (required)tools- List of tools available to agent (required)return_state_field- State field to store results (required)max_iterations- Max agentic loop iterations (default: from config)temperature- LLM temperature (default: from config)provider_name- Which provider to use (default: from config)
Important Notes:
- Agent
run()method is synchronous (not async) - Iteration count tracks only LLM calls, not tool executions
- Tool usage is tracked in
tools_used_countandtools_used_names - Every execution creates a run directory with artifacts
from providers.base import BaseProvider, ProviderResponse
from providers.registry import ProviderRegistry
from config.base import SubSectionParser
from dataclasses import dataclass
# 1. Define config
@dataclass
class CustomConfig(SubSectionParser):
SECTION = "custom"
api_url: str
api_key: str
model: str
temperature: float = 0.7
max_tokens: int = 4096
# 2. Implement provider
class CustomProvider(BaseProvider):
def __init__(self, config: CustomConfig):
self.config = config
def generate(self, system_prompt: str, user_prompt: str, **kwargs) -> ProviderResponse:
# Call your API
response = self._call_api(system_prompt, user_prompt)
return ProviderResponse(
content=response["output"],
model=self.config.model,
usage={"tokens": response.get("tokens", 0)}
)
def validate_connection(self) -> bool:
try:
# Test API connection
return True
except:
return False
def get_provider_name(self) -> str:
return "custom"
# 3. Register provider
ProviderRegistry.register("custom", CustomProvider, CustomConfig)
# 4. Add to config.ini
# [custom]
# api_url = https://api.example.com
# api_key = your-key
# model = your-model┌─────────────────────────────────────────────┐
│ Input Code │
└────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ PHASE 1: SUMMARIZATION (Parallel) │
├─────────────────────────────────────────────┤
│ Environment Summary Agent │
│ Behavior Summary Agent │
│ Component Summary Agent │
│ (run simultaneously) │
└────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ Combine Summaries │
└────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ PHASE 2: OPTIMIZATION (Sequential) │
├─────────────────────────────────────────────┤
│ Analyzer Agent │
│ ↓ │
│ Optimization Agent │
│ ↓ │
│ Update Repository │
└────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ Optimized Code + Report + Artifacts │
└─────────────────────────────────────────────┘
Define agents using class attributes instead of method overrides:
class MyAgent(BaseAgent):
prompt = "..." # System prompt
tools = [...] # Available tools
return_state_field = "..." # Result field nameDynamically create and manage providers:
from providers.registry import ProviderRegistry
# Create from config
provider = ProviderRegistry.create("ollama")
# Register custom provider
ProviderRegistry.register("custom", CustomProvider)
# Validate connection
is_valid = ProviderRegistry.validate_provider("openai")Dataclass-based configuration with INI mapping:
from config.parser import ConfigParser
from config.providers import OllamaConfig
ConfigParser.load()
config = ConfigParser.get(OllamaConfig)Manage execution artifacts and run directories:
from utils import RunManager
run_manager = RunManager()
run_dir = run_manager.create_run_dir(repo_path, agent.name)
run_manager.save_config(config_path)
run_manager.save_response(result)
run_manager.save_metrics(metrics)
run_manager.save_state(agent)Continuous refinement pattern: Think → Tool Use → Observe
1. Think: LLM processes context with available tools
2. Tool Use: LLM calls tools and gets results
3. Observe: Tool results fed back to LLM
4. Repeat: Until no tool calls or max_iterations reached
- Python
- JavaScript / TypeScript
- Java
- C / C++
- Go
- Rust
- And more...
- Parallel Summarization: Three agents run concurrently in Phase 1 for faster analysis
- Provider Flexibility: Choose between local (Ollama) for privacy or cloud providers for higher quality
- Configurable Timeouts: Adjust timeout settings based on your LLM provider and code complexity
- Synchronous Execution: Cleaner logging and debugging with synchronous
run()method
- Local Processing: Use Ollama for complete local code analysis without sending data to external services
- Provider Abstraction: Easily switch providers based on your security requirements
- No Code Storage: By default, code is not persisted unless explicitly configured
- Input Validation: All inputs are validated before processing
- Environment Variables: Sensitive API keys stored in
.env, never in code or config
- Web UI for code optimization
- Database backend for code versioning
- Batch processing for multiple files
- Custom optimization rules engine
- Integration with popular IDEs (VS Code, PyCharm)
- Pre-commit hooks for automatic optimization
- Performance benchmarking framework
- Multi-model ensemble optimization
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Please see CLAUDE.md for development guidelines and architecture details.
This project is licensed under the MIT License - see the LICENSE file for details.
Recently Implemented:
- ✅ Agent framework with LangGraph integration
- ✅ Synchronous execution with comprehensive logging
- ✅ Provider abstraction (Ollama, OpenAI, Anthropic, Gemini)
- ✅ Configuration system (INI-based with .env support)
- ✅ Tool binding and execution with metrics tracking
- ✅ RunManager for execution artifact management
- ✅ Beautilog integration for terminal + file logging
- ✅ Consolidated utils module (metrics + runs management)
- ✅ Execution metrics (iteration_count, tools_used_count, tools_used_names)
- ✅ Run directory structure with automatic artifact storage
Specialized Agents:
- ✅ EnvironmentSummarizer
- ✅ BehaviorSummarizer
- ✅ ComponentSummarizer
Code Analysis Tools:
- ✅ Environment analysis (dependencies, imports)
- ✅ Behavior analysis (logic, patterns, execution flow)
- ✅ Component analysis (structure, functions, classes)
Next Phase:
- Analyzer Agent
- Optimization Agent
- Multi-agent orchestration workflows
- Comprehensive test suite