Skip to content

alexnicita/archmap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Archmap

Automated Architecture Diagram Generation from Source Code

Archmap is an AI-powered system that automatically analyzes Git repositories and generates enterprise-grade architecture diagrams. The system combines advanced static code analysis with large language models to provide deep insights into codebase structure, dependencies, and component relationships.

Overview

Archmap employs a multi-analyzer architecture that examines code from multiple perspectives:

  • AST Analysis - Parse and analyze abstract syntax trees to extract classes, functions, and relationships
  • Dependency Analysis - Build module dependency graphs, detect cycles, calculate centrality metrics
  • Call Graph Analysis - Map function invocation patterns, identify entry points and hotspots
  • Metrics Analysis - Calculate cyclomatic complexity, maintainability indices, and Halstead metrics
  • Module Analysis - Examine package structure, identify features, measure cohesion

The system produces professional Mermaid flowchart diagrams with layered architecture visualization, smart component grouping, and configurable styling themes.

Features

Comprehensive Code Analysis

The system performs parallel execution of five specialized analyzers, synthesizing results into enriched context for LLM-based architectural understanding. This multi-faceted approach provides significantly deeper insights than traditional static analysis alone.

Scale-Aware Sampling

Intelligent code sampling adjusts based on repository size:

  • Standard repositories (< 500 files): 10 samples
  • Medium repositories (500-1000 files): 20 samples
  • Large repositories (1000-5000 files): 30 samples
  • Very large repositories (5000+ files): 50 samples

Strategic sampling prioritizes core modules, entry points, and architecturally significant files.

Extensible Architecture

Modular formatter design allows easy addition of new output formats. Current implementation supports Mermaid diagrams with GitHub/GitLab/Notion compatibility. PlantUML and Lucid formatters are planned.

Installation

Prerequisites

  • Python 3.13 or higher
  • Git
  • OpenAI API key or OpenRouter API key

Setup

git clone https://github.com/alexnicita/archmap.git
cd archmap

cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

cp .env.example .env
# Configure your API keys in .env

uvicorn app.main:app --host 0.0.0.0 --port 8000

Usage

API Request

curl -X POST http://localhost:8000/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "repo_url": "https://github.com/facebook/react",
    "output_format": "mermaid"
  }'

Python Client

import requests

response = requests.post(
    "http://localhost:8000/analyze",
    json={
        "repo_url": "https://github.com/facebook/react",
        "output_format": "mermaid"
    }
)

result = response.json()
diagram = result["diagram"]["content"]

Examples

React Framework

Analysis of the React codebase (6,953 files) with 50 strategically sampled files:

  • 14 components identified (Scheduler, Reconciler, Renderers, Compilers, Plugins)
  • 9 relationships mapped (data flow, dependencies, synchronous calls)
  • 3 architectural layers (Application, Domain, Infrastructure)
flowchart LR
    %%{init: {'theme':'base', 'themeVariables': {'fontSize':'16px'}}}%%

    subgraph application["Application Layer"]
        comp1["`Scheduler
[Engine]`"]
        comp2["`React Renderer
[Renderer]`"]
        comp3["`React Native Renderer
[Renderer]`"]
        comp4["`ReactTestRenderer
[Testing]`"]
    end

    subgraph domain["Domain Layer"]
        comp5["`React Reconciler
[Core]`"]
        comp6["`SchedulerPriorities
[Core]`"]
        comp7["`BadMapPolyfill
[Core]`"]
        comp8["`SchedulerFeatureFlags
[Core]`"]
        comp9["`Custom Components
[Core]`"]
    end

    subgraph infrastructure["Infrastructure Layer"]
        comp10["`Babel Plugin for React Compiler
[Compiler]`"]
        comp11["`ESLint Plugin for React Hooks
[Plugin]`"]
        comp12(["`Scripts Module
[CLI]`"])
        comp13["`Error Codes
[Support]`"]
        comp14["`Packages Module
[Support]`"]
    end

    comp12 ==>|orchestrates| comp10
    comp2 -->|calls| comp5
    comp1 -->|uses| comp6
    comp1 -->|reads| comp8
    comp13 -->|uses| comp5
    comp11 -->|Ensures that custom | comp9
    comp5 -.->|depends| comp1
    comp10 -.->|depends| comp2
    comp3 -.->|depends| comp2

    classDef presentation fill:#f5f5f5,stroke:#333,stroke-width:2px,color:#000
    classDef application fill:#e0e0e0,stroke:#333,stroke-width:2px,color:#000
    classDef domain fill:#bdbdbd,stroke:#333,stroke-width:3px,color:#000
    classDef infrastructure fill:#9e9e9e,stroke:#333,stroke-width:2px,color:#000
    classDef external fill:#757575,stroke:#333,stroke-width:2px,color:#fff
    classDef database fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#000

    class comp5 domain
    class comp1 application
    class comp2 application
    class comp10 infrastructure
    class comp11 infrastructure
    class comp3 application
    class comp6 domain
    class comp12 infrastructure
    class comp4 application
    class comp7 domain
    class comp8 domain
    class comp9 domain
    class comp13 infrastructure
    class comp14 infrastructure
Loading

Polymarket Agents

Analysis of Polymarket agents repository (37 files):

  • 8 components (Agents, Connectors, Utils, GammaMarketClient, Documentation, Test Suite, Scripts, API Integration)
  • 6 relationships (synchronous calls, dependencies, external integrations)
  • 4 layers (Presentation, Application, Infrastructure, External Services)
flowchart LR
    %%{init: {'theme':'base', 'themeVariables': {'fontSize':'16px'}}}%%

    subgraph presentation["Presentation Layer"]
        comp1["`Documentation
[Service]`"]
    end

    subgraph application["Application Layer"]
        comp2["`Agents
[Core]`"]
        comp3["`Connectors
[Core]`"]
        comp4["`Utils
[Module]`"]
        comp5["`GammaMarketClient
[Core]`"]
    end

    subgraph infrastructure["Infrastructure Layer"]
        comp6["`Test Suite
[Service]`"]
        comp7["`Scripts
[Service]`"]
    end

    subgraph external["External Services"]
        comp8("`Polymarket API Integration
[External]`")
    end

    comp2 -->|calls| comp3
    comp2 -->|Agents interact with| comp8
    comp3 -->|calls| comp5
    comp6 -->|Test Suite validates| comp2
    comp2 -.->|depends| comp4

    classDef presentation fill:#f5f5f5,stroke:#333,stroke-width:2px,color:#000
    classDef application fill:#e0e0e0,stroke:#333,stroke-width:2px,color:#000
    classDef domain fill:#bdbdbd,stroke:#333,stroke-width:3px,color:#000
    classDef infrastructure fill:#9e9e9e,stroke:#333,stroke-width:2px,color:#000
    classDef external fill:#757575,stroke:#333,stroke-width:2px,color:#fff
    classDef database fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#000

    class comp2 application
    class comp3 application
    class comp6 infrastructure
    class comp4 application
    class comp8 external
    class comp1 presentation
    class comp7 infrastructure
    class comp5 application
Loading

Architecture

System Design

┌─────────────┐
│   FastAPI   │ REST API server
│   Backend   │ Request validation
└──────┬──────┘
       │
┌──────▼──────────────────────────────────┐
│     Analysis Orchestrator               │
│  ┌────────┐ ┌────────┐ ┌────────┐     │
│  │  AST   │ │  Deps  │ │  Call  │     │
│  │Analyzer│ │Analyzer│ │ Graph  │ ... │
│  └────────┘ └────────┘ └────────┘     │
└──────┬──────────────────────────────────┘
       │ Synthesized Context
┌──────▼──────┐     ┌─────────────┐
│LLM Analyzer │────▶│  OpenAI/    │
│             │     │ OpenRouter  │
└──────┬──────┘     └─────────────┘
       │
┌──────▼──────────┐
│    Formatters   │
│  ┌──────────┐   │
│  │ Mermaid  │   │
│  └──────────┘   │
└─────────────────┘

Analyzer Components

ASTAnalyzer

Parses Python source code using the ast standard library module. Extracts structural information including class definitions, function signatures, import statements, and inheritance hierarchies. Builds a comprehensive map of method invocations across the codebase.

DependencyAnalyzer

Constructs module dependency graphs using NetworkX. Performs cycle detection to identify circular dependencies that may indicate architectural issues. Calculates centrality metrics to identify core modules. Applies topological sorting to infer architectural layers.

CallGraphAnalyzer

Maps function and method call relationships throughout the codebase. Identifies entry points (functions with no callers that initiate execution flow). Detects hotspots (frequently called functions that may be performance bottlenecks). Traces call chains to understand execution paths. Identifies unreachable code.

MetricsAnalyzer

Calculates software quality metrics using the Radon library. Computes cyclomatic complexity (McCabe metric) to assess code complexity. Calculates maintainability index based on Halstead metrics and lines of code. Generates detailed reports on code quality distribution across the codebase.

ModuleAnalyzer

Examines package structure and organization. Analyzes directory hierarchies to understand module relationships. Identifies feature modules based on naming conventions and structure. Calculates cohesion metrics to assess module organization quality. Determines relative module sizes and importance.

LLM Integration

The orchestrator synthesizes results from all static analyzers and provides enriched context to the LLM:

{
  "code_structure": {
    "total_classes": 150,
    "total_functions": 800,
    "inheritance_relationships": 45,
    "method_call_count": 1200
  },
  "dependencies": {
    "module_count": 50,
    "dependency_count": 180,
    "has_cycles": false,
    "central_modules": ["core", "scheduler", "renderer"],
    "layers_detected": 4
  },
  "call_patterns": {
    "entry_points": 12,
    "hotspots": ["reconcile", "schedule", "render"],
    "longest_chain_length": 8
  },
  "quality_metrics": {
    "files_analyzed": 45,
    "average_maintainability": 72.3,
    "high_complexity_files": 3
  },
  "module_organization": {
    "total_packages": 18,
    "features_identified": 6
  }
}

This comprehensive context enables the LLM to generate accurate, detailed architectural descriptions that go far beyond what would be possible from code samples alone.

API Reference

POST /analyze

Analyze a repository and generate architecture diagram.

Request:

{
  "repo_url": "https://github.com/user/repo",
  "output_format": "mermaid",
  "branch": "main"
}

Response:

{
  "success": true,
  "repository": {
    "url": "https://github.com/user/repo",
    "branch": "main",
    "total_files": 1000,
    "analyzed_files": 30,
    "languages": {"py": 500, "js": 300}
  },
  "analysis": {
    "architecture_summary": "...",
    "components": [...],
    "relationships": [...]
  },
  "diagram": {
    "format": "mermaid",
    "content": "flowchart LR..."
  }
}

GET /formats

List supported output formats.

Response:

{
  "formats": [
    {
      "name": "mermaid",
      "description": "Mermaid diagram format (works in GitHub, GitLab, Notion)",
      "status": "active"
    },
    {
      "name": "lucid",
      "description": "Lucid Standard Import JSON (for Lucidchart)",
      "status": "coming_soon"
    },
    {
      "name": "plantuml",
      "description": "PlantUML format (enterprise standard)",
      "status": "coming_soon"
    }
  ]
}

GET /health

Health check endpoint.

Response:

{
  "status": "healthy"
}

Configuration

Create a .env file in the backend directory:

# LLM Provider Configuration
LLM_PROVIDER=openai  # or "openrouter"

# OpenAI Configuration
OPENAI_API_KEY=sk-your-openai-api-key-here

# OpenRouter Configuration (if using OpenRouter)
OPENROUTER_API_KEY=sk-or-your-openrouter-api-key-here

# Model Selection
DEFAULT_MODEL=gpt-4o
MAX_TOKENS=4000

# Application Configuration
APP_NAME=archmap
APP_VERSION=1.0.0

See .env.example for a complete template.

Development

Project Structure

archmap/
├── backend/
│   ├── app/
│   │   ├── analyzers/          # Analysis engines
│   │   │   ├── base_analyzer.py
│   │   │   ├── ast_analyzer.py
│   │   │   ├── dependency_analyzer.py
│   │   │   ├── callgraph_analyzer.py
│   │   │   ├── metrics_analyzer.py
│   │   │   ├── module_analyzer.py
│   │   │   ├── llm_analyzer.py
│   │   │   └── analysis_orchestrator.py
│   │   ├── formatters/         # Output formatters
│   │   │   ├── base_formatter.py
│   │   │   ├── mermaid_formatter.py
│   │   │   ├── plantuml_formatter.py
│   │   │   └── lucid_formatter.py
│   │   ├── scanners/           # Repository scanners
│   │   ├── models/             # Pydantic schemas
│   │   ├── core/               # Configuration
│   │   └── main.py             # FastAPI application
│   ├── requirements.txt
│   └── .env.example
└── tests/
    ├── examples/               # Example diagrams
    │   ├── react.mmd
    │   └── polymarket_agents.mmd
    └── test_enhanced_system.py

Running Tests

# Start the backend server
cd backend
source venv/bin/activate
uvicorn app.main:app --host 0.0.0.0 --port 8000

# In another terminal, run the test suite
python tests/test_enhanced_system.py

Extending the System

Adding a New Analyzer

  1. Create a new analyzer class inheriting from BaseAnalyzer:
from .base_analyzer import BaseAnalyzer
from pathlib import Path
from typing import Any

class CustomAnalyzer(BaseAnalyzer):
    async def analyze(self, repo_path: Path, code_samples: dict[str, str]) -> dict[str, Any]:
        # Implement your analysis logic
        results = {}
        # ... analysis code ...
        return results
    
    def get_analysis_type(self) -> str:
        return "custom_analysis"
  1. Register the analyzer in AnalysisOrchestrator:
self.analyzers = {
    "ast": ASTAnalyzer(),
    "dependencies": DependencyAnalyzer(),
    # ... existing analyzers ...
    "custom": CustomAnalyzer(),
}

Adding a New Formatter

  1. Create a formatter class inheriting from BaseFormatter:
from .base_formatter import BaseFormatter, DetailLevel

class CustomFormatter(BaseFormatter):
    def format(self, analysis: dict) -> str:
        # Implement formatting logic
        components = self._filter_components_by_detail(analysis["components"])
        # ... formatting code ...
        return formatted_output
    
    def get_format_name(self) -> str:
        return "custom"
  1. Register in FormatterFactory:
if output_format == OutputFormat.CUSTOM:
    return CustomFormatter(detail_level=detail_level)

Roadmap

Near Term

  • PlantUML formatter implementation for enterprise environments
  • Lucid JSON formatter for Lucidchart integration
  • Enhanced multi-language support (JavaScript, TypeScript, Go, Rust, Java)
  • Diagram detail level controls (minimal, standard, detailed, comprehensive)

Medium Term

  • Interactive diagram editor with real-time updates
  • Diagram comparison tools for visualizing architectural changes
  • CI/CD pipeline integration for automated documentation
  • Support for monorepo analysis with multiple service detection

Long Term

  • VSCode extension for in-editor diagram generation
  • Diagram template library for common architectural patterns
  • Custom analyzer plugin system for domain-specific analysis
  • Real-time collaborative diagram editing

Contributing

We welcome contributions from the community. Areas where contributions would be particularly valuable:

  • Multi-language support - Extending analyzers to support JavaScript, TypeScript, Go, Rust, and other languages
  • New formatters - Implementing PlantUML, Lucid, or other diagram format generators
  • Enhanced analysis - Adding new analyzer types for specific architectural patterns or quality metrics
  • Documentation - Improving documentation, adding examples, creating tutorials
  • Testing - Expanding test coverage, adding test cases for edge cases

To contribute:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Please ensure your code follows the existing style and includes appropriate tests.

License

MIT License - see LICENSE file for details

Author

Created by Alex Nicita

Acknowledgments

This project builds upon excellent open source tools:

  • FastAPI - Modern Python web framework
  • Mermaid - Diagram generation from text
  • NetworkX - Graph analysis library
  • Radon - Code metrics calculation
  • OpenAI - Language model integration
  • OpenRouter - Multi-model LLM access

About

AI-powered automatic system architecture diagram generation from code. Analyzes Git repositories using multi-analyzer architecture and generates professional Mermaid diagrams.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors