Archmap

Automated Architecture Diagram Generation from Source Code

Archmap is an AI-powered system that automatically analyzes Git repositories and generates enterprise-grade architecture diagrams. The system combines advanced static code analysis with large language models to provide deep insights into codebase structure, dependencies, and component relationships.

Overview

Archmap employs a multi-analyzer architecture that examines code from multiple perspectives:

AST Analysis - Parse and analyze abstract syntax trees to extract classes, functions, and relationships
Dependency Analysis - Build module dependency graphs, detect cycles, calculate centrality metrics
Call Graph Analysis - Map function invocation patterns, identify entry points and hotspots
Metrics Analysis - Calculate cyclomatic complexity, maintainability indices, and Halstead metrics
Module Analysis - Examine package structure, identify features, measure cohesion

The system produces professional Mermaid flowchart diagrams with layered architecture visualization, smart component grouping, and configurable styling themes.

Features

Comprehensive Code Analysis

The system performs parallel execution of five specialized analyzers, synthesizing results into enriched context for LLM-based architectural understanding. This multi-faceted approach provides significantly deeper insights than traditional static analysis alone.

Scale-Aware Sampling

Intelligent code sampling adjusts based on repository size:

Standard repositories (< 500 files): 10 samples
Medium repositories (500-1000 files): 20 samples
Large repositories (1000-5000 files): 30 samples
Very large repositories (5000+ files): 50 samples

Strategic sampling prioritizes core modules, entry points, and architecturally significant files.

Extensible Architecture

Modular formatter design allows easy addition of new output formats. Current implementation supports Mermaid diagrams with GitHub/GitLab/Notion compatibility. PlantUML and Lucid formatters are planned.

Installation

Prerequisites

Python 3.13 or higher
Git
OpenAI API key or OpenRouter API key

Setup

git clone https://github.com/alexnicita/archmap.git
cd archmap

cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

cp .env.example .env
# Configure your API keys in .env

uvicorn app.main:app --host 0.0.0.0 --port 8000

Usage

API Request

curl -X POST http://localhost:8000/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "repo_url": "https://github.com/facebook/react",
    "output_format": "mermaid"
  }'

Python Client

import requests

response = requests.post(
    "http://localhost:8000/analyze",
    json={
        "repo_url": "https://github.com/facebook/react",
        "output_format": "mermaid"
    }
)

result = response.json()
diagram = result["diagram"]["content"]

Examples

React Framework

Analysis of the React codebase (6,953 files) with 50 strategically sampled files:

14 components identified (Scheduler, Reconciler, Renderers, Compilers, Plugins)
9 relationships mapped (data flow, dependencies, synchronous calls)
3 architectural layers (Application, Domain, Infrastructure)

flowchart LR
    %%{init: {'theme':'base', 'themeVariables': {'fontSize':'16px'}}}%%

    subgraph application["Application Layer"]
        comp1["`Scheduler
[Engine]`"]
        comp2["`React Renderer
[Renderer]`"]
        comp3["`React Native Renderer
[Renderer]`"]
        comp4["`ReactTestRenderer
[Testing]`"]
    end

    subgraph domain["Domain Layer"]
        comp5["`React Reconciler
[Core]`"]
        comp6["`SchedulerPriorities
[Core]`"]
        comp7["`BadMapPolyfill
[Core]`"]
        comp8["`SchedulerFeatureFlags
[Core]`"]
        comp9["`Custom Components
[Core]`"]
    end

    subgraph infrastructure["Infrastructure Layer"]
        comp10["`Babel Plugin for React Compiler
[Compiler]`"]
        comp11["`ESLint Plugin for React Hooks
[Plugin]`"]
        comp12(["`Scripts Module
[CLI]`"])
        comp13["`Error Codes
[Support]`"]
        comp14["`Packages Module
[Support]`"]
    end

    comp12 ==>|orchestrates| comp10
    comp2 -->|calls| comp5
    comp1 -->|uses| comp6
    comp1 -->|reads| comp8
    comp13 -->|uses| comp5
    comp11 -->|Ensures that custom | comp9
    comp5 -.->|depends| comp1
    comp10 -.->|depends| comp2
    comp3 -.->|depends| comp2

    classDef presentation fill:#f5f5f5,stroke:#333,stroke-width:2px,color:#000
    classDef application fill:#e0e0e0,stroke:#333,stroke-width:2px,color:#000
    classDef domain fill:#bdbdbd,stroke:#333,stroke-width:3px,color:#000
    classDef infrastructure fill:#9e9e9e,stroke:#333,stroke-width:2px,color:#000
    classDef external fill:#757575,stroke:#333,stroke-width:2px,color:#fff
    classDef database fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#000

    class comp5 domain
    class comp1 application
    class comp2 application
    class comp10 infrastructure
    class comp11 infrastructure
    class comp3 application
    class comp6 domain
    class comp12 infrastructure
    class comp4 application
    class comp7 domain
    class comp8 domain
    class comp9 domain
    class comp13 infrastructure
    class comp14 infrastructure

Polymarket Agents

Analysis of Polymarket agents repository (37 files):

8 components (Agents, Connectors, Utils, GammaMarketClient, Documentation, Test Suite, Scripts, API Integration)
6 relationships (synchronous calls, dependencies, external integrations)
4 layers (Presentation, Application, Infrastructure, External Services)

flowchart LR
    %%{init: {'theme':'base', 'themeVariables': {'fontSize':'16px'}}}%%

    subgraph presentation["Presentation Layer"]
        comp1["`Documentation
[Service]`"]
    end

    subgraph application["Application Layer"]
        comp2["`Agents
[Core]`"]
        comp3["`Connectors
[Core]`"]
        comp4["`Utils
[Module]`"]
        comp5["`GammaMarketClient
[Core]`"]
    end

    subgraph infrastructure["Infrastructure Layer"]
        comp6["`Test Suite
[Service]`"]
        comp7["`Scripts
[Service]`"]
    end

    subgraph external["External Services"]
        comp8("`Polymarket API Integration
[External]`")
    end

    comp2 -->|calls| comp3
    comp2 -->|Agents interact with| comp8
    comp3 -->|calls| comp5
    comp6 -->|Test Suite validates| comp2
    comp2 -.->|depends| comp4

    classDef presentation fill:#f5f5f5,stroke:#333,stroke-width:2px,color:#000
    classDef application fill:#e0e0e0,stroke:#333,stroke-width:2px,color:#000
    classDef domain fill:#bdbdbd,stroke:#333,stroke-width:3px,color:#000
    classDef infrastructure fill:#9e9e9e,stroke:#333,stroke-width:2px,color:#000
    classDef external fill:#757575,stroke:#333,stroke-width:2px,color:#fff
    classDef database fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#000

    class comp2 application
    class comp3 application
    class comp6 infrastructure
    class comp4 application
    class comp8 external
    class comp1 presentation
    class comp7 infrastructure
    class comp5 application

Architecture

System Design

┌─────────────┐
│   FastAPI   │ REST API server
│   Backend   │ Request validation
└──────┬──────┘
       │
┌──────▼──────────────────────────────────┐
│     Analysis Orchestrator               │
│  ┌────────┐ ┌────────┐ ┌────────┐     │
│  │  AST   │ │  Deps  │ │  Call  │     │
│  │Analyzer│ │Analyzer│ │ Graph  │ ... │
│  └────────┘ └────────┘ └────────┘     │
└──────┬──────────────────────────────────┘
       │ Synthesized Context
┌──────▼──────┐     ┌─────────────┐
│LLM Analyzer │────▶│  OpenAI/    │
│             │     │ OpenRouter  │
└──────┬──────┘     └─────────────┘
       │
┌──────▼──────────┐
│    Formatters   │
│  ┌──────────┐   │
│  │ Mermaid  │   │
│  └──────────┘   │
└─────────────────┘

Analyzer Components

ASTAnalyzer

Parses Python source code using the ast standard library module. Extracts structural information including class definitions, function signatures, import statements, and inheritance hierarchies. Builds a comprehensive map of method invocations across the codebase.

DependencyAnalyzer

Constructs module dependency graphs using NetworkX. Performs cycle detection to identify circular dependencies that may indicate architectural issues. Calculates centrality metrics to identify core modules. Applies topological sorting to infer architectural layers.

CallGraphAnalyzer

Maps function and method call relationships throughout the codebase. Identifies entry points (functions with no callers that initiate execution flow). Detects hotspots (frequently called functions that may be performance bottlenecks). Traces call chains to understand execution paths. Identifies unreachable code.

MetricsAnalyzer

Calculates software quality metrics using the Radon library. Computes cyclomatic complexity (McCabe metric) to assess code complexity. Calculates maintainability index based on Halstead metrics and lines of code. Generates detailed reports on code quality distribution across the codebase.

ModuleAnalyzer

Examines package structure and organization. Analyzes directory hierarchies to understand module relationships. Identifies feature modules based on naming conventions and structure. Calculates cohesion metrics to assess module organization quality. Determines relative module sizes and importance.

LLM Integration

The orchestrator synthesizes results from all static analyzers and provides enriched context to the LLM:

{
  "code_structure": {
    "total_classes": 150,
    "total_functions": 800,
    "inheritance_relationships": 45,
    "method_call_count": 1200
  },
  "dependencies": {
    "module_count": 50,
    "dependency_count": 180,
    "has_cycles": false,
    "central_modules": ["core", "scheduler", "renderer"],
    "layers_detected": 4
  },
  "call_patterns": {
    "entry_points": 12,
    "hotspots": ["reconcile", "schedule", "render"],
    "longest_chain_length": 8
  },
  "quality_metrics": {
    "files_analyzed": 45,
    "average_maintainability": 72.3,
    "high_complexity_files": 3
  },
  "module_organization": {
    "total_packages": 18,
    "features_identified": 6
  }
}

This comprehensive context enables the LLM to generate accurate, detailed architectural descriptions that go far beyond what would be possible from code samples alone.

API Reference

POST /analyze

Analyze a repository and generate architecture diagram.

Request:

{
  "repo_url": "https://github.com/user/repo",
  "output_format": "mermaid",
  "branch": "main"
}

Response:

{
  "success": true,
  "repository": {
    "url": "https://github.com/user/repo",
    "branch": "main",
    "total_files": 1000,
    "analyzed_files": 30,
    "languages": {"py": 500, "js": 300}
  },
  "analysis": {
    "architecture_summary": "...",
    "components": [...],
    "relationships": [...]
  },
  "diagram": {
    "format": "mermaid",
    "content": "flowchart LR..."
  }
}

GET /formats

List supported output formats.

Response:

{
  "formats": [
    {
      "name": "mermaid",
      "description": "Mermaid diagram format (works in GitHub, GitLab, Notion)",
      "status": "active"
    },
    {
      "name": "lucid",
      "description": "Lucid Standard Import JSON (for Lucidchart)",
      "status": "coming_soon"
    },
    {
      "name": "plantuml",
      "description": "PlantUML format (enterprise standard)",
      "status": "coming_soon"
    }
  ]
}

GET /health

Health check endpoint.

Response:

{
  "status": "healthy"
}

Configuration

Create a .env file in the backend directory:

# LLM Provider Configuration
LLM_PROVIDER=openai  # or "openrouter"

# OpenAI Configuration
OPENAI_API_KEY=sk-your-openai-api-key-here

# OpenRouter Configuration (if using OpenRouter)
OPENROUTER_API_KEY=sk-or-your-openrouter-api-key-here

# Model Selection
DEFAULT_MODEL=gpt-4o
MAX_TOKENS=4000

# Application Configuration
APP_NAME=archmap
APP_VERSION=1.0.0

See .env.example for a complete template.

Development

Project Structure

archmap/
├── backend/
│   ├── app/
│   │   ├── analyzers/          # Analysis engines
│   │   │   ├── base_analyzer.py
│   │   │   ├── ast_analyzer.py
│   │   │   ├── dependency_analyzer.py
│   │   │   ├── callgraph_analyzer.py
│   │   │   ├── metrics_analyzer.py
│   │   │   ├── module_analyzer.py
│   │   │   ├── llm_analyzer.py
│   │   │   └── analysis_orchestrator.py
│   │   ├── formatters/         # Output formatters
│   │   │   ├── base_formatter.py
│   │   │   ├── mermaid_formatter.py
│   │   │   ├── plantuml_formatter.py
│   │   │   └── lucid_formatter.py
│   │   ├── scanners/           # Repository scanners
│   │   ├── models/             # Pydantic schemas
│   │   ├── core/               # Configuration
│   │   └── main.py             # FastAPI application
│   ├── requirements.txt
│   └── .env.example
└── tests/
    ├── examples/               # Example diagrams
    │   ├── react.mmd
    │   └── polymarket_agents.mmd
    └── test_enhanced_system.py

Running Tests

# Start the backend server
cd backend
source venv/bin/activate
uvicorn app.main:app --host 0.0.0.0 --port 8000

# In another terminal, run the test suite
python tests/test_enhanced_system.py

Extending the System

Adding a New Analyzer

Create a new analyzer class inheriting from BaseAnalyzer:

from .base_analyzer import BaseAnalyzer
from pathlib import Path
from typing import Any

class CustomAnalyzer(BaseAnalyzer):
    async def analyze(self, repo_path: Path, code_samples: dict[str, str]) -> dict[str, Any]:
        # Implement your analysis logic
        results = {}
        # ... analysis code ...
        return results
    
    def get_analysis_type(self) -> str:
        return "custom_analysis"

Register the analyzer in AnalysisOrchestrator:

self.analyzers = {
    "ast": ASTAnalyzer(),
    "dependencies": DependencyAnalyzer(),
    # ... existing analyzers ...
    "custom": CustomAnalyzer(),
}

Adding a New Formatter

Create a formatter class inheriting from BaseFormatter:

from .base_formatter import BaseFormatter, DetailLevel

class CustomFormatter(BaseFormatter):
    def format(self, analysis: dict) -> str:
        # Implement formatting logic
        components = self._filter_components_by_detail(analysis["components"])
        # ... formatting code ...
        return formatted_output
    
    def get_format_name(self) -> str:
        return "custom"

Register in FormatterFactory:

if output_format == OutputFormat.CUSTOM:
    return CustomFormatter(detail_level=detail_level)

Roadmap

Near Term

PlantUML formatter implementation for enterprise environments
Lucid JSON formatter for Lucidchart integration
Enhanced multi-language support (JavaScript, TypeScript, Go, Rust, Java)
Diagram detail level controls (minimal, standard, detailed, comprehensive)

Medium Term

Interactive diagram editor with real-time updates
Diagram comparison tools for visualizing architectural changes
CI/CD pipeline integration for automated documentation
Support for monorepo analysis with multiple service detection

Long Term

VSCode extension for in-editor diagram generation
Diagram template library for common architectural patterns
Custom analyzer plugin system for domain-specific analysis
Real-time collaborative diagram editing

Contributing

We welcome contributions from the community. Areas where contributions would be particularly valuable:

Multi-language support - Extending analyzers to support JavaScript, TypeScript, Go, Rust, and other languages
New formatters - Implementing PlantUML, Lucid, or other diagram format generators
Enhanced analysis - Adding new analyzer types for specific architectural patterns or quality metrics
Documentation - Improving documentation, adding examples, creating tutorials
Testing - Expanding test coverage, adding test cases for edge cases

To contribute:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Please ensure your code follows the existing style and includes appropriate tests.

License

MIT License - see LICENSE file for details

Author

Created by Alex Nicita

Acknowledgments

This project builds upon excellent open source tools:

FastAPI - Modern Python web framework
Mermaid - Diagram generation from text
NetworkX - Graph analysis library
Radon - Code metrics calculation
OpenAI - Language model integration
OpenRouter - Multi-model LLM access

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
tests		tests
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Archmap

Overview

Features

Comprehensive Code Analysis

Scale-Aware Sampling

Extensible Architecture

Installation

Prerequisites

Setup

Usage

API Request

Python Client

Examples

React Framework

Polymarket Agents

Architecture

System Design

Analyzer Components

ASTAnalyzer

DependencyAnalyzer

CallGraphAnalyzer

MetricsAnalyzer

ModuleAnalyzer

LLM Integration

API Reference

POST /analyze

GET /formats

GET /health

Configuration

Development

Project Structure

Running Tests

Extending the System

Adding a New Analyzer

Adding a New Formatter

Roadmap

Near Term

Medium Term

Long Term

Contributing

License

Author

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages