What if AI agents could truly run in parallel, recover from failures automatically, and scale across machines seamlessly?
ElixirChain brings the battle-tested principles of actor-model concurrency, fault tolerance, and distributed computing to AI agent development. Built on Elixir and the BEAM VMβthe same foundation that powers systems handling millions of concurrent connectionsβElixirChain treats each AI agent as a supervised, isolated process. Designed around Google Gemini's massive 2M token context window, ElixirChain simplifies agent architecture while enabling sophisticated multimodal interactions.
Traditional AI frameworks face fundamental concurrency and reliability challenges:
- Threading Complexity: Managing hundreds of concurrent agents requires careful thread management
- Failure Propagation: One agent failure can impact others sharing the same process space
- Scaling Challenges: Distributing agents across machines requires significant infrastructure
- State Management: Coordinating shared state between agents becomes increasingly complex
# Each agent is an isolated, supervised process
{:ok, agents} = Enum.map(1..100, fn i ->
ElixirChain.start_agent(%{
name: "agent_#{i}",
system_prompt: "You are assistant #{i}",
tools: [:web_search, :calculator]
})
end)
# Agents run in parallel, restart on failure, scale across nodesπ― Key Benefits:
- π₯ Actor Model Concurrency: Each agent is a lightweight process with isolated state
- π‘οΈ Supervised Fault Tolerance: Agents restart automatically without affecting others
- β‘ BEAM VM Efficiency: Proven virtual machine optimized for massive concurrency
- π Distribution Ready: Built-in clustering and remote process communication
- π§ Hot Code Reloading: Update agent behavior without stopping the system
- πΎ Process Isolation: Agents cannot interfere with each other's memory
Application Supervisor
βββ Agent Registry (1000+ agents)
β βββ Agent 1 (GenServer) βββΊ Conversation State
β βββ Agent 2 (GenServer) βββΊ Tool Execution
β βββ Agent N (GenServer) βββΊ Memory Management
βββ LLM Provider Pool
βββ Tool Registry
βββ Memory Supervisors
Every agent is:
- Isolated: Crashes don't propagate
- Supervised: Automatic restart with state recovery
- Concurrent: True parallelism across all CPU cores
- Scalable: Add nodes, not complexity
- Process-Based Agents: Each agent is a supervised GenServer process
- Automatic Recovery: Supervision trees handle crashes gracefully
- Hot Code Swapping: Update agent behavior without restarts
- Resource Isolation: Memory and compute boundaries per agent
- Large Context Windows: Leverage Gemini's 2M token capacity for complex reasoning
- Multiple Types: Conversation, semantic, episodic, working memory
- Pluggable Backends: ETS, Mnesia, PostgreSQL, Redis, vector databases
- Large Context Leverage: Use Gemini's 2M tokens to reduce memory complexity
- Distributed Storage: Memory that spans across nodes
- Multimodal Memory: Store and retrieve text, images, and documents
defmodule MyCustomTool do
use ElixirChain.Tool
def execute(%{"query" => query}, _context) do
# Tool logic here
{:ok, result}
end
end
# Tools run in parallel with automatic timeout/retry
ElixirChain.add_tool(agent, MyCustomTool)research_chain = ElixirChain.Chain.new()
|> add_step({:llm, :gemini, "Generate search queries for: {{topic}}"})
|> add_step({:parallel, [
{:tool, :web_search, %{query: "{{query1}}"}},
{:tool, :web_search, %{query: "{{query2}}"}}
]})
|> add_step({:llm, :gemini, "Synthesize comprehensive report: {{results}}"})
{:ok, result} = ElixirChain.Chain.run(research_chain, %{topic: "AI trends"})- Multi-Node Clustering: Agents communicate across machines seamlessly
- Load Balancing: Automatic distribution of agent workloads
- State Synchronization: Consistent memory across the cluster
- Network Partition Tolerance: Graceful handling of node failures
def deps do
[
{:elixir_chain, "~> 0.1.0"}
]
end# Start an intelligent research assistant
{:ok, agent} = ElixirChain.start_agent(%{
name: "research_assistant",
system_prompt: "You are a brilliant research assistant with access to web search and calculations.",
tools: [:web_search, :calculator, :file_reader],
llm_provider: :gemini, # 2M token context window
memory_type: :semantic
})
# Chat naturally
{:ok, response} = ElixirChain.chat(agent,
"Research the latest developments in quantum computing and calculate the market growth rate")
# Stream responses for long-form content
stream = ElixirChain.chat_stream(agent, "Write a comprehensive report on AI trends")
for chunk <- stream do
IO.write(chunk)
end# Create specialized agents
{:ok, researcher} = ElixirChain.start_agent(%{name: "researcher", tools: [:web_search]})
{:ok, writer} = ElixirChain.start_agent(%{name: "writer", tools: [:text_processor]})
{:ok, reviewer} = ElixirChain.start_agent(%{name: "reviewer", tools: []})
# Coordinate complex workflows
workflow_result = ElixirChain.Coordination.delegate([
{researcher, "Research quantum computing trends"},
{writer, "Create a technical summary from: {{research}}"},
{reviewer, "Review and improve: {{summary}}"}
])- Erlang/OTP 26+ - The foundation of reliability
- Elixir 1.15+ - Modern language features
- PostgreSQL 13+ - Vector storage with pgvector
- Redis 6+ - High-performance caching
# One command to rule them all
make ensure # Installs Elixir, Erlang, PostgreSQL, Redis via mise
make setup # Complete project setup (dependencies + database)
make test # Verify everything works
make console # Start interactive development environment# Development workflow
make console # Interactive shell (iex -S mix)
make test # Run comprehensive test suite
make test-watch # Continuous testing during development
make lint # Code quality with Credo
make format # Consistent code formatting
make dialyzer # Static type analysis
make check-all # Run all quality checks
# Database operations
make db-setup # Initialize database with schema
make db-reset # Fresh database reset
make db-migrate # Apply schema migrations
make db-console # Direct PostgreSQL access
# Service management
mise run services-start # Start PostgreSQL and Redis
mise run services-stop # Stop background services
mise run services-status # Check service healthThe BEAM virtual machine powers some of the world's most reliable systems:
- WhatsApp: 2+ billion users with 99.999% uptime
- Discord: Millions of concurrent voice/text channels
- Pinterest: Handling billions of requests per day
- Bet365: Real-time sports betting with zero downtime
- Klarna: Financial transactions requiring absolute reliability
ElixirChain aims to achieve:
- High Concurrency: Support thousands of simultaneous agents
- Fault Isolation: Individual agent failures don't cascade
- Rapid Recovery: Automatic restart with state preservation
- Horizontal Scaling: Add nodes to increase capacity
- Hot Updates: Deploy changes without downtime
- π Security: Input validation, permission systems, secure tool execution
- π Observability: Telemetry integration, distributed tracing, health checks
- π Deployment: Docker containers, Kubernetes StatefulSets, zero-downtime updates
- π Scaling: Horizontal scaling across multiple nodes
- πͺ Reliability: Supervision trees, circuit breakers, graceful degradation
# Built-in observability
:observer.start() # Visual process monitoring
ElixirChain.Metrics.agent_count() # Current active agents
ElixirChain.Health.cluster_status() # Distributed health checkThis project is currently in the design and architecture phase. The comprehensive technical design document (elixir_chain_design_doc.md) contains the complete blueprint, but no Elixir implementation exists yet.
Phase 1: Core Framework (4-6 weeks)
- β Technical design complete
- π² Basic agent GenServer implementation
- π² LLM provider abstractions (OpenAI, Anthropic)
- π² Simple memory management (ETS-based)
- π² Tool system with basic tools
- π² Chain execution engine
Phase 2: Production Features (4-6 weeks)
- π² Persistent memory backends
- π² Vector similarity search
- π² Streaming responses with GenStage
- π² Middleware system (logging, metrics, caching)
- π² Rate limiting and circuit breakers
Phase 3: Advanced Features (6-8 weeks)
- π² Distributed multi-node support
- π² Advanced memory compression
- π² Web UI for agent management
- π² Plugin system for extensions
- π² Performance optimization
Phase 4: Ecosystem (4-6 weeks)
- π² Integration with vector databases
- π² Pre-built agent templates
- π² Deployment tooling (Docker, Kubernetes)
- π² Monitoring and observability
- π² Security hardening
- π Technical Design Document - Complete architecture and implementation details
- π Development Guide - Claude Code integration and development conventions
- π§ API Reference - Complete API documentation (when released)
- π Examples - Real-world usage patterns and tutorials
ElixirChain is designed to become the definitive AI agent framework for production systems. We welcome contributions that align with our core philosophy:
- Concurrency First: Leverage BEAM's process model
- Fault Tolerance: Let it crash, but recover gracefully
- Distribution Ready: Design for multiple nodes from day one
- Developer Experience: Make complex things simple
- Read the technical design document
- Check out the development setup
- Run
make setupto get your environment ready - Look for issues tagged
good-first-issue
ElixirChain is released under the MIT License - see the LICENSE file for details.
ElixirChain represents a different approach to AI agent architectureβone that prioritizes reliability, concurrency, and operational simplicity. By leveraging decades of research in actor systems and fault-tolerant computing, we aim to make AI agents as robust and scalable as the telecommunication systems that inspired the BEAM VM.
AI agents deserve the same reliability guarantees as mission-critical systems.
β Star this repo if you're interested in exploring actor-model approaches to AI agents!
π Watch for updates as we develop this experimental framework!
π€ Contribute to help explore new paradigms in agent architecture!