Skip to content

lonexreb/university-sim

Repository files navigation

University Sim

Multi-agent university simulation benchmark where LLM-powered agents (professors, students, staff, IT, management, applicants) interact within configurable university archetypes. Compares LLM providers (Claude, Qwen, Kimi, Gemini) on coordination quality, research output, and emergent behavior.

What Makes This Different

This isn't just another multi-agent chat. It combines three powerful ideas:

  • Concordia-style Game Master (Google DeepMind) -- centralized conflict resolution where agents observe, plan, act, and a GM adjudicates outcomes via LLM
  • OpenClaw-inspired Agent Coordination -- inter-agent P2P/broadcast messaging + shared task board with dependency chains, enabling bottom-up emergent coordination
  • Moltbook Molt Dynamics Metrics (arxiv 2603.03555) -- emergence score, role specialization index, core-periphery analysis, and communication reciprocity tracking

Architecture

Agents (observe/plan/act)
    |
    v
SimulationEngine ── MessageRouter (P2P + broadcast)
    |                TaskBoard (post/claim/complete with deps)
    v
GameMaster (LLM-based conflict resolution)
    |
    v
MetricsTracker ── MoltDynamicsMetrics (emergence scoring)
    |
    v
WorkspaceManager (per-agent persistence)

Time compression: 6 real months compressed to 6 simulation days (36 steps/day, 216 total steps).

Features

Simulation Engine

  • 6 agent roles: Professor, Student, Staff, IT Staff, Management, Applicant
  • Generative Agents memory system (stream + retrieval + reflection)
  • Academic calendar with phases: Semester Start, Mid-Semester, Application Period, Finals, Summer Research
  • Parallelized agent observe/plan/act via asyncio

Always-On Mode

  • Continuous 24/7 simulation: All 4 universities run as parallel async loops that restart automatically
  • Cross-cycle learning: Agents restore memory and state from previous runs via Supabase persistence
  • Error isolation: One university crashing doesn't affect others -- auto-retries after configurable interval
  • API control: Start/stop/pause/resume via REST endpoints (/api/always-on/*)
  • Auto-start: Set ALWAYS_ON=true env var to start on API boot, or trigger via API endpoint
  • Configurable concurrency: Semaphore-based throttling (ALWAYS_ON_MAX_CONCURRENT, default 2)

Coordination Systems (OpenClaw-inspired)

  • Inter-Agent Messaging: P2P direct messages + broadcast filtered by role/department
  • Shared Task Board: Agents post tasks with priority and dependencies; others claim and complete them
  • Cross-role collaboration tracking: Measures how different roles work together organically

Metrics & Analytics (Moltbook-inspired)

  • Emergence Score (0-100): Composite of specialization, reciprocity, task completion, cross-role collab, activity distribution
  • Role Specialization Index: Shannon entropy of action distributions per agent
  • Core-Periphery Analysis: Identifies hub agents vs peripheral ones, Gini coefficient
  • Communication Dynamics: Response rates, reciprocity, information bridges

Live Dashboard (Next.js)

  • 6-panel real-time dashboard: Live Feed, Metrics Chart, Emergence Panel, Message Feed, Task Board, Agent Roster
  • D3 force-directed interaction graph with core/periphery highlighting
  • WebSocket streaming from FastAPI backend
  • Supabase persistence for simulation events

Benchmarking

  • Compare LLM providers (Claude, Qwen, Kimi, Gemini) on identical scenarios
  • 4 university archetypes with different success metrics
  • Parallel trial execution with semaphore-based throttling
  • Statistical analysis with per-provider breakdowns

University Archetypes

Config Type Success Metric
tech_school MIT-like research_output
business_school HBS-like network_quality
legacy_school Oxford-like prestige_maintenance
commonwealth Public university accessibility

Quick Start

# Clone
git clone https://github.com/lonexreb/university-sim.git
cd university-sim

# Python backend
python -m venv .venv
source .venv/bin/activate
pip install -e .

# Set API keys (at least one provider)
cp .env.example .env
# Edit .env with your keys

# Run tests
python -m pytest tests/ -q

# Run a simulation (36 steps = 1 simulated day)
python scripts/run_simulation.py --university tech_school --provider qwen --steps 36

# Run full benchmark
python scripts/run_benchmark.py --providers qwen kimi gemini --trials 3

Web Dashboard

# Terminal 1: FastAPI backend
source .venv/bin/activate
python scripts/run_api.py

# Terminal 2: Next.js frontend
cd web
npm install
npm run dev

Open http://localhost:3000 to access the dashboard.

Always-On Mode

# Option 1: Auto-start on boot
ALWAYS_ON=true python scripts/run_api.py

# Option 2: Start via API
python scripts/run_api.py
curl -X POST localhost:8000/api/always-on/start
curl localhost:8000/api/always-on/status
curl -X POST localhost:8000/api/always-on/stop

# Option 3: Start with custom provider/interval
curl -X POST "localhost:8000/api/always-on/start?provider=qwen&interval=120"

# Pause/resume a single university
curl -X POST localhost:8000/api/always-on/tech_school/pause
curl -X POST localhost:8000/api/always-on/tech_school/resume

Environment variables for always-on mode:

Variable Default Description
ALWAYS_ON false Auto-start on API boot
ALWAYS_ON_PROVIDER gemini LLM provider (cheapest default)
ALWAYS_ON_INTERVAL 60 Seconds between cycles
ALWAYS_ON_MAX_CONCURRENT 2 Max universities running in parallel

Project Structure

src/
  agents/         # BaseAgent + 6 roles + MemorySystem + WorkspaceManager + factory
  core/           # SimulationEngine, GameMaster, CampusEnvironment, MessageRouter, TaskBoard, EventBus
  llm/            # LLMGateway (LiteLLM multi-provider), TokenTracker
  metrics/        # Research, Network, Prestige, Accessibility, Coordination, MoltDynamics
  benchmarking/   # Parallel ExperimentRunner, StatisticalAnalysis, ReportGenerator
  api/            # FastAPI backend with Supabase, WebSocket broadcasting, always-on runner
web/              # Next.js 16 + React 19 + Tailwind 4 + D3 + Recharts dashboard
config/
  universities/   # 4 archetype YAML configs
  simulation.yaml # Global settings
scripts/          # CLI runners
tests/            # 104 tests

LLM Providers

Provider Model Key Env Var
qwen openai/qwen-plus DASHSCOPE_API_KEY
kimi openai/kimi-k2-0711-preview MOONSHOT_API_KEY
gemini gemini/gemini-2.5-flash GOOGLE_API_KEY
claude anthropic/claude-sonnet-4-5-20250929 ANTHROPIC_API_KEY

Tech Stack

Backend: Python 3.10+, LiteLLM, NetworkX, NumPy, FastAPI, Supabase, asyncio, tenacity

Frontend: Next.js 16, React 19, Tailwind CSS 4, Recharts, D3.js, WebSocket

Inspirations & References

  • Concordia -- Game Master pattern for multi-agent simulation (Google DeepMind)
  • Generative Agents -- Memory stream + retrieval + reflection (Park et al., 2023)
  • OpenClaw -- Agent Teams RFC, inter-agent messaging, per-agent workspaces
  • Molt Dynamics -- Emergent social phenomena in autonomous AI agent populations (Moltbook research)

License

MIT

About

Multi-agent university simulation benchmark: Concordia GM + OpenClaw coordination + Moltbook emergence metrics. Compare LLM providers on agent coordination.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors