Capture the Flag 🚩

An AI-powered faction-based social deduction game where autonomous agents powered by Claude Sonnet 4 compete through strategic night actions, deduction, and flag capture.

✅ What's Working

Core Systems (92 Tests Passing)

Action System (28 tests)
- 8-point action allocation (snoop, raid, kill, defend)
- Success probability calculations with power multipliers
- Action queue with danger-based ordering
- Flag tracking and transfers
Resolution System (19 tests)
- Sequential action resolution
- Stochastic success/failure based on probabilities
- Flag transfers on kill/raid
- Evidence generation integration
Evidence System (8 tests)
- Physical trait-based evidence
- Ambiguity maximization
- Crime scene vs night activity evidence
- Player trait matching
Game Integration (9 tests)
- Flag tracker initialization
- Night/day phase execution
- Victory detection
- Multi-turn progression
LLM Action Decisions (15 tests, 11 non-stochastic)
- Real Claude Sonnet 4 integration
- Mock agent for fast testing
- 8-point validation
- Retry logic with fallback
- Structured tool-based output
Game Orchestration (17 tests)
- Complete game loop
- Faction assignment (3-6 factions for 3-26 players)
- Night phase: LLM decisions → resolution → evidence
- Day phase: revelations and flag tracking
- Victory checking (all flags or last faction standing)

Demo Game

Run a complete game:

# Play with real AI (Claude Sonnet 4)
python play.py

# With mock AI (fast, deterministic)
python play.py --mock --players 6

# Custom settings
python play.py --players 12 --max-turns 30

# Quiet mode
python play.py --quiet

📊 System Architecture

Game Orchestrator (game_orchestrator.py)
    ↓
Night Phase Loop
    → ActionDecisionAgent (llm_actions.py)
        → Claude Sonnet 4 with structured output
        → Returns validated 8-point allocations
    → ActionQueue (actions.py)
        → Sorts by danger level
    → ActionResolver (resolution.py)
        → Stochastic execution
        → Flag transfers
    → EvidenceGenerator (evidence.py)
        → Trait-based clues
    ↓
Day Phase
    → Reveal deaths
    → Show evidence
    → Discussion
    ↓
Victory Check
    → All flags held by one faction?
    → Only one faction has survivors?

🎮 Game Rules

Objective: Capture all flags or eliminate other factions

Night Phase:

Each player allocates 8 action points
Snoop (3x power): Learn target's faction and flag status
Raid (1x power): Steal flag from target
Kill (1x power): Eliminate target and take their flags
Defend: Protect yourself from attacks

Day Phase:

Deaths revealed with faction
Evidence shown to all players
Discussion period (future: voting)

Victory:

Hold all flags, OR
Be the last faction with survivors

🧪 Testing

# Run all non-stochastic tests (92 tests)
pytest tests/test_actions.py \
       tests/test_resolution.py \
       tests/test_evidence.py \
       tests/test_game_integration.py \
       tests/test_llm_actions.py::TestMockActionDecisions \
       tests/test_llm_actions.py::TestActionDecisionValidation \
       tests/test_llm_actions.py::TestActionDecisionRetries \
       tests/test_game_orchestrator.py

# Run specific test suite
pytest tests/test_game_orchestrator.py -v

# Run real LLM tests (slow, requires API key)
pytest tests/test_llm_actions.py::TestLLMActionDecisions -v

📁 Key Files

Core Systems:

rills/actions.py - Action allocation, queue, flag tracking
rills/resolution.py - Action execution and resolution
rills/evidence.py - Physical evidence generation
rills/llm_actions.py - LLM-based action decisions

Orchestration:

rills/game_orchestrator.py - Complete game coordinator
rills/phases/capture_flag_night.py - Night phase handler
rills/phases/capture_flag_day.py - Day phase handler

Demo:

demo_game.py - Playable game with CLI

Tests:

tests/test_actions.py - Action system tests
tests/test_resolution.py - Resolution tests
tests/test_evidence.py - Evidence tests
tests/test_game_integration.py - Integration tests
tests/test_llm_actions.py - LLM decision tests
tests/test_game_orchestrator.py - Full game tests

🚀 Next Steps

Potential enhancements:

Location System - Strategic positioning during day phase
Advanced LLM Context - Track faction reveals and alliances
Voting/Discussion - Democratic elimination during day
Special Abilities - Unique powers per player
Web Interface - Real-time game visualization
Tournament Mode - Multiple games with ELO rankings

🎯 Development Approach

Built using Test-Driven Development:

Write comprehensive tests first
Implement minimal code to pass
Refactor for clarity
Validate with real LLM API

All 92 non-stochastic tests pass consistently. Real LLM tests validate strategic decision-making.

📝 Game Example

=== Starting Capture the Flag Game ===
Players: 6
Factions: Red Faction, Blue Faction, Green Faction

📋 Initial Setup:
  Red Faction: Alice, Bob
  Blue Faction: Charlie, David
  Green Faction: Eve, Frank

🚩 Initial Flags:
  Red Faction flag → Bob
  Blue Faction flag → Charlie
  Green Faction flag → Eve

--- Turn 1 ---
🌙 Night 0 begins...
Night 0 ends. 0 player(s) died.

☀️ Day 1 begins...
## Morning Revelations
Everyone survived the night.

## Evidence Found
🔍 **video** - Security camera captured a tall person
🔍 **fabric** - Fabric fibers from dark clothes found at scene
[... evidence continues ...]

## Discussion Period
Players discuss the night's events...

Built with ❤️ using TDD and Claude Sonnet 4

Name		Name	Last commit message	Last commit date
Latest commit History 177 Commits
rills		rills
tests		tests
.emoji		.emoji
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
IMPLEMENTATION_PLAN.md		IMPLEMENTATION_PLAN.md
LICENSE		LICENSE
README.md		README.md
play.py		play.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Capture the Flag 🚩

✅ What's Working

Core Systems (92 Tests Passing)

Demo Game

📊 System Architecture

🎮 Game Rules

🧪 Testing

📁 Key Files

🚀 Next Steps

🎯 Development Approach

📝 Game Example

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Capture the Flag 🚩

✅ What's Working

Core Systems (92 Tests Passing)

Demo Game

📊 System Architecture

🎮 Game Rules

🧪 Testing

📁 Key Files

🚀 Next Steps

🎯 Development Approach

📝 Game Example

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages