Decision simulation engine. Make uncertainty explicit, reproducible, and falsifiable.
DECIDE is a Monte Carlo simulation engine for decisions under uncertainty.
You declare your assumptions as probability distributions. The engine samples them 20,000 times, evaluates your outcome expressions, computes quantiles and sensitivity rankings, and tells you which assumptions drive your results — and by how much.
The agent layer (powered by Claude) can take a natural language question, structure it into a decision spec, run the simulation, and explain what matters.
Most decisions are made with:
- Single-point estimates ("it'll cost $150k")
- Hidden assumptions
- Optimistic projections
- No uncertainty modeling
- No way to know which assumptions actually matter
DECIDE forces you to be explicit. Every assumption has a distribution. Every outcome traces back to the engine, not a language model. Results are seeded and reproducible.
Two layers, strictly separated:
Deterministic Engine (no LLM):
- Vectorized Monte Carlo simulation (NumPy, seeded)
- Python DSL via operator overloading — no
eval(), no YAML - Quantile summary: P10 / P50 / P90
- Sensitivity ranking via Spearman correlation
- Constraint violation rates
- Linting and validation
Agent Layer (Claude Agent SDK):
- Takes natural language goals and generates specs
- Iterates on specs (lint, fix, re-lint)
- Interprets and narrates results
- Guides wizard flows
- Reasons about scenario comparisons
The agent cannot compute simulation results — it has no tool for that. Every number in the output comes from NumPy.
Requires Python 3.13+ and uv.
git clone <repo>
cd dec
uv syncFor the agent commands, set your Anthropic API key:
export ANTHROPIC_API_KEY=your_key_herefrom decide import Decision, uniform, triangular, normal
d = Decision("Expand to new market")
# Assumptions -- explicit distributions
market_size = d.assume("market_size", uniform(10_000, 50_000), unit="users")
conversion = d.assume("conversion_rate", triangular(0.01, 0.03, 0.08))
price = d.assume("price", normal(mean=49, std=10))
cost = d.assume("fixed_cost", uniform(100_000, 300_000), unit="usd")
# Derived -- computed from assumptions
revenue = d.derive("revenue", market_size * conversion * price)
# Outcomes -- what we care about
d.outcome("net_profit", revenue - cost)
# Constraints -- conditions that must hold
d.constraint("profitable", revenue > cost, severity="critical")
# Run
results = d.run(n=20_000, seed=42)
d.report(results)Run it directly:
python spec.pyOr via CLI:
decide run spec.py --n 20000 --seed 42# Interactive session
decide agent
# Goal-directed session
decide agent --goal "Should I hire 2 engineers or outsource to an agency?"
# Agent narrates an existing results file
decide agent explain results.jsondecide run spec.py [--n 20000] [--seed 42] [--out results.json]
decide report results.json [--format terminal|markdown]
decide lint spec.py
decide diff a.py b.py [--n 20000] [--seed 42]
decide calibrate spec.py actuals.jsondecide agent # interactive session
decide agent --goal "your question" # goal-directed session
decide agent explain results.json # agent narrates resultsfrom decide import fixed, uniform, normal, triangular
fixed(42)
uniform(low=10_000, high=50_000)
normal(mean=100, std=15)
triangular(low=1, mode=3, high=8)d = Decision("name")
# Register an assumption with a distribution
var = d.assume("name", distribution, unit="usd", notes="context")
# Name an intermediate expression
derived = d.derive("name", expr)
# Register an outcome to track
d.outcome("name", expr)
# Register a constraint
d.constraint("name", boolean_expr, severity="critical") # or "warning"
# Run simulation
results = d.run(n=20_000, seed=42)
# Print terminal report
d.report(results)Expressions are built using standard Python operators on variables:
+, -, *, /, **, >, <, >=, <=
Collections are just Python:
tasks = []
for i in range(12):
cp = d.assume(f"task_{i}_cp", triangular(1, 3, 8))
aai = d.assume(f"task_{i}_aai", uniform(0.3, 0.9))
tasks.append((cp, aai))
total_effort = d.derive("total_effort", sum(cp * (1 - aai) for cp, aai in tasks))Three included examples:
| File | Decision |
|---|---|
examples/hire_vs_outsource.py |
Hire engineers vs. agency cost and output |
examples/market_expansion.py |
Market entry revenue and profit |
examples/build_vs_buy.py |
Build in-house vs. buy a vendor solution |
Run any of them:
decide run examples/hire_vs_outsource.py
decide diff examples/hire_vs_outsource.py examples/market_expansion.pyUser: "Should I hire 2 engineers or outsource to an agency?"
Agent:
1. Asks clarifying questions (budget, timeline, velocity expectations)
2. Generates a decision spec with distributions for each assumption
3. Calls lint_spec -- fixes any warnings
4. Shows the assumptions table, asks for adjustments
5. Calls run_simulation (n=20000, seed=42)
6. Calls read_results
7. Reasons about sensitivity ranking
8. Explains: "Hiring has P50 cost of $180k vs $220k outsourcing.
The key driver is ramp-up time -- it accounts for 64% of variance.
If ramp-up exceeds 6 weeks, outsourcing wins."
The agent never fabricates numbers. Every figure traces to a run_simulation call.
The agent interacts with the engine exclusively through these tools:
| Tool | What it does |
|---|---|
generate_spec |
Write a Python spec from a goal description |
lint_spec |
Validate and check a spec file |
run_simulation |
Execute Monte Carlo simulation |
read_results |
Read structured results JSON |
diff_specs |
Compare two scenarios |
edit_spec |
Modify assumptions in a spec |
explain_results |
Narrate results in plain language |
calibrate |
Compare predictions vs. actuals |
- Code-driven, not config-driven. Specs are Python scripts, not YAML. The computation graph is built by operator overloading — no expression parsing, no
eval(). - Deterministic. Every simulation is seeded. Same seed, same result, always.
- Agent reasons. Engine computes. The LLM never produces numbers. The boundary is enforced by tools.
- Falsifiable. Assumptions are explicit and inspectable. Results can be compared against actuals when outcomes are known.
- Diffable. Specs are
.pyfiles. Version them, review them, diff them.
- Python 3.13+
- uv for package management
- NumPy for vectorized simulation
- SciPy for Spearman correlation (sensitivity)
- Pydantic for results schema
- Typer + Rich for CLI
- Anthropic Claude Agent SDK for agent layer
- Conditional distributions
- Multi-phase / time-series simulation
- Sobol sensitivity indices
- Quasi-Monte Carlo (LHS) sampling
- Web UI
- Template library (estimation, risk, portfolio, hiring)
- Calibration with historical data
- PDF report export
- MCP server (expose DECIDE tools to any MCP client)
- Agent memory (remember past decisions and their outcomes)