Skip to content

nitindatta/decide

Repository files navigation

DECIDE

Decision simulation engine. Make uncertainty explicit, reproducible, and falsifiable.


What It Is

DECIDE is a Monte Carlo simulation engine for decisions under uncertainty.

You declare your assumptions as probability distributions. The engine samples them 20,000 times, evaluates your outcome expressions, computes quantiles and sensitivity rankings, and tells you which assumptions drive your results — and by how much.

The agent layer (powered by Claude) can take a natural language question, structure it into a decision spec, run the simulation, and explain what matters.


The Problem It Solves

Most decisions are made with:

  • Single-point estimates ("it'll cost $150k")
  • Hidden assumptions
  • Optimistic projections
  • No uncertainty modeling
  • No way to know which assumptions actually matter

DECIDE forces you to be explicit. Every assumption has a distribution. Every outcome traces back to the engine, not a language model. Results are seeded and reproducible.


Architecture

Two layers, strictly separated:

Deterministic Engine (no LLM):

  • Vectorized Monte Carlo simulation (NumPy, seeded)
  • Python DSL via operator overloading — no eval(), no YAML
  • Quantile summary: P10 / P50 / P90
  • Sensitivity ranking via Spearman correlation
  • Constraint violation rates
  • Linting and validation

Agent Layer (Claude Agent SDK):

  • Takes natural language goals and generates specs
  • Iterates on specs (lint, fix, re-lint)
  • Interprets and narrates results
  • Guides wizard flows
  • Reasons about scenario comparisons

The agent cannot compute simulation results — it has no tool for that. Every number in the output comes from NumPy.


Installation

Requires Python 3.13+ and uv.

git clone <repo>
cd dec
uv sync

For the agent commands, set your Anthropic API key:

export ANTHROPIC_API_KEY=your_key_here

Quick Start

Write a spec

from decide import Decision, uniform, triangular, normal

d = Decision("Expand to new market")

# Assumptions -- explicit distributions
market_size = d.assume("market_size", uniform(10_000, 50_000), unit="users")
conversion  = d.assume("conversion_rate", triangular(0.01, 0.03, 0.08))
price       = d.assume("price", normal(mean=49, std=10))
cost        = d.assume("fixed_cost", uniform(100_000, 300_000), unit="usd")

# Derived -- computed from assumptions
revenue = d.derive("revenue", market_size * conversion * price)

# Outcomes -- what we care about
d.outcome("net_profit", revenue - cost)

# Constraints -- conditions that must hold
d.constraint("profitable", revenue > cost, severity="critical")

# Run
results = d.run(n=20_000, seed=42)
d.report(results)

Run it directly:

python spec.py

Or via CLI:

decide run spec.py --n 20000 --seed 42

Use the agent

# Interactive session
decide agent

# Goal-directed session
decide agent --goal "Should I hire 2 engineers or outsource to an agency?"

# Agent narrates an existing results file
decide agent explain results.json

CLI Reference

Engine commands (no LLM required)

decide run spec.py [--n 20000] [--seed 42] [--out results.json]
decide report results.json [--format terminal|markdown]
decide lint spec.py
decide diff a.py b.py [--n 20000] [--seed 42]
decide calibrate spec.py actuals.json

Agent commands (requires ANTHROPIC_API_KEY)

decide agent                              # interactive session
decide agent --goal "your question"       # goal-directed session
decide agent explain results.json         # agent narrates results

DSL Reference

Distributions

from decide import fixed, uniform, normal, triangular

fixed(42)
uniform(low=10_000, high=50_000)
normal(mean=100, std=15)
triangular(low=1, mode=3, high=8)

Decision API

d = Decision("name")

# Register an assumption with a distribution
var = d.assume("name", distribution, unit="usd", notes="context")

# Name an intermediate expression
derived = d.derive("name", expr)

# Register an outcome to track
d.outcome("name", expr)

# Register a constraint
d.constraint("name", boolean_expr, severity="critical")  # or "warning"

# Run simulation
results = d.run(n=20_000, seed=42)

# Print terminal report
d.report(results)

Expressions are built using standard Python operators on variables: +, -, *, /, **, >, <, >=, <=

Collections

Collections are just Python:

tasks = []
for i in range(12):
    cp  = d.assume(f"task_{i}_cp",  triangular(1, 3, 8))
    aai = d.assume(f"task_{i}_aai", uniform(0.3, 0.9))
    tasks.append((cp, aai))

total_effort = d.derive("total_effort", sum(cp * (1 - aai) for cp, aai in tasks))

Example Specs

Three included examples:

File Decision
examples/hire_vs_outsource.py Hire engineers vs. agency cost and output
examples/market_expansion.py Market entry revenue and profit
examples/build_vs_buy.py Build in-house vs. buy a vendor solution

Run any of them:

decide run examples/hire_vs_outsource.py
decide diff examples/hire_vs_outsource.py examples/market_expansion.py

Agent Workflow Example

User: "Should I hire 2 engineers or outsource to an agency?"

Agent:
1. Asks clarifying questions (budget, timeline, velocity expectations)
2. Generates a decision spec with distributions for each assumption
3. Calls lint_spec -- fixes any warnings
4. Shows the assumptions table, asks for adjustments
5. Calls run_simulation (n=20000, seed=42)
6. Calls read_results
7. Reasons about sensitivity ranking
8. Explains: "Hiring has P50 cost of $180k vs $220k outsourcing.
   The key driver is ramp-up time -- it accounts for 64% of variance.
   If ramp-up exceeds 6 weeks, outsourcing wins."

The agent never fabricates numbers. Every figure traces to a run_simulation call.


Agent Tools

The agent interacts with the engine exclusively through these tools:

Tool What it does
generate_spec Write a Python spec from a goal description
lint_spec Validate and check a spec file
run_simulation Execute Monte Carlo simulation
read_results Read structured results JSON
diff_specs Compare two scenarios
edit_spec Modify assumptions in a spec
explain_results Narrate results in plain language
calibrate Compare predictions vs. actuals

Design Principles

  • Code-driven, not config-driven. Specs are Python scripts, not YAML. The computation graph is built by operator overloading — no expression parsing, no eval().
  • Deterministic. Every simulation is seeded. Same seed, same result, always.
  • Agent reasons. Engine computes. The LLM never produces numbers. The boundary is enforced by tools.
  • Falsifiable. Assumptions are explicit and inspectable. Results can be compared against actuals when outcomes are known.
  • Diffable. Specs are .py files. Version them, review them, diff them.

Tech Stack

  • Python 3.13+
  • uv for package management
  • NumPy for vectorized simulation
  • SciPy for Spearman correlation (sensitivity)
  • Pydantic for results schema
  • Typer + Rich for CLI
  • Anthropic Claude Agent SDK for agent layer

v1 Roadmap

  • Conditional distributions
  • Multi-phase / time-series simulation
  • Sobol sensitivity indices
  • Quasi-Monte Carlo (LHS) sampling
  • Web UI
  • Template library (estimation, risk, portfolio, hiring)
  • Calibration with historical data
  • PDF report export
  • MCP server (expose DECIDE tools to any MCP client)
  • Agent memory (remember past decisions and their outcomes)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages