Skip to content

unfault/unlost

Repository files navigation


Unfault

Unlost: Bridging code authorship and ownership.

Crates.io Version GitHub License GitHub Actions Workflow Status GitHub last commit

Before: You wrote the code, you owned the consequences.

Now: AI writes the code, you still own the consequences.

Agents have changed the writing part. You're still the one accountable for it. You are the one who has to explain it or defend it. Unlost keeps that context close to you.

I built Unlost because something was slipping away from me with coding agents. Unlost tries to keep some adherence between the agent's authorship and my owership as an engineer.

When do you need to "Get Unlost"?

A colleague asks why it's built this way. You know it works. You're less sure you can explain it. The reasoning was in the chat. unlost trace or unlost brief give you the answer, not from memory, from the record.

Six months later, someone needs to change it. Maybe it's you. The agent doesn't remember. The chat is gone. The commit message says "feat: add retry logic." unlost trace gives you the decision chain. unlost challenge tells you whether the original call still holds before you undo it.

Production is down. You're reading code under pressure that you didn't write. You don't know if the retry logic was intentional or a guess. unlost trace reconstructs the chain. unlost brief tells you what bites.

The PR is the handoff. To your team, and to your future self. The diff shows what changed. It doesn't show what was tried and rejected, what constraint you were navigating, what's still open. unlost pr-comment posts that note from session memory, not the diff. Ownership transfers with the code.


Install

curl -fsSL https://unlost.unfault.dev/install.sh | bash

Or download the binary manually from releases.

Quick Start

1. Hook into your agent

Claude Code (Global, zero per-repo config):

unlost config agent claude --global

OpenCode (Global):

unlost config agent opencode --global

Or per-project: unlost config agent opencode --path .

GitHub Copilot CLI (Per-project):

unlost config agent copilot --path .

This writes .github/hooks/unlost.json and installs a Copilot skill at .github/copilot/skills/unlost/.

2. (Optional) Configure extraction LLM

By default, unlost uses whatever LLM your agent is configured with. You can override this for better results (e.g., using a smaller/faster model for extraction):

# Use Claude
unlost config llm anthropic --model claude-3-5-sonnet-20241022

# Or OpenAI
unlost config llm openai --model gpt-4o-mini

Commands Overview

Understanding what was built

# Staff engineer's debrief on any file or module
unlost brief
unlost brief src/governor.rs

# What happened recently in this file?
unlost recall src/http_proxy.rs

# Reconstruct the decision chain that led to the current state
unlost trace src/governor.rs
unlost trace "why is the connection timeout 30 seconds?"
  • unlost brief: Scans all recorded memory and git commits. Scores by importance, not recency. Answers: what is this, what are the non-obvious choices, where do I start.
  • unlost trace: Builds a chronological causal chain seeded by semantic similarity. Answers: why is the code the way it is, not just what happened recently.
  • unlost recall: Narrates the recent story for a file or concept. Useful for catching up after time away.

Before you commit to a direction

# Argue with a past decision before you reverse it
unlost challenge "was using fastembed the right call?"

# Think through options using what this repo already knows
unlost explore "should we keep lancedb or move to sqlite+fts?"
  • unlost challenge: Surfaces recorded rationale, failure modes, and alternatives. Gives you a verdict grounded in history before you make a call.
  • unlost explore: Forward-looking planning. Labels what comes from memory [memory] vs. external knowledge [outside] so you know what you're actually standing on.

Reflecting on how you worked

# How did you and the agent collaborate? What should change?
unlost reflect
unlost reflect --mode tune
unlost reflect --mode both --since 7d
  • unlost reflect: Reads per-turn evaluation telemetry collected silently during sessions and generates a structured narrative via LLM — no raw transcript required. Three modes: coach (your collaboration habits), tune (agent drift, loops, hallucination), both. Every output opens with a scannable NEXT ACTIONS block. The tune and both modes include a SKILL ASSESSMENT that audits your installed agent skills against observed turn data and suggests behavioural gaps to fill.

Handing it off

# Post a PR comment from session history: intent, tradeoffs, risks
unlost pr-comment 42
unlost pr-comment https://github.com/owner/repo/pull/42
  • unlost pr-comment: Posts a "staff engineer" style note on the PR. Not a diff summary. A note from someone who was in the room: what changed functionally, what we were navigating, what's left open, what to re-read in three months.

The /unlost-walkthrough agent skill does the same thing interactively: step through what changed, in order, with reasons, so you can review with confidence rather than just approve.


How it works

  1. Capture: After each agent exchange → extracts a structured capsule (intent, decision, rationale, symbols) plus a TurnEval — per-turn coaching and agent-tuning scores computed locally with zero LLM calls.
  2. Store: Capsules stay local, embedded with fastembed, indexed in LanceDB. Nothing leaves your machine.
  3. Guide: Before each prompt → checks for friction (loops, drift, misalignment). Injects a correction if something is off.
  4. Recall: Capsules are queryable anytime, by file, symbol, question, or concept.
  5. Reflect: unlost reflect reads the TurnEval timeline and generates a structured coaching/diagnostics narrative — developer habits, agent patterns, skill gaps — without touching raw transcript text.

Privacy First

Everything unlost stores stays on your machine:

  • Capsules — Stored locally in ~/.local/share/unlost/workspaces/
  • Embeddings — Generated locally with fastembed
  • Query history — Never leaves your disk

The only network call unlost makes is to the LLM provider you configure for extraction. That LLM sees only the exchange text (no tool outputs), and it produces a capsule that never goes back upstream.

Under the Hood (Technical Details)

Trajectory Sensing

  • Three-state FSM — Stable → Watch → Intervene controller with hysteresis, per-basin cooldowns, and a one-shot rule preventing repeat intervention types
  • Weighted multi-channel basin scoring — Loop, Spec, and Drift intensities computed as calibrated weighted sums of 10 independent symptom channels
  • EMA smoothing — All 10 channels smoothed with exponential moving average (α=0.3) to suppress single-turn noise spikes
  • Sliding window persistence — State only escalates after 3 consecutive turns above the 0.75 intensity threshold
  • Coffee Pause soft decay — >30-minute gaps decay intensity to 0.3× and reset state; injects a resumption brief on return
  • Grounding stall detection — User-mentioned file paths tracked with exponential time decay; stall streak increments when the agent ignores them
  • Jaccard-like logic churn — Word-set distance between consecutive agent decisions; detects rapid plan changes without progress
  • Symbol repetition / novelty collapse — Fraction of current capsule symbols seen in the last 8 capsules; complement is novelty score
  • Stubbornness boost — Extra intensity when alignment debt is high but decision churn is low (agent acknowledges errors but keeps the same plan)
  • Blind Acceptance risk — Detects fluent long responses followed by passive short user replies; flags over-trust risk
  • Summary intent damping — Multiplies intensity by 0.6 on turns the agent is legitimately consolidating, preventing false positives
  • Stratified intervention policy — Ambient hint / Structural note / Emergency hard-stop tiered by intensity level
  • Hydration packet — For Loop interventions, injects the 3 most relevant recent capsules scored by recency, symbol overlap, emotion, effort, and failure mode

Emotion & NLP

  • Multi-label emotion classification — RoBERTa-base fine-tuned on GoEmotions (28 labels → 8 buckets), running locally via ONNX Runtime
  • Heuristic emotion override — Pattern-based frustration and doubt detection that corrects misclassifications from the neural model
  • Affective modulation — Joy halves trajectory intensity; persistent anger triggers a de-escalation override regardless of basin state

Retrieval & Memory

  • HyPE — At indexing time, the LLM generates 2–3 questions each capsule answers; at retrieval time, each command frames your query to match those questions — question-to-question match, not keyword-to-document. (Ma et al., 2025)
  • Trajectory-encoded embeddings — Each capsule is embedded with its category, failure mode, symbols, and the prior decision from the same work thread; causally related capsules cluster together across sessions
  • BGE-small-en-v1.5 dense embeddings — 384-dimensional vectors, generated fully locally via fastembed + ONNX Runtime
  • ANN vector search — LanceDB nearest_to with an auto-tuned approximate nearest-neighbour index on the embedding column
  • LabelList index — Scalar index on the symbols array column enabling fast array_contains fan-out queries
  • Causal chain algorithm — ANN seed → symbol fan-out via LabelList index → similarity threshold pruning → chronological sort; powers trace
  • Cross-session recurrence scoring — Capsules scored for brief by failure mode, explicit rationale/decision, and symbols recurring across multiple sessions (no recency bias)
  • Recency-weighted fingerprint deduplicationrecall collapses near-duplicates by content fingerprint and caps older sessions at 3 results, with a 30-minute recency bypass
  • Checkpoint summarization — Background process compresses windows of capsules into narrative checkpoints; recall and brief use a fast path when the delta since last checkpoint is small

TurnEval: Per-Turn Evaluation

  • Zero-LLM coaching scores — Each capsule carries clarity, context_freshness, verification_rigor, decision_progress, scope_discipline, and cost_acceleration — all heuristic, computed at flush time from capsule content and usage metadata
  • Agent-tuning channels — Persisted governor SymptomChannels (repetition, novelty_collapse, semantic_stall, alignment_debt, path_hallucination, logic_churn, fluency, …) previously discarded after friction decisions; now stored per capsule
  • Behavioral flags — Derived thresholds: retry_loop, session_heavy, session_too_long, unverified_claim, scope_shift, blind_acceptance, cost_spike, etc.
  • Outcome backfill — At checkpoint time, outcome_hint (progressed/stalled/regressed/unclear) is retroactively set via deterministic lookahead heuristics and written back via LanceDB UPDATE
  • Reflect-time LLMunlost reflect feeds the per-turn timeline + session aggregates to the LLM; the LLM narrates from structured telemetry only, no raw transcript crosses the wire
  • Skill gap guidance — Observed flag patterns are matched to a behavioural gap catalogue; the reflect output includes a "Look for skills that…" list grounded in actual session data
  • Reindex backfillunlost reindex automatically populates TurnEval for all capsule history; post-v0.13 capsules restore full data from JSONL, pre-v0.13 get coach dimensions computed from content

Storage & Infrastructure

  • Apache Arrow / LanceDB columnar store — Capsules stored as Arrow RecordBatches with three indexes (ANN, LabelList, scalar timestamp); append-only with schema evolution
  • Code graph analysisunfault-core + petgraph builds a live static graph for centrality scoring, dependency/impact traversal, and symbol validation backing Drift detection
  • LLM structured extraction — JSON Schema extraction via rig-core + schemars; produces typed IntentCapsule structs from raw agent exchanges
  • Hybrid extraction mode — Heuristics identify "pivotal" turns before invoking the LLM, reducing extraction cost by skipping routine turns
  • SHA-256 job deduplication — Flush jobs hashed by content; identical jobs within a 45-second window are suppressed
  • Git grounding & SHA provenance — Git HEAD and commit SHAs stored on every capsule; git commits ingested as first-class capsules, deduplicated by hash
  • Changelog ingestion — CHANGELOG.md versions parsed and stored as versioned capsules, surfaced with ref=version:vX.Y.Z citations in LLM prompts

Dev

cargo test
cargo build

License

MIT. See LICENSE for details.

Docs

  • agents/README.md - Agent integrations

About

Bridging code authorship and ownership

Resources

License

Code of conduct

Stars

Watchers

Forks

Contributors