AI gets code authorship.
Context ownership stays with you.

Agents write the code. You carry the consequences. Unlost keeps the context that used to live in your head - and now lives in the agent - close to you.

curl -fsSL https://unlost.unfault.dev/install.sh | bashCopied
For Windows, please download the binary from our releases page. Setup guide
Privacy First

Capsules and indexes stay on your machine.

Open-Source

Built in the open under a MIT license.

Non-Blocking

Journaling returns immediately; heavy work runs async.

Any agent. One shared memory.

Claude Code
OpenCode
GitHub Copilot

When it matters

It used to be that writing the code meant understanding it. That's changed. These are the moments where the gap shows.

A colleague asks why

unlost brief / unlost trace

You know it works. You're less sure you can explain it. The reasoning was in the chat, which is gone.

Six months later

unlost trace / unlost challenge

Someone needs to change it. Maybe it's you. The agent doesn't remember. The commit message says "feat: add retry logic."

Production is down

unlost trace / unlost brief

You're reading code under pressure that you didn't write. You don't know if the retry logic was intentional or a guess.

The PR is the handoff

unlost pr-comment

To your team, and to your future self. The diff shows what changed. It doesn't show what was tried and rejected, what constraint you were navigating, what's still open.

How It Works

01

Record

Unlost sits alongside your agent process. It captures the raw stream of thought—intent, reasoning, and decisions—without blocking your flow.

# Silent observation
Recording session ses_3a79... [active]
02

Extract & Structure

We don't just log text. Unlost distills messy conversations into Capsules: atomic units of memory containing the Why, the What, and the How.

# Extracted Capsule
{intent: "Refactor auth", decision: "Use PASETO tokens", rationale: "..."}
03

Ground

Every capsule is verified against the live code graph. We link decisions to specific files and symbols, creating a navigable map of your project's evolution.

# Graph Links
Linked: src/auth.rs (80% relevance) | Symbol: verify_token

One Memory. Many Lenses.

Once context is grounded, you can query it from any angle.

Look Back

Recover the context you lost. Whether it's a high-level staff engineer debrief or a specific decision trail, you get the why behind the code.

trace brief

Challenge Present

Don't just approve; verify. Pressure-test decisions against the graph and monitor the agent's struggle in real-time.

challenge metrics

Explore Future

Draft with memory. Use your established constraints and history to weigh new trade-offs before writing a single line of code.

explore

Long Memory & the Story Arc

Capsules are not just a log. They encode a causal history - the sequence of decisions, constraints, and failures that explain why the code looks the way it does today.

The problem with standard RAG

Most retrieval systems return a ranked bag of hits. Ask "why is the timeout 30s?" and you get the five most similar capsules - but no sense of how you arrived there. Causality is lost. Sessions are mixed. Old decisions look the same as new ones.

Engineers don't think in bags of results. They think in chains: "because we switched to HTTP/2, we needed longer timeouts, which surfaced a bug in the keepalive logic, which we worked around by…" That's a causal chain, and it's what unlost trace reconstructs.

How the causal chain is built

1
Richer embeddings

Every capsule is embedded with its category, failure mode, top symbols, and the prior decision from the same work thread. This encodes trajectory into the vector; capsules from the same causal chain cluster together in embedding space, even across different agent sessions.

2
HyPE: question-first retrieval

When a capsule is extracted, the LLM also generates 2–3 questions the capsule answers: "Why is the connection timeout 30 seconds?", "How does the proxy route upstream requests?". At retrieval time, each command frames your input as a question matching its intent before embedding - recall asks "What happened with X?", challenge asks "Was the decision about X the right call?", explore asks "What are the alternatives for X?", and so on. This turns retrieval into a question-to-question match against the stored questions, dramatically improving precision with zero extra LLM cost.

Based on: Ma et al., "HyPE: Hypothetical Prompt Embeddings for Improved Retrieval" (2025). papers.ssrn.com/sol3/papers.cfm?abstract_id=5139335

3
Seed → fan-out → threshold

A vector ANN search finds the closest seed capsules. For each seed, unlost fans out to all capsules that share symbols, traversing the existing LabelList index. Capsules above the similarity threshold are dropped to prevent the chain from sprawling. The survivors are sorted chronologically: oldest first, newest last.

Sessions don't slice neatly

In practice, engineers reuse the same agent session across multiple pieces of work. Session IDs are a poor proxy for "same work thread." Unlost sidesteps this by using semantic continuity instead: the prior-decision embedding means capsules that are conceptually related cluster together regardless of session boundaries. The chain reflects intent, not session structure.

# same chain, different sessions, months apart
2026-01-14 [ses_a] switched to HTTP/2 for upstream
2026-01-21 [ses_a] retry spiral on keepalive, increased timeout
2026-02-03 [ses_b] timeout 30s hardcoded in proxy_request
2026-02-18 [ses_b] ← you are here

Install

Get Unlost

curl -fsSL https://unlost.unfault.dev/install.sh | bash
For Windows, please download the binary from our releases page.

Agent Integration

Hook unlost into your agent. This installs the Unlost agent skill and configures the necessary hooks.

# Claude Code - one command, works everywhere
unlost config agent claudecode --global

# OpenCode - opt-in per repository
cd your-project
unlost config agent opencode --path .

# GitHub Copilot CLI - opt-in per repository
cd your-project
unlost config agent copilot --path .

Configure Your LLM

Unlost requires an LLM for capsule extraction and summarization.

unlost config llm anthropic --model claude-sonnet-4-5-20250929
unlost config llm openai --model gpt-4o-mini
Recommendation: Use a fast, cheap model like GPT-4o-mini or Claude 3.5 Haiku.

That's it. Next time you start your agent, unlost will automatically work alongside it. Happy coding!

Commands

unlost brief

Memory

A staff engineer's debrief on any codebase: what matters, what bites, where to start. Scans all recorded memory (conversations + git commits) and scores by importance, not recency.

unlost brief
unlost brief src/governor.rs
unlost brief TrajectoryController

unlost recall

Memory

Recall the story so far (proactive overview) from a specific file or directory.

unlost recall src/http_proxy.rs
unlost recall src/

unlost trace

Memory

Reconstruct the causal chain of decisions that led to the current state of a file, symbol, or concept.

unlost trace src/governor.rs
unlost trace "why is the timeout 30s?"

unlost challenge

Memory

Pressure-test a past decision or technology choice using workspace memory and the live code graph.

unlost challenge "lancedb"

unlost explore

Memory

Explore future paths grounded in your workspace memory.

unlost explore "should we keep lancedb or move to sqlite+fts?"

unlost metrics

Monitor

Show workspace trajectory metrics and friction hotspots.

unlost metrics

unlost replay

Monitor

Replay and backfill agent transcripts.

unlost replay opencode --git-grounding --no-llm

unlost inspect

Monitor

Inspect stored capsules for this workspace.

unlost inspect

unlost reindex

Manage

Rebuild LanceDB index from capsules.jsonl.

unlost reindex

unlost clear

Manage

Delete all generated data for the current workspace.

unlost clear

unlost where

Manage

Show where the workspace's files are stored.

unlost where

unlost config

Manage

Manage configuration (LLM provider, agent integrations, etc.).

unlost config

The Cognitive Mirror

The unlost metrics command generates a Cognitive Mirror: a diagnostic report that reveals the structural health of your collaboration with AI agents.

Emotional Dynamics

We analyze conversation patterns to detect emotional signals that indicate when interactions may be deteriorating:

  • Valence: Positive vs negative sentiment (-1.0 to +1.0)
  • Intensity: Strength of the emotional response (0.0 to 1.0)

The purpose is early detection of friction before it escalates. When emotion signals cross thresholds, Unlost can intervene with guidance.

The Three Basins of Friction

Unlost models interaction dynamics across three distinct "Basins of Friction" to provide proactive regulation:

Loop Basin (The Stall)

Detects repetitive failures, symbol stalls, and logic churn. Triggered by high EMA-smoothed repetition and low novelty.

Spec Basin (Misunderstanding)

Detects alignment debt (user corrections) and instruction staticness (verbatim repeats).

Drift Basin (Grounding Failure)

Detects grounding stalls (ignoring user file mentions) and symbol hallucinations. Validated against a live codebase graph via unfault-core.

Metrics at a Glance

friction rate

Measured in warnings per 1M tokens. This is your primary "Babysitting Tax" indicator. A rate under 5 is healthy exploration; over 10 indicates a session that is likely stalling or drifting.

avg interval

The average number of tokens processed between proactive interventions. Short intervals suggest you are fighting the agent's mental model turn-by-turn.

Friction vs Context Size

Unlost buckets friction by input token count to identify your agent's Context Inflection Point.

"For most current models, we observe a stability collapse between 8k and 12k tokens. The friction rate typically doubles past this point as the 'lost in the middle' phenomenon degrades grounding."

Typical Inflection Pattern

Under the Hood

Every sensor, retrieval strategy, and storage primitive that makes Unlost work.

Recording: The Silent Observer

Unlost is designed to be invisible until you need it. The recording architecture prioritizes your flow above all else:

Process Isolation

The Unlost daemon runs as a separate sidecar process. Your agent (Claude, OpenCode) talks to it via a lightweight shim that fires-and-forgets. If Unlost crashes or hangs, your agent keeps working.

Async Processing

Heavy lifting (LLM extraction, embedding generation, graph analysis) happens asynchronously in the background. The shim returns control to the agent immediately.

Content-Addressed Deduplication

Flush jobs are hashed by content. If an agent loops or retries the same output, Unlost silently suppresses the duplicates (within a 45s window) to keep your history clean.

Grounding: The Code Graph

Drift happens when an agent hallucinates symbols that don't exist. Unlost prevents this by maintaining a live graph of your codebase.

unfault-core + petgraph

We use unfault-core to parse your code and build a dependency graph in milliseconds. It handles symbol resolution, identifying which functions call which, and calculating centrality scores.

Symbol Verification

Every time an agent mentions a file or function, we verify it against the graph. If it exists, we link the capsule to that node. If it doesn't, we flag it as potential drift.

Git Provenance

We capture the git HEAD SHA at the start and end of every turn. This allows us to time-travel: we know exactly what the code looked like when a decision was made, even if it has changed since.

Storage & Retrieval

Apache Arrow / LanceDB

Capsules are stored locally as Arrow RecordBatches with three indexes: ANN (vector search), LabelList (tag filtering), and Scalar (time). It's fast, private, and zero-config.

HyPE Embeddings

We use Hypothetical Prompt Embeddings (HyPE). At indexing time, we generate questions the capsule answers. At retrieval time, we frame your query as a question. Matching questions-to-questions is significantly more accurate than matching queries-to-documents.

Research-First Discipline

Unlost isn't just a collection of heuristics. It is built on a research-first discipline where every sensor and basin is validated against real-world "Marathon" datasets.

Precision-First

We only intervene when the trajectory signal is unambiguous. We favor silence over noise to protect your flow.

Temporal Awareness

The controller respects "Coffee Pauses." It decays state across breaks to avoid misattributing human pauses to agent stalls.

Academic Alignment

Our basins are aligned with EASE'25 research on emotional strain, and Nature Scientific Reports on how conversational fluency can mask inaccuracy.

Codebase Grounding

We use unfault-core for sub-second symbol graph validation, ensuring drift detection is backed by factual codebase state.

Scientific References

Cristina Martinez Montes and Ranim Khojah. 2025. Emotional Strain and Frustration in LLM Interactions in Software Engineering. In Proceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering (EASE '25). DOI: 10.1145/3756681.3756951

Zhu, Y., Wu, Y., & Miller, J. 2024. Conversational presentation mode increases credibility judgements during information search with ChatGPT. Scientific Reports (Nature). DOI: 10.1038/s41598-024-67829-6

Ma, L., et al. 2025. HyPE: Hypothetical Prompt Embeddings for Improved Retrieval. arXiv preprint. papers.ssrn.com/sol3/papers.cfm?abstract_id=5139335 Demonstrates that pre-generating questions at indexing time and matching query-to-question rather than query-to-document significantly improves retrieval precision.

Who's behind this?

I built Unlost because the intimacy I had with code started slipping once agents were doing the writing. I still wanted to feel close to what was being built - to understand the tradeoffs, to own the decisions, not just approve the diff. Unlost is my attempt to keep that feeling alive in a world where I'm not the one typing every line.

If something feels rough, open an issue. I read them. github.com/unfault/unlost/issues

- Sylvain