Agents write the code. You carry the consequences. Unlost keeps the context that used to live in your head - and now lives in the agent - close to you.
curl -fsSL https://unlost.unfault.dev/install.sh | bashCopied
Capsules and indexes stay on your machine.
Built in the open under a MIT license.
Journaling returns immediately; heavy work runs async.
Any agent. One shared memory.
It used to be that writing the code meant understanding it. That's changed. These are the moments where the gap shows.
unlost brief / unlost traceYou know it works. You're less sure you can explain it. The reasoning was in the chat, which is gone.
unlost trace / unlost challengeSomeone needs to change it. Maybe it's you. The agent doesn't remember. The commit message says "feat: add retry logic."
unlost trace / unlost briefYou're reading code under pressure that you didn't write. You don't know if the retry logic was intentional or a guess.
unlost pr-commentTo your team, and to your future self. The diff shows what changed. It doesn't show what was tried and rejected, what constraint you were navigating, what's still open.
Unlost sits alongside your agent process. It captures the raw stream of thought—intent, reasoning, and decisions—without blocking your flow.
We don't just log text. Unlost distills messy conversations into Capsules: atomic units of memory containing the Why, the What, and the How.
Every capsule is verified against the live code graph. We link decisions to specific files and symbols, creating a navigable map of your project's evolution.
Once context is grounded, you can query it from any angle.
Recover the context you lost. Whether it's a high-level staff engineer debrief or a specific decision trail, you get the why behind the code.
Don't just approve; verify. Pressure-test decisions against the graph and monitor the agent's struggle in real-time.
Draft with memory. Use your established constraints and history to weigh new trade-offs before writing a single line of code.
Capsules are not just a log. They encode a causal history - the sequence of decisions, constraints, and failures that explain why the code looks the way it does today.
Most retrieval systems return a ranked bag of hits. Ask "why is the timeout 30s?" and you get the five most similar capsules - but no sense of how you arrived there. Causality is lost. Sessions are mixed. Old decisions look the same as new ones.
Engineers don't think in bags of results. They think in chains: "because we switched to HTTP/2, we needed longer timeouts, which surfaced a bug in the keepalive logic, which we worked around by…" That's a causal chain, and it's what unlost trace reconstructs.
Every capsule is embedded with its category, failure mode, top symbols, and the prior decision from the same work thread. This encodes trajectory into the vector; capsules from the same causal chain cluster together in embedding space, even across different agent sessions.
When a capsule is extracted, the LLM also generates 2–3 questions the capsule answers: "Why is the connection timeout 30 seconds?", "How does the proxy route upstream requests?". At retrieval time, each command frames your input as a question matching its intent before embedding - recall asks "What happened with X?", challenge asks "Was the decision about X the right call?", explore asks "What are the alternatives for X?", and so on. This turns retrieval into a question-to-question match against the stored questions, dramatically improving precision with zero extra LLM cost.
Based on: Ma et al., "HyPE: Hypothetical Prompt Embeddings for Improved Retrieval" (2025). papers.ssrn.com/sol3/papers.cfm?abstract_id=5139335
A vector ANN search finds the closest seed capsules. For each seed, unlost fans out to all capsules that share symbols, traversing the existing LabelList index. Capsules above the similarity threshold are dropped to prevent the chain from sprawling. The survivors are sorted chronologically: oldest first, newest last.
In practice, engineers reuse the same agent session across multiple pieces of work. Session IDs are a poor proxy for "same work thread." Unlost sidesteps this by using semantic continuity instead: the prior-decision embedding means capsules that are conceptually related cluster together regardless of session boundaries. The chain reflects intent, not session structure.
curl -fsSL https://unlost.unfault.dev/install.sh | bash
For Windows, please download the binary from our releases page.
Hook unlost into your agent. This installs the Unlost agent skill and configures the necessary hooks.
# Claude Code - one command, works everywhere
unlost config agent claudecode --global
# OpenCode - opt-in per repository
cd your-project
unlost config agent opencode --path .
# GitHub Copilot CLI - opt-in per repository
cd your-project
unlost config agent copilot --path .
Unlost requires an LLM for capsule extraction and summarization.
unlost config llm anthropic --model claude-sonnet-4-5-20250929
unlost config llm openai --model gpt-4o-mini
Recommendation: Use a fast, cheap model like GPT-4o-mini or Claude 3.5 Haiku.
That's it. Next time you start your agent, unlost will automatically work alongside it. Happy coding!
A staff engineer's debrief on any codebase: what matters, what bites, where to start. Scans all recorded memory (conversations + git commits) and scores by importance, not recency.
unlost brief
unlost brief src/governor.rs
unlost brief TrajectoryController
Recall the story so far (proactive overview) from a specific file or directory.
unlost recall src/http_proxy.rs
unlost recall src/
Reconstruct the causal chain of decisions that led to the current state of a file, symbol, or concept.
unlost trace src/governor.rs
unlost trace "why is the timeout 30s?"
Pressure-test a past decision or technology choice using workspace memory and the live code graph.
unlost challenge "lancedb"
Explore future paths grounded in your workspace memory.
unlost explore "should we keep lancedb or move to sqlite+fts?"
Show workspace trajectory metrics and friction hotspots.
unlost metrics
Replay and backfill agent transcripts.
unlost replay opencode --git-grounding --no-llm
Inspect stored capsules for this workspace.
unlost inspect
Rebuild LanceDB index from capsules.jsonl.
unlost reindex
Delete all generated data for the current workspace.
unlost clear
Show where the workspace's files are stored.
unlost where
Manage configuration (LLM provider, agent integrations, etc.).
unlost config
The unlost metrics command generates a Cognitive Mirror: a diagnostic
report that reveals the structural health of your collaboration with AI agents.
We analyze conversation patterns to detect emotional signals that indicate when interactions may be deteriorating:
The purpose is early detection of friction before it escalates. When emotion signals cross thresholds, Unlost can intervene with guidance.
Unlost models interaction dynamics across three distinct "Basins of Friction" to provide proactive regulation:
Detects repetitive failures, symbol stalls, and logic churn. Triggered by high EMA-smoothed repetition and low novelty.
Detects alignment debt (user corrections) and instruction staticness (verbatim repeats).
Detects grounding stalls (ignoring user file mentions) and symbol hallucinations. Validated against a live codebase graph via unfault-core.
Measured in warnings per 1M tokens. This is your primary "Babysitting Tax" indicator. A rate under 5 is healthy exploration; over 10 indicates a session that is likely stalling or drifting.
The average number of tokens processed between proactive interventions. Short intervals suggest you are fighting the agent's mental model turn-by-turn.
Unlost buckets friction by input token count to identify your agent's Context Inflection Point.
"For most current models, we observe a stability collapse between 8k and 12k tokens. The friction rate typically doubles past this point as the 'lost in the middle' phenomenon degrades grounding."
Every sensor, retrieval strategy, and storage primitive that makes Unlost work.
Unlost is designed to be invisible until you need it. The recording architecture prioritizes your flow above all else:
The Unlost daemon runs as a separate sidecar process. Your agent (Claude, OpenCode) talks to it via a lightweight shim that fires-and-forgets. If Unlost crashes or hangs, your agent keeps working.
Heavy lifting (LLM extraction, embedding generation, graph analysis) happens asynchronously in the background. The shim returns control to the agent immediately.
Flush jobs are hashed by content. If an agent loops or retries the same output, Unlost silently suppresses the duplicates (within a 45s window) to keep your history clean.
Drift happens when an agent hallucinates symbols that don't exist. Unlost prevents this by maintaining a live graph of your codebase.
We use unfault-core to parse your code and build a dependency graph in milliseconds. It handles symbol resolution, identifying which functions call which, and calculating centrality scores.
Every time an agent mentions a file or function, we verify it against the graph. If it exists, we link the capsule to that node. If it doesn't, we flag it as potential drift.
We capture the git HEAD SHA at the start and end of every turn. This allows us to time-travel: we know exactly what the code looked like when a decision was made, even if it has changed since.
Capsules are stored locally as Arrow RecordBatches with three indexes: ANN (vector search), LabelList (tag filtering), and Scalar (time). It's fast, private, and zero-config.
We use Hypothetical Prompt Embeddings (HyPE). At indexing time, we generate questions the capsule answers. At retrieval time, we frame your query as a question. Matching questions-to-questions is significantly more accurate than matching queries-to-documents.
Unlost isn't just a collection of heuristics. It is built on a research-first discipline where every sensor and basin is validated against real-world "Marathon" datasets.
We only intervene when the trajectory signal is unambiguous. We favor silence over noise to protect your flow.
The controller respects "Coffee Pauses." It decays state across breaks to avoid misattributing human pauses to agent stalls.
Our basins are aligned with EASE'25 research on emotional strain, and Nature Scientific Reports on how conversational fluency can mask inaccuracy.
We use unfault-core for sub-second symbol graph validation, ensuring drift detection is backed by factual codebase state.
Cristina Martinez Montes and Ranim Khojah. 2025. Emotional Strain and Frustration in LLM Interactions in Software Engineering. In Proceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering (EASE '25). DOI: 10.1145/3756681.3756951
Zhu, Y., Wu, Y., & Miller, J. 2024. Conversational presentation mode increases credibility judgements during information search with ChatGPT. Scientific Reports (Nature). DOI: 10.1038/s41598-024-67829-6
Ma, L., et al. 2025. HyPE: Hypothetical Prompt Embeddings for Improved Retrieval. arXiv preprint. papers.ssrn.com/sol3/papers.cfm?abstract_id=5139335 Demonstrates that pre-generating questions at indexing time and matching query-to-question rather than query-to-document significantly improves retrieval precision.
I built Unlost because the intimacy I had with code started slipping once agents were doing the writing. I still wanted to feel close to what was being built - to understand the tradeoffs, to own the decisions, not just approve the diff. Unlost is my attempt to keep that feeling alive in a world where I'm not the one typing every line.
If something feels rough, open an issue. I read them. github.com/unfault/unlost/issues
- Sylvain