Ghost is a persistent memory system for AI agents. Single binary, SQLite-backed, zero server dependencies.
┌─────────────────────────────────────────────────┐
│ Consumers │
│ CLI (ghost put/get/search) │ MCP Server │
│ Library (memory.go) │ (stdio transport) │
└──────────────┬───────────────┴──────┬────────────┘
│ │
┌───────▼───────┐ ┌────────▼────────┐
│ internal/cli │ │ internal/mcpserver│
│ (Cobra cmds) │ │ (MCP tools) │
└───────┬────────┘ └────────┬─────────┘
│ │
└───────────┬───────────┘
│
┌────────▼────────┐
│ internal/store │
│ Store interface │
│ SQLiteStore │
└──┬───────┬───┬──┘
│ │ │
┌──────────▼┐ ┌──▼──┐ ┌▼──────────────┐
│ chunker │ │model│ │ embedding │
│ (text→ │ │ │ │ (Ollama/OpenAI)│
│ chunks) │ │ │ │ │
└────────────┘ └─────┘ └────────────────┘
| Package | Purpose | Size |
|---|---|---|
cmd/ghost |
Entrypoint — delegates to cli.RootCmd.Execute() |
13 LOC |
internal/cli |
Cobra commands for all CLI subcommands | ~1900 src / ~2400 test |
internal/store |
Store interface + SQLiteStore implementation |
~4400 src / ~3700 test |
internal/model |
Core data types: Memory, Chunk, FileRef |
68 LOC |
internal/chunker |
Markdown-aware text splitting (~400 char targets) | 193 LOC |
internal/embedding |
Pluggable vector embeddings (local/Ollama/OpenAI) | ~320 LOC |
internal/ingest |
Markdown file parser (H2 → sections → memories) | 154 LOC |
internal/mcpserver |
MCP server over stdio (10 tools: put, get, search, context, expand, consolidate, curate, reflect, edge, edge_candidates) | ~520 LOC |
memory.go |
Public library API — re-exports from internal packages | 102 LOC |
The core unit. Indexed by (namespace, key) with automatic versioning.
Memory {
ID string // ULID
NS string // agent namespace (e.g. "agent:pikamini")
Key string // unique within namespace, descriptive
Content string // the actual memory text
Kind string // semantic | episodic | procedural
Tier string // sensory | stm | ltm | dormant (lifecycle stage)
Pinned bool // always loaded in context, exempt from decay
Priority string // low | normal | high | critical
Importance float64 // 0.0–1.0 continuous score
Version int // auto-incremented on update
Supersedes string // ID of previous version
Tags []string // first-class filtering (identity, lore, project:ghost, chat:123)
AccessCount int // incremented on every retrieval
UtilityCount int // incremented when memory was actually useful
EstTokens int // rough token estimate (len/4 + 20)
TTL/ExpiresAt // optional expiration
}
Text segments (~400 chars) for search indexing. Each memory is split into chunks on store.
Links a memory to a file path with a relationship type (modified, created, deleted, read).
Weighted, typed associations between memories for graph-based retrieval (spreading activation). Edges are first-class objects with their own lifecycle metadata.
Edge {
FromID string // source memory ID
ToID string // target memory ID
Rel string // relates_to | contradicts | depends_on | refines | contains | merged_into | caused_by | prevents | implies
Weight float64 // 0.0–1.0, strength of association
AccessCount int // incremented on co-retrieval
LastAccessedAt *time.Time // for future decay calculation
}
Default weights by relation type:
- Core: contradicts=0.9, refines=0.8, depends_on=0.7, contains=0.6, relates_to=0.5, merged_into=0.0 (audit trail only)
- Reasoning: implies=0.8, caused_by=0.75, prevents=0.7 (for cognitive-memory traversal; not auto-linked, require explicit creation)
Auto-linking: On put, Ghost computes embedding similarity against existing memories in the same namespace. Memories above the threshold (default 0.80, configurable via GHOST_EDGE_THRESHOLD) automatically get relates_to edges.
Re-linking: When a memory is versioned (new ID), all edges referencing the old ID are updated to point to the new ID.
Semantic relationships between memories: relates_to, contradicts, depends_on, refines, contains, merged_into. Reasoning relations: caused_by, prevents, implies (populated via ghost infer-edges CLI — LLM-assisted, offline). Migrated to memory_edges table on startup.
Uses modernc.org/sqlite with WAL mode. Single file at ~/.ghost/memory.db (configurable via --db flag or $GHOST_DB).
| Table | Purpose |
|---|---|
memories |
Core records with all metadata columns |
chunks |
Text segments with optional embedding vectors |
chunks_fts |
FTS5 virtual table for full-text search |
memory_edges |
Weighted directed edges for DAG retrieval |
memory_links |
Legacy directed edges (migrated to memory_edges) |
memory_files |
File path references |
reflect_rules |
Condition→action rules for the reflect engine |
Schema evolution uses ALTER TABLE ADD COLUMN with safe defaults — executed on every startup, idempotent. No migration framework; the migrate() method in sqlite.go handles everything.
FTS5 is kept in sync with chunks via AFTER INSERT/DELETE/UPDATE triggers.
Namespaces represent agent identity — each namespace is one agent's isolated memory space.
| Namespace | Purpose |
|---|---|
agent:<name> |
Per-agent memory space (e.g. agent:pikamini, agent:coder) |
Memories are isolated by namespace — no cross-namespace visibility.
Tags replace the old namespace hierarchy for categorization. Use tags to classify what a memory is about:
| Tag | Purpose |
|---|---|
identity |
Core agent persona (name, personality, appearance) |
lore |
Background knowledge (relationships, fun facts) |
chat:<id> |
Per-conversation context |
project:<name> |
Project knowledge |
learning |
Accumulated insights |
convention |
Coding/writing rules |
capability |
What the agent can do |
user:<name> |
Per-user preferences |
- Namespace = agent identity. One DB can host multiple agents. Each is isolated.
- Tags = flexible metadata. Categories are tags, not namespace segments.
- Pinned = chronic accessibility. Replaces the old
identitytier. Based on cognitive science — self-knowledge is chronically accessible LTM, not a separate memory store. - No enforced schema. These are conventions, not constraints.
Three methods fused via Reciprocal Rank Fusion (RRF, k=60):
- FTS5 — Full-text search with stop-word filtering and priority/recency scoring
- LIKE fallback — Per-term substring matching across content, keys, and chunks
- Vector embeddings — Cosine similarity via all-MiniLM-L6-v2 (enabled by default, pure Go)
RRF merges ranked results: score = Σ 1/(60 + rank_i) across methods.
Default tier exclusion: dormant and sensory tiers are excluded from search results by default. Use IncludeAll: true to search all tiers.
Temporal intent detection: Queries containing time-related keywords (yesterday, recent, latest, etc.) trigger temporal-aware ranking — FTS weight drops from 0.5→0.2, recency weight rises from 0.3→0.7, and episodic-kind memories get a +0.3 boost in RRF fusion.
Three-phase greedy packing within a token budget:
- Phase 1 (Pinned): Load all
pinned = truememories, ordered by importance. Fills up tobudget/3. - Phase 2 (Search): Query-relevant memories scored by composite metric, filterable by tags.
- Phase 3 (Edge Expansion): Implements spreading activation (Collins & Loftus, 1975). For each seed from Phase 2, follow top-K outgoing edges (sorted by weight). Neighbors enter the candidate pool with propagated scores:
propagated = seed_score × edge_weight × damping. Memories that appear as both direct hits and edge neighbors get additive boost (capped atMaxBoostFactor × direct_score).contradictsedges bypass the normal cap — contradicting memories get a minimum score of 80% of the seed's score, ensuring conflicts are always surfaced. Parent boosting: when a seed is a child of acontainsparent, the parent is pulled into the candidate pool with at least the child's score — ensuring summaries appear even when the query matches children but not the summary. Containment suppression: before packing, children ofcontainsparents in the candidate pool are suppressed to avoid redundancy. Final pool is re-sorted and greedy-packed into remaining budget.
Edge expansion defaults (EdgeExpansionConfig):
| Parameter | Default | Purpose |
|---|---|---|
Enabled |
true |
Enable/disable edge expansion |
Damping |
0.3 |
Caps propagated score relative to seed |
MaxEdgesPerSeed |
5 |
Limits fanout per seed memory |
MinEdgeWeight |
0.1 |
Edges below this are not traversed |
MaxExpansionTotal |
50 |
Total cap on new candidates from edges |
MaxBoostFactor |
0.5 |
Cap on additive boost for direct+edge hits |
Composite scoring (4 additive factors + multiplicative tier modifier):
Additive weights vary by memory kind to match cognitive retrieval patterns:
| Factor | Semantic | Episodic | Procedural |
|---|---|---|---|
| Relevance | 0.45 | 0.30 | 0.35 |
| Recency | 0.10 | 0.40 | 0.05 |
| Importance | 0.30 | 0.15 | 0.15 |
| Access freq | 0.15 | 0.15 | 0.45 |
The composite score is then multiplied by a tier modifier so that tier transitions have meaningful impact on ranking (not just a small additive boost):
Tier multipliers: ltm=1.0, stm=0.8, dormant=0.15, sensory=0.1
Per-memory token cap: MaxMemoryTokens (default 400) limits how many tokens any single memory can contribute to the output. Memories exceeding the cap are automatically excerpted, preventing large memories from dominating the budget and leaving room for more diverse results. Set to -1 to disable.
Memories that don't fully fit the remaining budget get excerpted (truncated with "...") if at least 25 tokens remain.
Rule-based lifecycle management. No LLM calls — purely deterministic.
Each rule has a condition (AND-joined fields) and an action:
Pinned memories (pinned = true) are exempt from all lifecycle rules — they skip rule evaluation entirely.
| Built-in Rule | Condition | Action |
|---|---|---|
sys-promote-sensory |
sensory, >1h old, >1 access | PROMOTE to STM |
sys-decay-sensory |
sensory, >4h old | DELETE |
sys-decay-unaccessed |
STM, >48h old, <10 accesses | DECAY importance ×0.95 (min 0.1) |
sys-promote-to-ltm |
STM, >72h old, >50 accesses | PROMOTE to LTM |
sys-demote-stale-ltm |
LTM, >168h since last access | DEMOTE to dormant |
sys-prune-low-utility |
>20 accesses, utility ratio <0.05 | DEMOTE to dormant |
sys-merge-similar |
STM, embedding similarity >0.9 | LINK (create relates_to edges, non-destructive) |
sys-dedup-all |
Any tier, embedding similarity >0.92 | DEDUP (archive duplicates to dormant, keep canonical) |
Rules are evaluated in two passes:
- Per-memory pass — evaluated in priority order (first-match-wins). Sensory rules run at higher priority to quickly promote attended memories or discard unattended ones.
- Similarity pass — rules with
cond_similarity_gtrun pairwise embedding comparison. Default strategy islink_only(non-destructive): createsrelates_toedges between similar memories without deleting any content. The reflect response includeslinked_clustersshowing which groups were linked. The agent can then useghost clustersto discover all clusters andghost consolidateto create summaries when ready. Legacykeep_highest_importancestrategy (destructive merge) is still available for custom rules.
Custom rules can be added via ghost rule set.
When similar memories accumulate, the agent-driven consolidation flow is:
1. ghost reflect → links similar memories, returns linked_clusters
2. ghost clusters -n ... → shows all connected groups of similar memories
3. agent reviews clusters → decides which to consolidate, writes summary text
4. ghost consolidate → creates summary parent with contains edges
5. ghost context → returns summary, suppresses children (parent boosting)
All original memories are preserved — summaries are derived views. This follows the LCM (Lossless Context Management) pattern: compaction creates materialized summaries while originals remain searchable.
Edges have their own lifecycle managed alongside memory nodes:
- Co-retrieval strengthening (Hebbian learning — "fire together, wire together"): When two connected memories appear together in a
ghost_contextresponse, their shared edge is strengthened:weight += 0.05 × (1 - weight)(diminishing returns approaching 1.0),access_count++,last_accessed_atupdated. - Auto-linking on put: When a memory is stored, Ghost computes embedding similarity against existing memories in the same namespace. Memories above the threshold (default 0.80, configurable via
GHOST_EDGE_THRESHOLD) automatically getrelates_toedges with weight = similarity score. - Edge re-linking: When a memory is versioned (new ID supersedes old), all edges referencing the old ID are updated to point to the new ID within the same transaction.
- Edge decay: During reflect, edges not accessed in 30+ days with <3 accesses have their weight multiplied by 0.9.
- Edge pruning: Edges with weight < 0.05 are automatically deleted during reflect.
Exposes 10 tools over stdio transport using github.com/modelcontextprotocol/go-sdk:
ghost_put— Store/update a memory (auto-links similar via edges)ghost_get— Retrieve a specific memory by namespace and keyghost_search— Full-text search with rankingghost_context— Budget-aware context assembly with edge expansion (includescompaction_suggestedsignal; supportsmin_score/min_spreadnoise filters)ghost_expand— List consolidation nodes (no key) or drill into a summary to get its children (with key)ghost_consolidate— Create a summary memory with contains edges to source memories in one operationghost_curate— Instance-level lifecycle actions on individual memories (promote, demote, boost, diminish, archive, delete, pin, unpin)ghost_edge— Create, remove, or list weighted edges between memories for DAG-based retrievalghost_reflect— Run lifecycle rules across all memories (promote, decay, prune, merge similar, edge decay)ghost_edge_candidates— Returnrelates_topairs that do not yet have a typed reasoning edge, so the calling agent (itself an LLM) can classify them and commit edges viaghost_edge. Ghost does zero LLM work — the hot path AND this hygiene path stay LLM-free. For fully-automated batch inference without an agent in the loop, use theghost infer-edgesCLI instead, which does spawn an LLM.
Started via ghost mcp-serve subcommand. See Claude Code Setup or Integration Guide for usage patterns.
Claude Code hooks can automate memory capture without relying on the agent remembering to call ghost_put. Both hooks use type: "command" shell scripts that pipe the session transcript to claude -p (headless mode) for analysis, then execute the resulting ghost put CLI commands.
- PreCompact hook (async) — fires before context compression on long sessions. Reads the last ~100 transcript lines, extracts learnings, stores them via
ghostCLI. Zero latency cost. - Stop hook (async) — fires after each agent turn. Same pattern but reads more transcript (last ~200 lines) as the final chance to capture learnings. Zero latency cost.
See the hooks section in the Claude Code Setup guide.
memory.go at the package root provides a public Go API by re-exporting internal types:
import memory "github.com/rcliao/ghost"
store, _ := memory.NewSQLiteStore("path/to/db")
store.Put(ctx, memory.PutParams{NS: "project:x", Key: "fact", Content: "..."})
results, _ := store.Search(ctx, memory.SearchParams{Query: "something"})The public Store interface is a subset of the internal one — core CRUD, search, context, reflect, and GC.
| Command | Description |
|---|---|
put |
Store or update a memory |
get |
Retrieve by namespace + key |
list |
List memories (filterable by ns, kind, tags) |
rm |
Soft-delete (or hard-delete) |
search |
Full-text search |
context |
Assemble relevant memories within token budget |
peek |
Lightweight index of memory state |
history |
Full version history of a key |
edge |
Create, remove, or list weighted edges between memories |
consolidate |
Create a summary memory with contains edges to source memories |
clusters |
Discover groups of similar memories connected by edges |
link |
Create/remove semantic relationships (legacy) |
files |
Manage file references |
tags |
List, rename, or remove tags |
ns |
Namespace operations (list, rm) |
reflect |
Run lifecycle rules |
gc |
Garbage collect expired/stale memories |
stats |
Database statistics |
export / import |
JSON export/import |
ingest |
Parse markdown files into memories |
mcp |
Start MCP server on stdio |
| Dependency | Purpose |
|---|---|
modernc.org/sqlite |
Pure Go SQLite (no CGo) |
github.com/spf13/cobra |
CLI framework |
github.com/oklog/ulid/v2 |
Time-sortable unique IDs |
github.com/modelcontextprotocol/go-sdk |
MCP server protocol |
github.com/knights-analytics/hugot |
Local sentence embeddings (all-MiniLM-L6-v2 via GoMLX, pure Go) |
-
No LLM calls inside the library. Ghost is a storage and retrieval layer. Intelligence lives in the calling agent.
-
Namespace over kind. The 3-kind taxonomy classifies what type of knowledge. Namespaces organize who or what project it belongs to.
-
Explicit over implicit. Importance, utility, and tier transitions are set by the caller or by deterministic rules — never silently inferred.
-
Budget-aware retrieval. Context assembly respects token budgets with greedy packing and excerpting.
-
Backwards compatible storage. Schema changes use
ALTER TABLE ADD COLUMNwith safe defaults. Existing data always remains readable.
Ghost supports pluggable embedding providers via GHOST_EMBED_PROVIDER:
| Provider | Value | Description |
|---|---|---|
| Local (default) | local or unset |
all-MiniLM-L6-v2 via hugot/GoMLX. Pure Go, no CGo. Model (~86MB) downloaded on first use to ~/.ghost/models/. ~80ms per embedding. |
| Ollama | ollama |
Local Ollama instance. Configurable model via GHOST_EMBED_MODEL. |
| OpenAI | openai |
OpenAI-compatible API. Requires OPENAI_API_KEY. |
| Disabled | none |
No vector embeddings. Search uses FTS5 + LIKE only. |
Priority order:
--dbflag$GHOST_DBenv var$AGENT_MEMORY_DB(legacy fallback)~/.ghost/memory.db