This document gives AI agents (and humans) the patterns, style, and constraints of this codebase so changes stay consistent.
- llmem (binary
llmem): persistent memory service for AI agents. Store, search, and retrieve memories with similarity matching (BM25 throughout; optional lexical boost for query–doc). Porter stemming. Built for Cursor via MCP; also exposes a REST API. - Tech: Go 1.24+, single
mainpackage. HTTP: Gin. MCP:modelcontextprotocol/go-sdk. Storage: SQLite (modernc.org/sqlite). Tokenization:rekram1-node/tokenizer; stemming: in-house Porter instemmer.go.
- One main concern per file:
memory.go(core store),memory_add.go,memory_get.go,memory_search.go,memory_relevant.go,memory_context.go, etc. BM25 helpers inmemory_bm25.go; storage interface and SQLite instorage.go; API and MCP inapi.goandmcp_server.go. - Options structs for construction:
MemoryStoreOptions,TgptSummarizerOptions. Use them instead of long argument lists. - Internal entry points when the same behavior is needed from two callers: e.g.
addInternal(..., addOpts)used byAddandImportso Import can preserve ID and timestamps without changing the publicAddAPI.
- All store reads/writes go through
MemoryStore.mu(RWMutex). Writes uses.mu.Lock(), reads uses.mu.RLock(). Methods that assume the lock is held are named with aLockedsuffix (e.g.bm25MemSimilarityLocked,neighborsLocked,bm25ScoreLocked). - Persistence:
toStorageData()does a deep copy of maps/slices so the snapshot is safe after releasing the lock. When adding new map/slice fields tostoredChunk, they must be copied intoStorageData()too. - Async writes (e.g. access tracking) use
s.wg.Add(1)before the goroutine anddefer s.wg.Done()inside.Close()callss.wg.Wait()before closing storage so in-flight writes finish. - SQLite: single connection (
SetMaxOpenConns(1)), WAL mode,busy_timeoutto avoid SQLITE_BUSY.
- Sentinel errors in
memory.go:ErrNotFound,ErrEmptyID,ErrIncompatibleTypes,ErrTooFewIDs. Handlers useerrors.Is(err, ErrNotFound)to map to HTTP 404; others typically 400/500. - REST: request/response types per endpoint (e.g.
createMemoryRequest,createMemoryResponse). Bind withc.ShouldBindJSON(&req). Return errors aserrorResponse{Error: err.Error()}. Always trim and validate IDs and query params (e.g.strings.TrimSpace(c.Param("id"))). - Ensure JSON arrays are never null when it matters: e.g.
if related == nil { related = []RelatedMemory{} }before sending.
- Doc–doc edges (who is similar to whom): BM25 in
memory_bm25.go—bm25MemSimilarityLocked(a, b)= max(bm25(a→b), bm25(b→a)) normalized asscore/(1+score)to [0,1). Used by Add, Update, RebuildEdges. Threshold:DefaultSimilarityDelta(0.35). - Query–doc (Search / FindRelevant): shared core in
scoreChunksLocked(memory.go) — BM25 viabm25ScoreLocked+ optional lexical boost vialexicalOverlapScoreLocked/idfLocked, with corpus-aware synonym expansion. Search sort: sim → accessCount → ID. FindRelevant: lower threshold (0.2); sort sim → BM25 raw → accessCount → recency. - Context restoration (
memory_context): six priority hashtags at start of text:#self,#goal,#relationship,#status,#principle,#thought. Sorted byeffectiveTime = max(UpdatedAt, CreatedAt); configurableper_category.
Storageinterface:Init,LoadAll,Save,Delete,Close.storedChunkDatais the serializable shape; it includes vectors, edges, tokens, access fields. SQL schema has legacy columns (label_vector_json,label_norm); code writes harmless defaults and ignores them on load. Do not remove columns; keep LoadAll/Save in sync with any new fields.
make testandmake test-racemust pass before considering a change done. Tests useNullStorageand afakeSummarizerwhere appropriate. Helpers likemustNewMemoryStore(t, opts)for setup. Prefer focused tests and table-driven style where it clarifies cases.- Do not restart the llmem service unless strictly necessary: restarting breaks the MCP client connection in Cursor and makes the memory tools appear broken.
- Prefer concise, readable code; comment when the “why” or concurrency/data flow is non-obvious.
- Constants for magic numbers:
MaxMemoryBytes,DefaultSimilarityDelta,bm25K1,bm25B. Config viaenvOr/envOrDuration(e.g.LLMEM_PORT,LLMEM_DB,LLMEM_AUTO_CONSOLIDATE_INTERVAL). - Similarity threshold:
MemoryStore.similarityDelta; set viaMemoryStoreOptions.SimilarityDelta. - IDs: generated with
newID()(atomic counter, formatm1,m2, …).parseChunkIDfor numeric part; when importing, advancenextIDso future IDs don’t collide. - Scopes:
normalizeScopesfor trim/dedup;matchesScope/scopesCompatiblefor filtering and edge creation. Global = nil/empty scopes.
- Adding new map/slice fields to
storedChunkwithout updatingtoStorageData()(risk of races with async persistence). - Using
Getinside consolidation when you only need to read chunk data (it increments access count); read under RLock froms.chunksinstead. - Holding
s.muacross I/O or external calls (storage, summarizer); snapshot under lock, then release before calling out. - Adding a “Versioning” section to the README (explicitly removed by the maintainer).
- Restarting the service from an agent workflow when you can avoid it (MCP in Cursor depends on a long-lived process).
The llmem service exposes 15 MCP tools (e.g. memory_add, memory_search, memory_context). In Cursor, the tool names are prefixed by the MCP server identifier — but the exact prefix varies between sessions and Cursor versions:
user-llmem-memory_*(e.g.user-llmem-memory_stats)mcp_llmem_memory_*(e.g.mcp_llmem_memory_stats)
If tools return "Tool not found", try the other prefix before falling back to the REST API.
REST fallback — The same operations are always available via HTTP at localhost:9980:
curl http://localhost:9980/v1/memories/context?per_category=2
curl http://localhost:9980/v1/memories/relevant?message=your+query&limit=5
curl http://localhost:9980/v1/stats
- With every prompt regarding this project, automatically use the llmem-usage skill to refresh your context.
| Area | Location / pattern |
|---|---|
| Core store | memory.go: MemoryStore, storedChunk, scoredChunk, toStorageData, loadFromStorage |
| Scoring | memory.go: scoreChunksLocked — shared BM25 + lexical scoring core used by both Search and FindRelevant |
| Add/Import | memory_add.go: Add → addInternal with addOpts |
| Search | memory_search.go: calls scoreChunksLocked; sort by sim → accessCount → ID; Neighbors computed after limit |
| Relevant | memory_relevant.go: keyword extraction → scoreChunksLocked (threshold 0.2); sort by sim → bm25Raw → accessCount → recency; RelevantMemory includes timestamps + scopes |
| Context | memory_context.go: hashtags at start of text, per_category, effectiveTime |
| Edges | memory_bm25.go: bm25MemSimilarityLocked; used in memory_add.go, memory_update.go, memory_rebuild.go; rebuild removes stale edges below threshold |
| REST | api.go: Gin, /v1/*, request/response structs, errorResponse |
| MCP | mcp_server.go: AddTool with typed in/out and jsonschema tags |
| Storage | storage.go: Storage interface, SQLiteStorage, NullStorage, migrations via ALTER TABLE |
Use this file as the single source of truth for how to work in this repo; keep it updated when patterns or constraints change.