Atagia

Open-source memory engine for AI assistants. Selects memories by applicability, not similarity.

Named after autophagy, the cellular process of recycling what no longer serves.

Why

AI assistants forget everything between sessions. The standard fix -- retrieve memories by embedding similarity -- creates its own problems: coding preferences bleed into emotional support conversations, outdated beliefs override current context, and compressed summaries get treated as established facts. Similarity tells you what sounds related, not what actually helps.

Atagia scores each candidate memory across multiple dimensions (task fit, mode fit, temporal validity, epistemic quality, risk relevance) and selects based on applicability to the current situation. The result is memory that adapts to what the assistant is doing right now, not just what the user said before.

Key capabilities

Applicability-based memory selection -- memories are scored on whether they help the current task, not just whether they sound similar
Belief revision with version history -- 8 conflict resolution strategies; beliefs are never silently overwritten
Scoped personalization per assistant mode -- a coding assistant and a companion retrieve different memories from the same user, governed by policy manifests
Consequence chain learning -- records action-outcome-tendency chains so the assistant can learn from prior advice results
Interaction contract learning -- observes how the user prefers to collaborate (depth, directness, pushback tolerance) and adapts per mode
Natural memory capture -- picks up facts from normal conversation without requiring "remember this" commands
Consent-gated memory -- sensitive information is stored only after user confirmation, with per-user category tracking
Temporal grounding -- resolves relative dates ("last Saturday", "three weeks ago") against source timestamps into actual calendar dates
Adaptive context caching -- deterministic staleness scoring serves cached results on follow-up turns when context has not significantly changed
Two-level text chunking -- rule-based + AI-assisted splitting handles voice transcriptions and long pastes before extraction
All storage in SQLite -- no external vector DB required; sqlite-vec available for optional embedding recall

Current status

Atagia is functional and under active development. The core pipeline is implemented and tested with 539 unit and integration tests.

What works today:

Memory extraction from conversations with LLM-based applicability scoring
Natural memory capture from casual conversation (no protocol phrases required)
Consent-gated memory storage with per-user category thresholds
Hybrid retrieval: FTS5 with reciprocal rank fusion and progressive multi-query expansion
Three-level memory hierarchy (L0 verbatim, L1 belief, L2 summary) with mirror retrieval
Temporal grounding for relative dates against source message timestamps
Belief revision with 8 strategies and full version history
Consequence chain learning and interaction contract observation
Adaptive context cache with deterministic staleness scoring
Two-level text chunking for long messages (voice transcriptions, pastes)
Query-aware context selection with diversity reranking
Library mode, REST API, and MCP server
LoCoMo benchmark harness with ablation support and replay probes

What is in progress or planned:

Vector embeddings: candidate search now supports optional sqlite-vec semantic recall on top of FTS5. This is aimed at multi-hop and sparse-keyword questions across larger personal corpora.
Benchmark coverage: the LoCoMo harness runs end-to-end. A ground truth audit of the LoCoMo conv-26 subset found factual errors and ambiguous items in the dataset (details). A separate benchmark (Atagia-bench) is in design to cover behaviors LoCoMo does not test: verbatim recall, consent boundaries, privacy scoping, and abstention.
Neo4j graph layer: planned for relationship traversal where flat retrieval is insufficient. Will ship only if benchmark evidence justifies the complexity.

Quick start

As a Python library

pip install -e .

from atagia import Atagia

async with Atagia(
    db_path="memory.db",
    llm_provider="anthropic",      # or "openai", "openrouter"
    llm_api_key="sk-ant-...",
) as engine:
    # Create resources
    await engine.create_user("user_1")
    await engine.create_conversation("user_1", "conv_1", assistant_mode_id="coding_debug")

    # Get memory-enriched context for your own LLM call
    context = await engine.get_context(
        user_id="user_1",
        conversation_id="conv_1",
        message="What did we decide about the migration?",
        mode="coding_debug",
    )
    # context.system_prompt       -> inject into your LLM
    # context.memories            -> scored memories that were selected
    # context.contract            -> how this user prefers to collaborate
    # context.detected_needs      -> signals detected in the query
    # context.from_cache          -> whether this was served from cache
    # context.staleness           -> 0.0 (fresh) to 1.0 (stale)

    # Or let Atagia handle the LLM call too
    result = await engine.chat(
        user_id="user_1",
        conversation_id="conv_1",
        message="Why is the test failing?",
        mode="coding_debug",
    )
    print(result.response_text)

As a sidecar client for your own webchat

Use connect_atagia when your app should work the same way whether Atagia is imported in-process or running as a local/remote HTTP service.

Same-process local mode:

from atagia.client import connect_atagia

client = await connect_atagia(
    transport="local",
    db_path="memory.db",
    llm_provider="anthropic",
    llm_api_key="sk-ant-...",
)

context = await client.get_context(
    user_id="user_1",
    conversation_id="conv_1",
    message="What did we decide about the migration?",
    mode="coding_debug",
)

response_text = await my_llm_call(
    system_prompt=context.system_prompt,
    user_text="What did we decide about the migration?",
)

await client.add_response(
    user_id="user_1",
    conversation_id="conv_1",
    text=response_text,
)
await client.close()

HTTP service mode:

from atagia.client import connect_atagia

client = await connect_atagia(
    transport="http",
    base_url="http://localhost:8100",
    api_key="your-service-api-key",
)

result = await client.chat(
    user_id="user_1",
    conversation_id="conv_1",
    message="Why is the test failing?",
)
print(result.response_text)

Auto mode uses HTTP when base_url or ATAGIA_BASE_URL is present, otherwise it uses local mode. It also reads ATAGIA_SERVICE_API_KEY for HTTP and ATAGIA_DB_PATH or ATAGIA_SQLITE_PATH for local mode:

client = await connect_atagia(transport="auto")

MCP remains the right transport for Claude Desktop, Cursor, and other tool clients. For ordinary backend, desktop app, or webchat integrations, use the Python client facade instead.

Offline ingestion

For loading long conversations without triggering retrieval on every turn:

async with Atagia(db_path="memory.db") as engine:
    await engine.create_user("user_1")
    await engine.create_conversation("user_1", "conv_1")

    # Ingest messages (extraction runs in background workers)
    await engine.ingest_message("user_1", "conv_1", "user", "I was born in Barcelona in 1990.")
    await engine.ingest_message("user_1", "conv_1", "assistant", "Tell me more about growing up there.")

    # Persist an assistant response in conversation history
    await engine.add_response("user_1", "conv_1", "That sounds like a great place to grow up.")

    # Wait for all background extraction jobs to finish
    await engine.flush(timeout_seconds=60.0)

Messages accept an optional occurred_at ISO timestamp for historical data where the message happened at a different time than the ingestion. This lets the extraction pipeline resolve temporal references like "yesterday" or "last year" correctly.

Constructor parameters

Parameter	Default	Description
`db_path`	`"atagia.db"`	SQLite database path
`redis_url`	`None`	Optional Redis URL for cache and queues
`manifests_dir`	Built-in	Directory with assistant mode JSON manifests
`llm_provider`	From env	`"anthropic"`, `"openai"`, or `"openrouter"`
`llm_api_key`	From env	API key for the LLM provider
`llm_model`	From env	Model name for extraction, scoring, and chat
`embedding_backend`	`"none"`	`"none"` or `"sqlite_vec"`
`embedding_provider_name`	From env	Optional override for which provider handles embeddings
`embedding_model`	`None`	Embedding model name (required when backend is `sqlite_vec`)
`context_cache_enabled`	`True`	Enable adaptive context caching
`chunking_enabled`	`True`	Enable intelligent chunking for long messages
`skip_belief_revision`	`False`	Disable belief revision (for benchmarks/ablation)
`skip_compaction`	`False`	Disable compaction (for benchmarks/ablation)

SQLite is the only required storage dependency. An LLM API (Anthropic, OpenAI, or OpenRouter) is required for memory extraction, scoring, and chat. Redis accelerates queues and caching but is optional -- the engine works without it using in-process queues.

For a dual-provider setup, a common production shape is Anthropic for chat/extraction/scoring plus sqlite-vec embeddings on OpenAI or OpenRouter. Example env:

ATAGIA_LLM_PROVIDER=anthropic
ATAGIA_LLM_API_KEY=your-anthropic-key
ATAGIA_OPENAI_API_KEY=your-openai-key
ATAGIA_EMBEDDING_BACKEND=sqlite_vec
ATAGIA_EMBEDDING_PROVIDER=openai
ATAGIA_EMBEDDING_MODEL=text-embedding-3-small
ATAGIA_EMBEDDING_DIMENSION=1536

As an MCP server (Claude Desktop, Cursor, Windsurf)

pip install "atagia[mcp]"

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "atagia-memory": {
      "command": "/path/to/.venv/bin/atagia-mcp",
      "env": {
        "ATAGIA_DB_PATH": "/path/to/memory.db",
        "ATAGIA_LLM_PROVIDER": "anthropic",
        "ATAGIA_LLM_API_KEY": "sk-ant-..."
      }
    }
  }
}

Five tools are exposed: atagia_get_context, atagia_add_memory, atagia_search_memories, atagia_list_memories, atagia_delete_memory.

As a REST API

git clone https://github.com/jordicor/atagia.git
cd atagia
pip install -e ".[dev]"
cp .env.example .env   # configure LLM provider and keys
uvicorn atagia.app:create_app --factory --reload

Service mode requires ATAGIA_SERVICE_MODE=true and ATAGIA_SERVICE_API_KEY in .env.

Core routes

Method	Path	Description
POST	`/v1/users`	Create a user
POST	`/v1/conversations`	Create a conversation
POST	`/v1/workspaces`	Create a workspace
POST	`/v1/chat/{conversation_id}/reply`	Send a message and get a response
POST	`/v1/conversations/{conversation_id}/context`	Get sidecar context for a host-managed LLM call
POST	`/v1/conversations/{conversation_id}/responses`	Persist a host-generated assistant response
POST	`/v1/conversations/{conversation_id}/messages`	Ingest a user or assistant message without retrieval
POST	`/v1/flush`	Wait for pending background work
POST	`/v1/memory/feedback`	Submit memory feedback (used, useful, irrelevant, intrusive, stale)
GET	`/v1/memory/objects/{memory_id}`	Inspect a memory object
GET	`/v1/users/{user_id}/contract`	View the user's interaction contract
GET	`/v1/users/{user_id}/state`	View the user's current state

Admin routes (require `ATAGIA_ADMIN_API_KEY`)

Method	Path	Description
POST	`/v1/admin/rebuild/conversation/{id}`	Rebuild memories for a conversation
POST	`/v1/admin/rebuild/user/{id}`	Rebuild all memories for a user
POST	`/v1/admin/compact/conversation/{id}`	Compact a conversation
POST	`/v1/admin/compact/workspace/{id}`	Compact a workspace
POST	`/v1/admin/reindex`	Rebuild FTS indexes
POST	`/v1/admin/lifecycle/run`	Run memory lifecycle (decay, archival)
POST	`/v1/admin/metrics/compute`	Compute retrieval quality metrics
GET	`/v1/admin/metrics/latest`	Latest metric values
GET	`/v1/admin/metrics/{name}/history`	Metric history over time
GET	`/v1/admin/metrics/retrieval-summary`	Retrieval performance summary
GET	`/v1/admin/retrieval-events/{id}`	Inspect a retrieval event
GET	`/v1/admin/consequence-chains/{user_id}`	List consequence chains for a user
POST	`/v1/admin/replay/event/{id}`	Replay a retrieval event with ablation
POST	`/v1/admin/replay/conversation/{id}`	Replay a full conversation
POST	`/v1/admin/grounding/{event_id}`	Analyze grounding for a retrieval event
POST	`/v1/admin/export/conversation/{id}`	Export a conversation

How it works

Four memory layers

Layer	What it stores	How it updates
Evidence	Verbatim spans, extracted events, citations, timestamps	Append-only. What actually happened.
Belief	Revisable interpretations derived from evidence	Versioned. Never silently overwritten.
Interaction contract	How the user prefers to collaborate: depth, directness, pushback tolerance, pace	Learned from observation. Scoped per mode.
State	Current context: urgency, focus, frustration	Continuously updated. Transient.

Applicability scoring

Each candidate memory is scored across multiple dimensions:

final_score = 0.65 * llm_applicability
            + 0.15 * retrieval_score
            + 0.10 * vitality_boost
            + 0.10 * confirmation_boost
            - privacy_penalty
            - contradiction_penalty

The LLM evaluates applicability by considering task fit, mode fit, temporal validity, epistemic quality, and risk relevance. Semantic similarity contributes to candidate generation but does not govern final selection.

Scoped personalization

Memory is not a flat global profile. Each assistant mode defines its own retrieval policy:

coding_debug: prefers evidence, tight scope, low personalization
research_deep_dive: broad scope, high depth, tolerates uncertainty
companion: high emotional sensitivity, prefers interaction contracts
brainstorm: wide association, loose scope filtering
biographical_interview: maximizes evidence recall, strict privacy
general_qa: balanced defaults

Policies control which scopes are allowed, what memory types are preferred, privacy ceilings, context budgets, and retrieval parameters. Custom modes are defined as JSON manifests.

Belief revision

When new evidence conflicts with an existing belief, the system chooses among eight actions:

Every revision preserves the full history. A belief like "user prefers detailed answers" does not get silently replaced. Instead, it becomes "depth preference is mode-dependent: concise for debugging, deep for research."

Consequence chains

When a user reports an outcome of prior advice, the system records the chain:

action: "Suggested large refactor"
  -> outcome: "User came back with regressions"
  -> tendency: "This workspace is fragile to sweeping changes"

These chains surface during retrieval when follow-up failure or loop signals are detected.

Adaptive context cache

Atagia caches retrieval results and serves them on follow-up turns when the context has not significantly changed. A deterministic staleness scorer evaluates message count, elapsed time, topic continuity, and interaction pace to decide whether to serve from cache or refresh.

Cache entries are bound to the active policy hash (manifest changes force misses), validated against the current workspace, and invalidated on any mutation (new messages, memory edits, rebuilds, workspace changes). MCP and benchmark paths always use fresh retrieval.

Intelligent chunking

Long messages (voice transcriptions, copy-pasted conversations) are automatically split before memory extraction:

Level 0 (rule-based, zero cost): splits at natural boundaries like timestamps, speaker turns, section breaks, and bracketed annotations. Segments below 500 chars are merged with neighbors.
Level 1 (AI-assisted): for segments still exceeding 16K tokens after Level 0, inline markers are inserted every ~1,000 tokens and an LLM identifies semantic cut points. Falls back to deterministic size-bounded splitting if the AI output is unusable, with full logging of the fallback event.

Each chunk is extracted independently with a cross-chunk context accumulator that prevents duplicate extraction and helps resolve coreferences across chunks.

Architecture

               Library mode                    Service mode
            +----------------+              +------------------+
            |  Your app      |              |  Any HTTP client  |
            |  imports       |              |  calls            |
            |  Atagia()      |              |  REST API         |
            +-------+--------+              +--------+---------+
                    |                                |
                    +------------ + ----------------+
                                  |
                     +------------v-----------+
                     |   Context Cache        |
                     |   Staleness scoring    |
                     +------------+-----------+
                                  |
                     +------------v-----------+
                     |   RetrievalPipeline    |
                     |   Need detection       |
                     |   FTS5 candidate search|
                     |   Applicability score  |
                     |   Context compose      |
                     +------------+-----------+
                                  |
              +-------------------+-------------------+
              |                   |                   |
     +--------v------+   +-------v-------+   +-------v------+
     |   SQLite      |   |   Workers     |   |   Redis      |
     |   FTS5        |   |   Extract     |   |   (optional) |
     |   sqlite-vec  |   |   Chunk       |   |   Queues     |
     |   Canonical   |   |   Revise      |   |   Cache      |
     +--------------+    |   Compact     |   +--------------+
                         |   Evaluate    |
                         +---------------+

SQLite is the canonical data store. An LLM API is required for memory extraction, applicability scoring, belief revision, and chat. Redis accelerates queues and caching but is optional.

Stack

Component	Technology
Language	Python 3.12+
API	FastAPI
Primary storage	SQLite + FTS5
LLM providers	Anthropic, OpenAI, OpenRouter
Optional cache/queues	Redis
Optional semantic recall	sqlite-vec

Running tests

pip install -e ".[dev]"
python -m pytest tests/ -v

For MCP server tests:

pip install -e ".[dev,mcp]"
python -m pytest tests/ -v

Benchmarking

Atagia includes a LoCoMo benchmark harness with a dataset downloader, LLM judge scorer, ablation presets, and CLI summary output.

python -m benchmarks.locomo.download
python -m benchmarks.locomo \
  --data-path benchmarks/data/locomo10.json \
  --provider anthropic \
  --model claude-sonnet-4-6 \
  --max-questions 25

Available ablation presets: similarity_only, no_contract, no_scope, no_need_detection, no_revision, no_compaction.

Roadmap

Phase	Focus	Status
1	Core memory system: extraction, retrieval, scoring, contracts, lifecycle, API	Done
2	Belief revision, consequence chains, compaction, evaluation, replay	Done
2.5	Library mode, MCP server	Done
3	Benchmark harness, adaptive context cache, text chunking, temporal timestamps	Done
3.5	Retrieval quality: RRF hybrid scoring, temporal grounding, contextual indexing, memory hierarchy, natural memory capture, consent gating	Done
4	Embeddings activation and tuning: sqlite-vec backfill, validation, and semantic recall operations	In progress
5	Neo4j graph layer for relationship-aware retrieval	Planned

Research

Atagia is backed by cross-domain research into memory systems spanning microbiology, neuroscience, physics, and traditional knowledge frameworks:

Beyond Similarity: Applicability-Governed Memory -- thesis paper with testable hypotheses and evaluation strategy
Beyond Human Memory -- cross-domain exploration from cellular autophagy to Aboriginal Dreamtime
Technical Design -- full engineering specification

License

Apache 2.0

Links

Website: atagia.org
Author: Jordi Cor (Acerting Art Inc. / OjoCentauri)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
benchmarks		benchmarks
manifests		manifests
migrations		migrations
src/atagia		src/atagia
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Atagia

Why

Key capabilities

Current status

Quick start

As a Python library

As a sidecar client for your own webchat

Offline ingestion

Constructor parameters

As an MCP server (Claude Desktop, Cursor, Windsurf)

As a REST API

Core routes

Admin routes (require ATAGIA_ADMIN_API_KEY)

How it works

Four memory layers

Applicability scoring

Scoped personalization

Belief revision

Consequence chains

Adaptive context cache

Intelligent chunking

Architecture

Stack

Running tests

Benchmarking

Roadmap

Research

License

Links

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Admin routes (require `ATAGIA_ADMIN_API_KEY`)

Packages