Memory infrastructure for AI agents
Give your AI a brain that remembers. Persistent, queryable memory for stateless LLMs.
from memorylayer import sync_client
# Connect and use memory
with sync_client() as memory:
# Remember user preferences
memory.remember("User prefers light mode")
# Recall with semantic search
result = memory.recall("What are the user's preferences?") The Problem with AI Today
Current LLM-based agents suffer from critical memory limitations that limit their usefulness.
LLMs Forget Everything
Between sessions, your AI starts from scratch. No memory of past conversations, decisions, or learned patterns.
Context Windows Are Limited
Even the best models typically only handle 128-256K tokens. Long histories get truncated or forgotten.
No Learning Persistence
Current LLMs do not remember user preferences, past decisions, or evolve their understanding over time.
Expensive Re-computation
Without persistent memory, you must reprocess the same data repeatedly, leading to high costs and inefficiency.
What is memorylayer.ai?
memorylayer.ai is an API-first memory infrastructure for LLM-powered agents. It provides the missing memory layer to keep track of information just like humans do.
Episodic
Specific events and interactions. "User asked about Python logging on Jan 15."
Semantic
Facts, concepts, and relationships. "User prefers TypeScript over JavaScript."
Procedural
How to do things. "To deploy, run npm run deploy."
Working
Current task context. "Currently debugging auth.py line 42."
Architecture
Everything you need for AI memory
Built with production-grade features that scale from local development to enterprise deployment.
Claude Code Plugin
Protect your context window with the official Claude Code plugin. Automatically captures memory before compaction. (We use it too!)
Adaptive Learning
Memory importance changes over time in response to feedback and how memories are used. No more stale, irrelevant information clogging up your context.
Dynamic Retrieval
Dynamically leverages hybrid of vector search, graph search, and agentic search to prioritize memories
Relationship Graph
60+ typed relationship edges across 11 categories enable multi-hop causal queries that vector similarity alone cannot answer.
MCP Server
First-class Model Context Protocol integration for Claude Desktop, Cursor, and other MCP-compatible tools.
Semantic Tiering
Memories are progressively summarized into different detail levels. Retrieve the right amount of information for each query without wasting context.
Context Sandbox
Process hundreds of memories server-side without consuming your context window. A persistent Python sandbox lets agents explore, filter, and transform memory data programmatically — driven by the agent or fully autonomous.
Recursive Reasoning
Inspired by RLM, the server iteratively executes code and LLM queries over sandbox data and memories. Run it autonomously server-side, or let your agent orchestrate each step via MCP.
Smart Extraction
Every memory stored automatically extracts facts, builds typed associations between related memories, deduplicates against existing knowledge, and categorizes by type — no manual tagging required.
Enterprise Ready
Scale up for production with smart hot/warm/cold data tiering, vector-graph compression, smarter vector search, custom ontologies, RBAC, audit trails, and more.
Advanced Sandbox
Enterprise sandbox with state checkpointing, stronger isolation, extended tool libraries, and resource limits for production-grade server-side reasoning.
Multimodal Support
Unified handling of text, images, audio, video, documents, and PDFs.
Start with three simple operations
API complexity scales with your requirements. Basic usage is really straightforward.
Remember
Store memories with automatic classification. memorylayer extracts facts, preferences, and decisions from raw content.
memory.remember(
content="User prefers light mode",
type="semantic",
importance=0.8
) Recall
Search with intelligent retrieval. Use fast RAG mode or deep LLM mode with query rewriting and context resolution.
result = memory.recall(
query="user preferences",
) Reflect
Synthesize insights from accumulated knowledge. Generate summaries, detect contradictions, and identify patterns.
reflection = memory.reflect(
query="Summarize all recent coding decisions",
) Get started in minutes
Two steps: start the server, then use the SDK
Start the Server
# Install the server (with local embeddings)
pip install memorylayer-server[local]
# Start the server (uses SQLite by default)
memorylayer serve
# Server running at http://localhost:61001
# Data stored in your home directory
# Configure embedding & LLM providers (see docs) Use the SDK
# pip install memorylayer-client
from memorylayer import MemoryLayerClient, MemoryType
# Connect to your local server
async with MemoryLayerClient(
base_url="http://localhost:61001",
workspace_id="my-project"
) as ml:
# Store a memory
memory = await ml.remember(
content="User prefers light mode with clean design",
type=MemoryType.SEMANTIC,
importance=0.8,
tags=["preferences", "ui"]
)
# Recall memories
memories = await ml.recall(
query="what are the user's UI preferences?"
)
# Synthesize insights
reflection = await ml.reflect(
query="summarize all user preferences"
)
Get started with the full-featured Apache 2.0 license open source core.
Built for every AI application
memorylayer powers memory for any AI agent that needs to remember, learn, and evolve.
Conversational Agents
Build chatbots that remember user context, preferences, and past conversations across sessions.
- Remember user preferences
- Maintain conversation context
- Personalized responses
Claude Code Assistant
Protect your context window from compaction. Automatically extract and store key learnings during long coding sessions.
- Pre-compact memory capture
- Session-start briefings
- Context protection
Research Agents
Power research tools that accumulate knowledge, track sources, and build understanding over time.
- Accumulate findings
- Cross-reference sources
- Build knowledge graphs
Personal AI Assistants
Develop assistants that truly know their users - preferences, habits, goals, and communication style.
- Learn user patterns
- Anticipate needs
- Evolve over time
Domain-Specific Agents
Deploy agents with custom taxonomy and ontology tailored to your industry. One size doesn't fit all.
- Custom memory schemas
- Industry-specific relationships
- Specialized knowledge graphs
Process Plant Intelligence
Power safety systems and digital twins for oil, gas, energy, chemical, and pharmaceutical operations.
- Process interconnection mapping
- Safety system awareness
- Digital twin memory
Choose your deployment
Run in the cloud or self-host with full data control. Same API, same features.
Self-Hosted Apache 2.0
Open source and free to use. No external database server required — just SQLite. Full data control. Can work completely offline. Easy installation or Docker Container.
- Effortless deployment using SQLite database
- Can work completely offline
- Full data ownership
- Deploy anywhere in minutes
Enterprise Cloud (Managed SaaS)
Fully managed SaaS. Zero infrastructure to maintain. Scales automatically.
- Scales effortlessly with usage - no capacity planning needed
- Automatic backups and redundancy
- Enterprise features for RBAC, Auditing, and more
- Secure multi-tenant design keeps your data isolated and protected
Enterprise (On-prem)
Get all of the features of our managed service but under your control. Deploy and manage your own instance of our platform on your own infrastructure.
- Enterprise features for RBAC, Auditing, and more
- Leverage your existing infrastructure and security practices
- Ideal for regulated industries with strict compliance requirements
The same client SDK is used for open source and enterprise deployments.