Open Source Memory Infrastructure

Memory infrastructure for AI agents

Give your AI a brain that remembers. Persistent, queryable memory for stateless LLMs.

main.py

from memorylayer import sync_client

# Connect and use memory
with sync_client() as memory:
    # Remember user preferences
    memory.remember("User prefers light mode")

    # Recall with semantic search
    result = memory.recall("What are the user's preferences?")

The Challenge

The Problem with AI Today

Current LLM-based agents suffer from critical memory limitations that limit their usefulness.

LLMs Forget Everything

Between sessions, your AI starts from scratch. No memory of past conversations, decisions, or learned patterns.

Context Windows Are Limited

Even the best models typically only handle 128-256K tokens. Long histories get truncated or forgotten.

No Learning Persistence

Current LLMs do not remember user preferences, past decisions, or evolve their understanding over time.

Expensive Re-computation

Without persistent memory, you must reprocess the same data repeatedly, leading to high costs and inefficiency.

There's a better way

The Solution

What is memorylayer.ai?

memorylayer.ai is an API-first memory infrastructure for LLM-powered agents. It provides the missing memory layer to keep track of information just like humans do.

Episodic

Specific events and interactions. "User asked about Python logging on Jan 15."

Semantic

Facts, concepts, and relationships. "User prefers TypeScript over JavaScript."

Procedural

How to do things. "To deploy, run npm run deploy."

Working

Current task context. "Currently debugging auth.py line 42."

Architecture

Agent Frameworks

LangChain CrewAI Autogen Custom

Applications

Claude Code Cursor & IDEs Web Apps Your App

memorylayer.ai API Infrastructure Layer

LLM Providers

Anthropic OpenAI Gemini Local

Features

Everything you need for AI memory

Built with production-grade features that scale from local development to enterprise deployment.

Claude Code Plugin

Protect your context window with the official Claude Code plugin. Automatically captures memory before compaction. (We use it too!)

Adaptive Learning

Memory importance changes over time in response to feedback and how memories are used. No more stale, irrelevant information clogging up your context.

Dynamic Retrieval

Dynamically leverages hybrid of vector search, graph search, and agentic search to prioritize memories

Relationship Graph

60+ typed relationship edges across 11 categories enable multi-hop causal queries that vector similarity alone cannot answer.

MCP Server

First-class Model Context Protocol integration for Claude Desktop, Cursor, and other MCP-compatible tools.

Semantic Tiering

Memories are progressively summarized into different detail levels. Retrieve the right amount of information for each query without wasting context.

Context Sandbox

Process hundreds of memories server-side without consuming your context window. A persistent Python sandbox lets agents explore, filter, and transform memory data programmatically — driven by the agent or fully autonomous.

Recursive Reasoning

Inspired by RLM, the server iteratively executes code and LLM queries over sandbox data and memories. Run it autonomously server-side, or let your agent orchestrate each step via MCP.

Smart Extraction

Every memory stored automatically extracts facts, builds typed associations between related memories, deduplicates against existing knowledge, and categorizes by type — no manual tagging required.

Enterprise

Enterprise Ready

Scale up for production with smart hot/warm/cold data tiering, vector-graph compression, smarter vector search, custom ontologies, RBAC, audit trails, and more.

Enterprise

Advanced Sandbox

Enterprise sandbox with state checkpointing, stronger isolation, extended tool libraries, and resource limits for production-grade server-side reasoning.

Enterprise

Multimodal Support

Unified handling of text, images, audio, video, documents, and PDFs.

How It Works

Start with three simple operations

API complexity scales with your requirements. Basic usage is really straightforward.

Remember

Store memories with automatic classification. memorylayer extracts facts, preferences, and decisions from raw content.

example.py

memory.remember(
  content="User prefers light mode",
  type="semantic",
  importance=0.8
)

Recall

Search with intelligent retrieval. Use fast RAG mode or deep LLM mode with query rewriting and context resolution.

example.py

result = memory.recall(
  query="user preferences",
)

Reflect

Synthesize insights from accumulated knowledge. Generate summaries, detect contradictions, and identify patterns.

example.py

reflection = memory.reflect(
  query="Summarize all recent coding decisions",
)

Quick Start

Get started in minutes

Two steps: start the server, then use the SDK

Start the Server

Terminal

# Install the server (with local embeddings)
pip install memorylayer-server[local]

# Start the server (uses SQLite by default)
memorylayer serve

# Server running at http://localhost:61001
# Data stored in your home directory
# Configure embedding & LLM providers (see docs)

Terminal

# Pull and run the container
docker run -d \
  --name memorylayer \
  -p 61001:61001 \
  -v memorylayer-data:/data \
  scitrera/memorylayer-server

# Server running at http://localhost:61001
# Local embeddings included — no API key needed
# Data persisted in Docker volume
# Add an LLM provider for full features (see docs)

Use the SDK

main.py


# pip install memorylayer-client
from memorylayer import MemoryLayerClient, MemoryType

# Connect to your local server
async with MemoryLayerClient(
    base_url="http://localhost:61001",
    workspace_id="my-project"
) as ml:
    # Store a memory
    memory = await ml.remember(
        content="User prefers light mode with clean design",
        type=MemoryType.SEMANTIC,
        importance=0.8,
        tags=["preferences", "ui"]
    )

    # Recall memories
    memories = await ml.recall(
        query="what are the user's UI preferences?"
    )

    # Synthesize insights
    reflection = await ml.reflect(
        query="summarize all user preferences"
    )

index.ts npm install @scitrera/memorylayer-sdk


// npm install @scitrera/memorylayer-sdk
import { MemoryLayerClient, MemoryType } from '@scitrera/memorylayer-sdk';

// Connect to your local server
const ml = new MemoryLayerClient({
  baseUrl: 'http://localhost:61001',
  workspaceId: 'my-project'
});

// Store a memory
const memory = await ml.remember(
  'User prefers light mode with clean design',
  {
    type: MemoryType.SEMANTIC,
    importance: 0.8,
    tags: ['preferences', 'ui']
  }
);

// Recall memories
const memories = await ml.recall(
  'what are the user\'s UI preferences?'
);

// Synthesize insights
const reflection = await ml.reflect(
  'summarize all user preferences'
);

claude_desktop_config.json Claude Desktop / Cursor

// Add to Claude Desktop config:
// ~/.config/claude/claude_desktop_config.json

{
  "mcpServers": {
    "memorylayer": {
      "command": "npx",
      "args": ["@scitrera/memorylayer-mcp-server"],
      "env": {
        "MEMORYLAYER_URL": "http://localhost:61001"
      }
    }
  }
}

// Available MCP tools include (see docs for more details):
// - memory_remember: Store new memories
// - memory_recall: Search memories
// - memory_reflect: Synthesize insights
// - memory_forget: Delete memories
// - memory_associate: Explicitly link memories with a relationship
// - memory_briefing: Session context
// - memory_session_start: Start a new session for working memory
// - memory_session_end: End session and optionally commit working memory
// - memory_session_status: Get current session status

Get started with the full-featured Apache 2.0 license open source core.

View on GitHub Read the Docs

Use Cases

Built for every AI application

memorylayer powers memory for any AI agent that needs to remember, learn, and evolve.

OSS Enterprise

Conversational Agents

Build chatbots that remember user context, preferences, and past conversations across sessions.

Remember user preferences
Maintain conversation context
Personalized responses

OSS Enterprise

Claude Code Assistant

Protect your context window from compaction. Automatically extract and store key learnings during long coding sessions.

Pre-compact memory capture
Session-start briefings
Context protection

OSS Enterprise

Research Agents

Power research tools that accumulate knowledge, track sources, and build understanding over time.

Accumulate findings
Cross-reference sources
Build knowledge graphs

OSS Enterprise

Personal AI Assistants

Develop assistants that truly know their users - preferences, habits, goals, and communication style.

Learn user patterns
Anticipate needs
Evolve over time

Enterprise

Domain-Specific Agents

Deploy agents with custom taxonomy and ontology tailored to your industry. One size doesn't fit all.

Custom memory schemas
Industry-specific relationships
Specialized knowledge graphs

Enterprise

Process Plant Intelligence

Power safety systems and digital twins for oil, gas, energy, chemical, and pharmaceutical operations.

Process interconnection mapping
Safety system awareness
Digital twin memory

Deployment

Choose your deployment

Run in the cloud or self-host with full data control. Same API, same features.

Available Now

Self-Hosted Apache 2.0

Open source and free to use. No external database server required — just SQLite. Full data control. Can work completely offline. Easy installation or Docker Container.

Effortless deployment using SQLite database
Can work completely offline
Full data ownership
Deploy anywhere in minutes

Get Started

Coming Soon

Enterprise Cloud (Managed SaaS)

Fully managed SaaS. Zero infrastructure to maintain. Scales automatically.

Scales effortlessly with usage - no capacity planning needed
Automatic backups and redundancy
Enterprise features for RBAC, Auditing, and more
Secure multi-tenant design keeps your data isolated and protected

Contact Sales

Coming Soon

Enterprise (On-prem)

Get all of the features of our managed service but under your control. Deploy and manage your own instance of our platform on your own infrastructure.

Enterprise features for RBAC, Auditing, and more
Leverage your existing infrastructure and security practices
Ideal for regulated industries with strict compliance requirements

Contact Sales

The same client SDK is used for open source and enterprise deployments.

Memory infrastructure for AI agents

The Problem with AI Today

LLMs Forget Everything

Context Windows Are Limited

No Learning Persistence

Expensive Re-computation

What is memorylayer.ai?

Episodic

Semantic

Procedural

Working

Architecture

Everything you need for AI memory

Claude Code Plugin

Adaptive Learning

Dynamic Retrieval

Relationship Graph

MCP Server

Semantic Tiering

Context Sandbox

Recursive Reasoning

Smart Extraction

Enterprise Ready

Advanced Sandbox

Multimodal Support

Start with three simple operations

Remember

Recall

Reflect

Get started in minutes

Start the Server

Cloud Hosting Coming Soon

Use the SDK

Built for every AI application

Conversational Agents

Claude Code Assistant

Research Agents

Personal AI Assistants

Domain-Specific Agents

Process Plant Intelligence

Choose your deployment

Self-Hosted Apache 2.0

Enterprise Cloud (Managed SaaS)

Enterprise (On-prem)