,_,
(O,O)
( )
" "
· · · W E A V · · ·
Weaving context graphs for AI.
An in-memory context graph database built for AI.
Quickstart · Architecture · Query Language · Auth · API Reference · SDKs · Benchmarks
Weav is a Redis-like, in-memory context graph database purpose-built for AI and LLM workloads. It combines graph topology, vector search, temporal tracking, and token-aware retrieval into a single system — so your AI applications can retrieve exactly the right context, within budget, in microseconds.
The problem: LLMs need context. RAG gives you chunks. But chunks lack structure — relationships, provenance, temporal validity, and relevance decay all get lost. You end up stitching together a vector DB, a graph DB, and a lot of glue code.
Weav's answer: One database that natively understands all of it.
| Capability | Description |
|---|---|
| Context Graph | Directed, labeled, weighted graph with property storage |
| Vector Search | HNSW index (via usearch) with cosine, euclidean, and dot product metrics |
| Bi-Temporal | Track both real-world validity and transaction time for point-in-time queries |
| Token Budgeting | Greedy knapsack packing — fit the most relevant context into your LLM's token window |
| Flow Scoring | Relevance propagation from seed nodes through the graph topology |
| Entity Dedup | Exact key, fuzzy name (Jaro-Winkler), and vector similarity deduplication |
| Provenance | Track source, confidence, and extraction method for every piece of knowledge |
| Decay Functions | Linear, exponential, and gaussian relevance decay over time |
| MCP Server | Model Context Protocol integration — connect directly from Claude, Cursor, etc. |
| Multi-Protocol | HTTP REST, RESP3 (Redis protocol), and gRPC — all on one server |
| Auth & ACL | Redis-ACL-inspired auth with command categories, graph-level permissions, API keys |
| Persistence | WAL with CRC32 checksums + periodic snapshots with full recovery |
# Clone and build
git clone https://github.com/SiluPanda/weav.git
cd weav
cargo build --release
# Start the server
./target/release/weav-server
# Default ports:
# RESP3 → :6380
# gRPC → :6381
# HTTP → :6382The LLM provider integration (AWS SDK, Actix) is opt-in to keep default builds lean:
# Default build — everything except LLM providers (~414 crates)
cargo build --release
# Full build — including LLM extraction (~495 crates)
cargo build --release -p weav-server --features full
# Minimal — HTTP-only graph database
cargo build --release -p weav-server --no-default-features# Interactive REPL
./target/release/weav-cli
# Single command
./target/release/weav-cli -c 'PING'
# Connect with authentication
./target/release/weav-cli -u admin -a supersecretweav> GRAPH CREATE "knowledge"
OK
weav> NODE ADD TO "knowledge" LABEL "concept" PROPERTIES {"name": "Transformers", "content": "Self-attention mechanism for sequence modeling"} EMBEDDING [0.1, 0.2, 0.3]
(integer) 0
weav> NODE ADD TO "knowledge" LABEL "concept" PROPERTIES {"name": "BERT", "content": "Bidirectional encoder from transformers"} EMBEDDING [0.12, 0.22, 0.28]
(integer) 1
weav> EDGE ADD TO "knowledge" FROM 1 TO 0 LABEL "derived_from" WEIGHT 0.95
(integer) 0
weav> CONTEXT "attention mechanisms" FROM "knowledge" BUDGET 4096 TOKENS
pip install httpx # dependencyfrom weav import WeavClient
client = WeavClient(host="localhost", port=6382)
# or with authentication:
# client = WeavClient(host="localhost", port=6382, api_key="wk_live_abc123")
# client = WeavClient(host="localhost", port=6382, username="admin", password="secret")
# Create a graph
client.create_graph("research")
# Add nodes with embeddings
node_id = client.add_node("research",
label="paper",
properties={"title": "Attention Is All You Need", "year": 2017},
embedding=[0.1, 0.2, 0.3, ...] # your embedding vector
)
# Query context with token budget
result = client.context("research",
query="transformer architectures",
budget=4096,
include_provenance=True
)
# Ready for your LLM
prompt = result.to_prompt()
messages = result.to_messages()import { WeavClient } from "@weav/client";
const client = new WeavClient({ host: "localhost", port: 6382 });
// or with authentication:
// const client = new WeavClient({ host: "localhost", port: 6382, apiKey: "wk_live_abc123" });
// const client = new WeavClient({ host: "localhost", port: 6382, username: "admin", password: "secret" });
await client.createGraph("research");
const nodeId = await client.addNode("research", {
label: "paper",
properties: { title: "Attention Is All You Need", year: 2017 },
embedding: [0.1, 0.2, 0.3],
});
const result = await client.context({
graph: "research",
query: "transformer architectures",
budget: 4096,
});
const prompt = contextToPrompt(result); ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ HTTP REST │ │ RESP3 TCP │ │ gRPC │
│ :6382 │ │ :6380 │ │ :6381 │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
└───────────────┼────────────────┘
│
┌──────────────────────────────────────────────┐
│ AUTH LAYER (opt-in) │
│ Bearer/Basic │ AUTH cmd │ gRPC metadata │
│ → ACL Store → Category + Graph ACL check → │
└──────────────────────┬───────────────────────┘
│
┌──────────────────────────────────────────────┐
│ ENGINE │
│ ┌────────────────────────────────────────┐ │
│ │ Query Pipeline │ │
│ │ Parse → Plan → Execute → Budget → Out│ │
│ └────────────────────────────────────────┘ │
│ │
│ ┌──────────┐ ┌──────────┐ ┌─────────────┐ │
│ │ Graph 0 │ │ Graph 1 │ │ Graph N │ │
│ │┌────────┐│ │┌────────┐│ │┌───────────┐│ │
│ ││Adjacen.││ ││Adjacen.││ ││ Adjacency ││ │
│ │├────────┤│ │├────────┤│ │├───────────┤│ │
│ ││Propert.││ ││Propert.││ ││Properties ││ │
│ │├────────┤│ │├────────┤│ │├───────────┤│ │
│ ││ Vector ││ ││ Vector ││ ││ Vector ││ │
│ ││ (HNSW) ││ ││ (HNSW) ││ ││ (HNSW) ││ │
│ │└────────┘│ │└────────┘│ │└───────────┘│ │
│ └──────────┘ └──────────┘ └─────────────┘ │
└──────────────────────┬───────────────────────┘
│
┌──────────────────────────────────────────────┐
│ PERSISTENCE │
│ ┌──────────────┐ ┌────────────────────┐ │
│ │ WAL │ │ Snapshots │ │
│ │ (CRC32) │ │ (bincode) │ │
│ └──────────────┘ └────────────────────┘ │
└──────────────────────────────────────────────┘
| Crate | Purpose |
|---|---|
weav-core |
Foundation types, config, errors, shard infrastructure, message bus |
weav-graph |
Adjacency store, property store, traversal (BFS, flow scoring, Dijkstra, PPR), entity dedup |
weav-vector |
HNSW vector index (usearch), token counting (tiktoken-rs) |
weav-extract |
Ingestion pipeline: document parsing (PDF/DOCX/CSV/text), chunking, LLM extraction (opt-in) |
weav-query |
Query parser (38 commands), planner, executor, token budget enforcement |
weav-auth |
Authentication (Argon2id), API keys (SHA-256), ACL store, command classification |
weav-persist |
Write-ahead log, snapshot engine, crash recovery |
weav-proto |
RESP3 codec, gRPC protobuf definitions, command mapping |
weav-mcp |
Model Context Protocol server (8 tools, stdio + HTTP transports) |
weav-server |
Engine coordinator, HTTP/RESP3/gRPC servers (axum, tonic) |
weav-cli |
Interactive REPL client with history (rustyline) |
benchmarks |
Criterion benchmarks at 100K scale |
- Compact String Interning — Labels and property keys stored as
u16IDs, not heap strings - Column-Oriented Properties — Sparse property sets without wasting memory on nulls
- SmallVec<8> Adjacency — Most nodes have few edges; avoid heap allocation for the common case
- Roaring Bitmaps — Efficient set operations for node filtering and membership tests
- Greedy Knapsack Budget — Packs the highest value-density chunks (relevance / tokens) first
- Zero-Copy Ready — rkyv support for future hot-path serialization
Weav uses a Redis-style command language optimized for context retrieval.
GRAPH CREATE "<name>"
GRAPH DROP "<name>"
GRAPH LIST
GRAPH INFO "<name>"
NODE ADD TO "<graph>" LABEL "<label>" PROPERTIES {json} [EMBEDDING [f32, ...]]
NODE GET "<graph>" <id>
NODE GET "<graph>" BY ENTITY_KEY "<key>"
NODE UPDATE "<graph>" <id> PROPERTIES {json} [EMBEDDING [f32, ...]]
NODE DELETE "<graph>" <id>
EDGE ADD TO "<graph>" FROM <source> TO <target> LABEL "<label>" [WEIGHT <f32>]
EDGE GET "<graph>" <id>
EDGE DELETE "<graph>" <id>
EDGE INVALIDATE "<graph>" <id>
BULK NODES TO "<graph>" DATA [{node}, {node}, ...]
BULK EDGES TO "<graph>" DATA [{edge}, {edge}, ...]
The star of the show — retrieve structured, budget-aware context for your LLM:
CONTEXT "<query>" FROM "<graph>" BUDGET <n> TOKENS
[SEEDS [node_id, ...]]
[MAX DEPTH <u8>]
[DIRECTION IN|OUT|BOTH]
[EDGE_FILTER {json}]
[DECAY linear|exponential|gaussian]
[TEMPORAL AT <timestamp>]
[LIMIT <u32>]
[SORT BY relevance|recency|confidence ASC|DESC]
How the context pipeline works:
Query Text ──→ Vector Search ──→ Seed Nodes
│
Explicit Seeds ─────────────────────┤
▼
Graph Traversal
(BFS to max_depth)
│
▼
Flow Scoring
(relevance propagation)
│
▼
Temporal Filtering
(bi-temporal validity)
│
▼
Conflict Detection
(label-group dedup)
│
▼
Token Budget Enforcement
(greedy knapsack)
│
▼
Sorted ContextChunks[]
PING
INFO
STATS ["<graph>"]
SNAPSHOT
AUTH <password> # Redis-compat single-password auth
AUTH <username> <password> # Username + password auth
ACL SETUSER <user> [>password] [on|off] [+@cat|-@cat] [~pattern:perm]
ACL DELUSER <username>
ACL LIST
ACL GETUSER <username>
ACL WHOAMI
ACL SAVE # Persist ACL to file
ACL LOAD # Reload ACL from file
Command categories: +@connection, +@read, +@write, +@admin, +@all
Graph patterns: ~*:readwrite, ~app:*:read, ~shared:admin
Weav includes a Redis-ACL-inspired auth system that works across all three protocols. Auth is disabled by default — zero config change needed for existing deployments.
| Layer | Mechanism |
|---|---|
| HTTP | Authorization: Bearer <api_key> or Authorization: Basic <base64> header |
| RESP3 | AUTH [username] password command (per-connection identity) |
| gRPC | authorization metadata key |
Command categories control what types of operations a user can perform:
| Category | Commands |
|---|---|
connection |
PING, INFO, AUTH |
read |
NODE.GET, EDGE.GET, GRAPH.INFO, GRAPH.LIST, STATS, CONTEXT, CONFIG.GET, ACL WHOAMI |
write |
NODE.ADD, NODE.UPDATE, NODE.DELETE, EDGE.ADD, EDGE.DELETE, EDGE.INVALIDATE, BULK.INSERT.* |
admin |
GRAPH.CREATE, GRAPH.DROP, SNAPSHOT, CONFIG.SET, ACL SETUSER/DELUSER/LIST/GETUSER/SAVE/LOAD |
Graph-level ACL controls which graphs a user can access, using glob patterns:
[[auth.users]]
username = "app_writer"
password = "writepass"
categories = ["+@read", "+@write"]
graph_patterns = [
{ pattern = "app:*", permission = "readwrite" },
{ pattern = "shared", permission = "read" },
]Users can be assigned API keys (prefixed wk_) for Bearer token auth. The server stores only SHA-256 hashes — raw keys are never persisted.
[[auth.users]]
username = "service_account"
categories = ["+@read"]
api_keys = ["wk_live_abc123def456"]- Auth is OFF by default — pass no config and everything works as before
require_auth = false(the default when auth is enabled) allows mixed authenticated/unauthenticated connections during migration- All SDK auth parameters are optional — existing client code is unchanged
Base URL: http://localhost:6382
All responses follow { "success": bool, "data"?: T, "error"?: string }.
| Method | Endpoint | Body | Response |
|---|---|---|---|
POST |
/v1/graphs |
{ "name": "..." } |
{ "name": "..." } |
GET |
/v1/graphs |
— | [{ "name", "node_count", "edge_count" }] |
GET |
/v1/graphs/{name} |
— | { "name", "node_count", "edge_count" } |
DELETE |
/v1/graphs/{name} |
— | "dropped" |
| Method | Endpoint | Body | Response |
|---|---|---|---|
POST |
/v1/graphs/{g}/nodes |
{ "label", "properties?", "embedding?", "entity_key?" } |
{ "node_id": u64 } |
GET |
/v1/graphs/{g}/nodes/{id} |
— | { "node_id", "label", "properties" } |
PUT |
/v1/graphs/{g}/nodes/{id} |
{ "properties?", "embedding?" } |
"updated" |
DELETE |
/v1/graphs/{g}/nodes/{id} |
— | "deleted" |
POST |
/v1/graphs/{g}/nodes/bulk |
{ "nodes": [...] } |
{ "node_ids": [u64] } |
| Method | Endpoint | Body | Response |
|---|---|---|---|
POST |
/v1/graphs/{g}/edges |
{ "source", "target", "label", "weight?", "properties?" } |
{ "edge_id": u64 } |
GET |
/v1/graphs/{g}/edges/{id} |
— | Edge details |
DELETE |
/v1/graphs/{g}/edges/{id} |
— | "deleted" |
POST |
/v1/graphs/{g}/edges/{id}/invalidate |
— | "invalidated" |
POST |
/v1/graphs/{g}/edges/bulk |
{ "edges": [...] } |
{ "edge_ids": [u64] } |
| Method | Endpoint | Body |
|---|---|---|
POST |
/v1/context |
{ "graph", "query?", "embedding?", "seed_nodes?", "budget?", "max_depth?", "include_provenance?", "decay?", "temporal_at?", "limit?", "sort_field?", "sort_direction?", "edge_labels?", "direction?" } |
Returns ContextResult with chunks, token counts, and query timing.
Decay parameter (object, not string):
{
"decay": {
"decay_type": "exponential",
"half_life_ms": 3600000,
"max_age_ms": null,
"cutoff_ms": null
}
}Supported types: exponential, linear, step, none.
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check |
GET |
/v1/info |
Server info |
POST |
/v1/snapshot |
Trigger snapshot |
GET |
/metrics |
Prometheus metrics |
Connect on port 6380 with any Redis client or weav-cli. Commands are sent as RESP3 arrays.
Connect on port 6381. Proto definitions in weav-proto/proto/weav.proto. Supports 22 RPC methods including ContextQueryStream for streaming results.
from weav import WeavClient, AsyncWeavClient
# Sync client
client = WeavClient(host="localhost", port=6382)
# Async client
client = AsyncWeavClient(host="localhost", port=6382)
# With authentication
client = WeavClient(host="localhost", port=6382, api_key="wk_live_abc123")
client = WeavClient(host="localhost", port=6382, username="admin", password="secret")
# LLM integrations
from weav import WeavLangChain, WeavLlamaIndexContext result helpers:
result = client.context("my_graph", query="...", budget=4096)
result.to_prompt() # Formatted string for system prompt injection
result.to_messages() # OpenAI-compatible message listFull parameter support:
result = client.context("my_graph",
query="transformer architectures",
budget=4096,
decay={"type": "exponential", "half_life_ms": 3600000},
edge_labels=["derived_from", "related_to"],
temporal_at=1700000000000,
direction="outgoing",
limit=50,
sort_field="relevance",
sort_direction="desc",
include_provenance=True,
seed_nodes=["node_key_1"],
embedding=[0.1, 0.2, 0.3],
)import { WeavClient, contextToPrompt, contextToMessages } from "@weav/client";
const client = new WeavClient({ host: "localhost", port: 6382 });
// With authentication:
// new WeavClient({ host: "localhost", port: 6382, apiKey: "wk_live_abc123" });
// new WeavClient({ host: "localhost", port: 6382, username: "admin", password: "secret" });
const result = await client.context({
graph: "my_graph",
query: "...",
budget: 4096,
decay: { type: "exponential", halfLifeMs: 3600000 },
edgeLabels: ["related_to", "derived_from"],
temporalAt: Date.now(),
direction: "outgoing",
limit: 50,
sortField: "relevance",
sortDirection: "desc",
});
contextToPrompt(result); // Formatted prompt string
contextToMessages(result); // OpenAI-compatible messagesAll response fields use camelCase (nodeId, relevanceScore, tokenCount).
Weav is configured via TOML file or environment variables.
# weav.toml
[server]
bind_address = "0.0.0.0"
port = 6380 # RESP3
grpc_port = 6381 # gRPC
http_port = 6382 # HTTP REST
max_connections = 10000
tcp_keepalive_secs = 300
read_timeout_ms = 30000
[engine]
num_shards = 8 # Defaults to CPU count
default_vector_dimensions = 1536
max_vector_dimensions = 4096
default_hnsw_m = 16
default_hnsw_ef_construction = 200
default_hnsw_ef_search = 50
default_conflict_policy = "LastWriteWins"
enable_temporal = true
enable_provenance = true
token_counter = "CharDiv4" # or "TiktokenCl100k", "TiktokenO200k"
[persistence]
enabled = false
data_dir = "./weav-data"
wal_enabled = true
wal_sync_mode = "EverySecond" # or "Always", "Never"
snapshot_interval_secs = 3600
max_wal_size_mb = 256
[memory]
max_memory_mb = 0 # 0 = unlimited
eviction_policy = "NoEviction"
arena_size_mb = 64
[auth]
enabled = false # Set true to enable auth
require_auth = false # Set true to reject unauthenticated connections
# default_password = "secret" # Redis-compat: AUTH <password> only
# acl_file = "./weav-data/acl.conf"
# [[auth.users]]
# username = "admin"
# password = "supersecret"
# categories = ["+@all"]
#
# [[auth.users]]
# username = "reader"
# password = "readonly123"
# categories = ["+@read", "+@connection"]
# graph_patterns = [{ pattern = "*", permission = "read" }]
# api_keys = ["wk_live_abc123def456"]| Variable | Description |
|---|---|
WEAV_SERVER_PORT |
RESP3 listen port |
WEAV_SERVER_HTTP_PORT |
HTTP REST listen port |
WEAV_SERVER_GRPC_PORT |
gRPC listen port |
WEAV_SERVER_BIND_ADDRESS |
Bind address |
WEAV_ENGINE_NUM_SHARDS |
Number of shards |
WEAV_PERSISTENCE_ENABLED |
Enable persistence (true/false) |
WEAV_PERSISTENCE_DATA_DIR |
Persistence directory path |
WEAV_MEMORY_MAX_MEMORY_MB |
Memory limit in MB |
WEAV_AUTH_ENABLED |
Enable authentication (true/false) |
WEAV_AUTH_REQUIRE_AUTH |
Require auth for all connections (true/false) |
WEAV_AUTH_DEFAULT_PASSWORD |
Default password for Redis-compat AUTH <password> |
Node
├── node_id: u64
├── label: LabelId (interned u16)
├── properties: Map<PropertyKeyId, Value>
├── embedding: Option<Vec<f32>>
├── entity_key: Option<String>
└── temporal: BiTemporal
Edge
├── edge_id: u64
├── source: NodeId → target: NodeId
├── label: LabelId (interned u16)
├── weight: f32
├── properties: Map<PropertyKeyId, Value>
├── provenance: Option<Provenance>
└── temporal: BiTemporal
BiTemporal
├── valid_from / valid_until ← real-world validity window
└── tx_from / tx_until ← database transaction time
Provenance
├── source: String ← "gpt-4-turbo", "user-input", "sec-filing-10k"
├── confidence: f32 ← 0.0 to 1.0
├── extraction_method ← LlmExtracted | NlpPipeline | UserProvided | Derived | Imported
├── source_document_id: Option
└── source_chunk_offset: Option
Value (dynamic type system)
├── Null | Bool | Int | Float
├── String | Bytes | Timestamp
├── Vector(Vec<f32>)
├── List(Vec<Value>)
└── Map(Vec<(String, Value)>)
ContextChunk
├── node_id: u64
├── content: String ← concatenated text properties
├── label: String
├── relevance_score: f32 ← flow scoring result
├── depth: u8 ← hops from seed
├── token_count: u32
├── provenance: Option<Provenance>
├── relationships: Vec<RelationshipSummary>
└── temporal: Option<BiTemporal>
Run with:
cargo bench| Benchmark | Scale | Description |
|---|---|---|
vector_search_100k_128d_k10 |
100K vectors, 128 dims | Top-10 nearest neighbor search |
bfs_100kn_depth3 |
100K nodes, avg degree 5 | BFS traversal to depth 3 |
flow_score_100kn_depth3 |
100K nodes, avg degree 5 | Relevance flow scoring |
node_adjacency_10k |
10K insertions | Adjacency insert throughput |
Benchmarks produce HTML reports via criterion.
# Run all tests (default features)
cargo test --workspace
# Run all tests including LLM provider tests
cargo test --workspace --features weav-server/full,weav-extract/llm-providers
# Run tests for a specific crate
cargo test -p weav-core
cargo test -p weav-graph
cargo test -p weav-server
# Python SDK tests
cd sdk/python && pip install -e ".[dev]" && pytest
# Node SDK tests
cd sdk/node && npm test1,340 Rust tests across all crates, 1,414 total including SDKs — all passing.
| Crate | Tests |
|---|---|
| weav-core | 134 |
| weav-graph | 343 |
| weav-vector | 32 |
| weav-extract | 32 |
| weav-query | 227 |
| weav-auth | 47 |
| weav-persist | 47 |
| weav-proto | 61 |
| weav-server | 378 (282 unit + 28 integration + 68 E2E) |
| weav-cli | 39 |
| Python SDK | 49 |
| Node SDK | 25 |
weav/
├── weav-core/ Core types, config, errors, shard, message bus
├── weav-graph/ Adjacency store, property store, traversal, dedup
├── weav-vector/ HNSW vector index, token counter
├── weav-extract/ Ingestion: document parsing, chunking, LLM extraction (opt-in)
├── weav-query/ Parser (38 commands), planner, executor, budget enforcer
├── weav-auth/ Authentication (Argon2id), API keys, ACL store
├── weav-persist/ WAL, snapshots, recovery manager
├── weav-proto/ RESP3 codec, gRPC proto, command mapping
├── weav-mcp/ MCP server (Model Context Protocol for LLM tools)
├── weav-server/ Engine, HTTP/RESP3/gRPC servers, binary
├── weav-cli/ Interactive REPL client
├── benchmarks/ Criterion benchmarks (100K scale)
├── sdk/
│ ├── python/ Python HTTP client + LLM integrations
│ └── node/ TypeScript HTTP client
└── Cargo.toml Workspace root
Weav is built on battle-tested Rust crates:
| Category | Crates |
|---|---|
| Async | tokio, tokio-util |
| HTTP | axum, tower |
| gRPC | tonic, prost |
| Vector Search | usearch (HNSW) |
| Tokenization | tiktoken-rs (cl100k, o200k) |
| LLM Integration | llm (opt-in via extract-llm feature) |
| Serialization | serde, bincode, rkyv |
| Data Structures | roaring, smallvec, compact_str |
| Hashing | xxhash-rust, crc32fast, sha2 |
| Auth | argon2 (Argon2id), rand, glob-match |
| String Matching | strsim (Jaro-Winkler) |
| CLI | clap, rustyline |
| Memory | bumpalo (arena allocator) |
| Concurrency | crossbeam, parking_lot |