This page provides a high-level introduction to TrustGraph, its purpose as a context backend for reliable AI systems, and its overall architecture. TrustGraph is designed to bridge the gap between raw data and agentic reasoning by providing a structured, multi-model infrastructure for context management.
For detailed explanations of core terminology such as Collections, Triples, and Context Cores, see Key Concepts. For information about the modular package structure and dependencies, see Package Structure.
TrustGraph is a context development platform and graph-native infrastructure designed to store, enrich, and retrieve structured knowledge at any scale. It addresses the fundamental problem that LLMs alone tend to hallucinate and diverge from ground truth by providing a multi-model database system that maintains factual information and makes it available through various retrieval mechanisms. README.md14-18
The system operates as an AI-native backend similar to traditional database systems like Supabase, but specifically architected for context graph operations. TrustGraph ingests data from multiple sources, automatically structures it using knowledge graph extraction, embeds it for semantic search, and makes it queryable through specialized Retrieval-Augmented Generation (RAG) pipelines and agentic workflows. README.md20-36
Sources: README.md14-18 README.md20-36 README.md97-103
TrustGraph implements a message-driven, microservices architecture where all components communicate asynchronously through Apache Pulsar. This design enables horizontal scalability, fault isolation, and flexible deployment configurations. README.md61
The following diagram illustrates the relationship between the client interfaces, the core processing services, and the underlying storage layer, mapping system concepts to specific service implementations.
Sources: README.md20-62 README.md112-129
All inter-service communication flows through Apache Pulsar, which provides guaranteed message delivery, persistence, and multi-tenancy. Services implement the AsyncProcessor or FlowProcessor base classes, which define subscription topics and message handlers. README.md61
Each service subscribes to specific Pulsar topics defined in its ConsumerSpec configuration. For example, the chunker services subscribe to document input topics and publish to embedding service topics. This pub/sub model decouples services and enables multiple instances of each service to share the workload.
Sources: README.md61
TrustGraph employs a multi-model storage strategy where different data types are stored in specialized databases optimized for their access patterns. README.md21-24
Collection names follow specific conventions for multi-tenancy and dimensionality handling:
d_{user}_{collection}_{dim} for documents, t_{user}_{collection}_{dim} for graph entities.user and collection metadata properties.Sources: README.md21-24 README.md58-60
TrustGraph implements three complementary RAG strategies: README.md28-31
Sources: README.md28-31
The agent implementation follows the ReAct (Reasoning and Acting) pattern, where the agent iteratively reasons about a problem and takes actions using available tools. README.md33-36
Sources: README.md33-36 README.md127-129
TrustGraph provides a unified interface for all major LLMs, supporting both cloud APIs and self-hosted inferencing stacks. README.md40-42
| Provider Type | Examples | Use Case |
|---|---|---|
| Cloud APIs | Anthropic, Gemini, Mistral, OpenAI | Production deployments with managed infrastructure |
| Self-Hosted | vLLM, TGI, Ollama, LM Studio, Llamafiles | On-premise deployments, data sovereignty requirements |
Sources: README.md40-42 README.md62
TrustGraph can be deployed locally or in the cloud using the provided configuration tools. README.md64-73
The configuration builder generates deployment artifacts (deploy.zip containing docker-compose.yaml or resources.yaml) and an INSTALLATION.md file. README.md70-72
Sources: README.md64-73