Preparing your content
<10ms
Embedding Inference + Retrieval
125K+
Package Installs
100%
Offline Indexing + Querying
Trusted by 1000+ teams










How It Works
01
Push your data source (docs, knowledge bases, or live data) via our SDK or portal.
02
We index, sync, and distribute a compact index wherever your agent runs: browser, edge, device, or cloud.
03
Your agent retrieves context locally, in under 10ms. No hops. No lag. No infra to manage.
Integration
Install Moss and start querying in just a few lines of code
import { MossClient, DocumentInfo } from '@inferedge/moss'
const client = new MossClient(process.env.MOSS_PROJECT_ID!,
process.env.MOSS_PROJECT_KEY!)
const documents: DocumentInfo[] = [
{ id: 'doc1', text: 'How do I track my order? Log into your account.',
metadata: { category: 'shipping' } }
]Benchmarks
Benchmarked on 100k documents. Latency includes embedding inference and end-to-end cloud roundtrip. View benchmark script
Use Cases
If you're building voice AI, copilots, or multimodal apps where retrieval is on your critical path, Moss is built for you.
Sub-10ms context retrieval for real-time conversation. Your agent recalls, reasons, and responds without the pause.
FAQ
Everything you need to know about Moss and real-time semantic search for AI agents.