Preparing your content
Preparing your content
<10ms
Embedding Inference + Retrieval
150K+
Package Installs
100%
Offline Indexing + Querying
Trusted by 1000+ teams










How It Works
01
Push your data source (docs, knowledge bases, or live data) via our SDK or portal.
02
We index, sync, and distribute a compact index wherever your agent runs: browser, edge, device, or cloud.
03
Your agent retrieves context locally, in under 10ms. No hops. No lag. No infra to manage.
Integration
Install Moss and start querying in just a few lines of code
from inferedge_moss import MossClient
client = MossClient(PROJECT_ID, PROJECT_KEY)
docs = [{"text": "How do I track my order?"}]
await client.add_docs("my-index", docs)Benchmarks
Benchmarked on 100k documents. Latency includes embedding inference and end-to-end cloud roundtrip. View benchmark script
Integrations
TypeScript, Python, voice, agents. Install and go.

Use Cases
If you're building voice AI, copilots, or multimodal apps where retrieval is on your critical path, Moss is built for you.
Sub-10ms context retrieval for real-time conversation. Your agent recalls, reasons, and responds without the pause.
FAQ
Everything you need to know about Moss and real-time semantic search for AI agents.