Self-hosted inference
for search & document processing
# Configure module "sie" { source = "superlinked/sie/aws" region = "us-east-1" gpus = ["a100-40gb", "l4-spot"] } # Deploy > terraform apply > helm install sie oci://ghcr.io/superlinked/charts/sie-cluster # Use > pip install sie-sdk client.encode("BAAI/bge-m3", Item(text="indemnification"), options={"lora": "legal"})
# Configure module "sie" { source = "superlinked/sie/google" region = "us-central1" gpus = ["a100-40gb", "l4-spot"] } # Deploy > terraform apply > helm install sie oci://ghcr.io/superlinked/charts/sie-cluster # Use > pip install sie-sdk client.encode("BAAI/bge-m3", Item(text="indemnification"), options={"lora": "legal"})
# Run > docker run -p 8080:8080 ghcr.io/superlinked/sie-server # Use > pip install sie-sdk client.encode("BAAI/bge-m3", Item(text="indemnification"), options={"lora": "legal"})
Works with your favorite tools
Browse integrations"Haystack orchestrates multi-modal pipelines with any combination of models and SIE is the simplest way to self-host them all, including SOTA OCR."
"Chroma makes context engineering simple. SIE adds instruction-following rerankers and relationship extractors for even more precise retrieval."
"Weaviate's Query Agent unlocks natural language search and with SIE you can pre-process your query and data for even better quality."
"Modern search systems compose the best indexing, scoring, filtering and ranking models. With SIE you can host them all in one cluster."
Benefits of self-hosted inference
Pay for your own GPUs instead of per-token API pricing. Improve GPU utilization and stability vs. custom TEI/Infinity deployments.
Boost accuracy with latest task-specific open source models. Embeddings, rerankers, extraction — including multi-modal and multi-vector.
Data never leaves your AWS/GCP. You pick models and configurations. SOC2 Type2 certified. Apache 2.0 licensed.
Learn from our example apps
Browse examples
SIE: Superlinked Inference Engine
Run all your Search & Document processing inference in one centralized cluster across teams and workloads.
Build your apps
> pip install sie-sdk > npm install @superlinked/sie-sdk
and 5+ framework integrations
Manage models & configurations via SDK
client.list_models()
Deploy the cluster
> helm install sie
oci://ghcr.io/superlinked/
charts/sie-cluster
Observe with cloud-native tools, grafana and
> sie-top
Create the infrastructure
module "sie" { source = "superlinked/sie/aws" region = "us-east-1" gpus = ["a100-40gb", "l4-spot"] }
Deploy
> terraform apply
Plan your self-deployment
How SIE fits in your stack
See where SIE sits in a typical retrieval pipeline alongside vector databases, orchestration frameworks, and your application layer.
Cost Comparison
Compare across models, GPU types, and cloud providers.