Vector Engine
The vector engine powers semantic search — nearest-neighbor retrieval over high-dimensional embeddings. It uses a custom HNSW index with multiple quantization levels and hardware-accelerated distance math.
When to Use
- Semantic search over embeddings (text, images, audio)
- RAG pipelines for AI agents
- Recommendation systems
- Similarity matching and deduplication
Key Features
- HNSW index — Multi-layer proximity graph. Construction at full precision (FP32/FP16) for structural integrity; traversal on quantized payloads for cache residency.
- Quantization — SQ8 (~4x memory reduction), PQ (~4-8x), IVF-PQ (~16 bytes/vector for 100M+ datasets).
- Adaptive pre-filtering — Roaring Bitmap-based filtering. Automatic strategy selection: pre-filter (selective), post-filter (broad), or brute-force.
- Distance metrics — L2, cosine, inner product, Manhattan, Chebyshev, Hamming, Jaccard, Pearson.
- Cross-engine fusion — Combine with graph (GraphRAG), full-text (hybrid BM25+vector), or spatial filtering.
SQL Usage
-- Create a collection with a vector index
CREATE COLLECTION articles;
CREATE VECTOR INDEX idx_embed ON articles METRIC cosine DIM 384;
-- Insert with embedding
INSERT INTO articles { title: 'Understanding Transformers', embedding: [0.12, -0.34, 0.56, ...] };
-- Nearest neighbor search
SEARCH articles USING VECTOR(embedding, ARRAY[0.1, 0.3, -0.2, ...], 10);
-- Filtered vector search
SELECT title, vector_distance(embedding, ARRAY[0.1, 0.3, ...]) AS score
FROM articles
WHERE category = 'machine-learning'
AND id IN (SEARCH articles USING VECTOR(embedding, ARRAY[0.1, 0.3, ...], 10));
-- Hybrid BM25 + vector (RRF fusion)
SELECT title, rrf_score(
vector_distance(embedding, $query_vec),
bm25_score(body, 'transformer attention')
) AS score
FROM articles
LIMIT 10;
Quantization Selection
| Type | Memory (384d) | Recall | Best for |
| HNSW (FP32) | ~1.5 KB | ~99% | < 1M vectors, max accuracy |
| HNSW + SQ8 | ~384 B | ~98% | 1-10M vectors |
| HNSW + PQ | ~96 B | ~95% | 10-50M vectors |
| IVF-PQ | ~16 B | ~85-95% | 50M+ vectors |
How It Works
Vectors are indexed in the HNSW graph at full precision. During search, quantized copies are traversed for speed, then top candidates are re-ranked against full-precision vectors. When metadata filters are present, the engine builds a Roaring Bitmap of matching IDs and selects the optimal strategy based on selectivity.