High-performance Rust data structures and compression algorithms with memory safety guarantees.
- High Performance: Zero-copy operations, SIMD optimizations (AVX2, AVX-512), cache-friendly layouts, SIMD cursor primitives for Block-Max WAND
- Memory Safety: 99.8% unsafe block documentation coverage, all production unsafe blocks annotated with
// SAFETY:comments - Secure Memory Management: Production-ready memory pools with thread safety and RAII
- Blob Storage: 8 specialized stores with trie-based indexing and compression
- Succinct Data Structures: 12 rank/select variants, Rank9 (Vigna 2008), Elias-Fano / Partitioned / DP-Optimal Partitioned Elias-Fano with cursor
advance_to_index, HybridPostingList (auto-select encoding), AMD-safe PDEP withhas_fast_bmi2 - BM25 Scoring: FieldnormEncoder (Lucene SmallFloat, 1-byte fieldnorms) + Bm25BatchScorer (AVX2 SIMD batch, prefetch)
- Specialized Containers: 13+ containers (VecTrbSet/Map, MinimalSso, SortedUintVec, LruMap, etc.)
- Hash Maps: Golden ratio optimized, string-optimized, cache-optimized implementations
- Advanced Tries: Double-Array (DoubleArrayTrie, XOR transitions), LOUDS, Critical-Bit (BMI2), Patricia tries with rank/select, NestTrieDawg, lazy prefix/fuzzy iterators
- Compression: PA-Zip, Huffman O0/O1/O2, FSE, rANS, ZSTD integration
- C FFI Support: Complete C API (
--features ffi)
[dependencies]
zipora = "3.1.4"
# With C FFI bindings
zipora = { version = "3.1.4", features = ["ffi"] }
# AVX-512 (nightly only)
zipora = { version = "3.1.4", features = ["avx512"] }use zipora::*;
// High-performance vector
let mut vec = FastVec::new();
vec.push(42).unwrap();
// Zero-copy strings with SIMD hashing
let s = FastStr::from_string("hello world");
println!("Hash: {:x}", s.hash_fast());
// Intelligent rank/select with automatic optimization
let mut bv = BitVector::new();
for i in 0..1000 { bv.push(i % 7 == 0).unwrap(); }
let adaptive_rs = AdaptiveRankSelect::new(bv).unwrap();
let rank = adaptive_rs.rank1(500);
// Unified Trie - Strategy-based configuration
use zipora::fsa::{ZiporaTrie, ZiporaTrieConfig, Trie};
let mut trie = ZiporaTrie::new();
trie.insert(b"hello").unwrap();
assert!(trie.contains(b"hello"));
// Unified Hash Map - Strategy-based configuration
use zipora::hash_map::{ZiporaHashMap, ZiporaHashMapConfig};
let mut map = ZiporaHashMap::new();
map.insert("key", "value").unwrap();
// Blob storage with compression
let config = ZipOffsetBlobStoreConfig::performance_optimized();
let mut builder = ZipOffsetBlobStoreBuilder::with_config(config).unwrap();
builder.add_record(b"Compressed data").unwrap();
let store = builder.finish().unwrap();
// Entropy coding
let encoder = HuffmanEncoder::new(b"sample data").unwrap();
let compressed = encoder.encode(b"sample data").unwrap();
// String utilities
use zipora::string::{join_str, hex_encode, hex_decode, words, decimal_strcmp};
let joined = join_str(", ", &["hello", "world"]);
assert_eq!(joined, "hello, world");- Containers - Specialized containers (FastVec, ValVec32, IntVec, LruMap, etc.)
- Hash Maps - ZiporaHashMap, GoldHashMap with strategy-based configuration
- Blob Storage - 8 blob store variants with trie indexing and compression
- Memory Management - SecureMemoryPool, MmapVec, five-level pools
- Algorithms - Radix sort, suffix arrays, set operations, cache-oblivious algorithms, SIMD popcount, SIMD galloping search, SIMD block filter
- Compression - PA-Zip, Huffman, FSE, rANS, real-time compression
- String Processing - SIMD string operations, pattern matching
- Concurrency - Pipeline processing, work-stealing, parallel trie building
- Error Handling - Error classification, automatic recovery strategies
- Configuration - Rich configuration APIs, presets, validation
- SIMD Framework - 6-tier SIMD with AVX2/BMI2/POPCNT support
- I/O & Serialization - Stream processing, endian handling, varint encoding
- C FFI - C API for interoperability
- Search Engine Guide - End-to-end search engine architecture with Zipora
- Performance Benchmarks - Verified benchmarks across all components
- Porting Status - Feature implementation status
| Feature | Default | Description |
|---|---|---|
simd |
Yes | SIMD optimizations (AVX2, SSE4.2) |
mmap |
Yes | Memory-mapped file support |
zstd |
Yes | ZSTD compression |
serde |
Yes | Serialization support (serde, serde_json, bincode) |
lz4 |
Yes | LZ4 compression |
async |
Yes | Async runtime (tokio) for concurrency, pipeline, real-time compression |
ffi |
No | C FFI bindings |
avx512 |
No | AVX-512 (nightly only) |
nightly |
No | Nightly-only optimizations |
# Build (default features)
cargo build --release
# Build with all features including FFI
cargo build --release --all-features
# Test
cargo test --lib
# Sanity check (all feature combinations, debug + release)
make sanity
# Benchmark (release only)
cargo bench
# Lint
cargo clippy --all-targets --all-features -- -D warningsSee Performance Benchmarks for detailed results across all components (Trie, BitVector, popcount, rank/select, containers, entropy coding, LRU cache, BM25 scoring).
Highlights: DoubleArrayTrie 20.6 ns/lookup, SIMD popcount 5.2 Gwords/s, bulk bitwise 41x faster, BM25 SIMD 13.5x faster, LRU hot-get 26x faster.
Minimal dependency footprint by design:
- Core:
bytemuck,thiserror,log,ahash,rayon,libc,once_cell,raw-cpuid - Default:
memmap2(mmap),zstd,lz4_flex,serde/serde_json/bincode,tokio(async) - Optional:
cbindgen(ffi) - Removed:
crossbeam-utils,parking_lot,uuid,num_cpus,async-trait,futures(all replaced with std or eliminated)
See Search Engine Guide for the complete guide with code examples covering all 11 components: term dictionaries (DoubleArrayTrie + lazy prefix/fuzzy iterators), posting lists (HybridPostingList + Elias-Fano cursors), SIMD query primitives (simd_gallop_to, simd_block_filter, advance_to_index), BM25 scoring (FieldnormEncoder + Bm25BatchScorer), document storage, compression, multi-threaded indexing, and component selection guide.
Business Source License 1.0 - See LICENSE for details.