| title | Performance Optimization Guide | |
|---|---|---|
| status | CURRENT | |
| last_updated | 2026-02-02 | |
| version | 1.0 | |
| related |
|
Baseline (v1.0):
- Event ingestion: 469,000 events/sec
- Query latency (p99): 11.9μs
- Concurrent writes (8 threads): 7.98ms
Target (v1.2 - Phase 1.5):
- Event ingestion: 1M+ events/sec (+113%)
- Query latency (p99): <5μs (-58%)
- Concurrent writes (8 threads): <4ms (-50%)
DashMap Instead of Mutex:
// ✅ CURRENT: Lock-free with internal sharding
use dashmap::DashMap;
pub struct EventIndex {
entity_index: Arc<DashMap<String, Vec<IndexEntry>>>,
}
impl EventIndex {
pub fn index_event(&self, ...) {
self.entity_index
.entry(entity_id.to_string())
.or_insert_with(Vec::new)
.push(entry); // No locks!
}
}Impact: 3x faster concurrent writes
Public Fields Instead of Getters:
// ✅ CURRENT: Direct field access
pub struct Event {
pub id: Uuid, // Direct access: ~1ns
pub event_type: String,
// ...
}
let id = event.id; // Zero overheadImpact: 10x faster field access (10ns → 1ns)
Separate Fast/Validated Constructors:
// Fast path (no validation)
pub fn new(...) -> Self {
Self { id: Uuid::new_v4(), ... } // ~50ns
}
// Validated path (when needed)
pub fn new_validated(...) -> Result<Self> {
Self::validate_event_type(&event_type)?; // ~100ns
// ...
}Impact: 2x faster event construction
// Zero-copy deserialization with SIMD
use simd_json;
let event: Event = simd_json::from_slice(&mut bytes)?;Target: +40% deserialization speed
// Concurrent async operations
stream::iter(events)
.map(|event| storage.write(event))
.buffered(100) // 100 concurrent writes
.try_collect::<()>()
.await?;Target: +700% throughput
// Already implemented in use cases
pub async fn execute(&self, requests: Vec<IngestEventRequest>) -> Result<...> {
let events = requests.into_iter().map(Event::from).collect();
self.repository.save_batch(&events).await?;
}Target: +1300% for large batches
# Single benchmark
cargo bench --bench performance_benchmarks -- ingestion_throughput/1000
# All benchmarks
cargo bench --bench performance_benchmarksingestion_throughput: Events/secondquery_performance: Query latencyconcurrent_writes: Multi-threaded write performance
- Lock-free data structures (DashMap)
- Zero-cost field access (public fields)
- No validation in hot path
- Batch processing support in use cases
- simd-json integration
- Async I/O batching
- SIMD for filtering operations
Detailed benchmarks: Run cargo bench and see /target/criterion/report/index.html
Implementation guide: See Phase 1.5 Progress