Skip to content

Latest commit

 

History

History
150 lines (116 loc) · 3.35 KB

File metadata and controls

150 lines (116 loc) · 3.35 KB
title Performance Optimization Guide
status CURRENT
last_updated 2026-02-02
version 1.0
related
./CLEAN_ARCHITECTURE.md

Performance Optimization Guide

Current Performance

Baseline (v1.0):

  • Event ingestion: 469,000 events/sec
  • Query latency (p99): 11.9μs
  • Concurrent writes (8 threads): 7.98ms

Target (v1.2 - Phase 1.5):

  • Event ingestion: 1M+ events/sec (+113%)
  • Query latency (p99): <5μs (-58%)
  • Concurrent writes (8 threads): <4ms (-50%)

Key Optimizations

1. Lock-Free Data Structures (✅ IMPLEMENTED)

DashMap Instead of Mutex:

// ✅ CURRENT: Lock-free with internal sharding
use dashmap::DashMap;

pub struct EventIndex {
    entity_index: Arc<DashMap<String, Vec<IndexEntry>>>,
}

impl EventIndex {
    pub fn index_event(&self, ...) {
        self.entity_index
            .entry(entity_id.to_string())
            .or_insert_with(Vec::new)
            .push(entry);  // No locks!
    }
}

Impact: 3x faster concurrent writes

2. Zero-Cost Field Access (✅ IMPLEMENTED)

Public Fields Instead of Getters:

// ✅ CURRENT: Direct field access
pub struct Event {
    pub id: Uuid,           // Direct access: ~1ns
    pub event_type: String,
    // ...
}

let id = event.id;  // Zero overhead

Impact: 10x faster field access (10ns → 1ns)

3. No Validation in Hot Path (✅ IMPLEMENTED)

Separate Fast/Validated Constructors:

// Fast path (no validation)
pub fn new(...) -> Self {
    Self { id: Uuid::new_v4(), ... }  // ~50ns
}

// Validated path (when needed)
pub fn new_validated(...) -> Result<Self> {
    Self::validate_event_type(&event_type)?;  // ~100ns
    // ...
}

Impact: 2x faster event construction

4. Planned Optimizations

simd-json (⏳ PLANNED)

// Zero-copy deserialization with SIMD
use simd_json;
let event: Event = simd_json::from_slice(&mut bytes)?;

Target: +40% deserialization speed

Async I/O Batching (⏳ PLANNED)

// Concurrent async operations
stream::iter(events)
    .map(|event| storage.write(event))
    .buffered(100)  // 100 concurrent writes
    .try_collect::<()>()
    .await?;

Target: +700% throughput

Batch Processing (✅ IMPLEMENTED - OPTIMIZE)

// Already implemented in use cases
pub async fn execute(&self, requests: Vec<IngestEventRequest>) -> Result<...> {
    let events = requests.into_iter().map(Event::from).collect();
    self.repository.save_batch(&events).await?;
}

Target: +1300% for large batches


Performance Testing

Run Benchmarks

# Single benchmark
cargo bench --bench performance_benchmarks -- ingestion_throughput/1000

# All benchmarks
cargo bench --bench performance_benchmarks

Key Metrics

  • ingestion_throughput: Events/second
  • query_performance: Query latency
  • concurrent_writes: Multi-threaded write performance

Optimization Checklist

  • Lock-free data structures (DashMap)
  • Zero-cost field access (public fields)
  • No validation in hot path
  • Batch processing support in use cases
  • simd-json integration
  • Async I/O batching
  • SIMD for filtering operations

Detailed benchmarks: Run cargo bench and see /target/criterion/report/index.html

Implementation guide: See Phase 1.5 Progress