I am a performance-obsessed software engineer with 9+ years of experience building high-scale transactional systems. Currently architecting microservices at Tata Consultancy Services and diving deep into Systems Programming with Go.
I specialize in:
- High-Throughput Systems: Optimizing SQL and concurrency models to handle 1M+ daily events.
- Distributed Architecture: Designing fault-tolerant services using microservices, Kafka, and GCP.
- Low-Level Engineering: Building standard library-only tools to understand OS internals.
| Core | Infrastructure | Data & Streaming | Observability |
|---|---|---|---|
Microservices, Event-Driven Architecture, & Performance Tuning
A cloud-native ecosystem designed to decouple business logic into independent services (Session, Trend, Users).
- The Architecture: Guarded by a custom Reverse Proxy Gateway built with Go's
net/httpandhttputil. - The Challenge: GORM reflection was causing high latency on write-heavy paths.
- The Fix: Migrated to Raw SQL, reducing P99 latency from ~100ms to 42ms.
- Key Tech: Go, Docker, PostgreSQL, Cloud Run,
sync.MutexRate Limiter.
AI Infrastructure, Vector Search, & Latency Optimization
- **A cost-control firewall designed to intercept LLM traffic and serve responses from memory using vector similarity.
- **The Architecture: A "Dual-Path" retrieval engine that checks for Exact Matches (O(1)) and Semantic Matches (HNSW Index) before calling OpenAI.
- **The Challenge: Production LLM queries were averaging ~3,000ms latency and incurring high token costs for repetitive questions.
- **The Fix: Implemented Redis Vector Search with Cosine Similarity, reducing P99 latency to <50ms (~5,000x speedup) for cached hits.
- **Key Tech: Go (Goroutines), Redis Stack (RediSearch), OpenAI Embeddings, Docker.
Systems Programming, OS Internals, & Kernel Interfaces
A POSIX-compliant shell built from scratch to master low-level system calls (fork, exec, wait).
- Key Features: Implements non-blocking pipelines (
|) and I/O redirection (>,>>) by manually manipulating File Descriptors. - Why I built it: To understand the abstraction layer between the Go runtime and the Linux Kernel.
Results from productivity-planner load testing (via hey)
| Metric | Result | Context |
|---|---|---|
| Throughput | 2,150 RPS | Single Cloud Run instance (2 vCPU) |
| P99 Latency | 42 ms | Optimized Raw SQL Write Path |
| Error Rate | 0.00% | Sustained load over 5m duration |


