ML engineer and technical lead with 10+ years building production ML systems at scale. Deployed LLMs, semantic search, and foundation models serving 50M+ monthly queries and driving $100M+ in measurable revenue. Deep hands-on experience across the full ML lifecycle: model training, fine-tuning, optimization, serving, and monitoring.
- Foundation model: Domain-specific model encoding behavioral sequences for predictive and personalized applications across a $150B+ retail ecosystem
- Semantic search at scale: End-to-end system serving 50M+ monthly queries at P99 <200ms; +5% conversion lift over heuristic baselines, $108M+ attributable revenue
- FoodieLlama: Fine-tuned and quantized SLM for multi-intent query understanding, with safety evaluation, red-teaming, and deployment via Triton + vLLM (OpenAI-compatible API)
- RAG pipeline: LLM-driven generation system using Azure OpenAI for structured output from natural language prompts
- Neural taxonomy classifier: Hybrid LSTM/CNN system predicting across 500K+ categories, deployed via TensorFlow Serving
LLMs & NLP: Fine-tuning, PEFT/LoRA, RAG systems, semantic search, embedding models, Transformers, vLLM, Triton Inference Server
ML Systems: PyTorch, TensorFlow, DeepSpeed, distributed training, model serving at scale, A/B testing, pipeline automation
Infrastructure: Python, SQL, Docker, AWS, REST/gRPC, Elasticsearch, Redis, Git, CI/CD


