8+ years of software engineering. 5 years building backends and full-stack systems in Python, Java, Node.js, then a hard pivot into AI/ML. Once I saw the power of LLMs and AI, I went all-in, no looking back.
Now 3 years deep in production LLM systems - I build multi-agent pipelines that retrieve with precision, evaluate themselves, fail gracefully, and don't bankrupt the company on API calls. If a system can't show you why it gave that answer, it shouldn't be in production.
- 13K+ users on SciWeave — multi-agent RAG across 250M+ papers, handling 10K+ monthly queries with cited answers in <6 seconds
- 10x cost reduction ($90 → $9/month) via hybrid DeBERTa + LLM classification across 275 intent labels, semantic caching & tiered routing
- 60% latency reduction on multi-agent pipelines with parallel execution, 5-layer caching & dual-provider failover
🔗 SciWeave · 🔍 RepoScout · 📫 [email protected] · LinkedIn
🔍 RepoScout — AI-Powered Open Source Intelligence Engine
5-stage agentic pipeline across 85K+ Python packages · hybrid Mistral + OpenAI model selection optimized per stage · autonomous tool-calling with up to 8 reasoning iterations · 85K+ semantic embeddings on Qdrant Cloud · Supabase over 2.1M+ dependency signals · SSE streaming with conversation follow-ups.
- Advanced RAG (Self-RAG, Hierarchical, Adaptive) with 13+ DSPy modules across a 4-phase parallel pipeline
- Multi-agent orchestration with dual-provider failover — 60% latency reduction, 30% fewer LLM calls
- NL-to-SQL pipelines with hallucination guardrails
- Hybrid retrieval: BM25 + dense embeddings + cross-encoder reranking
- Qdrant, FAISS, Chroma, Pinecone, Elasticsearch, Supabase, PostgreSQL
- Multimodal document systems: layout analysis, figure extraction, table parsing, vision models
- 4-tier query routing — 15% retrieval precision improvement
- Hybrid DeBERTa + LLM classification for 275 intent labels (83% accuracy at 95% confidence)
- 5-layer caching, semantic caching, tiered routing
- 10x cost reduction ($90 → $9/month)
- RAG evaluation on QASA benchmark using RAGAS & LLM-as-judge — 6.3% context recall gain, 0% faithfulness loss
- Analyzed 40K+ queries across personas to drive complexity-aware routing
- Tool-use flows, function calling, structured outputs, custom MCP servers
- MCP servers for K8s tunneling, SQL safety, resource lifecycle management
- 8+ years across monoliths and microservices
- REST, GraphQL, async pipelines, distributed workflows
- AWS (Lambda, S3, SQS, DynamoDB, SageMaker, Bedrock), Docker, Kubernetes
- MLFlow, Terraform, Vercel
- React, Next.js, TypeScript, shadcn/ui
- Full-stack AI app (RepoScout) built end-to-end
- 🤖 Multi-agent systems — orchestration patterns, handoff protocols, memory architectures
- 🔧 Tool-use & function calling — making agents actually do things reliably
- 🕸️ Graph RAG & knowledge graphs — structured reasoning over unstructured data
- 🧩 Claude Agent SDK & MCP — building with the next generation of agent infrastructure
- 🎯 Google Certified TensorFlow Developer — scored 100%
- 🎓 President's Honor List — Post Graduate Certificate, Seneca College, Toronto
- 🥇 Gold Medalist — B.E. Computer Engineering, VNSGU, India
Building AI systems that work in production, not just in notebooks.



