|
Backend engineer and AI infrastructure enthusiast focused on building highly scalable and intelligent systems. I specialize in Cloud Native Architectures, ML Engineering, and MLOps to drive robust and efficient AI-powered applications.
-
Programming Languages: Python, Go, Java and more.
-
Cloud-Native & Distributed Systems: Kubernetes(CRI & CNI & CSI & Scheduler & Operator), Service Mesh(Istio & Linkerd), Container(CGroups & Namespaces & UnionFS), Ray, Spark, Distributed System Design, etc.
-
AI/ML Engineering & Platforms: RecSys, RAG, Text2SQL, NLP, MLOps, PyTorch, DeepSpeed, Triton.
-
End-to-End ML Platform Engineering: Architecting and building production-grade, Kubernetes-native ML platforms that integrate cloud-native infrastructure (service mesh, observability, security) with complete ML lifecycle automation from distributed data processing and elastic training to model versioning and intelligent deployment strategies.
-
AI Application Development: Building and optimizing high-performance applications for search, retrieval, and recommendation, with proven implementations of RAG, Text-to-SQL, and hybrid search solutions.
-
LLM Inference & Performance Tuning: Deep diving into runtime optimization techniques (PagedAttention, FlashAttention, 3D Parallelism, etc.) and advanced attention mechanisms to maximize throughput and reduce latency.
-
Advanced Retrieval & Recommendation: Advancing the application of RAG, generative recommendation systems, and Text2SQL to solve real-world problems.
-
AI Agent Architectures: Designing and developing autonomous agents and multi-agent systems for complex, multi-step task automation.
|