Skip to content

anviit/anviit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 

Repository files navigation

Anvit More - ML Systems · GPU Kernels · Applied RL

ML engineer focused on the systems layer: how models run fast, how decisions get made under uncertainty, how inference holds up in production.


What I build

GPU kernel engineering — custom Triton kernels for LLM primitives, benchmarked against PyTorch baselines on real hardware.

Kernel Speedup Peak throughput
Fused Bias + GELU 14.65× 172 GB/s
FlashAttention (T=2048) 2.52× 11.4 GB/s
Fused AdamW (50M params) 3.45× 177 GB/s
Inference attention (B=2) 3.94× 95 GB/s

triton-llm-kernels — RMSNorm, LayerNorm, FlashAttention, fused AdamW, inference attention. Every kernel validated against fp32 reference, benchmarked with triton.testing.do_bench.


Production LLM inference — async serving stack on a 6 GB GPU, built from first principles.

Metric Value
TTFT P50 28 ms
Decode speed 39.4 tok/s (~85% of memory bandwidth)
Cache hit latency P50 2 ms
Cache hit rate 81%
Success rate @ concurrency=10 100%

llm-inference-serving — FastAPI gateway → Redis cache → FP16 PyTorch → RTX 4050L. Fused Triton attention kernel, asyncio-locked GPU access, fire-and-forget cache writes.


Reinforcement learning for real-time decisions — physics-informed simulation + PPO agent for F1 race strategy. The same architecture applies to ADAS planning, EV energy management, and hybrid powertrain arbitration.

Agent E[Position] E[Points]
Rule-based baseline (1-stop M→H) 3.09 15.8
PPO agent 1.00 25.0

+58% points vs baseline. Monte Carlo planner runs at 870 rollouts/second on a single CPU core.

autonomous-strategy-engine — physics-informed tyre/fuel/weather models, 10k–100k MC rollouts, PPO on 8-dim sensor observation, 27 passing tests.


Stack

Python · Triton · PyTorch · CUDA · FastAPI · Redis · Docker · Stable-Baselines3 · NumPy · scikit-learn

Production experience: LoRA/QLoRA fine-tuning · Whisper ASR · RAG (FAISS, Pinecone) · Gemini Vision · medical NLP


Background

MSc Data Science — University of Edinburgh (2024)
Currently: ML & AI Engineer @ Plus91 Technology, Pune
Target: ML Systems / LLM Inference / Automotive AI — open to relocate to Germany, Switzerland, Poland, Norway, Finland

📧 [email protected] · LinkedIn

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors