- π Ph.D. in Electrical Engineering from Stanford University.
- π Iβm interested in LLM Inference & Serving, with a focus on Quantization and Parallelism (e.g., Parallel Decoding, Speculative Decoding).
- π± Currently focused on:
- CUDA Kernel Optimization
- Model Deployment & Serving Infrastructure (Paged KV Cache, Continuous Batching)
- Post-training (RLHF, Distillation, Flow-matching)
- π« How to reach me: [email protected]
- π Pronouns: She/Her
lfopensource/lfopensource
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Β | Β | |||