Hi, I'm Kingsley Kim. I'm a current student at UVA, mostly interested in kernel optimization and LLM research.I've worked under Chen-Yu Wei, mainly on RL and LLM post-training as well as training more efficient Process Reward Models.
Outside of school, I worked at Palantir in Summer 2025 as a Forward Deployed Engineer in a manufacturing team. Was also an 8VC Fellow during that time, and met so many great people.
Things I've worked on:
- Published an ICLR Workshop Paper on training data-efficient Process Reward Models for LLM posttraining
- A more applied paper on Transformers on Time Series, AAAI 2024 Workshop
Code:
- As of my knowledge, first opensource H100 CUDA Kernels for the Gated Delta Net arch repo, check blog for more details
- A minimal dependence CUDA inference engine for H100s (WIP) repo
- A minimal repo to learn FP8 and BF16 MoE optimizations (WIP, worklog soon) repo
- RL applied to control problems
- Research on mini-VLMs for 'social navigation' in robots repo
I've also written a few posts on what I've learned from CUDA kernel engineering, and some notes on textbooks I've read:
Contact: