Kingsley Kim

Hi, I'm Kingsley Kim. I'm a current student at UVA, mostly interested in kernel optimization and LLM research.I've worked under Chen-Yu Wei, mainly on RL and LLM post-training as well as training more efficient Process Reward Models.

Outside of school, I worked at Palantir in Summer 2025 as a Forward Deployed Engineer in a manufacturing team. Was also an 8VC Fellow during that time, and met so many great people.

Things I've worked on:

Published an ICLR Workshop Paper on training data-efficient Process Reward Models for LLM posttraining
A more applied paper on Transformers on Time Series, AAAI 2024 Workshop

Code:

As of my knowledge, first opensource H100 CUDA Kernels for the Gated Delta Net arch repo, check blog for more details
A minimal dependence CUDA inference engine for H100s (WIP) repo
A minimal repo to learn FP8 and BF16 MoE optimizations (WIP, worklog soon) repo
RL applied to control problems
Research on mini-VLMs for 'social navigation' in robots repo

I've also written a few posts on what I've learned from CUDA kernel engineering, and some notes on textbooks I've read:

Contact: