lkevinzc

Follow

🎯

Learning

zclzc lkevinzc

🎯

Learning

Follow

@google-deepmind

164 followers · 163 following

Achievements

Achievements

Organizations

Pinned Loading

sail-sg/oat sail-sg/oat Public

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

Python 647 63
axon-rl/gem axon-rl/gem Public

A Gym for Agentic LLMs

Python 476 31
sail-sg/understand-r1-zero sail-sg/understand-r1-zero Public

Understanding R1-Zero-Like Training: A Critical Perspective

Python 1.2k 57
mosecorg/mosec mosecorg/mosec Public

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

Python 897 72
spiral-rl/spiral spiral-rl/spiral Public

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Python 183 21
sail-sg/Precision-RL sail-sg/Precision-RL Public

Defeating the Training-Inference Mismatch via FP16

Python 187 17