A toy project of finetuning LLM with math data: stacks, arxiv, math.SE, mathoverflow, mathlib4, trl-lib/DeepMath-103K etc.
with the final goal of RL on lean4 theorem verifier
input examples: 123+456=, 123*456=
mamba create --name leanlm
mamba install "python==3.12.*"
pip install uv
# install packages with `uv pip install ...`
# export environment
./mamba_export