nanoRLHF

Nano implementation of Reinforcement Learning from Human Feedback (RLHF) in the style of Andrej Karpathy

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
example		example
legal		legal
LICENSE		LICENSE
README.md		README.md
github-marketplace-developer-agreement.md		github-marketplace-developer-agreement.md

Provide feedback