A Framework for LLM-based Multi-Agent Reinforced Training and Inference
-
Updated
Apr 14, 2026 - Python
A Framework for LLM-based Multi-Agent Reinforced Training and Inference
RL training environments with verifiable rewards for coding agents. Works with TRL, Unsloth, verl, OpenRLHF.
A list of uv environments templates for LLM development.
RLHF Annotation Studio — Web-based tool for collecting human preference data to train LLMs via Reinforcement Learning from Human Feedback (RLHF). Compare responses side-by-side, capture preferences, and export JSONL for reward model training.
🌐 Streamline LLM development with ready-to-use environment templates for efficient setup and deployment.
Add a description, image, and links to the openrlhf topic page so that developers can more easily learn about it.
To associate your repository with the openrlhf topic, visit your repo's landing page and select "manage topics."