Comprehensive examples for training and evaluating vision-language models with reinforcement learning.
- Training Guide - How to train models on various tasks
- Evaluation Guide - How to evaluate trained models
examples/
├── algorithms/ # RL algorithm scripts (GRPO, RLOO, DAPO, etc.)
├── tasks/ # Task-specific training scripts
├── eval/ # Evaluation scripts
├── format_prompt/ # Prompt templates
├── reward_function/ # Reward functions
└── config.yaml # Base configuration