slime_example

Slime Framework Usage Documentation

Directory File Introduction

multiturn_llm_reward.py is a file under the Slime framework for custom reward models and custom generation of rollout data, used to generate datasets and calculate reward values.

Usage:

Ensure that the Slime framework and related dependencies are installed.
Check the 【# Change Here !!!!】 comment in run_qwen2.5_3B.sh to ensure the path points to the correct file.
Point the reward model path in the configuration file to the multiturn_llm_reward.py file, ensuring the function name and parameters are correct.
Run the training script using run_qwen2.5_3B.sh to start training and evaluating the model.

To implement multi-turn + tool calling, in slime you only need to implement a custom data generation function and a reward model required for the task, corresponding to these 2 configuration items in the startup script:

CUSTOM_ARGS=(
   --custom-generate-function-path multiturn_llm_reward.generate
   --custom-rm-path multiturn_llm_reward.compute_score
)

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
README_ZH.md		README_ZH.md
multiturn_llm_reward.py		multiturn_llm_reward.py
run_qwen2.5_3B.sh		run_qwen2.5_3B.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Slime Framework Usage Documentation

Directory File Introduction

FilesExpand file tree

slime_example

Directory actions

More options

Directory actions

More options

Latest commit

History

slime_example

Folders and files

parent directory

README.md

Slime Framework Usage Documentation

Directory File Introduction