Implementation of WARP algorithm (https://arxiv.org/pdf/2406.16768)
git clone https://github.com/LuLim14/Alignment_project.git
cd ./Alignment_project/warp_algorithm
Установка зависимостей из 'requirements.txt':
pip install -U -r requirements.txt
python main.py --use_wandb '[False|True]' --path_to_checkpoints_reward_model [path to reward_model checkpoints directory] --checkpoint_theta_dir [path to train_checkpoints_theta directory] --checkpoint_final_dir [path to checkpoints final directory] --checkpoint_ema_dir [path to checkpoints ema directory]