IRanker: Towards Ranking Foundation Model

📌Preliminary

Environment Setup

conda create -n iranker python=3.9
# install torch [or you can skip this step and let vllm to install the correct version for you]
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu121
# install vllm
pip3 install vllm==0.6.3 # or you can install 0.5.4, 0.4.2 and 0.3.1
pip3 install ray

# verl
pip install -e .

# flash attention 2
pip3 install flash-attn --no-build-isolation
# quality of life
pip install wandb IPython matplotlib

📊 Dataset Preparation

This section outlines the steps to generate the datasets used for DRanker and IRanker training and evaluation.

DRanker Dataset

To generate the DRanker dataset, run the following command:

python examples/data_preprocess/direct_data_generation.py

The processed dataset will be saved to: data/direct_ranking

IRanker Dataset

To generate the IRanker dataset, execute this script:

python examples/data_preprocess/iterative_data_generation.py

The processed dataset will be saved to: data/iterative_ranking

Raw Dataset

The original raw dataset is available for download from Hugging Face:

Dataset Repository: ulab-ai/Ranking-bench

⭐Experiments

Training

To train IRanker (Iterative Deletion Ranker) or DRanker (Direct Ranking Ranker), use the provided training scripts:

# Set required environment variables
export N_GPUS=2
export BASE_MODEL=/path/to/base/model
export DATA_DIR=data/iterative_ranking  # for IRanker
# or
export DATA_DIR=data/direct_ranking      # for DRanker
export ROLLOUT_TP_SIZE=2
export EXPERIMENT_NAME=my_experiment

# Run training
bash scripts/train_iranker.sh  # for IRanker
# or
bash scripts/train_dranker.sh  # for DRanker

Checkpoints will be saved to checkpoints/Ranking-FM/<experiment_name>/.

🔍 Evaluation

Running Evaluation

To evaluate a model on a specific dataset, use the following command:

python eval/eval.py --dataset <dataset_name> --model_path <path_to_model>

Parameters

--dataset: Specifies the dataset to evaluate on
--model_path: Path to the trained model you want to evaluate

Supported Datasets

The evaluation script supports the following datasets:

Passage Ranking

Passage-5
Passage-7
Passage-9

Router Tasks

Router-Performance
Router-Balance
Router-Cost

Recommendation Systems

Rec-Movie
Rec-Music
Rec-Game

Citation

@inproceedings{feng2024graphrouter,
  title={Graphrouter: A graph-based router for llm selections},
  author={Feng, Tao and Shen, Yanzhen and You, Jiaxuan},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
docs		docs
eval		eval
examples		examples
figures		figures
scripts		scripts
verl		verl
LICENSE		LICENSE
Notice.txt		Notice.txt
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IRanker: Towards Ranking Foundation Model

📌Preliminary

Environment Setup

📊 Dataset Preparation

DRanker Dataset

IRanker Dataset

Raw Dataset

⭐Experiments

Training

🔍 Evaluation

Running Evaluation

Parameters

Supported Datasets

Passage Ranking

Router Tasks

Recommendation Systems

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

IRanker: Towards Ranking Foundation Model

📌Preliminary

Environment Setup

📊 Dataset Preparation

DRanker Dataset

IRanker Dataset

Raw Dataset

⭐Experiments

Training

🔍 Evaluation

Running Evaluation

Parameters

Supported Datasets

Passage Ranking

Router Tasks

Recommendation Systems

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages