The code for our paper ALLIES: Prompting Large Language Model with Beam Search.
dataset/nq-test.jsonl
dataset/tqa-test.jsonl
dataset/webq-test.jsonl
We release the preprocessed data and trained ckpts in Azure Blob. Here we also provide the file list under this URL:
Click here to see the file list.
INFO: nq/de-checkpoint-10000/passage_embedding.pb; Content Length: 60.13 GiB
INFO: nq/de-checkpoint-10000/passage_embedding2id.pb; Content Length: 160.33 MiB
INFO: webq/de-checkpoint-400/passage_embedding.pb; Content Length: 60.13 GiB
INFO: webq/de-checkpoint-400/passage_embedding2id.pb; Content Length: 160.33 MiB
INFO: tq/de-checkpoint-10000/passage_embedding.pb; Content Length: 60.13 GiB
INFO: tq/de-checkpoint-10000/passage_embedding2id.pb; Content Length: 160.33 MiBTo download the files, please refer to HOW_TO_DOWNLOAD.
python main.py --dataset $dataset --task answer_without_retrieval --apikey $ID
python main.py --dataset $dataset --task answer_with_retrieval --topK $retrieval_num --apikey $ID
python main.py --dataset $dataset --task genread --apikey $ID
##GENREAD
python main.py --dataset $dataset --task ALLIES --retrieval_type generate --beam_size $beam_size --beam_Depth $beam_depth --ask_question_num $ask_question_num --apikey $ID
##Retrieval
python main.py --dataset $dataset --task ALLIES --topK $retrieval_num --retrieval_type retrieve --beam_size $beam_size --beam_Depth $beam_depth --ask_question_num $ask_question_num --apikey $ID
- $dataset: Dataset for testing
- $ID: The key for API
- $beam_size: Beam size
- $beam_depth: Beam depth
- $ask_question_num: Ask question number
- $retrieval_num: Retrieval doc num
