rerank

BERT-base Passage Reranking

Code for training bert-base passage reranking model. Our code is developed based on the reranker framework, so, you need to put the code in Reranker/src/ in './' for reproduce the result.

Training

Data preprocessing

building train and dev dataset

usage: build_train.py [-h] [--tokenizer_name TOKENIZER_NAME] [--truncate TRUNCATE] [--qrel_file QREL_FILE] [--query_file QUERY_FILE] [--corpus_file CORPUS_FILE] [--retrieval_file RETRIEVAL_FILE] [--ranking_file RANKING_FILE]

optional arguments:
  -h, --help            show this help message and exit
  --tokenizer_name TOKENIZER_NAME
                        bert tokenizer name
  --truncate TRUNCATE   bert tokenizer max sequence length
  --qrel_file QREL_FILE
                        qrels train file
  --query_file QUERY_FILE
                        query train file
  --corpus_file CORPUS_FILE
                        corpus file
  --retrieval_file RETRIEVAL_FILE
                        train query retrieval result
  --ranking_file RANKING_FILE
                        ranking train save file

usage: build_dev.py [-h] [--tokenizer_name TOKENIZER_NAME] [--truncate TRUNCATE] [--qrel_file QREL_FILE] [--query_file QUERY_FILE] [--corpus_file CORPUS_FILE] [--topk TOPK] [--retrieval_file RETRIEVAL_FILE] [--ranking_file RANKING_FILE] [--label_file LABEL_FILE]

optional arguments:
  -h, --help            show this help message and exit
  --tokenizer_name TOKENIZER_NAME
                        bert tokenizer name
  --truncate TRUNCATE   bert tokenizer max sequence length
  --qrel_file QREL_FILE
                        qrels train file
  --query_file QUERY_FILE
                        query train file
  --corpus_file CORPUS_FILE
                        corpus file
  --topk TOPK           select topk result
  --retrieval_file RETRIEVAL_FILE
                        train query retrieval result
  --ranking_file RANKING_FILE
                        ranking train save file
  --label_file LABEL_FILE
                        ranking train save file

Training

sh run_train.sh

Inference

Inference the ranking score of each query-passage pair (here, we only rerank the top1000 retrieval passages for each query)

sh inference.sh

Evaluation

usage: evaluate.py [-h] [--topk_path TOPK_PATH] [--qrel_path QREL_PATH] [--topk TOPK]

optional arguments:
  -h, --help            show this help message and exit
  --topk_path TOPK_PATH
  --qrel_path QREL_PATH
  --topk TOPK

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

BERT-base Passage Reranking

Training

Inference

Evaluation

Name		Name	Last commit message	Last commit date
parent directory ..
build_dev.py		build_dev.py
build_train.py		build_train.py
evaluate.py		evaluate.py
inference.sh		inference.sh
readme.md		readme.md
run_marco.py		run_marco.py
run_train.sh		run_train.sh

FilesExpand file tree

rerank

Directory actions

More options

Directory actions

More options

Latest commit

History

rerank

Folders and files

parent directory

readme.md

BERT-base Passage Reranking

Training

Inference

Evaluation