Hi,
I found that while doing the TREC DL document task, the code in the msmarco.py processes "msmarco-test2019-queries.tsv" as the dev-query file.
if args.data_type == 0:
write_query_rel(
args,
pid2offset,
"msmarco-doctrain-queries.tsv",
"msmarco-doctrain-qrels.tsv",
"train-query",
"train-qrel.tsv")
write_query_rel(
args,
pid2offset,
"msmarco-test2019-queries.tsv",
"2019qrels-docs.txt",
"dev-query",
"dev-qrel.tsv")
If I want to reproduce your work, is it okay to use the "msmarco-docdev-queries.tsv" as devset to select the best checkpoint?
Hi,
I found that while doing the TREC DL document task, the code in the msmarco.py processes "msmarco-test2019-queries.tsv" as the dev-query file.
ANCE/data/msmarco_data.py
Line 190 in 936ec3e
If I want to reproduce your work, is it okay to use the "msmarco-docdev-queries.tsv" as devset to select the best checkpoint?