Our task is to apply BERT on Facebook Children’s Book Test Dataset. In the dataset, each question is constructed by taking 20 consecutive sentences from the book text and leaving the 21st as the query statement. A word from the query is selected and masked, and the model is tasked with selecting which word from the text (of the chosen type) should be used to fill this placeholder in the query. Our goal is to measure how well language models can exploit wider linguistic context and ultimately create a model that is capable of answering a query based on a short story.
The project proposal of this project can be found here.
The progress report of this project can be found here.
The final presentation video can be found here. The final presentation pdf of this project can be found here.
The final documentation of this project can be found here.