You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A BERT-Based Machine Reading Comprehension Baseline
This repository maintains a machine reading comprehension baseline based on BERT. The implementations follow the baseline system descriptions in the following two papers.
If you find this code useful, please consider citing the following papers.
@article{sun2019probing,
title={Probing Prior Knowledge Needed in Challenging Chinese Machine Reading Comprehension},
author={Sun, Kai and Yu, Dian and Yu, Dong and Cardie, Claire},
journal={CoRR},
volume={cs.CL/1904.09679v2},
url={https://arxiv.org/abs/1904.09679v2}
year={2019}
}
@article{pan2019improving,
title={Improving Question Answering with External Knowledge},
author={Pan, Xiaoman and Sun, Kai and Yu, Dian and Ji, Heng and Yu, Dong},
journal={CoRR},
volume={cs.CL/1902.00993v1}
url={https://arxiv.org/abs/1902.00993v1},
year={2019}
}
Here, we show the usage of this baseline using a demo designed for DREAM, a dialogue-based three-choice machine reading comprehension task.
Download and unzip the pre-trained language model from https://github.com/google-research/bert. and set up the environment variable for BERT by export BERT_BASE_DIR=/PATH/TO/BERT/DIR.
Copy the data folder data from the DREAM repo to bert/.
In bert, execute python convert_tf_checkpoint_to_pytorch.py --tf_checkpoint_path=$BERT_BASE_DIR/bert_model.ckpt --bert_config_file=$BERT_BASE_DIR/bert_config.json --pytorch_dump_path=$BERT_BASE_DIR/pytorch_model.bin
The resulting fine-tuned model, predictions, and evaluation results are stored in bert/dream_finetuned.
Results on DREAM:
We run the experiments five times with different random seeds and report the best development set performance and the corresponding test set performance.