You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recommended: Create a conda environment with conda create -n myenv python=3.7
Conversion and Utilities
The repository contains conversion scripts for converting different datasets into the SQuAD 1.1 format.
vpe2squad.py: Convert VP ellipsis dataset into SQuAD format
conll2squad.py: Convert coreference data from C0NLL-2012 to SQuAD format
First convert .conll files to .jsonlines using this
Set ONTONOTES_DIR (ontonotes folder path) and set2fmt (filename to convert to SQuAD format)
Run script
sluice2squad.py: Convert sluice ellipsis dataset into SQuAD format
wikicoref2conll.py: Convert WikiCoref dataset into CoNLL-2012 format
squad2conll.py: Convert the prediction files produced by bert/run_squad.py into CONLL format for evaluation
Miscellaneous
annotate_qwords.py: Adds <ref> and </ref> tags to interrogation words in SQuAD files
evaluate-v1.1.py: Standard SQuAD v1.1 evaluation script (for evaluating ellipsis)
For coreference resolution, use the standard CoNLL-2012 script after converting the predictions into the CoNLL-2012 format using squad2conll.py.
Training Details
Each model folder contains pre-processing, configuration, training and evaluation scripts for Sluice Ellipsis. To run on other datasets, just replace the data paths appropriately.