Source code for the paper "Connecting the Dots: Document-level Relation Extraction with Edge-oriented Graphs" in EMNLP 2019.
- python 3.5 +
- PyTorch 1.1.0
- tqdm
- matplotlib
- recordtype
- orderedyamlload
pip3 install -r requirements.txt
Results can be reproducable, when using a seed equal to 0 and the following settings: GK210GL Tesla K80 GPU, Ubuntu 16.04
Download the datasets
$ mkdir data && cd data
$ wget https://biocreative.bioinformatics.udel.edu/media/store/files/2016/CDR_Data.zip && unzip CDR_Data.zip && mv CDR_Data CDR
$ wget https://bitbucket.org/alexwuhkucs/gda-extraction/get/fd4a7409365e.zip && unzip fd4a7409365e.zip && mv alexwuhkucs-gda-extraction-fd4a7409365e GDA
$ cd ..
Download the GENIA Tagger and Sentence Splitter:
$ cd data_processing
$ mkdir common && cd common
$ wget https://www.nactem.ac.uk/y-matsu/geniass/geniass-1.00.tar.gz && tar xvzf geniass-1.00.tar.gz
$ cd geniass/ && make && cd ..
$ git clone https://github.com/bornabesic/genia-tagger-py.git
$ cd genia-tagger-py && make
$ cd ../../
In order to process the datasets, they should first be transformed into the PubTator format. The run the processing scripts as follows:
$ sh process_cdr.sh
$ sh process_gda.sh
In order to get the data statistics run:
python3 statistics.py --data ../data/CDR/processed/train.data
This will additionally generate the gold-annotation file in the same folder with suffix .gold
.
Run the main script from training and testing as follows. Select gpu -1 for cpu mode.
CDR dataset: Train the model on the training set and evaluate on the dev set, in order to identify the best training epoch.
For testing, re-run the model on the union of train and dev (train+dev_filter.data
) until the best epoch and evaluate on the test set.
GDA dataset: Simply train the model on the training set and evaluate on the dev set. Test the saved model on the test set.
In order to ensure the usage of early stopping criterion, use the --early_stop
option.
If during training early stopping is not triggered, the maximum epoch (specified in the config file) will be used.
Otherwise, if you want to train up to a specific epoch, use the --epoch epochNumber
option without early stopping.
The maximum stopping epochs can be defined by the --epoch
option.
For example, in the CDR dataset:
$ cd src/
$ python3 eog.py --config ../configs/parameters_cdr.yaml --train --gpu 0 --early_stop # using early stopping
$ python3 eog.py --config ../configs/parameters_cdr.yaml --train --gpu 0 --epoch 15 # train until the 15th epoch *without* early stopping
$ python3 eog.py --config ../configs/parameters_cdr.yaml --train --gpu 0 --epoch 15 --early_stop # set both early stop and max epoch
$ python3 eog.py --config ../configs/parameters_cdr.yaml --test --gpu 0
All necessary parameters can be stored in the yaml files inside the configs directory. The following parameters can be also directly given as follows:
usage: eog.py [-h] --config CONFIG [--train] [--test] [--gpu GPU]
[--walks WALKS] [--window WINDOW] [--edges [EDGES [EDGES ...]]]
[--types TYPES] [--context CONTEXT] [--dist DIST] [--example]
[--seed SEED] [--early_stop] [--epoch EPOCH]
optional arguments:
-h, --help show this help message and exit
--config CONFIG Yaml parameter file
--train Training mode - model is saved
--test Testing mode - needs a model to load
--gpu GPU GPU number
--walks WALKS Number of walk iterations
--window WINDOW Window for training (empty processes the whole
document, 1 processes 1 sentence at a time, etc)
--edges [EDGES [EDGES ...]]
Edge types
--types TYPES Include node types (Boolean)
--context CONTEXT Include MM context (Boolean)
--dist DIST Include distance (Boolean)
--example Show example
--seed SEED Fixed random seed number
--early_stop Use early stopping
--epoch EPOCH Maximum training epoch
In order to evaluate you need to first generate the gold data format and then use the evaluation script as follows:
$ cd evaluation/
$ python3 evaluate.py --pred path_to_predictions_file --gold ../data/CDR/processed/test.gold --label 1:CDR:2
$ python3 evaluate.py --pred path_to_predictions_file --gold ../data/GDA/processed/test.gold --label 1:GDA:2
Below are the results in terms of F1-score for the CDR and GDA datasets.
The results of CDR are the same as reported in the paper and can be reproduced using the provided source code with a similar environment.
For the GDA dataset, due to a sentence-splitting error, some inter-sentence pairs were missed.
Below are the GDA updated results. Intra-sentence and overall performance are similar, however, inter-sentence performance differs.
CDR | Dev F1 (%) | Test F1 (%) | |||||
Model | Overall | Intra | Inter | Overall | Intra | Inter | |
EoG | L = 8 | 63.57 | 68.25 | 46.68 | 63.62 | 68.25 | 50.94 |
EoG (Full) | L = 2 | 58.90 | 66.49 | 40.20 | 57.66 | 66.52 | 39.41 |
EoG (NoInf) | - | 50.07 | 57.58 | 33.93 | 49.24 | 60.12 | 30.64 |
EoG (Sent) | L = 4 | 57.56 | 65.47 | - | 55.22 | 65.29 | - |
GDA | Dev F1 (%) | Test F1 (%) | |||||
Model | Overall | Intra | Inter | Overall | Intra | Inter | |
EoG | L = 16 | 78.61 | 83.00 | 46.92 | 80.18 | 84.74 | 45.66 |
EoG (Full) | L = 4 | 77.89 | 82.26 | 54.26 | 79.94 | 84.60 | 54.77 |
EoG (NoInf) | - | 71.60 | 77.19 | 45.34 | 73.73 | 79.22 | 47.06 |
EoG (Sent) | L = 16 | 72.17 | 78.10 | - | 73.05 | 78.83 | - |
If you found this code useful and plan to use it, please cite the following paper =)
@inproceedings{christopoulou2019connecting,
title = "Connecting the Dots: Document-level Neural Relation Extraction with Edge-oriented Graphs",
author = "Christopoulou, Fenia and Miwa, Makoto and Ananiadou, Sophia",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
year = "2019",
publisher = "Association for Computational Linguistics",
pages = "4927--4938"
}