You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In our paper, we conduct an extensive comparison of the components of QA-based metrics for factual consistency evaluation in summarization. Our optimized metric builds on QAEval with question consistency filtering and an improved answer overlap metric, leading to a 14% average improvement over previous QA-based metrics on the SummaC factual consistency benchmark.
git clone https://github.com/salesforce/QAFactEval
cd QAFactEval
pip install -e .
For use in scripts
Download the required pretrained models using download_models.sh.
See run.py for an example of using the QAFactEval metric:
fromqafactevalimportQAFactEvalkwargs= {"cuda_device": 0, "use_lerc_quip": True, \
"verbose": True, "generation_batch_size": 32, \
"answering_batch_size": 32, "lerc_batch_size": 8}
model_folder=""# path to models downloaded with download_models.shmetric=QAFactEval(
lerc_quip_path=f"{model_folder}/quip-512-mocha",
generation_model_path=f"{model_folder}/generation/model.tar.gz",
answering_model_dir=f"{model_folder}/answering",
lerc_model_path=f"{model_folder}/lerc/model.tar.gz",
lerc_pretrained_model_path=f"{model_folder}/lerc/pretraining.tar.gz",
**kwargs
)
results=metric.score_batch_qafacteval(["This is a source document"], [["This is a summary."]], return_qa_pairs=True)
score=results[0][0]['qa-eval']['lerc_quip']
Citation
When referencing this repository, please cite this paper:
@misc{fabbri-etal-2022-qafacteval,
title = {QAFactEval: Improved QA-Based Factual Consistency Evaluation for Summarization},
author = {Alexander R. Fabbri and Chien-Sheng Wu and Wenhao Liu and Caiming Xiong},
year={2022},
eprint={2112.08542},
archivePrefix={arXiv},
primaryClass={cs.CL},
url = {https://arxiv.org/abs/2112.08542},
}
License
This repository is released under the BSD-3 License.