You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository contains the NarrativeQA dataset. It includes the list of documents with Wikipedia summaries, links to full stories, and questions and answers.
The NarrativeQA Reading Comprehension Challenge Dataset
This repository contains the NarrativeQA dataset. It includes the list of
documents with Wikipedia summaries, links to full stories, and questions and
answers.
documents.csv - contains document_id, set, kind, story_url, story_file_size,
wiki_url, wiki_title, story_word_count, story_start, story_end. The word count
is approximate after some basic cleanup and tokenization.
third_party/wikipedia/summaries.csv - contains document_id, set, summary,
summary_tokenized. The summaries are from Wikipedia.
download_stories.sh - script to download the stories.
compare.sh - compare downloaded story's file size to the document size we had.
(At the time of publication, all stories have <3.5% file difference (except
one), likely due to punctuation encoding.)
Bibtex
@article{narrativeqa,
author = {Tom\'a\v s Ko\v cisk\'y and Jonathan Schwarz and Phil Blunsom and
Chris Dyer and Karl Moritz Hermann and G\'abor Melis and
Edward Grefenstette},
title = {The {NarrativeQA} Reading Comprehension Challenge},
journal = {Transactions of the Association for Computational Linguistics},
url = {https://TBD},
volume = {TBD},
year = {2018},
pages = {TBD},
}
Dataset Metadata
The following table is necessary for this dataset to be indexed by search
engines such as Google Dataset Search.
property
value
name
The NarrativeQA Reading Comprehension Challenge Dataset
alternateName
NarrativeQA
url
https://github.com/deepmind/narrativeqa
sameAs
https://github.com/deepmind/narrativeqa
description
This repository contains the NarrativeQA dataset. It includes the list of
documents with Wikipedia summaries, links to full stories, and questions and answers.
provider
property
value
name
DeepMind
sameAs
https://en.wikipedia.org/wiki/DeepMind
license
property
value
name
Apache License, Version 2.0
url
https://www.apache.org/licenses/LICENSE-2.0.html
citation
https://identifiers.org/arxiv:1712.07040
About
This repository contains the NarrativeQA dataset. It includes the list of documents with Wikipedia summaries, links to full stories, and questions and answers.