ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering

Dataset

You can find our dataset on huggingface: 🤗ChartQAPro Dataset

Evaluation Results

✅ Evaluation Instructions

To evaluate your model on ChartQAPro, follow the steps below:

1. Format Your Predictions

Save your model's predictions in a .json file that contains a list of dictionaries.
Each dictionary should include the following keys (first three keys taken from the original huggingface dataset):

"Answer": the ground truth answer
"Question Type": the type of the question (e.g., Factoid, MCQ, etc.)
"Year": useful for evaluating year-based answers
"prediction": your model’s predicted answer

📝 Example Format

[
  {
    "Answer": ["2016"]
    "Question Type": "Factoid",
    "Year": ["YES"]
    "prediction": "2016"
  },
  ...
]

2. Install Required Dependencies

pip install anls pandas

3. Run the Evaluation Script

python evaluate_predictions.py --predictions-file path/to/your/predictions.json

This will print your model’s performance across different question types and the overall score, following the official evaluation metrics used in the paper. 📊

💬 Contact

If you have any questions about this work, please contact Ahmed Masry using the following email addresses: amasry17@ku.edu.tr, ahmed.elmasry24653@gmail.com, or masry20@yorku.ca.

📚 Citation

If you use ChartQAPro in your research, please cite:

@misc{masry2025chartqaprodiversechallengingbenchmark,
      title={ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering}, 
      author={Ahmed Masry and Mohammed Saidul Islam and Mahir Ahmed and Aayush Bajaj and Firoz Kabir and Aaryaman Kartha and Md Tahmid Rahman Laskar and Mizanur Rahman and Shadikur Rahman and Mehrad Shahmohammadi and Megh Thakkar and Md Rizwan Parvez and Enamul Hoque and Shafiq Joty},
      year={2025},
      eprint={2504.05506},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2504.05506}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md
evaluate_predictions.py		evaluate_predictions.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering

Dataset

Evaluation Results

✅ Evaluation Instructions

1. Format Your Predictions

📝 Example Format

2. Install Required Dependencies

3. Run the Evaluation Script

💬 Contact

📚 Citation

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

vis-nlp/ChartQAPro

Folders and files

Latest commit

History

Repository files navigation

ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering

Dataset

Evaluation Results

✅ Evaluation Instructions

1. Format Your Predictions

📝 Example Format

2. Install Required Dependencies

3. Run the Evaluation Script

💬 Contact

📚 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages