InfoChartQA: A Benchmark for Multimodal Question Answering on Infographic Charts

About

InfoChartQA is a benchmark for evaluating multimodal large language models (MLLMs) on infographic charts enriched with pictorial visual elements like pictograms and icons. It features 5,948 pairs of infographic and plain charts that share the same underlying data but differ in visual style, enabling controlled comparisons. The dataset contains a total of 58,857 questions, including 50,920 text-based and 7,937 visual-element-based questions designed to probe model understanding of both content and complex visual design. Our analysis of 20 MLLMs reveals significant performance drops on infographic charts, highlighting key challenges and new research directions.

🤗 Dataset

You can find our dataset on huggingface: InfoChartQA Dataset

Evaluation

Evaluation Results

Usage

Each question entry is arranged as follows. Note that for visual questions, there may be some extra input figures, which are cropped from the orginal figure. We present their bboxes in "extra_input_figure_bboxes".

{
        "question_id": id of the question,
        "question_type_name": question type name, for example: "extreme" questions, 
        "question_type_id": question type id, this is only used for evaluation! For example: 72 means "extreme" questions,
        "figure_id": id of the figure,
        "question": question text,  
        "answer": ground truth answer,
        "instructions": instructions,
        "url": url of the input image,
        "extra_input_figure_ids": ids of the extra input figures,
        "extra_input_figure_bboxes": bbox of the extra input figures, in [x,y,w,h] format w/o normalization.
        "difficulty": difficulty level,
        "chart_type": chart_type,
}

Each question is built by:

input_image: item["url"] (may need to download for models that don't support url input)
extra_input_image: Cropped input_image using item["extra_input_figure_bboxes"],
input_text: item["question"] + item["instructions"] (if any)

where item is an entry of the dataset.

Evaluation Instructions

For detailed evaluation instructions and usage, please refer to the Evaluation.

📄 Paper

InfoChartQA: A Benchmark for Multimodal Question Answering on Infographic Charts
Tianchi Xie, Minzhi Lin, Mengchen Liu, Yilin Ye, Changjian Chen, Shixia Liu

📚 Citation

If you use our work and are inspired by our work, please cite:

@misc{lin2025infochartqa,
      title={InfoChartQA: A Benchmark for Multimodal Question Answering on Infographic Charts}, 
      author={Tianchi Xie and Minzhi Lin and Mengchen Liu and Yilin Ye and Changjian Chen and Shixia Liu},
      year={2025},
      eprint={2505.19028},
      url={https://arxiv.org/abs/2505.19028}, 
}

🪪 License

Our original data contributions (all data except the charts) are distributed under the CC BY-SA 4.0 license. The copyright of the charts belong to the original authors.

✨ Related Projects

OrionBench: A Benchmark for Chart and Human-Recognizable Object Detection in Infographics
Paper | Code | Dataset
ChartGalaxy: A Dataset for Infographic Chart Understanding and Generation
paper | Code | Dataset

💬 Contact

If you have any questions about this work, please contact us using the following email address: linmz21@mails.tsinghua.edu.cn.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
eval		eval
LICENSE		LICENSE
README.md		README.md
result.png		result.png
teaser.jpg		teaser.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

InfoChartQA: A Benchmark for Multimodal Question Answering on Infographic Charts

About

🤗 Dataset

Evaluation

Evaluation Results

Usage

Evaluation Instructions

📄 Paper

📚 Citation

🪪 License

✨ Related Projects

💬 Contact

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

thu-vis/InfoChartQA

Folders and files

Latest commit

History

Repository files navigation

InfoChartQA: A Benchmark for Multimodal Question Answering on Infographic Charts

About

🤗 Dataset

Evaluation

Evaluation Results

Usage

Evaluation Instructions

📄 Paper

📚 Citation

🪪 License

✨ Related Projects

💬 Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages