You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The official code implementation of the ACL 2023 Finding paper: Enhanced Chart Understanding in Vision and Language Task via Cross-modal Pre-training on Plot Table Pairs
We are still working on preparing the extracted visual features for download. However, you can also extract visual features from the images with the following instructions.
conda env create -f feature_extractor.yml
conda activate chart_feature_extractor
cd feature_extraction
#extract features for chartQA dataset
python chartqa_proposal.py --data_root /path/to/your/chartvqa --split train/val/test
Pre-training
After extract the data, change the pretrain_datadir in ChartT5/src/chart_pretrain_data.py to the /path/to/extracted_data/pretrain .
Then run the following command to start the pre-training:
cd ChartT5
bash scripts/Chartpretrain_VLT5.sh 2
Downstream Task Fine-tuning
Chart VQA
After extract the data, change the chartqa_root in ChartT5/src/chartqa_data.py to /path/to/extracted_data/chart_qa. Also change the src_folder in ChartT5/scripts/ChartQA_VLT5.sh to /path/to/extracted_data/chart_qa.
Then run the following command to start the fine-tuning:
cd ChartT5
bash scripts/ChartQA_VLT5.sh 2
Citation
Please cite our paper if you use our model in your works:
@inproceedings{zhou2023chartt5,
title = {Enhanced Chart Understanding in Vision and Language Task via Cross-modal Pre-training on Plot Table Pairs},
author = {Mingyang Zhou, Yi R. Fung, Long Chen, Christopher Thomas, Heng Ji, Shih-Fu Chang},
booktitle = {Findings of the Association for Computational Linguistics: ACL 2023},
year = {2023}
}
Acknowledge
Our code is mainly based on VLT5. We thank the author for opening source their code and checkpoints. Portions of the code also uses the resource from ChartQA.
Liscense
MIT
About
The official code implementation of the ACL 2023 Finding paper: Enhanced Chart Understanding in Vision and Language Task via Cross-modal Pre-training on Plot Table Pairs