This repository contains the code to train and infer ChartCoder.
[2025.1.16] We have updated our data generation code data_generator, built on Multi-modal-Self-instruct. Please follow their instructions and our code to generate the <chart, code> data pairs.
- Clone this repo
git clone https://github.com/thunlp/ChartCoder.git
- Create environment
cd MMedAgent
conda create -n chartcoder python=3.10 -y
conda activate chartcoder
pip install --upgrade pip # enable PEP 660 support
pip install -e .
- Additional packages required for training
pip install -e ".[train]"
pip install flash-attn --no-build-isolation
Model | Download Link |
---|---|
MLP Connector | projector |
ChartCoder | ChartCoder |
The MLP Connector is our pre-trained MLP weights, which you could directly use for SFT.
Model | Download Link |
---|---|
Chart2Code-160k | Chart2Code-160k TBD |
The whole training process consists of two stages. To train the ChartCoder, siglip-so400m-patch14-384
and deepseek-coder-6.7b-instruct
should be downloaded first.
For Pre-training, run
bash scripts/train/pretrain_siglip.sh
For SFT, run
bash scripts/train/finetune_siglip_a4.sh
Please change the model path to your local path. See the corresponding .sh
file for details.
We also provide other training scripts, such as using CLIP _clip
and multiple machines _m
. See scripts/train
for further information.
Please see inference.py
for details.
Please refer to our paper for detailed performance on ChartMimic, Plot2Code and ChartX benchmarks. Thanks for these contributions to the chart-to-code field.
If you find this work useful, consider giving this repository a star ⭐️ and citing 📝 our paper as follows:
@misc{zhao2025chartcoderadvancingmultimodallarge,
title={ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation},
author={Xuanle Zhao and Xianzhen Luo and Qi Shi and Chi Chen and Shuo Wang and Wanxiang Che and Zhiyuan Liu and Maosong Sun},
year={2025},
eprint={2501.06598},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2501.06598},
}
The code is based on the LLaVA-NeXT. Thanks for these great works and open sourcing!