maestro

Hello

maestro is a tool designed to streamline and accelerate the fine-tuning process for multimodal models. It provides ready-to-use recipes for fine-tuning popular vision-language models (VLMs) such as Florence-2, PaliGemma 2, and Qwen2.5-VL on downstream vision-language tasks.

Quickstart

Install

To get started with maestro, you’ll need to install the dependencies specific to the model you wish to fine-tune.

pip install maestro[qwen_2_5_vl]

Note: Some models may have clashing dependencies. We recommend creating a separate python environment for each model to avoid version conflicts.

CLI

maestro qwen_2_5_vl train \
  --dataset "dataset/location" \
  --epochs 10 \
  --batch-size 4 \
  --optimization_strategy "qlora" \
  --metrics "edit_distance"

Python

from maestro.trainer.models.qwen_2_5_vl.core import train

config = {
    "dataset": "dataset/location",
    "epochs": 10,
    "batch_size": 4,
    "optimization_strategy": "qlora",
    "metrics": ["edit_distance"]
}

train(config)

Contribution

We would love your help in making this repository even better! We are especially looking for contributors with experience in fine-tuning vision-language models (VLMs). If you notice any bugs or have suggestions for improvement, feel free to open an issue or submit a pull request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

maestro

Hello

Quickstart

Install

CLI

Python

Contribution

Files

README.md

Latest commit

History

README.md

File metadata and controls

maestro

Hello

Quickstart

Install

CLI

Python

Contribution