[Bug] Why does lora fine-tuning take longer than full parameter fine-tuning? #254

WangRongsheng · 2025-03-10T01:26:32Z

Environment

machine: 4*A800 (80GiB)

train scripts:

export WANDB_BASE_URL="https://api.wandb.ai"
export WANDB_MODE=online
torchrun --nnodes 1 --nproc_per_node 4 --master_port 29903 \
    fastvideo/train.py \
    --seed 1024 \
    --pretrained_model_name_or_path /sds_wangby/models/hunyuan_diffusers \
    --model_type hunyuan_hf \
    --cache_dir data/.cache \
    --data_json_path /sds_wangby/models/cjy/med_vid/code-wrs/dataset/Image-Vid-Finetune-HunYuan/videos2caption.json \
    --validation_prompt_dir /sds_wangby/models/cjy/med_vid/code-wrs/dataset/Image-Vid-Finetune-HunYuan/validation \
    --gradient_checkpointing \
    --train_batch_size 16 \
    --num_latent_t 24 \
    --sp_size 4 \
    --train_sp_batch_size 1 \
    --dataloader_num_workers 4 \
    --gradient_accumulation_steps 16 \
    --max_train_steps 8000 \
    --learning_rate 8e-5 \
    --mixed_precision bf16 \
    --checkpointing_steps 500 \
    --validation_steps 100 \
    --validation_sampling_steps 50 \
    --checkpoints_total_limit 3 \
    --allow_tf32 \
    --ema_start_step 0 \
    --cfg 0.0 \
    --ema_decay 0.999 \
    --log_validation \
    --output_dir data/outputs/Finetune-Hunyuan-lora \
    --tracker_project_name Finetune-Hunyuan-lora \
    --num_frames 93 \
    --validation_guidance_scale "1.0" \
    --shift 7 \
    --use_lora \
    --lora_rank 32 \
    --lora_alpha 32

Describe the bug

We find that using lora fine-tuning takes longer than full parameter fine-tuning and doesn't speed up any training when I tweak --train_batch_size as well as --gradient_accumulation_steps?

When I adjust --train_batch_size as well as --gradient_accumulation_steps, the memory is always stable, when I adjust --train_sp_batch_size, the memory is increased, but the training time becomes longer.

Reproduction

None

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Why does lora fine-tuning take longer than full parameter fine-tuning? #254

[Bug] Why does lora fine-tuning take longer than full parameter fine-tuning? #254

WangRongsheng commented Mar 10, 2025

[Bug] Why does lora fine-tuning take longer than full parameter fine-tuning? #254

[Bug] Why does lora fine-tuning take longer than full parameter fine-tuning? #254

Comments

WangRongsheng commented Mar 10, 2025

Environment

Describe the bug

Reproduction