You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We find that using lora fine-tuning takes longer than full parameter fine-tuning and doesn't speed up any training when I tweak --train_batch_size as well as --gradient_accumulation_steps?
When I adjust --train_batch_size as well as --gradient_accumulation_steps, the memory is always stable, when I adjust --train_sp_batch_size, the memory is increased, but the training time becomes longer.
Reproduction
None
The text was updated successfully, but these errors were encountered:
Environment
machine:
4*A800 (80GiB)
train scripts:
Describe the bug
We find that using lora fine-tuning takes longer than full parameter fine-tuning and doesn't speed up any training when I tweak
--train_batch_size
as well as--gradient_accumulation_steps
?When I adjust
--train_batch_size
as well as--gradient_accumulation_steps
, the memory is always stable, when I adjust--train_sp_batch_size
, the memory is increased, but the training time becomes longer.Reproduction
None
The text was updated successfully, but these errors were encountered: