-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out of Memory Error for cogvideox lora finetuning #273
Comments
We've had many folks report succesful training of CogVideoX in under 16-24gb so I don't believe the issue is on our end. To help debug, I would first try training with a single video and overfitting it to first try and make sure the training runs |
#!/bin/bash GPU_IDS="0" DATA_ROOT="data" Model argumentsmodel_cmd="--model_name cogvideox Dataset argumentsdataset_cmd="--data_root $DATA_ROOT Dataloader argumentsdataloader_cmd="--dataloader_num_workers 4" Training argumentstraining_cmd="--training_type lora Optimizer argumentsoptimizer_cmd="--optimizer adamw Miscellaneous argumentsmiscellaneous_cmd="--tracker_name finetrainers-cog cmd="accelerate launch --config_file accelerate_configs/deepspeed.yaml --gpu_ids $GPU_IDS train.py echo "Running command: $cmd" Deepspeed.yaml Is there something I should change in these? I am using single video only |
I am using pytorch version 2.6.0+cu124 |
Even after using 2 gpus of 32 gb vram I am still getting out of memory error. It should run on 24 gb vram gpu. If anyone has run lora finetuning under 24 gb vram please help. I am attaching the screenshot of the error. I have already tried all optimizations mentioned on the repo.
The text was updated successfully, but these errors were encountered: