Question about finetuning with different resolution and frame nums #220

zqh0253 · 2025-01-15T02:50:46Z

Thank you for providing such an excellent codebase. I'm curious if this repository supports fine-tuning the model (e.g., CogVideox) with different resolutions and frame numbers compared to the default setting, such as 256 x 256 x 6 (H x W x T). If so, what is the best practice for doing this?

neph1 · 2025-01-15T08:20:55Z

I've done some experimentation with this with LTX-V. Multiple resolutions add flexibility to the lora, ie results get better when using different resolutions during inference (which is expected, I guess).
LTX doesn't seem to like lower framerates. I've tried to train it at as low as 12 fps, but seem to be getting smeared results as it interpolates to 24 fps. Although it's possible to offset it somewhat by raising the fps during inference (40fps, 60fps or more). I'd be happy to be corrected if anyone has better results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about finetuning with different resolution and frame nums #220

Question about finetuning with different resolution and frame nums #220

zqh0253 commented Jan 15, 2025

neph1 commented Jan 15, 2025

Question about finetuning with different resolution and frame nums #220

Question about finetuning with different resolution and frame nums #220

Comments

zqh0253 commented Jan 15, 2025

neph1 commented Jan 15, 2025