You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to fine-tune LTX (I2V) on my custom dataset, but I found it difficult to converge. Specifically, I set the resolution and frame rate to 512 * 512 * 9. I found that when there are only a few dozen videos, the data can converge in a few hundred iterations. But when training on 500 or more videos, it was found that it's hard to converge even after 20000 iterations. Has anyone encountered a similar problem before.
The text was updated successfully, but these errors were encountered:
From my experiments, I would suggest using multi-resolution data to avoid model collapse. LTX was trained with a variety of different frame/height/width, so trying to focus it on one specific resolution might require more training steps to get right.
Also, the current implementation of LTX Video does not account for first-frame conditioning (an essential part of the training algorithm as mentioned in the LTX paper). I've added that in #245, but that PR is not yet ready
I tried to fine-tune LTX (I2V) on my custom dataset, but I found it difficult to converge. Specifically, I set the resolution and frame rate to 512 * 512 * 9. I found that when there are only a few dozen videos, the data can converge in a few hundred iterations. But when training on 500 or more videos, it was found that it's hard to converge even after 20000 iterations. Has anyone encountered a similar problem before.
The text was updated successfully, but these errors were encountered: