-
Notifications
You must be signed in to change notification settings - Fork 639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DALLE trained on FashionGen Dataset RESULTS 💯 #443
Comments
Hi, can you offer the Colab link and check points? |
You'll find the trained Dall-E weights here: https://drive.google.com/uc?id=1kEHTTZH2YbbHZjY6fTWuPb5_D-7nQ866 |
@alexriedel1 |
Yes right, the text sequence length is 120, is this a problem for you? |
I also used the default tokenizer in this project which uses bpe_simple_vocab_16e6 byte pair encoder https://github.com/lucidrains/DALLE-pytorch/blob/main/dalle_pytorch/tokenizer.py. It uses a text token size of 49408 by default. I increased the text sequence length to 120 because the FashionGen dataset uses quite long text descriptions to the images. |
Thank you a lot! |
Hi do you still have access to the Fashiongen dataset? I can't seem to find a good link for it. |
DALLE on FashionGen
Text to image generation and re-ranking by CLIP
Best 16 of 48 generations ranked by CLIP
Generations from the training set (Including their Groundtruths)
Generations based on custom prompts (withouttheir Groundtruths)
Model specifications
VAE
Trained VQGAN for 1 epoch on Fashion-Gen dataset
Embeddings: 1024
Batch size: 5
DALLE
Trained DALLE for 1 epoch on Fashion-Gen dataset
dim = 312
text_seq_len = 80
depth = 36
heads = 12
dim_head = 64
reversible = 0
attn_types =('full', 'axial_row', 'axial_col', 'conv_like')
Optimization
Optimizer: Adam
Learning rate: 4.5e-4
Gradient Clipping: 0.5
Batch size: 7
The text was updated successfully, but these errors were encountered: