YouTube video walk-through of this codebase #88

gordicaleksa · 2022-07-31T13:51:29Z

First of all awesome work, you made my job that much easier. :)

I created a YouTube video where I do a deep dive/walk-through of this repo.

Maybe someone finds it useful:
https://youtu.be/x_8uHX5KngE

Hopefully it's ok to share it here in the form of an issue, do let me know!

kuprel · 2022-07-31T15:04:23Z

Wow this is great! I just added your video to the readme. You're right the clamping is unnecessary. It originally served to avoid a cryptic cuda runtime error. Later I implemented a more precise solution to limit the BART decoder to 2**14 tokens to match the VQGAN. I'm not sure why there's a mismatch in vocabulary counts. Also I didn't realize those are shared weights. There's probably a simpler solution here. Great video!

kuprel · 2022-07-31T16:36:14Z

I checked to see if the embedding weights in the BART decoder were the same weights as the embedding weights in the VQGAN detokenizer. It seems they are actually different. The BART decoder in Dalle Mega is embedding to 2048 dimensions and the VQGAN is embedding to 256 dimensions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YouTube video walk-through of this codebase #88

YouTube video walk-through of this codebase #88

gordicaleksa commented Jul 31, 2022

kuprel commented Jul 31, 2022

kuprel commented Jul 31, 2022

YouTube video walk-through of this codebase #88

YouTube video walk-through of this codebase #88

Comments

gordicaleksa commented Jul 31, 2022

kuprel commented Jul 31, 2022

kuprel commented Jul 31, 2022