Skip to content
This repository has been archived by the owner on Feb 13, 2025. It is now read-only.

GPT-2 currently exhausts all available GPU memory on an 8 GB GPU #673

Open
BradLarson opened this issue Sep 29, 2020 · 2 comments
Open

GPT-2 currently exhausts all available GPU memory on an 8 GB GPU #673

BradLarson opened this issue Sep 29, 2020 · 2 comments
Assignees

Comments

@BradLarson
Copy link
Contributor

In testing PR #671, we noticed that the GPT-2 model now exhausts all available memory on 8 GB GPUs (example: GTX 1080) for both eager mode and X10 runtimes. It did not do this previously, so at some point the RAM usage of this model has increased to the point where it can no longer train on these GPUs.

We should investigate why this happened and see if memory usage for this model can be brought back down.

@xihui-wu
Copy link
Contributor

xihui-wu commented Oct 5, 2020

As I tested an 16GB GPU VM, I do see that among the last nearly 2/3 of the epochs (10 epochs in total), a peak memory usage of 9187MB happens once in each epoch, and they happen around last training batch.

@xihui-wu
Copy link
Contributor

I just verified again on a new 16GB GPU DLVM instance created today, issue sustains.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants