Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some performance improvements #6

Merged
merged 4 commits into from
Feb 18, 2023
Merged

Some performance improvements #6

merged 4 commits into from
Feb 18, 2023

Conversation

JCBrouwer
Copy link
Owner

Some progress on #4

  • Refactor optimal_textures function into OptimalTextures module (this makes it easier to use PyTorch's JIT and compile APIs)
  • Enable batch support
  • Update defaults and add some performance related arguments to CLI:
  --no_tf32             Disable tf32 format (probably slower).
  --cudnn_benchmark     Enable CUDNN benchmarking (probably slower unless doing a high number of iterations).
  --compile             Use PyTorch 2.0 compile function to optimize the model.
  --script              Use PyTorch JIT script function to optimize the model.
  --device DEVICE       Which device to run on.
  --memory_format {contiguous,channels_last}
                        Which memory format to use for optimization.

On my 1080 Ti this version is almost twice as fast for a batch size of 1. Utilization is definitely still low, but upping the batch size helps a lot and I think this is a good springboard for realizing more gains.

@JCBrouwer JCBrouwer merged commit 693556f into main Feb 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant