Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better device memory allocation strategy #744

Open
5 tasks
w23 opened this issue Nov 27, 2024 · 1 comment
Open
5 tasks

better device memory allocation strategy #744

w23 opened this issue Nov 27, 2024 · 1 comment
Labels
dev-tools Tools helpful for development enhancement New feature or request performance Performance improvement needed potential bug refactoring Improvement of the code base skill issue workaround_known There is a known workaround for this issue

Comments

@w23
Copy link
Owner

w23 commented Nov 27, 2024

The problem: big allocations (say, > half of default devmem allocation size, currently 64M/2 = 32M) lead to too many device memory objects, hitting the hardcoded limit. Example: UHD window resolution w/ rt pipeline leads to allocating many large "g-buffer" textures and hitting the max devmem assert.

Better devmem allocation strategy could consist of the following properties:

  • Dynamic devmem array, not limited by hardcoded max count
  • Special handling for big allocations, e.g.:
    • allocate a dedicated devmem object for each big alloc (e.g. > 32M)
    • allocation classes: allocate 128M/256M devmem object for big allocs specifically
    • lazy allocation: first collect how much memory would be needed, and only then allocate devmem when it is actually required.

Related: #502, it would be good to know devmem allocation stats: how many and how large object we have, etc.

@w23 w23 added enhancement New feature or request performance Performance improvement needed dev-tools Tools helpful for development potential bug workaround_known There is a known workaround for this issue refactoring Improvement of the code base skill issue labels Nov 27, 2024
@w23
Copy link
Owner Author

w23 commented Dec 12, 2024

Related issues:

What we could do is:

Split allocations into three classes

  • Static stuff allocated at render start time: various buffers, geometry, uniform, acceleration structures*, etc. This is allocated once and never changes after. (* -- AS could be also semi-dynamic, but that's out of scope now)
  • Long-life heap: e.g. textures. These are usually mass-allocated-deallocated at map change events, with only a few exceptions.
  • Dynamic heap: G-buffer stuff. This depends on RT pipeline configuration, max frame size, etc. Its contents might also be aliasable based on pipeline structure. It can also change between frames, e.g. when resizing window.

Implement different allocation strategies for them

  • Static stuff can be allocated lazily. I.e. in R_VkInit() we collect all requirements, build a list of things to allocate, collect all necessary GPU memory heaps, etc. At the end of init function we know the needed sizes exactly and can just allocate devmem objects with given sized.
  • Long heap: just implement current strategy of filling up and allocating new devmem objects when needed. Could even do compaction/defrag at map boundaries, if we see that it's a problem.
  • Dynamic heap is similar to static, in the sense that on RT pipeline reload or swapchain params change we can collect all the requirements (sizes, aliasing, etc) and just allocate devmems exactly. This also helps in the sense that it doesn't interfere with other allocations and doesn't introduce extra fragmentation.

w23 added a commit that referenced this issue Jan 30, 2025
Some maps fail to load. Possibly drivers have changed and require bigger
buffer.

Related: #744, #502, etc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dev-tools Tools helpful for development enhancement New feature or request performance Performance improvement needed potential bug refactoring Improvement of the code base skill issue workaround_known There is a known workaround for this issue
Projects
None yet
Development

No branches or pull requests

1 participant