POC: Pooling allocator for head blocks for reduced memory fragmentation (experiment) #9777
+134
−74
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
We have observed particularly as the number of active streams increases Loki becomes more and more susceptable to memory fragmentation. Go is not a compacting GC so keeping around entire memory spans for one or two entries that are waiting on a chunk max time can be quite expensive.
This PR replaces individual strings with offset+length data referencing a chunk of memory (4096, or larger if line len exceeds this) thereby removing the strings being kept for block creation (these can be freed by GOGC). This PR adds new allocations on iteration and loading (e.g WAL) that otherwise would not exist. Thereby creating some GC pressure. This is because I have not replaced strings with
[]byte
in loki, this would be a big PR.You can end up with spans (grouped by size) kept alive by entries from the 1,000 low rate strings
With a bit of size variation you can end up with substantial go memory fragmentation by expanding the spans of different sizes
If this patch was expanded to include a removal of string allocation the number of tracked objects by GOGC could be shrunk substantially.
In our observations this fragmentation leads to memory usage that is usually almost twice what it should be.
Another side effect of the current architecture is that it puts alot of stress on the GC for all the strings. This PR does not attempt to remove Loki's usage of string (e.g replace it with byte spans). That would be a PR in excess of this experiment.
Special notes for your reviewer:
Comments wanted.
To date with the Loki project PRs have gone to stale bot without any follow up. Prove Loki is not deaf to the community as it so often appears. Comment.
This is not intended to merge as-is, its intended to be a PoC.
If anyone is facing this today you can decrease the effect of this fragmentation by shrinking the block size, this casues less individual allocations of entries to be created. The major cause of fragmentation in Loki tends to be the strings for entries, by reducing the number of these in flight you in turn decrease Loki's susceptability to fragmentation.
You can also force ordered writes, this helps with both performance and memory fragmentation if your source allows for it.
Test Results
In testing (results taken after 6 hours of burn in)
Small Server: 76MB -> 32MB ram
Large Server: 3GB -> 2GB ram
Fragmentation is much reduced. Would be alot better I expect without any strings.