Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repository information cache map keeps growing until out of memory #3252

Closed
raestio opened this issue Mar 6, 2025 · 4 comments
Closed

Repository information cache map keeps growing until out of memory #3252

raestio opened this issue Mar 6, 2025 · 4 comments
Assignees
Labels
type: regression A regression from a previous release

Comments

@raestio
Copy link

raestio commented Mar 6, 2025

Hi,

it looks like there is a bug in RepositoryFactorySupport in combination with QuerydslPredicateExecutor causing the repository information cache map to grow indefinitely, which eventually leads to OOM:

Heap usage in version 3.4.3 (same for 3.4.x):
Image

Heap usage in version 3.3.9:
Image

Steps to reproduce and observed behaviour

I'm attaching a sample application below which is based on Spring Data REST, Spring Data JPA and Querydsl:
spring-repository-cache-map-issue.zip

  1. Each HTTP resource request (GET /someItems) will produce a unique entry in repository information cache (repositoryInformationCache)
  2. So the number of entries in the map will be equal to number of executed HTTP requests
  3. The cacheKey is always unique because its compositionHash is calculated from QuerydslJpaPredicateExecutor which doesn't override hashCode method.
    Image

The thing is, it behaves the same in version 3.3.9 as well. But there was:

new ConcurrentReferenceHashMap<>(16, ReferenceType.WEAK)

(changed in #3067 - b21b2e8)
so if I understand correctly, the values were always garbage collected because of the WEAK reference type.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Mar 6, 2025
@mp911de
Copy link
Member

mp911de commented Mar 6, 2025

Looks like a regression. It also seems, that we do not benefit from proper caching as the cache keys do not yield many cache hits.

@mp911de mp911de added type: regression A regression from a previous release and removed status: waiting-for-triage An issue we've not yet triaged labels Mar 6, 2025
@mp911de mp911de self-assigned this Mar 6, 2025
@mp911de
Copy link
Member

mp911de commented Mar 6, 2025

Two things contribute to cause this problem:

  1. Changing the cache map from a map with weak references to one with strong references
  2. The actual cause is that during RepositoryInformation creation, we instantiate repository fragments (such as the Querydsl one) and that leads to a different hash code every time.

@mp911de mp911de added this to the 3.4.4 (2024.1.4) milestone Mar 6, 2025
mp911de added a commit that referenced this issue Mar 6, 2025
We now use a refined strategy to cache RepositoryInformation and RepositoryComposition.

Previously, RepositoryComposition wasn't cached at all and store modules that e.g. contributed a Querydsl (or a different) fragment based on the interface declaration returned a new RepositoryComposition (and thus a different hashCode) each time RepositoryInformation was obtained leading to memory leaks caused by HashMap caching.

We now use Fragment's hashCode for the cache key resulting in RepositoryComposition being created only once for a given repository interface and input-fragments arrangement.

Closes #3252
@mp911de mp911de closed this as completed in 5f96594 Mar 6, 2025
@mp911de
Copy link
Member

mp911de commented Mar 6, 2025

That's fixed now. A new snapshot build of 3.4.4-SNAPSHOT is going to be available from repo.spring.io in ~ 30 minutes. Please verify the fix, also, feel free to widely test against this change as it might introduce undesired side-effects.

mp911de added a commit that referenced this issue Mar 6, 2025
We now use a refined strategy to cache RepositoryInformation and RepositoryComposition.

Previously, RepositoryComposition wasn't cached at all and store modules that e.g. contributed a Querydsl (or a different) fragment based on the interface declaration returned a new RepositoryComposition (and thus a different hashCode) each time RepositoryInformation was obtained leading to memory leaks caused by HashMap caching.

We now use Fragment's hashCode for the cache key resulting in RepositoryComposition being created only once for a given repository interface and input-fragments arrangement.

Closes #3252
mp911de added a commit that referenced this issue Mar 6, 2025
@raestio
Copy link
Author

raestio commented Mar 10, 2025

I've just tested it against 3.4.4-SNAPSHOT. Looks OK, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: regression A regression from a previous release
Projects
None yet
Development

No branches or pull requests

3 participants