Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HHH-14694 Use stable proxy names to avoid managing proxy state #9831

Merged
merged 4 commits into from
Mar 27, 2025

Conversation

beikov
Copy link
Member

@beikov beikov commented Mar 3, 2025


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license
and can be relicensed under the terms of the LGPL v2.1 license in the future at the maintainers' discretion.
For more information on licensing, please check here.


https://hibernate.atlassian.net/browse/HHH-14694
https://hibernate.atlassian.net/browse/HHH-19230

@bstansberry
Copy link
Contributor

@scottmarlow @Sanne FYI re this, with respect to the discussion we had at wildfly/wildfly#18748 (comment). That PR evolved to not include the change we were discussing there but if something like that comes up again this may be relevant.

@beikov
Copy link
Member Author

beikov commented Mar 4, 2025

Hey @bstansberry, since the type cache is somewhat class loader specific (because a Class can only be garbage collected once the class loader is unreachable), I would argue that Wildfly should tie the BytecodeProvider instance lifecycle to the deployment and its associated class loader.

@scottmarlow
Copy link
Member

I'm fine with updating WildFly to tie the BytecodeProvider instance lifecycle to the deployment classloader but I'm not yet sure of what we need to do to avoid the leak reported by https://issues.redhat.com/browse/WFLY-20430 which may be similar to what this change addresses or not.

Perhaps it might help to tie the BytecodeProvider instance lifecycle to the deployment classloader but I'm not sure why that would help avoid leaking the classes as mentioned in https://issues.redhat.com/browse/WFLY-20430?focusedId=26681534&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-26681534

@beikov
Copy link
Member Author

beikov commented Mar 4, 2025

I'm not sure what you are using there exactly for you test runs @scottmarlow, but storing the BytecodeProvider in a static field will most certainly cause a memory leak. It needs to be tied to a deployment or its classloader.

@scottmarlow
Copy link
Member

I'm not sure what you are using there exactly for you test runs @scottmarlow, but storing the BytecodeProvider in a static field will most certainly cause a memory leak. It needs to be tied to a deployment or its classloader.

Thanks, I'll experiment with doing that.

@scottmarlow
Copy link
Member

I have a question about the #9831 change and whether it would work with WildFly in the following situation:

  1. myapplication.ear is deployed
  2. myapplication.ear is undeployed
  3. myapplication.ear is updated to add additional entity class fields + methods
  4. myapplication.ear is deployed

My question is whether reusing the one copy of the HibernateProxy class between application redeployments would mean an incorrect HibernateProxy class would be used after redeployment that is missing certain added methods?

My other question if the answer is that redeployment of applications with use of HibernateProxy would not work if there are any class changes is how can we get to zero leaks of the HibernateProxy classes? Would that be a deeper Byte-Buddy change that should be made in Byte-Buddy?

Also, for reference:

https://issues.redhat.com/browse/WFLY-20430?focusedId=26726972&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-26726972 has a comment from testing with this pull request back ported to the 6.6 branch and also the #9836 change.

From the testing I did, #9836 does help but I still see what looks like a permgen (class) leak with the #9831 change but it is not a HibernateProxy class leak.

@beikov
Copy link
Member Author

beikov commented Mar 10, 2025

My question is whether reusing the one copy of the HibernateProxy class between application redeployments would mean an incorrect HibernateProxy class would be used after redeployment that is missing certain added methods?

This shouldn't happen, unless Wildfly reuses the same class loader, which I doubt. Since these caches are bound to the class loader, this should not be an issue. The change of this PR i.e. #9831 are only effective if the BytecodeProvider is "shared" for a deployment/app and will allow reuse of proxy classes for classes defined on the same class loader throughout different persistence units of the same deployment.

In short, on WF, if you scope BytecodeProvider to the classloader/deployment, this will improve the startup time and reduce the amount of classes defined for deployments/apps that have multiple PUs that refer to the same class.
Apps that do not mess with the BytecodeProvider should not see a difference in behavior, because a new instance is created for every SessionFactory through ServiceRegistry.

My other question if the answer is that redeployment of applications with use of HibernateProxy would not work if there are any class changes is how can we get to zero leaks of the HibernateProxy classes? Would that be a deeper Byte-Buddy change that should be made in Byte-Buddy?

This is only a problem if you reuse the class loader and then also the BytecodeProvider, which WF doesn't do AFAIU.

From the testing I did, #9836 does help but I still see what looks like a permgen (class) leak with the #9831 change but it is not a HibernateProxy class leak.

Sounds like a different problem. You mentioned Instantiator and AccessOptimizer to be the culprit in the chat. Let's continue that discussion in the chat.

@beikov beikov force-pushed the HHH-14694 branch 2 times, most recently from 1a39ac3 to c5aaabf Compare March 11, 2025 19:15
Copy link
Member

@yrodiere yrodiere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks... suprisingly simpler. Thanks.

We'll probably need @Sanne to have a look though :)

Comment on lines 64 to 68
// Don't clear the state anymore, since the cache is not static anymore since HHH-16058 was fixed
// final BytecodeProvider bytecodeProvider =
// metadata.getMetadataBuildingOptions().getServiceRegistry()
// .getService( BytecodeProvider.class );
// addSessionFactoryObservers( new SessionFactoryObserverForBytecodeEnhancer( bytecodeProvider ) );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several things:

  1. Maybe remove the code instead of commenting it out?
  2. We use this observer in Quarkus: https://github.com/quarkusio/quarkus/blob/47ecd0a245b26e85a2b01536f182f3292e95b2e8/extensions/hibernate-orm/runtime/src/main/java/io/quarkus/hibernate/orm/runtime/boot/FastBootEntityManagerFactoryBuilder.java#L194-L198 . I think we should remove it there too, but let me know if you disagree.
  3. Maybe you should remove or at least deprecate this observer, if it's no longer useful?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the comments, the reason these caches are cleared are to reduce memory consumption, which is in principle fine I guess, but this interferes with multi-PU use cases. Ideally, you would only clear the cache once all PUs are built, but not sure how one would know about this scenario. Another concern I have is the fact that at runtime, there is no BytecodeProvider i.e. Quarkus uses either https://github.com/quarkusio/quarkus/blob/main/extensions/hibernate-orm/runtime/src/main/java/io/quarkus/hibernate/orm/runtime/service/bytecodeprovider/RuntimeBytecodeProvider.java or the Hibernate ORM built-in "none" one: https://github.com/quarkusio/quarkus/blob/main/extensions/hibernate-orm/runtime/src/main/java/io/quarkus/hibernate/orm/runtime/customized/DisabledBytecodeProviderInitiator.java#L20

Both of which don't have state, so any clearing would be pointless.

Am I missing something and this code is actually called during Quarkus "deployment" time where an actual bytecode provider is still setup?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Quarkus Bytecode enhancement happens at build time.

So you're right:

  • I would expect there is no (real) BytecodeProvider at static/runtime init.
  • This observer is probably useless in Quarkus, and most likely has been for years.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been discussing this now with @beikov and I kind agree that it's responsibility of the integrating runtime to clear such caches; they are in a better position to control such things and they have visibility over lifecycle of multiple PUs.

Careful though I'm not sure if it's completely ineffective in Quarkus: we do also use the enhancer instance to generate enhanced proxies during static-init, it's not used "just" for entity enhancement.
It's possible that this is no longer needed in Quarkus but it was necessary - I remember implementing this hook because it was useful for keeping the native-image captured heap small.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S. while I'm leaning towards simply removing this observer, let's remember that it's possible for any runtime to inject a custom enhancer implementation (which could be as simple as wrapping our implementation with some minimal overrides), and that implementation could simply do a no-op on this event being fired.
So this isn't strictly necessary to allow e.g. WildFly to do better, and in fact I believe runtimes like WildFly should probably override the instance anyway, as there are other reasons as well to do this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, regardless of this, in Quarkus at least, and at static init, the bytecode provider is still scoped per PU. So the reset can certainly happen per PU?

Doing that might trash the cache and lead to multiple proxies being generated for the same class if used in multiple PUs. It would IMO be best if you can tie the BytecodeProvider to the JPA subsystem as a whole instead, as that will also improve the startup/deployment performance of multi PU apps.

Meh. I just looked, and generation of proxies at static init is actually an edge case, and it should probably be an error case (cc @Sanne ); most proxies should be pre-generated at build time. See https://github.com/quarkusio/quarkus/blob/6b17de4ce46ba227f4b781d16fe2324b2bb283c5/extensions/hibernate-orm/runtime/src/main/java/io/quarkus/hibernate/orm/runtime/proxies/ProxyDefinitions.java#L138-L155 .
So if we do something more in Quarkus, I'd rather put work in eliminating such edge cases and remove proxy generation at static init completely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we do something more in Quarkus, I'd rather put work in eliminating such edge cases and remove proxy generation at static init completely.

Agree

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we do something more in Quarkus, I'd rather put work in eliminating such edge cases and remove proxy generation at static init completely.

Agree

FWIW, created quarkusio/quarkus#46847

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the naming of this resetCaches method is a bit misleading. It's actually not a "cache", but rather the information about which generated proxy class can be used for what managed type. Clearing this state will essentially put more pressure on the metaspace, because new proxy classes will be generated if the same managed type is used by another PU.

The more I think about this, the more I want to remove this TypeCache stuff. Why can't we simply not have the cache and implement this through well known naming?

E.g. for entity A always use the name AProxy6_6_11_SNAPSHOT i.e. encode the "proxy kind" and Hibernate version into the name, since that is everything that matters. That way, a "cache lookup" would simply be a ClassLoader#loadClass() and if that fails, we define the class instead. Also, it would be impossible to have a memory leak in the first place.

Would the described naming change make sense for ORM 6.6 + 7.0? Or is there a specific version in mind for that kind of change (e.g. like a future version)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the answer to my just asked ^ question, my next question is whether this pr might be an alternative for 6.6 if the proxy naming change is not expected to be back ported?

Copy link
Member

@Sanne Sanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, nice cleanup! Thanks

Comment on lines +244 to +248
new NamingStrategy.SuffixingRandom.BaseNameResolver.ForFixedValue( clazz.getName() )
) )
.subclass( superClass )
.implement( ReflectionOptimizer.AccessOptimizer.class )

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note

Invoking
SuffixingRandom.SuffixingRandom
should be avoided because it has been deprecated.
@beikov beikov merged commit bf154c4 into hibernate:main Mar 27, 2025
22 of 25 checks passed
@beikov beikov deleted the HHH-14694 branch March 27, 2025 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants