-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added fallback using reflection for backward-compatibility #1573
added fallback using reflection for backward-compatibility #1573
Conversation
@@ -45,7 +45,7 @@ jobs: | |||
matrix: | |||
os: [ubuntu-24.04] | |||
java-version: [11] | |||
spark-version: [{short: '3.4', full: '3.4.3'}, {short: '3.5', full: '3.5.5'}] | |||
spark-version: [{short: '3.4', full: '3.4.3'}, {short: '3.5', full: '3.5.4'}, {short: '3.5', full: '3.5.5'}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we advanced the minimum version to 3.5.5, I am not sure if we should add back 3.5.4 tests.
SQL tests can be done here, but tests in Comet are not done with 3.5.4 anymore.
If we decide to support 3.5.4, I would vote for doing it by adding a maven profile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a maven profile using spark 3.5.4 will break compilation for the 3.5 shim -
So to support both 3.5.4 and 3.5.5 in compilation we will need either separate shims for each patch version,
or the 3.5 shim to support all versions through reflection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say we should create a separate Shims for Spark 3.5.4 where it is necessary. I know it is right now we only do profile at 3.4, 3.5, 4.0 level. but I observed a couple of API changes between patch version changes.
Comet Release for 3.5 does not mean that it works with all 3.5.x, currently. Now the minimum version will be 3.5.5 in the next release. Either we drop the 3.5.4 support, or create a new profile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went for the reflection approach after my post-merge discussion with @wForget in #1565 -
since creating different shims for 3.5.4 also means a different release for 3.5.4, which might cause confusion for users.
Since I did the upgrade to 3.5.5 separately, dropping support for 3.5.4 simply means discarding this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to add 3.5.4 ci, just make sure it can be compiled successfully. Or how about deleting it after verification in this pr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it can't be compiled succesfully - for the same reason 3.5.5 couldn't be compiled when the 3.5 shims were compiled against 3.5.4...
try { | ||
PartitionedFileUtil.getPartitionedFile(f, f.getPath, p.values) | ||
} catch { | ||
case _: NoSuchMethodError => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about directly judging the spark version? Like:
if (org.apache.spark.SPARK_VERSION_SHORT >= "3.5.5") {
... for spark 3.5.5+
} else {
... for spark 3.5.4
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And keeping the current code paths? (i.e. compiled code for 3.5.5, else reflection)
That is more explicit, but I'm not sure it brings any benefit.
The advantage of the try-catch idea that in case this param will go away in 3.5.6 (like it did in 4.0.0-preview1/2, while it's back in main...) this will remain backward and forward-compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to define reflection for each version. Some similar implementations:
@kazuyukitanimura WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reflection is ok with me. I just wanted to make sure we dropped 3.5.4 support
If we decide to support 3.5.4 as community, I think we should test with 3.5.4 (downgrade in pom) all the way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer maintaining compatibility with more Spark patch versions to increase its adoptability. If it is known to not work with Spark <=3.5.4, we'd better explicitly document it in Supported Spark Versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we should support multiple 3.5.x versions. Some users do not have control over the minor Spark version they use. There isn't any technical reason why we cannot maintain diff files for multiple Spark versions and test multiple versions in CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to see reflection used for both 3.5.0-3.5.4 and 3.5.5 rather than catching a runtime exception, as @wForget suggests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a commit to always invoke the two methods differing between 3.5.x versions using reflection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reflection is ok with me. I just wanted to make sure we dropped 3.5.4 support If we decide to support 3.5.4 as community, I think we should test with 3.5.4 (downgrade in pom) all the way
@kazuyukitanimura I filed #1579 to continue this discussion.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1573 +/- ##
============================================
+ Coverage 56.12% 58.53% +2.40%
- Complexity 976 985 +9
============================================
Files 119 122 +3
Lines 11743 12237 +494
Branches 2251 2279 +28
============================================
+ Hits 6591 7163 +572
+ Misses 4012 3944 -68
+ Partials 1140 1130 -10 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending CI. Thanks @YanivKunda. I filed a follow on #1579 for clarifying our policy for supported Spark versions and continuing the discussion about which versions we should test in CI and for updating the documentation to list the specific versions that we support.
I see the tests passed for both 3.5.4 and 3.5.5 - which cover the two braches in my change. |
Yes, I think this is sufficient for now. I created a follow on PR to explain that 3.5.4 is not fully tested in CI. |
Which issue does this PR close?
Closes #1572
Rationale for this change
Users with older 3.5.x versions before Spark 3.5.5 should be able using the latest comet version.
What changes are included in this PR?
Modifying the 3.5
ShimCometScanExec
shim to fallback to reflection with the previous method signatures if the latest is not found.How are these changes tested?
Ran
mvn test
with thespark-3.5
profile.Added 3.5.4 to the matrix of tested spark versions in
spark_sql_test.yml