Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to cross-compile Scio-core for Scala 3 #4638

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jchyb
Copy link

@jchyb jchyb commented Dec 15, 2022

This PR is meant to explore the possibility of making scio cross compile for Scala-3. For now the scio-core and a large part of scio-test do compile, with some of that compiled scio-test failing (more on that later).

Missing dependencies

There are some missing dependencies, most notably from the twitter stack:

  • com.twitter %% algebird-core
  • com.twitter %% chill
  • com.twitter %% chill-algebird
    None of them are too problematic, requiring only the use of CrossVersion.for3Use2_13. This does mean that there are some additional dependencies of which the 2.13 version has to be resolved, as otherwise we would get classpath conflicts (same thing will have to be done for any downstream projects).

Missing shapeless 2 for Scala 3 is not a major issue, as it only leads to 2 changes:

  • Having to reimplement the Low Priority functionality, which while not yet done, seems very simple to port for Scala 3
  • Implicits not being cached in a dedicated macro (not sure if and how to reimplement this yet, but to me it seems as not a critical issue, unless I misunderstood)

Various Magnolify modules are used in scio-test and scio-avro. Magnolify appears to already have a similar draft PR, where they are waiting for a solution of a few pending issues, including some we independently need solved for here as well. For now magnolify modules and code that depends on it was moved to a scala-2 only module.

Macros

Some macros were rewritten, with the missing ones having a suitable comment near them. Most of the problematic (for now) macros stem from the use of Magnolia.
While Magnolia does have a Scala 3 version, it has a fundamentally different API and implementation from the Scala 2 Magnolia, for now using Mirrors instead of macros. This means that not all previously supported types will work, including AnyVal, or any other non case class types. The good news is that an alternative implementation of Magnolia was being investigated (softwaremill/magnolia#321), where it could be possible to support all those types. Absolute worst case scenario - we can fork that.

Another macro that remains unported for now is the To.safe method. It looks doable, but we would have to construct schemas manually, based on the passed types instead of using eval like before. It should be possible, but I did not have the time to take care of this myself yet, especially considering it is relatively non critical to compiling and running the core module and tests.

Any macro annotations are being commented-out / ignored for now, until they will be added to the language.

Testing

Because of the previously mentioned missing magnolify dependencies, some test suites that relied on them were moved to a separate Scala-2 only module. Additionally, there were some individual tests that did not compile either, mostly because of the aforementioned lack of support of Scala 3 Magnolia for non case class types - those tests were simply commented out for now.

I haven't had the time to properly setup the environment and analyze the test outputs, but for now many more tests fail on Scala 3 than on Scala 2, with the common error message being something like:

java.lang.IllegalArgumentException: com.spotify.scio.util.Functions$$anon$8, @ProcessElement processElement(Object, OutputReceiver), @ProcessElement processElement(Object, OutputReceiver), parameter of type OutputReceiver at index 1: OutputReceiver should be parameterized by U
[info]   at org.apache.beam.sdk.transforms.reflect.DoFnSignatures$ErrorReporter.throwIllegalArgument(DoFnSignatures.java:2403)
[info]   at org.apache.beam.sdk.transforms.reflect.DoFnSignatures$ErrorReporter.checkArgument(DoFnSignatures.java:2409)
[info]   at org.apache.beam.sdk.transforms.reflect.DoFnSignatures.analyzeExtraParameter(DoFnSignatures.java:1412)
[info]   at org.apache.beam.sdk.transforms.reflect.DoFnSignatures.analyzeProcessElementMethod(DoFnSignatures.java:1236)
[info]   at org.apache.beam.sdk.transforms.reflect.DoFnSignatures.parseSignature(DoFnSignatures.java:644)
[info]   at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708)
[info]   at org.apache.beam.sdk.transforms.reflect.DoFnSignatures.getSignature(DoFnSignatures.java:300)
[info]   at org.apache.beam.sdk.transforms.ParDo.validate(ParDo.java:614)
[info]   at org.apache.beam.sdk.transforms.ParDo.of(ParDo.java:403)
[info]   at com.spotify.scio.values.SCollection.parDo(SCollection.scala:231)

* Adjust the build.sbt, add an additional option for checking S3 macros
* Remove macro methods from classes and move them to separate modules
adding them as mix-ins (e.g. the fallback method from Coder and
safe from To)
* adjust the Magnolia usage to accomodate the new Scala 3 API
* some macro were reimplemented, other are still unimplemented (like
To.safe)
* references to static annotation macro were removed for now, until the
experimental support for them will be added
* redesign the usage of ClassTag in ScioContext for Scala 3 (now we
generate ClassTag[Nothing] ourselves, if necessary)
* some Scala 3 macros were moved to scio-core, for typechecking reasons
* other minor changes, like added type annotations
This was done to avoid hacks in build.sbt, and to keep the scio-avro
dependency for Scala 2 scio-test, while removing it for Scala 3.
scio-avro was not yet migrated to SCala 3, as it has some macro
dependencies, that are still waiting for the macro static annotation
support.
* Since the module depends on scio-avro (not yet being compiled for
Scala 3), some test suites that depend on it were moved to a scala-2
only module.
* Some individual tests that were failing to compile were commented out
for now - most of them test magnolia-based Coder derivation, which is
only implemented for case classes for now
@RustedBones
Copy link
Contributor

Thanks for the draft PR @jchyb !
Will see if we can work soon on Magnolia for scala 3 so we can go further with this

@turb
Copy link
Contributor

turb commented Nov 7, 2024

Hello! It seems now Magnolia supports Scala3. Could not this work be resurrected?

@RustedBones
Copy link
Contributor

As scio depends on magnolify, we need to cross-build this lib 1st.
I started that work a while ago (see draft PRs.

I found some issue with magnolia around serialization. This is an issue for distributed computing as we rely heavily on typeclass being serializable. Attempted to fix here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants