Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linux_android_with_fallback: avoid dlsym #598

Closed
wants to merge 1 commit into from

Conversation

tamird
Copy link
Contributor

@tamird tamird commented Feb 3, 2025

See the commit message.

Fixes #600.

@tamird
Copy link
Contributor Author

tamird commented Feb 3, 2025

r? @newpavlov

@tamird tamird force-pushed the dlsym-fallback-static branch 3 times, most recently from 4f11144 to 792ac28 Compare February 3, 2025 21:28
@tamird
Copy link
Contributor Author

tamird commented Feb 3, 2025

@newpavlov I've added a test now to check that musl works when /dev/urandom isn't available.

@newpavlov
Copy link
Member

newpavlov commented Feb 3, 2025

I don't think it's a good approach to solving this issue. We probably should write a MUSL-specific version of the init function which links statically to libc::getrandom and checks at runtime whether it's supported.

As for getrandom_test_linux_fallback_no_dev_urandom, we probably do not need it. If neither libc::getrandom, nor /dev/urandom work, then we simply can not do anything.

@tamird
Copy link
Contributor Author

tamird commented Feb 3, 2025

I don't think it's a good approach to solving this issue. We probably should write a MUSL-specific version of the init function which links statically to libc::getrandom and checks at runtime whether it's supported.

That's exactly what this does.

As for getrandom_test_linux_fallback_no_dev_urandom, we probably do not need it. If neither libc::getrandom, nor /dev/urandom work, then we simply can not do anything.

getrandom_test_linux_fallback_no_dev_urandom exercises the case where libc::getrandom does work but /dev/urandom does not. There is currently no coverage for this case, which is why this bug exists.

@tamird
Copy link
Contributor Author

tamird commented Feb 3, 2025

@newpavlov thoughts?

@newpavlov
Copy link
Member

I created #602 as a smaller alternative.

As for testing, I will do it in a different PR slightly differently.

@tamird tamird force-pushed the dlsym-fallback-static branch from 792ac28 to 92753dd Compare February 5, 2025 15:18
@tamird
Copy link
Contributor Author

tamird commented Feb 5, 2025

I've backed out the test.

I've changed the approach here - now the check for getrandom is done entirely at compile time by attempting to link a trivial program that calls getrandom.

The build script needs a bit of refinement - the check is not necessary on all platforms, but I want to get CI results and also to let you have a look.

@tamird tamird changed the title linux_android_with_fallback: avoid dlsym on musl linux_android_with_fallback: avoid dlsym Feb 5, 2025
@tamird tamird force-pushed the dlsym-fallback-static branch 2 times, most recently from d1f2012 to 49929b7 Compare February 5, 2025 15:38
The dlsym-based getrandom detection added in commit 869a4f0
("linux_android: use libc::getrandom") assumes dynamic linking, which
means it never works properly on musl where static linking is the norm.

Replace this with build-time feature checking by attempting to link to
getrandom in build.rs.
@tamird tamird force-pushed the dlsym-fallback-static branch from 49929b7 to 8c94108 Compare February 5, 2025 15:41
@newpavlov
Copy link
Member

Trying to detect getrandom availability in a build script not only complex, but also a bad approach in general IMO. A binary can be easily built on a machine with getrandom present, but executed on a machine without it or users may cross-compile.

@tamird
Copy link
Contributor Author

tamird commented Feb 5, 2025

A binary can be easily built on a machine with getrandom present, but executed on a machine without it

The only target we handle that may or may not have getrandom is android, and there we are always cross-compiling using the NDK where the target API level is specified. So I think we do not need to handle the case where getrandom is available at build time but not present in the deployment target.

or users may cross-compile.

Cross-compilation already works correctly with this approach. We are using the user's native compiler for the target.

@newpavlov
Copy link
Member

Fixed in #602

@newpavlov newpavlov closed this Feb 12, 2025
@tamird
Copy link
Contributor Author

tamird commented Feb 12, 2025

Please reopen this. I'm going to rebase it - it is a separate proposal to the fix in #602. This is about removing dlsym entirely.

@newpavlov
Copy link
Member

newpavlov commented Feb 12, 2025

I would prefer to keep using dlsym on non-MUSL targets. Alternatively, we could use raw syscalls similarly to std, but then we would lose on potential vDSO optimizations.

@tamird
Copy link
Contributor Author

tamird commented Feb 12, 2025

Let's continue the conversation over code, please. I'd appreciate some reasoning. Why is dlsym preferable?

@newpavlov
Copy link
Member

newpavlov commented Feb 12, 2025

It's much simpler than your approach based on build scripts. It does not add the cc build dependency to the most popular build target. It allows to launch binaries compiled on a "modern" Linux hosts on older Linux distributions and vice versa.

@tamird
Copy link
Contributor Author

tamird commented Feb 12, 2025

It allows to launch binaries compiled on a "modern" Linux hosts on older Linux distributions and vice versa.

Remember that this code is used only on Android and MUSL. For Android, you always use the NDK, so this doesn't apply. In MUSL, support for getrandom was added 7 years ago. Who is building with a 7 year old MUSL?

The benefits of this approach are that it's making the resulting code faster (no indirect call), the implementation in linux_android_with_fallback.rs much simpler, and inlines the fallback code on systems where getrandom is statically not available.

I don't understand why the build dependency on cc is a meaningful downside.

@newpavlov
Copy link
Member

newpavlov commented Feb 12, 2025

You are forgetting about GNU targets which are much more popular than MUSL and Android targets combined. With the current approach I can compile my project on RHEL 7 and get a binary which properly uses the getrandom syscall on modern distributions, while with your approach my application will be locked to the file fallback. It also applies to the reverse scenario (there are some quirks around glibc symbol versioning, but it's not relevant to this crate), with your approach I would get a binary which links to the getrandom symbol and would fail with a linking error on an old distribution.

making the resulting code faster (no indirect call),

As I wrote here, this is a vanishingly small difference in our case. And if you care about it or if you find dlsym problematic for some reason, you can switch to the linux_getrandom opt-in backend.

I don't understand why the build dependency on cc is a meaningful downside.

getrandom is a pretty fundamental dependency and some users are quite sensitive to additional dependencies even if they are also "fundamental", e.g. see the recent debacle around use of zerocopy in rand and troubles we had with lock file bloat. cc also can be problematic because it requires installation of a C compiler which is not present by default on many distributions.

@tamird
Copy link
Contributor Author

tamird commented Feb 12, 2025

You are forgetting about GNU targets which are much more popular than MUSL and Android targets combined. With the current approach I can compile my project on RHEL 7 and get a binary which properly uses the getrandom syscall on modern distributions, while with your approach my application will be locked to the file fallback. It also applies to the reverse scenario (there are some quirks around glibc symbol versioning, but it's not relevant to this crate), with your approach I would get a binary which links to the getrandom symbol and would fail with a linking error on an old distribution.

Ah! I missed the fact that this implementation is used on GNU (on some architectures).

I don't understand why the build dependency on cc is a meaningful downside.

getrandom is a pretty fundamental dependency and many users are quite sensitive to additional dependencies even if they are also "fundamental", e.g. see the recent debacle around use of zerocopy in rand and troubles we had with lock file bloat. cc is also can be problematic because it requires installation of a C compiler which is not present by default on many distributions.

Indeed, this makes more sense when we talk about GNU targets (Android is always cross-compiled). I agree there's no way to support GNU targets without dlsym or weak linking, which isn't possible on nightly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

linux_android_with_fallback never uses getrandom on musl
2 participants