-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
standalone installer rustc-1.70.0 (precompiled binary) segfaults in elf_machine_runtime_setup #112286
Comments
@juippis is the one with problem, hope he can provide additional details as needed. |
This sounds like it could be the same problem as #112275 |
Funnily enough I also first noticed the issue when attempting to build librsvg. Let me know what info you need, or if you want to merge these issues into one and me to provide the same info. |
What architecture is the crash happening on? I see PowerPC mentioned in the issue description, but also that you can't reproduce on PowerPC. I tried reproducing with this:
Then
And I get
(no segfault) What's the version of your dynamic linker? Mine in the Gentoo Docker image is:
The distributed x86_64 linux rustc contains optimizations from LTO, PGO, and BOLT, and this latest release contains an LLVM version bump. So if I had to guess, the cause of this crash is a bug in those optimizations that only surfaces in uncommon but valid If by building from source you mean just running |
x86-64 for juippis' report. Gyakovlev is just saying he can't hit it on ppc64 but he's forwarding the bug. Gentoo's build from source uses x.py. (Sorry if a bit terse, writing on mobile!) |
Yep, I'm on x86_64. I retraced your steps,
and rust installed with this method works! I do see the blake2b sum doesn't match with what gets installed via our rust-bin ebuild. I wonder if the manifest changed, or if the ebuild is pulling a wrong distfile? EDIT: oh, The -bin version installed by an ebuild is still broken on an identical container (lxc copy). |
Of course team Gentoo is here :)
It successfully rebuilds a bunch of my packages too (emlop, librsvg, rustic), except for libopenraw, which seems to be a language conformance/compiler strictness issue. |
Could you try |
Sure:
:( |
so our installer for -bin package also just uses install.sh. |
Issue: rust-lang/rust#112286 Signed-off-by: Georgy Yakovlev <[email protected]>
How exactly is rustc being stripped? I don't get a crash from this:
|
Nope, the executable is not the problem - see my previous comment about trying to strip with llvm-strip, which clearly shows that libLLVM is being borked. By default portage typically strips everything, including shared libraries.
|
I modified current -rust-bin-1.70.0 ebuild to not strip anything at all. |
hmm, there's simply no # Whether to build LLVM as a dynamically linked library (as opposed to statically linked).
# Under the hood, this passes `--shared` to llvm-config.
# NOTE: To avoid performing LTO multiple times, we suggest setting this to `true` when `thin-lto` is enabled.
#link-shared = llvm.thin-lto so if bundled llvm is built with LTO, solib is also installed as separate file and borked by stripping. |
just for the record, here's how it's normally stripped:
|
I've added |
I did a local build with
And can confirm, that the build that appears under Then I disabled BOLT with this patch diff --git a/src/ci/stage-build.py b/src/ci/stage-build.py
index 91bd137085e..9c9960e25e3 100644
--- a/src/ci/stage-build.py
+++ b/src/ci/stage-build.py
@@ -156,7 +156,7 @@ class LinuxPipeline(Pipeline):
))
def supports_bolt(self) -> bool:
- return True
+ return False
def executable_extension(self) -> str:
return "" Did a |
@gyakovlev I'm afraid that did not work; -r1 crashes again. Just because stripping libLLVM breaks it does not mean other libs are fine (or something is wrong with the dostrip expression). |
Had to remove 2 features in make.conf which prevented emerging but well did back in sep 22. As i recall it was force-mirror (a make.conf feature anyways) which made a librsvg build err, somehow came to a rust bug report which made it look like a rust problem (came to rust googling and checked the bug reports (17.1 hardened w amd use) ... somehow came to this bug report https://bugs.gentoo.org/907492 so i thought it must be rust .. couldn't have been more wrong, Unrelated but noteable: |
Issue: rust-lang/rust#112286 Signed-off-by: Joonas Niilola <[email protected]>
I bumped it again to |
we are getting reports that source rust-1.70.0 built and linked to system copy of LLVM (which is stripped) also segfaults in a similar way, but with no BOLT involvement in that case. arm64 and riscv included, not only x86_64. |
I am on x86_64 on gentoo as well, and I don't see segfaults w/ rustc
Though I am using mold linker instead
|
I'm on x86_64 on Gentoo and using system copy of LLVM and mold linker and don't see any segfaults
My use flags:
|
I noticed a similar issue on ubuntu 22.04 when installed in a container. When I install 1.69 everything still works fine. |
See this issue for problems with the official build rust-lang/rust#112286 Change-Id: I902f061ec3398dc7c2df6fb37f9284f58cf73d7d
Is the BOLT optimization so important that it needs to be kept without a solution being found for this? Rolling new binaries without it for 1.70.0 would unblock us in Gentoo (and it affects other distros too). We're currently stuck on 1.69.0. |
@saethlin mentioned the possibility this is just us encountering llvm/llvm-project#56738 again. Nominating at lqd's recommendation. P-high may be uh, too high, but BOLT has been guilty of causing a lot of Rust miscompilations before, so I am assuming that it is a high priority to figure out what is actually going on... and maybe it will be a lower priority after that.
|
@thesamesam Can you please clarify why this is blocking Gentoo? Does "just don't strip libLLVM.so" not work as a workaround for some reason? |
@gyakovlev Are there any more details or some way we can reproduce the other reports you mentioned in #112286 (comment)? That's the only comment in this thread that doesn't make sense to me. |
FWIW, I asked about this on the LLVM Discord and it seems that it's a known issue that is marked as "wontfix" for the moment (apparently BOLT produces a spec-compliant header, which cannot be parsed by binutils |
To actually answer your question: |
Thank you for explaining! Given #112286 (comment) and the bits below w/ llvm-strip, I feel as if it's likely to keep biting people, but that's up to you folks & I appreciate the help.
Okay, so I spoke to some people internally and it looks like it was a combination of:
So, my plan now is:
Thank you again folks for the help and I'll let you know what happens.
Note that we have reports of |
If you still have issues with Rust 1.70, please file a Gentoo bug with all of the details. I've written up what the situation is on the Rust bug [0], but pasting it inline: """ Okay, so I spoke to some people internally and it looks like it was a combination of: Typo in the initial don't-strip-this-single-file Someone reporting a problem with that Adding the total strip-disabling and people getting confused about when the issue happens? (possibly not having picked up the previous fix?) The fact that llvm-strip seemingly didn't/doesn't help making it sound like another problem existed. So, my plan now is: wait for someone who previously had an issue on ppc64; just totally disable stripping for the time being so we can get this out; make sure any issues get reported to us and then upstream if appropriate (rather than comments on IRC etc where it's easy to get confused about what someone's env is); re-evaluate disabling stripping for certain files instead if nothing comes up from the previous point. """ That feels to me to be a reasonable/plausible timeline of events. matoro's since done that PPC64 testing and not hit any problems; ionen's been using 1.70 for a while; I've rebuilt all rust pkgs (and used some of them) on rust{,-bin}-1.70 machines without incident. Thank you to matoro and ionen for helping me muddle my way through here. So, all that said, let's unmask it and handle any new issues (although I'm not expecting any) as-and-when/if they come in. [0] rust-lang/rust#112286 (comment) Bug: rust-lang/rust#112286 Signed-off-by: Sam James <[email protected]>
note that BOLT is only enabled on |
This was discussed in today's t-compiler meeting, starting in this zulip topic. While the issue is real and quite an uncommon thing to happen, the fact that there is in-progress work to fix it upstream made suggesting to use the workaround acceptable to the team, for the short-term at least. That also matches gentoo's plan above to disable stripping for now. This topic could be revisited in the future if issues that are impossible to work around are discovered, or if the issue is not fixed upstream as expected. Removing nomination. |
This reverts commit 1f1028e. Reason for revert: rust-lang/rust#112286 Change-Id: I58487807caaa6bd014896f28c956a7f0593ceaa5
Visited during the compiler team's |
This issue is labeled E-needs-mcve but the comments above suggest that it's being worked on. Does it still need a MCVE? |
The links above to reviews.llvm.org are now broken, because the site doesn't exist anymore. It's entirely unclear to me if this is still being worked on, but I also don't know if an MCVE would help. Unfortunately that's a question for the LLVM issue tracker, based on the above probably llvm/llvm-project#56738 |
This also appears to affect GNU Guix, as seen while building librsvg 2.56.4 with Rust 1.75 built from source with stripped LLVM:
We are using LLVM 15 for the build of Rust 1.75, and the Bolt project isn't enabled for it (so there is no |
In gentoo we ship prebuilt rust (from standalone installers https://forge.rust-lang.org/infra/other-installation-methods.html#standalone-installers) as an alternative for users who do not with to build from source and for bootstrap purposes. Hover, one of developers is observing segfaults from the precompiled binary. If they build 1.70.0 from source - it all works fine though.
Full backtrace at the bottom.
I can't reproduce on my powerpc64le-unknown-linux-gnu with standalone installer at all.
Meta
rustc --version --verbose
:Backtrace
The text was updated successfully, but these errors were encountered: