Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

standalone installer rustc-1.70.0 (precompiled binary) segfaults in elf_machine_runtime_setup #112286

Open
gyakovlev opened this issue Jun 4, 2023 · 37 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another.
Milestone

Comments

@gyakovlev
Copy link

In gentoo we ship prebuilt rust (from standalone installers https://forge.rust-lang.org/infra/other-installation-methods.html#standalone-installers) as an alternative for users who do not with to build from source and for bootstrap purposes. Hover, one of developers is observing segfaults from the precompiled binary. If they build 1.70.0 from source - it all works fine though.

Full backtrace at the bottom.

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7fd94b5 in elf_machine_runtime_setup (profile=0, lazy=<optimized out>, scope=0x7ffff7fab4a8, l=0x7ffff7fab120)
    at ../sysdeps/x86_64/dl-machine.h:88
88	      *(ElfW(Addr) *) (got + 1) = (ElfW(Addr)) l;

I can't reproduce on my powerpc64le-unknown-linux-gnu with standalone installer at all.

Meta

rustc --version --verbose:

rustc 1.70.0 (90c541806 2023-05-31)
binary: rustc
commit-hash: 90c541806f23a127002de5b4038be731ba1458ca
commit-date: 2023-05-31
host: powerpc64le-unknown-linux-gnu
release: 1.70.0
LLVM version: 16.0.2
Backtrace

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7fd94b5 in elf_machine_runtime_setup (profile=0, lazy=<optimized out>, scope=0x7ffff7fab4a8, l=0x7ffff7fab120)
    at ../sysdeps/x86_64/dl-machine.h:88
88	      *(ElfW(Addr) *) (got + 1) = (ElfW(Addr)) l;



Thread 1 (process 242817 "rustc"):
#0  0x00007ffff7fd94b5 in elf_machine_runtime_setup (profile=0, lazy=<optimized out>, scope=0x7ffff7fab4a8, l=0x7ffff7fab120) at ../sysdeps/x86_64/dl-machine.h:88
        cpu_features = <optimized out>
        got = 0x7fffeff24e48
        got = <optimized out>
        cpu_features = <optimized out>
#1  _dl_relocate_object (l=l@entry=0x7ffff7fab120, scope=<optimized out>, reloc_mode=1, consider_profiling=<optimized out>, consider_profiling@entry=0) at dl-reloc.c:301
        edr_lazy = <optimized out>
        textrels = 0x0
        errstring = 0x0
        lazy = <optimized out>
        skip_ifunc = 0
        consider_symbind = <optimized out>
#2  0x00007ffff7fe8941 in dl_main (phdr=<optimized out>, phnum=<optimized out>, user_entry=<optimized out>, auxv=<optimized out>) at rtld.c:2318
        l = 0x7ffff7fab120
        lnp = <optimized out>
        i = <optimized out>
        main_map = <optimized out>
        file_size = 140737353912872
        file = <optimized out>
        i = <optimized out>
        rtld_is_main = <optimized out>
        tcbp = <optimized out>
        state = {audit_list = {audit_strings = {0x34000000340 <error: Cannot access memory at address 0x34000000340> <repeats 12 times>, 0x0, 0x100 <error: Cannot access memory at address 0x100>, 0x0, 0x0}, length = 0, current_index = 0, current_tail = 0x0, fname = '\000' <repeats 80 times>, "W\366\036\021\000\000\000\000 \344\377\377\377\177\000\000\330\343\377\377\377\177\000\000\260\332\377\367\377\177\000\000\000\260\374\367\377\177\000\000\300\002\000\000\000\000\000\000W\022\376\367\377\177", '\000' <repeats 26 times>, "\031\347\377\377\377\177\000\000\002", '\000' <repeats 23 times>, "\020\000\000\000\000\000\000\000"...}, library_path = 0x0, library_path_source = 0x0, preloadlist = 0x0, preloadarg = 0x0, glibc_hwcaps_prepend = 0x0, glibc_hwcaps_mask = 0x0, mode = rtld_mode_normal, mode_trace_program = false, any_debug = false, version_info = false}
        ld_so_name = <optimized out>
        __PRETTY_FUNCTION__ = "dl_main"
        has_interp = <optimized out>
        first_preload = <optimized out>
        r = <optimized out>
        rtld_ehdr = <optimized out>
        rtld_phdr = <optimized out>
        cnt = <optimized out>
        need_security_init = <optimized out>
        count_modids = <optimized out>
        preloads = <optimized out>
        npreloads = <optimized out>
        preload_file = "/etc/ld.so.preload"
        rtld_multiple_ref = <optimized out>
        was_tls_init_tp_called = <optimized out>
        consider_profiling = 0
        start = <optimized out>
#3  0x00007ffff7fe507f in _dl_sysdep_start (start_argptr=start_argptr@entry=0x7fffffffe520, dl_main=dl_main@entry=0x7ffff7fe6b10 <dl_main>) at ../sysdeps/unix/sysv/linux/dl-sysdep.c:140
        dl_main_args = {phdr = 0x555555554040, phnum = 12, user_entry = 93824992337013}
#4  0x00007ffff7fe6814 in _dl_start_final (arg=0x7fffffffe520) at rtld.c:498
        start_addr = <optimized out>
        start_addr = <optimized out>
        rtld_total_time = <optimized out>
#5  _dl_start (arg=0x7fffffffe520) at rtld.c:585
No locals.
#6  0x00007ffff7fe5668 in _start () from /lib64/ld-linux-x86-64.so.2
        TD_SLEEP = TD_SLEEP
        TD_CREATE = TD_CREATE
        _URC_FATAL_PHASE1_ERROR = _URC_FATAL_PHASE1_ERROR
        TD_CATCHSIG = TD_CATCHSIG
        cet_permissive = cet_permissive
        TD_LOCK_TRY = TD_LOCK_TRY
        RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT = RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT
        PREFERRED_FEATURE_INDEX_1 = PREFERRED_FEATURE_INDEX_1
        PREFERRED_FEATURE_INDEX_MAX = PREFERRED_FEATURE_INDEX_MAX
        arch_kind_unknown = arch_kind_unknown
        TD_SWITCHFROM = TD_SWITCHFROM
        cache_extension_tag_generator = cache_extension_tag_generator
        cache_extension_tag_glibc_hwcaps = cache_extension_tag_glibc_hwcaps
        _URC_INSTALL_CONTEXT = _URC_INSTALL_CONTEXT
        TD_DEATH = TD_DEATH
        RT_CONSISTENT = RT_CONSISTENT
        LA_ACT_CONSISTENT = LA_ACT_CONSISTENT
        rtld_mode_verify = rtld_mode_verify
        TD_MAX_EVENT_NUM = TD_TIMEOUT
        RT_DELETE = RT_DELETE
        RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT = RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT
        relocate_time = 0
        DL_LOOKUP_ADD_DEPENDENCY = DL_LOOKUP_ADD_DEPENDENCY
        TD_READY = TD_READY
        _bitindex_arch_Slow_SSE4_2 = _bitindex_arch_Slow_SSE4_2
        rtld_mode_help = rtld_mode_help
        cpuid_register_index_eax = cpuid_register_index_eax
        CPUID_INDEX_1 = CPUID_INDEX_1
        CPUID_INDEX_7 = CPUID_INDEX_7
        CPUID_INDEX_80000001 = CPUID_INDEX_80000001
        CPUID_INDEX_D_ECX_1 = CPUID_INDEX_D_ECX_1
        CPUID_INDEX_80000007 = CPUID_INDEX_80000007
        CPUID_INDEX_80000008 = CPUID_INDEX_80000008
        CPUID_INDEX_7_ECX_1 = CPUID_INDEX_7_ECX_1
        CPUID_INDEX_19 = CPUID_INDEX_19
        CPUID_INDEX_14_ECX_0 = CPUID_INDEX_14_ECX_0
        CPUID_INDEX_MAX = CPUID_INDEX_MAX
        dso_sort_algorithm_original = dso_sort_algorithm_original
        TD_CONCURRENCY = TD_CONCURRENCY
        lt_executable = lt_executable
        cpuid_register_index_ebx = cpuid_register_index_ebx
        _bitindex_arch_I686 = _bitindex_arch_I686
        cache_extension_count = cache_extension_count
        PTHREAD_MUTEX_TIMED_NP = PTHREAD_MUTEX_TIMED_NP
        PTHREAD_MUTEX_RECURSIVE_NP = PTHREAD_MUTEX_RECURSIVE_NP
        PTHREAD_MUTEX_ERRORCHECK_NP = PTHREAD_MUTEX_ERRORCHECK_NP
        PTHREAD_MUTEX_ADAPTIVE_NP = PTHREAD_MUTEX_ADAPTIVE_NP
        PTHREAD_MUTEX_NORMAL = PTHREAD_MUTEX_TIMED_NP
        PTHREAD_MUTEX_RECURSIVE = PTHREAD_MUTEX_RECURSIVE_NP
        PTHREAD_MUTEX_ERRORCHECK = PTHREAD_MUTEX_ERRORCHECK_NP
        PTHREAD_MUTEX_DEFAULT = PTHREAD_MUTEX_TIMED_NP
        PTHREAD_MUTEX_FAST_NP = PTHREAD_MUTEX_TIMED_NP
        TD_REAP = TD_REAP
        DL_LOOKUP_RETURN_NEWEST = DL_LOOKUP_RETURN_NEWEST
        _URC_HANDLER_FOUND = _URC_HANDLER_FOUND
        cpuid_register_index_ecx = cpuid_register_index_ecx
        _bitindex_arch_Avoid_Short_Distance_REP_MOVSB = _bitindex_arch_Avoid_Short_Distance_REP_MOVSB
        _bitindex_arch_Prefer_FSRM = _bitindex_arch_Prefer_FSRM
        lc_property_unknown = lc_property_unknown
        _bitindex_arch_Fast_Unaligned_Load = _bitindex_arch_Fast_Unaligned_Load
        lt_library = lt_library
        cpuid_register_index_edx = cpuid_register_index_edx
        rtld_mode_list_diagnostics = rtld_mode_list_diagnostics
        start_time = 3138663582286386
        _URC_NO_REASON = _URC_NO_REASON
        _bitindex_arch_Prefer_PMINUB_for_stringop = _bitindex_arch_Prefer_PMINUB_for_stringop
        arch_kind_other = arch_kind_other
        rtld_mode_list = rtld_mode_list
        _dl_rtld_libname = {name = 0x5555555542e0 "/lib64/ld-linux-x86-64.so.2", next = 0x7ffff7ffe220 <newname>, dont_free = 0}
        TD_IDLE = TD_IDLE
        unknown = unknown
        _URC_FATAL_PHASE2_ERROR = _URC_FATAL_PHASE2_ERROR
        cet_elf_property = cet_elf_property
        RT_ADD = RT_ADD
        _bitindex_arch_Fast_Rep_String = _bitindex_arch_Fast_Rep_String
        _bitindex_arch_MathVec_Prefer_No_AVX512 = _bitindex_arch_MathVec_Prefer_No_AVX512
        _bitindex_arch_Fast_Copy_Backward = _bitindex_arch_Fast_Copy_Backward
        _bitindex_arch_AVX_Fast_Unaligned_Load = _bitindex_arch_AVX_Fast_Unaligned_Load
        existing = existing
        _URC_NORMAL_STOP = _URC_NORMAL_STOP
        lc_property_none = lc_property_none
        nonexisting = nonexisting
        load_time = 2211122
        TD_PREEMPT = TD_PREEMPT
        TD_TIMEOUT = TD_TIMEOUT
        TD_ALL_EVENTS = TD_ALL_EVENTS
        _URC_END_OF_STACK = _URC_END_OF_STACK
        _bitindex_arch_Prefer_No_AVX512 = _bitindex_arch_Prefer_No_AVX512
        arch_kind_intel = arch_kind_intel
        rtld_mode_trace = rtld_mode_trace
        rtld_mode_list_tunables = rtld_mode_list_tunables
        _bitindex_arch_Prefer_ERMS = _bitindex_arch_Prefer_ERMS
        cet_always_on = cet_always_on
        dso_sort_algorithm_dfs = dso_sort_algorithm_dfs
        LA_ACT_DELETE = LA_ACT_DELETE
        TD_SWITCHTO = TD_SWITCHTO
        _bitindex_arch_Slow_BSF = _bitindex_arch_Slow_BSF
        arch_kind_zhaoxin = arch_kind_zhaoxin
        cet_always_off = cet_always_off
        arch_kind_amd = arch_kind_amd
        _URC_FOREIGN_EXCEPTION_CAUGHT = _URC_FOREIGN_EXCEPTION_CAUGHT
        _bitindex_arch_Prefer_No_VZEROUPPER = _bitindex_arch_Prefer_No_VZEROUPPER
        TD_EVENT_NONE = TD_ALL_EVENTS
        TD_EVENTS_ENABLE = TD_EVENTS_ENABLE
        rtld_mode_normal = rtld_mode_normal
        TD_MIN_EVENT_NUM = TD_READY
        lc_property_valid = lc_property_valid
        LA_ACT_ADD = LA_ACT_ADD
        TD_PRI_INHERIT = TD_PRI_INHERIT
        _bitindex_arch_Fast_Unaligned_Copy = _bitindex_arch_Fast_Unaligned_Copy
        _URC_CONTINUE_UNWIND = _URC_CONTINUE_UNWIND
        DL_LOOKUP_FOR_RELOCATE = DL_LOOKUP_FOR_RELOCATE
        lt_loaded = lt_loaded
        DL_LOOKUP_GSCOPE_LOCK = DL_LOOKUP_GSCOPE_LOCK
        _dl_rtld_libname2 = {name = 0x0, next = 0x0, dont_free = 0}
        RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT = RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT
        _bitindex_arch_I586 = _bitindex_arch_I586
        __GI__dl_argv = 0x7fffffffe528
        _rtld_global_ro = {_dl_debug_mask = 0, _dl_platform = 0x7fffffffe729 "x86_64", _dl_platformlen = 6, _dl_pagesize = 4096, _dl_minsigstacksize = 3376, _dl_inhibit_cache = 0, _dl_initial_searchlist = {r_list = 0x7ffff7faca68, r_nlist = 12}, _dl_clktck = 100, _dl_verbose = 0, _dl_debug_fd = 2, _dl_lazy = 1, _dl_bind_not = 0, _dl_dynamic_weak = 0, _dl_fpu_control = 895, _dl_hwcap = 2, _dl_auxv = 0x7fffffffe5c0, _dl_x86_cpu_features = {basic = {kind = arch_kind_amd, max_cpuid = 16, family = 25, model = 33, stepping = 0}, features = {{{cpuid_array = {10620688, 153094144, 2130194955, 395049983}, cpuid = {eax = 10620688, ebx = 153094144, ecx = 2130194955, edx = 395049983}}, {active_array = {0, 0, 2128097795, 394821904}, active = {eax = 0, ebx = 0, ecx = 2128097795, edx = 394821904}}}, {{cpuid_array = {0, 563910569, 4195996, 16}, cpuid = {eax = 0, ebx = 563910569, ecx = 4195996, edx = 16}}, {active_array = {0, 562823976, 4195864, 16}, active = {eax = 0, ebx = 562823976, ecx = 4195864, edx = 16}}}, {{cpuid_array = {10620688, 536870912, 1975662591, 802421759}, cpuid = {eax = 10620688, ebx = 536870912, ecx = 1975662591, edx = 802421759}}, {active_array = {0, 0, 353, 134217728}, active = {eax = 0, ebx = 0, ecx = 353, edx = 134217728}}}, {{cpuid_array = {15, 840, 6144, 0}, cpuid = {eax = 15, ebx = 840, ecx = 6144, edx = 0}}, {active_array = {7, 0, 0, 0}, active = {eax = 7, ebx = 0, ecx = 0, edx = 0}}}, {{cpuid_array = {0, 59, 0, 26521}, cpuid = {eax = 0, ebx = 59, ecx = 0, edx = 26521}}, {active_array = {0, 0, 0, 0}, active = {eax = 0, ebx = 0, ecx = 0, edx = 0}}}, {{cpuid_array = {12336, 287241815, 20511, 65536}, cpuid = {eax = 12336, ebx = 287241815, ecx = 20511, edx = 65536}}, {active_array = {0, 512, 0, 0}, active = {eax = 0, ebx = 512, ecx = 0, edx = 0}}}, {{cpuid_array = {0, 0, 0, 0}, cpuid = {eax = 0, ebx = 0, ecx = 0, edx = 0}}, {active_array = {0, 0, 0, 0}, active = {eax = 0, ebx = 0, ecx = 0, edx = 0}}}, {{cpuid_array = {0, 0, 0, 0}, cpuid = {eax = 0, ebx = 0, ecx = 0, edx = 0}}, {active_array = {0, 0, 0, 0}, active = {eax = 0, ebx = 0, ecx = 0, edx = 0}}}, {{cpuid_array = {0, 0, 0, 0}, cpuid = {eax = 0, ebx = 0, ecx = 0, edx = 0}}, {active_array = {0, 0, 0, 0}, active = {eax = 0, ebx = 0, ecx = 0, edx = 0}}}}, preferred = {704}, isa_1 = 7, xsave_state_size = 896, xsave_state_full_size = 2560, data_cache_size = 32768, shared_cache_size = 33554432, non_temporal_threshold = 25165824, rep_movsb_threshold = 2112, rep_movsb_stop_threshold = 524288, rep_stosb_threshold = 2048, level1_icache_size = 32768, level1_icache_linesize = 64, level1_dcache_size = 32768, level1_dcache_assoc = 8, level1_dcache_linesize = 64, level2_cache_size = 524288, level2_cache_assoc = 8, level2_cache_linesize = 64, level3_cache_size = 33554432, level3_cache_assoc = 16, level3_cache_linesize = 64, level4_cache_size = 18446744073709551615}, _dl_x86_hwcap_flags = {"sse2\000\000\000\000", "x86_64\000\000", "avx512_1"}, _dl_x86_platforms = {"i586\000\000\000\000", "i686\000\000\000\000", "haswell\000", "xeon_phi"}, _dl_inhibit_rpath = 0x0, _dl_origin_path = 0x0, _dl_tls_static_size = 8064, _dl_tls_static_align = 64, _dl_tls_static_surplus = 1664, _dl_profile = 0x0, _dl_profile_output = 0x7ffff7ff1c60 "/var/tmp", _dl_init_all_dirs = 0x7ffff7fac300, _dl_sysinfo_dso = 0x7ffff7fc9000, _dl_sysinfo_map = 0x7ffff7ffe890, _dl_vdso_clock_gettime64 = 0x7ffff7fc9950 <clock_gettime>, _dl_vdso_gettimeofday = 0x7ffff7fc9770 <gettimeofday>, _dl_vdso_time = 0x7ffff7fc9920 <time>, _dl_vdso_getcpu = 0x7ffff7fc9c10 <getcpu>, _dl_vdso_clock_getres_time64 = 0x7ffff7fc9bb0 <clock_getres>, _dl_hwcap2 = 2, _dl_dso_sort_algo = dso_sort_algorithm_dfs, _dl_debug_printf = 0x7ffff7fd83f0 <_dl_debug_printf>, _dl_mcount = 0x7ffff7fd8f10 <__GI__dl_mcount>, _dl_lookup_symbol_x = 0x7ffff7fd5120 <_dl_lookup_symbol_x>, _dl_open = 0x7ffff7fd6d30 <_dl_open>, _dl_close = 0x7ffff7fcd580 <_dl_close>, _dl_catch_error = 0x7ffff7fcc5d0 <_dl_catch_error>, _dl_error_free = 0x7ffff7fce760 <_dl_error_free>, _dl_tls_get_addr_soft = 0x7ffff7fdd3f0 <_dl_tls_get_addr_soft>, _dl_libc_freeres = 0x7ffff7fe3f60 <__rtld_libc_freeres>, _dl_find_object = 0x7ffff7fcf0e0 <__GI__dl_find_object>, _dl_dlfcn_hook = 0x0, _dl_audit = 0x0, _dl_naudit = 0}
        _dl_argc = 2
        __rtld_tls_init_tp_called = true
        __pointer_chk_guard_local = 7955556426144204470
        _rtld_global = {_dl_ns = {{_ns_loaded = 0x7ffff7ffe2c0, _ns_nloaded = 13, _ns_main_searchlist = 0x7ffff7ffe598, _ns_global_scope_alloc = 0, _ns_global_scope_pending_adds = 0, libc_map = 0x7ffff7fc4ad0, _ns_unique_sym_table = {lock = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}}, entries = 0x0, size = 0, n_elements = 0, free = 0x0}, _ns_debug = {base = {r_version = 0, r_map = 0x0, r_brk = 0, r_state = RT_CONSISTENT, r_ldbase = 0}, r_next = 0x0}}, {_ns_loaded = 0x0, _ns_nloaded = 0, _ns_main_searchlist = 0x0, _ns_global_scope_alloc = 0, _ns_global_scope_pending_adds = 0, libc_map = 0x0, _ns_unique_sym_table = {lock = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, entries = 0x0, size = 0, n_elements = 0, free = 0x0}, _ns_debug = {base = {r_version = 0, r_map = 0x0, r_brk = 0, r_state = RT_CONSISTENT, r_ldbase = 0}, r_next = 0x0}} <repeats 15 times>}, _dl_nns = 1, _dl_load_lock = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}}, _dl_load_write_lock = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}}, _dl_load_tls_lock = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}}, _dl_load_adds = 13, _dl_initfirst = 0x0, _dl_profile_map = 0x0, _dl_num_relocations = 128, _dl_num_cache_relocations = 7, _dl_all_dirs = 0x7ffff7fac300, _dl_rtld_map = {l_addr = 140737353920512, l_name = 0x5555555542e0 "/lib64/ld-linux-x86-64.so.2", l_ld = 0x7ffff7ffce60, l_next = 0x7ffff7fac3d0, l_prev = 0x7ffff7fabbd0, l_real = 0x7ffff7ffdab0 <_rtld_global+2736>, l_ns = 0, l_libname = 0x7ffff7ffe260 <_dl_rtld_libname>, l_info = {0x0, 0x0, 0x0, 0x0, 0x0, 0x7ffff7ffce80, 0x7ffff7ffce90, 0x7ffff7ffcec0, 0x7ffff7ffced0, 0x7ffff7ffcee0, 0x7ffff7ffcea0, 0x7ffff7ffceb0, 0x0, 0x0, 0x7ffff7ffce60, 0x0 <repeats 15 times>, 0x7ffff7ffcf10, 0x0, 0x0, 0x0, 0x0, 0x7ffff7ffcf50, 0x7ffff7ffcf40, 0x7ffff7ffcf60, 0x0, 0x0, 0x7ffff7ffcf00, 0x7ffff7ffcef0, 0x7ffff7ffcf20, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7ffff7ffcf30, 0x0 <repeats 25 times>, 0x7ffff7ffce70}, l_phdr = 0x7ffff7fcb040, l_entry = 0, l_phnum = 10, l_ldnum = 0, l_searchlist = {r_list = 0x0, r_nlist = 0}, l_symbolic_searchlist = {r_list = 0x0, r_nlist = 0}, l_loader = 0x0, l_versions = 0x7ffff7f6bd30, l_nversions = 8, l_nbuckets = 71, l_gnu_bitmask_idxbits = 3, l_gnu_shift = 8, l_gnu_bitmask = 0x7ffff7fcb280, {l_gnu_buckets = 0x7ffff7fcb2a0, l_chain = 0x7ffff7fcb2a0}, {l_gnu_chain_zero = 0x7ffff7fcb3b8, l_buckets = 0x7ffff7fcb3b8}, l_direct_opencount = 0, l_type = lt_library, l_dt_relr_ref = 0, l_relocated = 1, l_init_called = 0, l_global = 1, l_reserved = 0, l_main_map = 0, l_visited = 1, l_map_used = 0, l_map_done = 0, l_phdr_allocated = 0, l_soname_added = 0, l_faked = 0, l_need_tls_init = 0, l_auditing = 0, l_audit_any_plt = 0, l_removed = 0, l_contiguous = 0, l_free_initfini = 0, l_ld_readonly = 0, l_find_object_processed = 0, l_nodelete_active = false, l_nodelete_pending = false, l_property = lc_property_unknown, l_x86_feature_1_and = 0, l_x86_isa_1_needed = 0, l_1_needed = 0, l_rpath_dirs = {dirs = 0x0, malloced = 0}, l_reloc_result = 0x0, l_versyms = 0x7ffff7fcbab2, l_origin = 0x0, l_map_start = 140737353920512, l_map_end = 140737354130104, l_text_end = 140737354075249, l_scope_mem = {0x0, 0x0, 0x0, 0x0}, l_scope_max = 0, l_scope = 0x0, l_local_scope = {0x0, 0x0}, l_file_id = {dev = 0, ino = 0}, l_runpath_dirs = {dirs = 0x0, malloced = 0}, l_initfini = 0x0, l_reldeps = 0x0, l_reldepsmax = 0, l_used = 1, l_feature_1 = 0, l_flags_1 = 0, l_flags = 0, l_idx = 0, l_mach = {plt = 0, gotplt = 0, tlsdesc_table = 0x0}, l_lookup_cache = {sym = 0x0, type_class = 0, value = 0x0, ret = 0x0}, l_tls_initimage = 0x0, l_tls_initimage_size = 0, l_tls_blocksize = 0, l_tls_align = 0, l_tls_firstbyte_offset = 0, l_tls_offset = 0, l_tls_modid = 0, l_tls_dtor_count = 0, l_relro_addr = 198976, l_relro_size = 5824, l_serial = 0}, _dl_rtld_auditstate = {{cookie = 0, bindflags = 0} <repeats 16 times>}, _dl_x86_feature_1 = 0, _dl_x86_feature_control = {ibt = cet_elf_property, shstk = cet_elf_property}, _dl_stack_flags = 6, _dl_tls_dtv_gaps = false, _dl_tls_max_dtv_idx = 5, _dl_tls_dtv_slotinfo_list = 0x7ffff7f6bfd0, _dl_tls_static_nelem = 5, _dl_tls_static_used = 3992, _dl_tls_static_optional = 512, _dl_initial_dtv = 0x7ffff7f69fe0, _dl_tls_generation = 0, _dl_scope_free_list = 0x0, _dl_stack_used = {next = 0x7ffff7ffe0a0 <_rtld_global+4256>, prev = 0x7ffff7ffe0a0 <_rtld_global+4256>}, _dl_stack_user = {next = 0x7ffff7f69900, prev = 0x7ffff7f69900}, _dl_stack_cache = {next = 0x7ffff7ffe0c0 <_rtld_global+4288>, prev = 0x7ffff7ffe0c0 <_rtld_global+4288>}, _dl_stack_cache_actsize = 0, _dl_in_flight_stack = 0, _dl_stack_cache_lock = 0}
#7  0x0000000000000002 in ?? ()
No symbol table info available.
#8  0x00007fffffffe730 in ?? ()
No symbol table info available.
#9  0x00007fffffffe73f in ?? ()
No symbol table info available.
#10 0x0000000000000000 in ?? ()
No symbol table info available.

@gyakovlev gyakovlev added the C-bug Category: This is a bug. label Jun 4, 2023
@gyakovlev
Copy link
Author

@juippis is the one with problem, hope he can provide additional details as needed.

@saethlin
Copy link
Member

saethlin commented Jun 4, 2023

This sounds like it could be the same problem as #112275

@Mark-Simulacrum Mark-Simulacrum added the regression-from-stable-to-stable Performance or correctness regression from one stable version to another. label Jun 4, 2023
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jun 4, 2023
@Mark-Simulacrum Mark-Simulacrum added this to the 1.70.0 milestone Jun 4, 2023
@juippis
Copy link

juippis commented Jun 5, 2023

Funnily enough I also first noticed the issue when attempting to build librsvg. Let me know what info you need, or if you want to merge these issues into one and me to provide the same info.

@saethlin
Copy link
Member

saethlin commented Jun 5, 2023

What architecture is the crash happening on? I see PowerPC mentioned in the issue description, but also that you can't reproduce on PowerPC.

I tried reproducing with this:

$ docker run -it --rm gentoo/stage3

Then

cd
emaint -a sync
curl -LO https://static.rust-lang.org/dist/rust-1.70.0-x86_64-unknown-linux-gnu.tar.gz
tar xf rust-1.70.0-x86_64-unknown-linux-gnu.tar.gz
cd rust-1.70.0-x86_64-unknown-linux-gnu
bash install.sh
rustc -vV

And I get

rustc 1.70.0 (90c541806 2023-05-31)
binary: rustc
commit-hash: 90c541806f23a127002de5b4038be731ba1458ca
commit-date: 2023-05-31
host: x86_64-unknown-linux-gnu
release: 1.70.0
LLVM version: 16.0.2

(no segfault)

What's the version of your dynamic linker? Mine in the Gentoo Docker image is:

/lib64/ld-linux-x86-64.so.2 --version
ld.so (Gentoo 2.36-r8 p10) stable release version 2.36.
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.

The distributed x86_64 linux rustc contains optimizations from LTO, PGO, and BOLT, and this latest release contains an LLVM version bump. So if I had to guess, the cause of this crash is a bug in those optimizations that only surfaces in uncommon but valid ld.so setup/configuration, or something else to do with the system.

If by building from source you mean just running x.py as opposed to the CI scripts and Docker images we use to build dist artifacts, that would also support this theory. x.py doesn't know about PGO and BOLT.

@thesamesam
Copy link

thesamesam commented Jun 5, 2023

x86-64 for juippis' report. Gyakovlev is just saying he can't hit it on ppc64 but he's forwarding the bug.

Gentoo's build from source uses x.py.

(Sorry if a bit terse, writing on mobile!)

@juippis
Copy link

juippis commented Jun 6, 2023

Yep, I'm on x86_64. I retraced your steps,

curl -LO https://static.rust-lang.org/dist/rust-1.70.0-x86_64-unknown-linux-gnu.tar.gz
tar xf rust-1.70.0-x86_64-unknown-linux-gnu.tar.gz
cd rust-1.70.0-x86_64-unknown-linux-gnu
bash install.sh

and rust installed with this method works! I do see the blake2b sum doesn't match with what gets installed via our rust-bin ebuild. I wonder if the manifest changed, or if the ebuild is pulling a wrong distfile?

EDIT: oh, .tar.gz vs. .tar.xz.

The -bin version installed by an ebuild is still broken on an identical container (lxc copy).

@gyakovlev

@hhoffstaette
Copy link

hhoffstaette commented Jun 7, 2023

Of course team Gentoo is here :)
I think I found something. As the unmolested upstream package works fine, I suspected it is probably something in the Gentoo ebuild, and lo!

root>FEATURES="nostrip" emerge -v1 \=virtual/rust-1.70.0 \=rust-bin-1.70.0
<portage does its thing>
root>/usr/bin/rustc -vV
rustc 1.70.0 (90c541806 2023-05-31)
binary: rustc
commit-hash: 90c541806f23a127002de5b4038be731ba1458ca
commit-date: 2023-05-31
host: x86_64-unknown-linux-gnu
release: 1.70.0
LLVM version: 16.0.2

It successfully rebuilds a bunch of my packages too (emlop, librsvg, rustic), except for libopenraw, which seems to be a language conformance/compiler strictness issue.
So there! Maybe another binutils bug and strip eats more than it should, or the LTO-PGO-BOLT stuff is so fragile that stripping breaks it. Probably BOLT.. ¯ \_(ツ)_/¯

@thesamesam
Copy link

Could you try STRIP=llvm-strip?

@hhoffstaette
Copy link

hhoffstaette commented Jun 7, 2023

Could you try STRIP=llvm-strip?

Sure:

root>/usr/bin/rustc -vV
/usr/bin/rustc: error while loading shared libraries: libLLVM-16-rust-1.70.0-stable.so: ELF load command address/offset not page-aligned

:(

@gyakovlev
Copy link
Author

so our installer for -bin package also just uses install.sh.
so stripping binaries breaks rust.
this should not happen ideally ofc. but at least I can disable stripping in this version. thanks for finding it!
I guess rustup version is not affected as it does not strip toolchains on installation.

gentoo-bot pushed a commit to gentoo/gentoo that referenced this issue Jun 7, 2023

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Issue: rust-lang/rust#112286
Signed-off-by: Georgy Yakovlev <[email protected]>
@saethlin
Copy link
Member

saethlin commented Jun 7, 2023

How exactly is rustc being stripped?

I don't get a crash from this:

emaint -a sync
ACCEPT_KEYWORDS=~amd64 emerge -v1 =virtual/rust-1.70.0 =rust-bin-1.70.0
strip /opt/rust-bin-1.70.0/bin/rustc-bin-1.70.0
/opt/rust-bin-1.70.0/bin/rustc-bin-1.70.0 -vV

@hhoffstaette
Copy link

How exactly is rustc being stripped?

I don't get a crash from this:

emaint -a sync
ACCEPT_KEYWORDS=~amd64 emerge -v1 =virtual/rust-1.70.0 =rust-bin-1.70.0
strip /opt/rust-bin-1.70.0/bin/rustc-bin-1.70.0
/opt/rust-bin-1.70.0/bin/rustc-bin-1.70.0 -vV

Nope, the executable is not the problem - see my previous comment about trying to strip with llvm-strip, which clearly shows that libLLVM is being borked. By default portage typically strips everything, including shared libraries.
To reproduce:

root>/usr/bin/rustc -V
rustc 1.70.0 (90c541806 2023-05-31)
root>strip /usr/lib/rust/lib-bin-1.70.0/libLLVM-16-rust-1.70.0-stable.so
strip: /usr/lib/rust/lib-bin-1.70.0/stC8zyTI: section `.eh_frame_hdr' can't be allocated in segment 4
LOAD: .text .text.cold .eh_frame .gcc_except_table .rodata .rodata.cold .eh_frame_hdr
root>/usr/bin/rustc -V                                                  
zsh: segmentation fault  /usr/bin/rustc -V

@saethlin saethlin added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Jun 8, 2023
@gyakovlev
Copy link
Author

I modified current -rust-bin-1.70.0 ebuild to not strip anything at all.
But if it’s just llvm solib I can just omit stripping it instead selectively. It’s still not right though.

@gyakovlev
Copy link
Author

hmm, there's simply no libLLVM-16-rust-1.70.0-stable.so file on ppc64 at all installed, probably explains why it's NOT broken on that arch.
looks like prebuilt installer is linked differently on x86_64? probably due to those optimizations too.
from config.toml.example

# Whether to build LLVM as a dynamically linked library (as opposed to statically linked).
# Under the hood, this passes `--shared` to llvm-config.
# NOTE: To avoid performing LTO multiple times, we suggest setting this to `true` when `thin-lto` is enabled.
#link-shared = llvm.thin-lto

so if bundled llvm is built with LTO, solib is also installed as separate file and borked by stripping.
arches that do not build LTO / shared llvm do not get borked.

@gyakovlev
Copy link
Author

just for the record, here's how it's normally stripped:

strip: powerpc64le-unknown-linux-gnu-strip --strip-unneeded -N __gentoo_check_ldflags__ -R .comment -R .GCC.command.line -R .note.gnu.gold-version

@gyakovlev
Copy link
Author

gentoo/gentoo@ad740d0

I've added rust-bin-1.70.0-r1 ebuild.
I excluded only liblLLVM from stripping, rest should be stripped as usual.

@saethlin
Copy link
Member

saethlin commented Jun 8, 2023

I did a local build with

DEPLOY=1 bash src/ci/docker/run.sh dist-x86_64-linux

And can confirm, that the build that appears under obj/build/x86_64-unknown-linux-gnu/stage2 has a bin/rustc that works, until you strip lib/libLLVM-16-rust-1.72.0-nightly.so. Then it segfaults.

Then I disabled BOLT with this patch

diff --git a/src/ci/stage-build.py b/src/ci/stage-build.py
index 91bd137085e..9c9960e25e3 100644
--- a/src/ci/stage-build.py
+++ b/src/ci/stage-build.py
@@ -156,7 +156,7 @@ class LinuxPipeline(Pipeline):
             ))
 
     def supports_bolt(self) -> bool:
-        return True
+        return False
 
     def executable_extension(self) -> str:
         return ""

Did a rm -r obj/, ran the build again, and got a toolchain that still seems to work with its libLLVM stripped.

@hhoffstaette
Copy link

hhoffstaette commented Jun 8, 2023

gentoo/gentoo@ad740d0

I've added rust-bin-1.70.0-r1 ebuild. I excluded only liblLLVM from stripping, rest should be stripped as usual.

@gyakovlev I'm afraid that did not work; -r1 crashes again. Just because stripping libLLVM breaks it does not mean other libs are fine (or something is wrong with the dostrip expression).
Edit: something is wrong with the expression, libLLVM is still stripped.

@alphaaurigae
Copy link

Funnily enough I also first noticed the issue when attempting to build librsvg. Let me know what info you need, or if you want to merge these issues into one and me to provide the same info.

Had to remove 2 features in make.conf which prevented emerging but well did back in sep 22.

As i recall it was force-mirror (a make.conf feature anyways) which made a librsvg build err, somehow came to a rust bug report which made it look like a rust problem (came to rust googling and checked the bug reports (17.1 hardened w amd use) ... somehow came to this bug report https://bugs.gentoo.org/907492 so i thought it must be rust .. couldn't have been more wrong,

Unrelated but noteable:
collision-protect prevented cpio for linux-firmware from emerging
joro on irc pointed out " app-alternatives were recently introduced and migration won't work with collision-protect unless you manually unmerge the corresponding package in advance which may break your system"

gentoo-bot pushed a commit to gentoo/gentoo that referenced this issue Jun 8, 2023
@gyakovlev
Copy link
Author

I bumped it again to rust-bin-1.70.0-r2 and disabled strip completely again for all libs/bins.

@gyakovlev
Copy link
Author

we are getting reports that source rust-1.70.0 built and linked to system copy of LLVM (which is stripped) also segfaults in a similar way, but with no BOLT involvement in that case. arm64 and riscv included, not only x86_64.
I'll post details one I gather more info, I still don't have any means to reproduce it myself unfortunately.

@ambasta
Copy link

ambasta commented Jun 9, 2023

I am on x86_64 on gentoo as well, and I don't see segfaults w/ rustc

$ rustc -vV
rustc 1.70.0 (90c541806 2023-05-31) (gentoo)
binary: rustc
commit-hash: 90c541806f23a127002de5b4038be731ba1458ca
commit-date: 2023-05-31
host: x86_64-unknown-linux-gnu
release: 1.70.0
LLVM version: 16.0.5

Though I am using mold linker instead

dev-lang/rust-1.70.0::gentoo was built with the following:
USE="lto rust-analyzer rust-src rustfmt system-bootstrap system-llvm -clippy -debug -dist -doc (-llvm-libunwind) (-miri) -nightly (-parallel-compiler) -profiler -test -verify-sig -wasm" CPU_FLAGS_X86="sse2" LLVM_TARGETS="AMDGPU (X86) -AArch64 -ARM -AVR -BPF -Hexagon -Lanai -LoongArch -MSP430 -Mips -NVPTX -PowerPC -RISCV -Sparc -SystemZ -VE -WebAssembly -XCore"
FEATURES="nodoc pid-sandbox ipc-sandbox strict binpkg-dostrip distlocks network-sandbox news config-protect-if-modified parallel-fetch userfetch warn-on-large-env binpkg-docompress merge-sync sandbox protect-owned unmerge-orphans strict-keepdir usersandbox fixlafiles sfperms unknown-features-warn unmerge-logs usersync assume-digests preserve-libs binpkg-logs binpkg-multi-instance buildpkg-live userpriv xattr ebuild-locks qa-unresolved-soname-deps multilib-strict noinfo"

@sdlarsen
Copy link

sdlarsen commented Jun 9, 2023

I'm on x86_64 on Gentoo and using system copy of LLVM and mold linker and don't see any segfaults

$ rustc -vV
rustc 1.70.0 (90c541806 2023-05-31) (gentoo)
binary: rustc
commit-hash: 90c541806f23a127002de5b4038be731ba1458ca
commit-date: 2023-05-31
host: x86_64-unknown-linux-gnu
release: 1.70.0
LLVM version: 16.0.5

My use flags:

$ equery u rust | rg '\+'  
+clippy
+cpu_flags_x86_sse2
+lto
+rust-analyzer
+rust-src
+rustfmt
+system-llvm

@kain88-de
Copy link

I noticed a similar issue on ubuntu 22.04 when installed in a container. When I install 1.69 everything still works fine.

CheckmkCI pushed a commit to Checkmk/checkmk that referenced this issue Jun 9, 2023
See this issue for problems with the official build

 rust-lang/rust#112286

Change-Id: I902f061ec3398dc7c2df6fb37f9284f58cf73d7d
@apiraino apiraino added the E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example label Jun 21, 2023
@thesamesam
Copy link

thesamesam commented Jul 9, 2023

Is the BOLT optimization so important that it needs to be kept without a solution being found for this? Rolling new binaries without it for 1.70.0 would unblock us in Gentoo (and it affects other distros too). We're currently stuck on 1.69.0.

@workingjubilee workingjubilee added I-compiler-nominated Nominated for discussion during a compiler team meeting. P-high High priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jul 10, 2023
@workingjubilee
Copy link
Member

workingjubilee commented Jul 10, 2023

@saethlin mentioned the possibility this is just us encountering llvm/llvm-project#56738 again. Nominating at lqd's recommendation. P-high may be uh, too high, but BOLT has been guilty of causing a lot of Rust miscompilations before, so I am assuming that it is a high priority to figure out what is actually going on... and maybe it will be a lower priority after that.

  • This appears to currently only affect Linux.
  • This appears to be an especial concern for Gentoo Linux, which builds everything from source.
  • This affects multiple architectures: x86-64, AArch64, PowerPC64, and RISCV are confirmed.
  • It appears to be triggered by use of both BOLT and strip on rustc and/or LLVM.
  • However, there may also be other coincident problems, as binaries without BOLT (on either rustc or LLVM?), but where LLVM is stripped, are also reportedly exhibiting a similar segfault.
  • It seems quite likely that one of the two tools, strip and BOLT, are obliterating data the other wants, or are not operating in mindfulness of each other possibly being used, and thus are screwing up their modifications of binaries when used in tandem.
  • It seems likely there are undiscovered issues and that this warrants deeper investigation to confirm whether there are problems with either tool in isolation.

@nikic
Copy link
Contributor

nikic commented Jul 10, 2023

@thesamesam Can you please clarify why this is blocking Gentoo? Does "just don't strip libLLVM.so" not work as a workaround for some reason?

@saethlin
Copy link
Member

@gyakovlev Are there any more details or some way we can reproduce the other reports you mentioned in #112286 (comment)? That's the only comment in this thread that doesn't make sense to me.

@Kobzol
Copy link
Contributor

Kobzol commented Jul 10, 2023

FWIW, I asked about this on the LLVM Discord and it seems that it's a known issue that is marked as "wontfix" for the moment (apparently BOLT produces a spec-compliant header, which cannot be parsed by binutils strip). This issue should be hopefully resolved once BOLT changes its implementation of binary rewriting (https://reviews.llvm.org/D144560).

@workingjubilee
Copy link
Member

Is the BOLT optimization so important that it needs to be kept without a solution being found for this? Rolling new binaries without it for 1.70.0 would unblock us in Gentoo (and it affects other distros too). We're currently stuck on 1.69.0.

To actually answer your question:
BOLT has been responsible for about -5% on cycle count across our perf suite when compiling artifacts, even better memory use in most cases, and a bit off instruction count, which in the general case has lead to, yes, better wall time. It's very hard to come by that kind of improvement, which is why there's still some chatter and attempts to isolate the exact problems here before we jump to disabling BOLT.

@thesamesam
Copy link

Is the BOLT optimization so important that it needs to be kept without a solution being found for this? Rolling new binaries without it for 1.70.0 would unblock us in Gentoo (and it affects other distros too). We're currently stuck on 1.69.0.

To actually answer your question: BOLT has been responsible for about -5% on cycle count across our perf suite when compiling artifacts, even better memory use in most cases, and a bit off instruction count, which in the general case has lead to, yes, better wall time. It's very hard to come by that kind of improvement, which is why there's still some chatter and attempts to isolate the exact problems here before we jump to disabling BOLT.

Thank you for explaining! Given #112286 (comment) and the bits below w/ llvm-strip, I feel as if it's likely to keep biting people, but that's up to you folks & I appreciate the help.

@thesamesam Can you please clarify why this is blocking Gentoo? Does "just don't strip libLLVM.so" not work as a workaround for some reason?

Okay, so I spoke to some people internally and it looks like it was a combination of:

  • Typo in the initial don't-strip-this-single-file
  • Someone reporting a problem with that
  • Adding the total strip-disabling and people getting confused about when the issue happens? (possibly not having picked up the previous fix?)
  • The fact that llvm-strip seemingly didn't/doesn't help making it sound like another problem existed.

So, my plan now is:

  • wait for someone who previously had an issue on ppc64;
  • just totally disable stripping for the time being so we can get this out;
  • make sure any issues get reported to us and then upstream if appropriate (rather than comments on IRC etc where it's easy to get confused about what someone's env is);
  • re-evaluate disabling stripping for certain files instead if nothing comes up from the previous point.

Thank you again folks for the help and I'll let you know what happens.

FWIW, I asked about this on the LLVM Discord and it seems that it's a known issue that is marked as "wontfix" for the moment (apparently BOLT produces a spec-compliant header, which cannot be parsed by binutils strip). This issue should be hopefully resolved once BOLT changes its implementation of binary rewriting (reviews.llvm.org/D144560).

Note that we have reports of llvm-strip failing too at:

gentoo-bot pushed a commit to gentoo/gentoo that referenced this issue Jul 11, 2023
If you still have issues with Rust 1.70, please file a Gentoo bug with all of
the details.

I've written up what the situation is on the Rust bug [0], but pasting it inline:
"""
Okay, so I spoke to some people internally and it looks like it was a combination of:

    Typo in the initial don't-strip-this-single-file
    Someone reporting a problem with that
    Adding the total strip-disabling and people getting confused about when the issue happens? (possibly not having picked up the previous fix?)
    The fact that llvm-strip seemingly didn't/doesn't help making it sound like another problem existed.

So, my plan now is:

    wait for someone who previously had an issue on ppc64;
    just totally disable stripping for the time being so we can get this out;
    make sure any issues get reported to us and then upstream if appropriate (rather than comments on IRC etc where it's easy to get confused about what someone's env is);
    re-evaluate disabling stripping for certain files instead if nothing comes up from the previous point.
"""

That feels to me to be a reasonable/plausible timeline of events.

matoro's since done that PPC64 testing and not hit any problems; ionen's been
using 1.70 for a while; I've rebuilt all rust pkgs (and used some of them) on
rust{,-bin}-1.70 machines without incident.

Thank you to matoro and ionen for helping me muddle my way through here.

So, all that said, let's unmask it and handle any new issues (although I'm
not expecting any) as-and-when/if they come in.

[0] rust-lang/rust#112286 (comment)

Bug: rust-lang/rust#112286
Signed-off-by: Sam James <[email protected]>
@lqd
Copy link
Member

lqd commented Jul 11, 2023

  • wait for someone who previously had an issue on ppc64

note that BOLT is only enabled on x86_64-unknown-linux-gnu

@lqd
Copy link
Member

lqd commented Jul 13, 2023

This was discussed in today's t-compiler meeting, starting in this zulip topic.

While the issue is real and quite an uncommon thing to happen, the fact that there is in-progress work to fix it upstream made suggesting to use the workaround acceptable to the team, for the short-term at least. That also matches gentoo's plan above to disable stripping for now.

This topic could be revisited in the future if issues that are impossible to work around are discovered, or if the issue is not fixed upstream as expected.

Removing nomination.

@lqd lqd removed the I-compiler-nominated Nominated for discussion during a compiler team meeting. label Jul 13, 2023
CheckmkCI pushed a commit to Checkmk/checkmk that referenced this issue Aug 30, 2023
This reverts commit 1f1028e.

Reason for revert: rust-lang/rust#112286

Change-Id: I58487807caaa6bd014896f28c956a7f0593ceaa5
@wesleywiser wesleywiser added P-medium Medium priority and removed P-high High priority labels Oct 6, 2023
@wesleywiser
Copy link
Member

Visited during the compiler team's P-high review. We think that P-medium is a more appropriate classification at this time as the underlying issue is not in the compiler or the language itself but in some combination of BOLT and binutils. Work appears to be progressing upstream on fixing the issue from the BOLT side, but we don't think disabling BOLT is worth it to resolve this issue currently.

@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
@kpreid
Copy link
Contributor

kpreid commented Dec 25, 2023

This issue is labeled E-needs-mcve but the comments above suggest that it's being worked on. Does it still need a MCVE?

@saethlin
Copy link
Member

The links above to reviews.llvm.org are now broken, because the site doesn't exist anymore. It's entirely unclear to me if this is still being worked on, but I also don't know if an MCVE would help. Unfortunately that's a question for the LLVM issue tracker, based on the above probably llvm/llvm-project#56738

@Apteryks
Copy link
Contributor

Apteryks commented Apr 12, 2024

This also appears to affect GNU Guix, as seen while building librsvg 2.56.4 with Rust 1.75 built from source with stripped LLVM:

[...]
test reference::tests::svg1_1_paths_data_19_f_svg ... ok
test reference::tests::svg1_1_paths_data_20_f_svg ... ok
test reference::tests::svg1_1_paths_data_14_t_svg ... ok
test reference::tests::svg1_1_paths_data_13_t_svg ... ok
test reference::tests::svg1_1_paths_data_15_t_svg ... ok
test reference::tests::svg1_1_filters_displace_02_f_svg ... ok
test reference::tests::svg1_1_paths_data_09_t_svg ... ok
test reference::tests::svg1_1_pservers_grad_02_b_svg ... ok
error: test failed, to rerun pass `--test src`

Caused by:
  process didn't exit successfully: `/tmp/guix-build-librsvg-2.56.4.drv-0/librsvg-2.56.4/target/release/deps/src-d06bb04cfecb27d8 --include-ignored` (signal: 11, SIGSEGV: invalid memory reference)
make[3]: *** [Makefile:1541: check-local] Error 101
make[3]: Leaving directory '/tmp/guix-build-librsvg-2.56.4.drv-0/librsvg-2.56.4'

We are using LLVM 15 for the build of Rust 1.75, and the Bolt project isn't enabled for it (so there is no llvm-bolt binary available).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another.
Projects
None yet
Development

No branches or pull requests