Skip to content

BOLT optimizations fail on Linux aarch64#128884

@zanieb

Description

@zanieb

Bug report

Bug description:

When running the --pgo test suite with a BOLT instrumented binary, the interpreter crashes with

./python -m test --pgo --rerun --verbose3 --timeout= python: ../cpython-ro-srcdir/Python/generated_cases.c.h:1074: _PyEval_EvalFrameDefault: Assertion `tp->tp_alloc == PyType_GenericAlloc' failed. Aborted (core dumped) 

I find this surprising since we include _PyEval_EvalFrameDefault in the BOLT skip functions — but am not familiar with the details.

The following patch successfully worked around that error

diff --git a/Python/generated_cases.c.h b/Python/generated_cases.c.h index 810beb61d0d..d24add0afab 100644 --- a/Python/generated_cases.c.h+++ b/Python/generated_cases.c.h@@ -1071,7 +1071,7 @@ DEOPT_IF(FT_ATOMIC_LOAD_UINT32_RELAXED(tp->tp_version_tag) != type_version, CALL); assert(tp->tp_new == PyBaseObject_Type.tp_new); assert(tp->tp_flags & Py_TPFLAGS_HEAPTYPE); - assert(tp->tp_alloc == PyType_GenericAlloc);+ assert(tp->tp_alloc == PyBaseObject_Type.tp_alloc); PyHeapTypeObject *cls = (PyHeapTypeObject *)callable_o; PyFunctionObject *init_func = (PyFunctionObject *)FT_ATOMIC_LOAD_PTR_ACQUIRE(cls->_spec_cache.init); PyCodeObject *code = (PyCodeObject *)init_func->func_code;

Then, the profiling test run succeeded, but BOLT crashed during the apply step.

# Run bolt against the merged data to produce an optimized binary. for bin in python; do \ /usr/lib/llvm-19/bin/llvm-bolt "${bin}.prebolt" -o "${bin}.bolt" -data="${bin}.fdata" -update-debug-sections -skip-funcs=_PyEval_EvalFrameDefault,sre_ucs1_match/1,sre_ucs2_match/1,sre_ucs4_match/1 -reorder-blocks=ext-tsp -reorder-functions=cdsort -split-functions -icf=1 -inline-all -split-eh -reorder-functions-use-hot-size -peepholes=none -jump-tables=aggressive -inline-ap -indirect-call-promotion=all -dyno-stats -use-gnu-stack -frame-opt=hot ; \ mv "${bin}.bolt" "${bin}"; \ done BOLT-INFO: Target architecture: aarch64 BOLT-INFO: BOLT version: <unknown> BOLT-INFO: first alloc address is 0x400000 BOLT-INFO: enabling relocation mode BOLT-INFO: pre-processing profile using branch profile reader BOLT-INFO: number of removed linker-inserted veneers: 0 BOLT-INFO: 8500 out of 12058 functions in the binary (70.5%) have non-empty execution profile BOLT-INFO: 41 functions with profile could not be optimized BOLT-INFO: profile for 1 objects was ignored BOLT-INFO: removed 1 empty block BOLT-INFO: ICF folded 678 out of 12439 functions in 5 passes. 0 functions had jump tables. BOLT-INFO: Removing all identical functions will save 46.23 KB of code space. Folded functions were called 3909549484 times based on profile. BOLT-INFO: ICP Total indirect calls = 1808544446, 153 callsites cover 99% of all indirect calls #0 0x0000aacc1be768cc (/usr/lib/llvm-19/bin/llvm-bolt+0x1ae68cc) #1 0x0000aacc1be74b80 (/usr/lib/llvm-19/bin/llvm-bolt+0x1ae4b80) #2 0x0000aacc1be77174 (/usr/lib/llvm-19/bin/llvm-bolt+0x1ae7174) #3 0x0000ff03feee37e0 (linux-vdso.so.1+0x7e0) #4 0x0000aacc1c397200 (/usr/lib/llvm-19/bin/llvm-bolt+0x2007200) #5 0x0000aacc1c39aa1c (/usr/lib/llvm-19/bin/llvm-bolt+0x200aa1c) #6 0x0000aacc1c39a9e4 (/usr/lib/llvm-19/bin/llvm-bolt+0x200a9e4) #7 0x0000aacc1c39a9e4 (/usr/lib/llvm-19/bin/llvm-bolt+0x200a9e4) #8 0x0000aacc1bf1ebc4 (/usr/lib/llvm-19/bin/llvm-bolt+0x1b8ebc4) #9 0x0000aacc1bf21328 (/usr/lib/llvm-19/bin/llvm-bolt+0x1b91328) #10 0x0000aacc1becfe3c (/usr/lib/llvm-19/bin/llvm-bolt+0x1b3fe3c) #11 0x0000aacc1aadf2f0 (/usr/lib/llvm-19/bin/llvm-bolt+0x74f2f0) #12 0x0000ff03fe8684c4 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3 #13 0x0000ff03fe868598 call_init ./csu/../csu/libc-start.c:128:20 #14 0x0000ff03fe868598 __libc_start_main ./csu/../csu/libc-start.c:347:5 #15 0x0000aacc1aadd4f0 (/usr/lib/llvm-19/bin/llvm-bolt+0x74d4f0) PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: /usr/lib/llvm-19/bin/llvm-bolt python.prebolt -o python.bolt -data=python.fdata -update-debug-sections -skip-funcs=_PyEval_EvalFrameDefault,sre_ucs1_match/1,sre_ucs2_match/1,sre_ucs4_match/1 -reorder-blocks=ext-tsp -reorder-functions=cdsort -split-functions -icf=1 -inline-all -split-eh -reorder-functions-use-hot-size -peepholes=none -jump-tables=aggressive -inline-ap -indirect-call-promotion=all -dyno-stats -use-gnu-stack -frame-opt=hot Segmentation fault (core dumped) 

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    buildThe build process and cross-buildtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions