Skip to content

Conversation

@brandtbucher
Copy link
Member

@brandtbucherbrandtbucher commented Nov 6, 2025

Example flamegraph from one of the tests:

Screenshot 2025-10-26 at 7 24 34 PM

📚 Documentation preview 📚: https://cpython-previews--141108.org.readthedocs.build/

… in the sampling profiler - Introduce a new field in the GC state to store the frame that initiated garbage collection. - Update RemoteUnwinder to include options for including "<native>" and "<GC>" frames in the stack trace. - Modify the sampling profiler to accept parameters for controlling the inclusion of native and GC frames. - Enhance the stack collector to properly format and append these frames during profiling. - Add tests to verify the correct behavior of the profiler with respect to native and GC frames, including options to exclude them.
@brandtbucherbrandtbucher self-assigned this Nov 6, 2025
@brandtbucherbrandtbucher added the type-feature A feature request or enhancement label Nov 6, 2025
@brandtbucherbrandtbucher added sprint interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Nov 6, 2025
@brandtbucherbrandtbucher added the stdlib Standard Library Python modules in the Lib/ directory label Nov 6, 2025
@pablogsal
Copy link
Member

Seems that there is either some form of a race of somehow the windows test don't trigger the GC:

Profile Stats: nsamples sample% tottime (ms) cumul% cumtime (s) filename:lineno(function) 1510/36080 63.0 151.000 1505.8 3.608 tmpiafjtexk:10(slow_fibonacci) 0/26017 0.0 0.000 1085.9 2.602 ~:0(<native>) 0/2395 0.0 0.000 100.0 0.240 _sync_coordinator.py:227(main) 0/2395 0.0 0.000 100.0 0.240 _sync_coordinator.py:244(<module>) 0/2395 0.0 0.000 100.0 0.240 runpy.py:88(_run_code) 0/2395 0.0 0.000 100.0 0.240 runpy.py:198(_run_module_as_main) 1/2374 0.0 0.100 99.1 0.237 _sync_coordinator.py:186(_execute_script) 0/2370 0.0 0.000 98.9 0.237 tmpiafjtexk:50(<module>) 0/2287 0.0 0.000 95.5 0.229 tmpiafjtexk:44(main_loop) 658/658 27.5 65.800 27.5 0.066 tmpiafjtexk:5(slow_fibonacci) 91/91 3.8 9.100 3.8 0.009 tmpiafjtexk:7(slow_fibonacci) 0/83 0.0 0.000 3.5 0.008 tmpiafjtexk:43(main_loop) 39/39 1.6 3.900 1.6 0.004 tmpiafjtexk:16(cpu_intensive_work) 35/35 1.5 3.500 1.5 0.004 tmpiafjtexk:17(cpu_intensive_work) 29/29 1.2 2.900 1.2 0.003 tmpiafjtexk:8(slow_fibonacci) Legend: nsamples: Direct/Cumulative samples (direct executing / on call stack) sample%: Percentage of total samples this function was directly executing tottime: Estimated total time spent directly in this function cumul%: Percentage of total samples when this function was on the call stack cumtime: Estimated cumulative time (including time in called functions) filename:lineno(function): Function location and name Summary of Interesting Functions: Functions with Highest Direct/Cumulative Ratio (Hot Spots): 1.000 direct/cumulative ratio, 3.1% direct samples: tmpiafjtexk:(cpu_intensive_work) 0.062 direct/cumulative ratio, 95.5% direct samples: tmpiafjtexk:(slow_fibonacci) 0.000 direct/cumulative ratio, 0.0% direct samples: _sync_coordinator.py:(_execute_script) Functions with Highest Call Frequency (Indirect Calls): 34570 indirect calls, 1538.3% total stack presence: tmpiafjtexk:(slow_fibonacci) 26017 indirect calls, 1085.9% total stack presence: ~:(<native>) 2395 indirect calls, 100.0% total stack presence: _sync_coordinator.py:(main) Functions with Highest Call Magnification (Cumulative/Direct): 2374.0x call magnification, 2373 indirect calls from 1 direct: _sync_coordinator.py:(_execute_script) 16.1x call magnification, 34570 indirect calls from 2288 direct: tmpiafjtexk:(slow_fibonacci) '8;6u 

@pablogsal
Copy link
Member

Another posibility is that the machines are too slow and we don't even get to run under the gc somehow?

@pablogsal
Copy link
Member

Maybe slow_fibonacci is too slow? 😆

@pablogsal
Copy link
Member

pablogsal commented Nov 8, 2025

I am thinking that <native> it's useful but perhaps it's a bit noisy if you are not hunting for it? Should we default it to False?

Another idea is that maybe there is a C function in the stack maybe in another PR we can fetch the C function name and use that as the code?

@pablogsal
Copy link
Member

I have pushed some new tests and fixes hopefully this does the trick

@brandtbucher
Copy link
MemberAuthor

The flakiness of these sorts of tests is... annoying. Quitting for the night.

@pablogsal
Copy link
Member

The flakiness of these sorts of tests is... annoying. Quitting for the night.

I feel you. Unfortunately it's very hard to write correct code here as its fundamentally a race condition between the function being profiled and the profiler. Specially in slow machines it's a pain.

I recommend doing one thing and one thing only per test

@pablogsal
Copy link
Member

@brandtbucher a suggestion if you struggled with CI it's to just add the GC switch in this PR and figure out native mode layer as that is currently less useful and it's giving us trouble.

@brandtbucher
Copy link
MemberAuthor

I think it's an ASan-specific thing (I can reproduce locally). I'll figure out what's going on later.

@brandtbucher
Copy link
MemberAuthor

I thought I was being clever when I also added support for native frames at the very top of the stack in a recent commit, but that only works on debug builds (where we clear the stack pointer upon resuming a Python frame). 🤦🏼‍♂️

Reverting, this version only finds native frames in the middle of the stack now.

@pablogsal
Copy link
Member

I thought I was being clever when I also added support for native frames at the very top of the stack in a recent commit, but that only works on debug builds (where we clear the stack pointer upon resuming a Python frame). 🤦🏼‍♂️

Reverting, this version only finds native frames in the middle of the stack now.

Haha nice!

I assume this means that you prefer to go with GC + native in this PR then, no?

@brandtbucher
Copy link
MemberAuthor

Yeah, I’m happy with the current state. We can beef up the native features later if they’re worth the performance hit.

Copy link
Member

@pablogsalpablogsal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT! Amazing work 💪

Left some small comments

@pablogsal
Copy link
Member

Fixed some merge conflicts

@pablogsalpablogsal enabled auto-merge (squash) November 17, 2025 13:32
@pablogsalpablogsal merged commit 336366f into python:mainNov 17, 2025
46 checks passed
@github-project-automationgithub-project-automationbot moved this from In Progress to Done in Sprint 2024Nov 17, 2025
@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot AMD64 FreeBSD14 3.x (tier-3) has failed when building commit 336366f.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/1232/builds/7240) and take a look at the build logs.
  4. Check if the failure is related to this commit (336366f) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/1232/builds/7240

Failed tests:

  • test_profiling

Failed subtests:

  • test_process_pool_executor_pickle - test.test_profiling.test_sampling_profiler.TestProcessPoolExecutorSupport.test_process_pool_executor_pickle

Summary of the results of the build (if available):

==

Click to see traceback logs
Traceback (most recent call last): File "/home/buildbot/buildarea/3.x.opsec-fbsd14/build/Lib/test/test_profiling/test_sampling_profiler.py", line 3354, in test_process_pool_executor_pickleself.assertIn("Results: [2, 4, 6]", stdout) ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^AssertionError: 'Results: [2, 4, 6]' not found in ''

@pablogsal
Copy link
Member

This is not in this PR I will take a look

@pablogsal
Copy link
Member

Fixing in #141688 (comment)

StanFromIreland pushed a commit to StanFromIreland/cpython that referenced this pull request Dec 6, 2025
…filer (python#141108) - Introduce a new field in the GC state to store the frame that initiated garbage collection. - Update RemoteUnwinder to include options for including "<native>" and "<GC>" frames in the stack trace. - Modify the sampling profiler to accept parameters for controlling the inclusion of native and GC frames. - Enhance the stack collector to properly format and append these frames during profiling. - Add tests to verify the correct behavior of the profiler with respect to native and GC frames, including options to exclude them. Co-authored-by: Pablo Galindo Salgado <pablogsal@gmail.com>
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

interpreter-core(Objects, Python, Grammar, and Parser dirs)sprintstdlibStandard Library Python modules in the Lib/ directorytype-featureA feature request or enhancement

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants

@brandtbucher@pablogsal@bedevere-bot