Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 33.9k
gh-138122: Add blocking mode for accurate stack traces in Tachyon#142998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
pablogsal commented Dec 20, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
bb510c0 to 8c21dc9Comparepablogsal commented Dec 20, 2025
@ivonastojanovic can you take a look? |
de4448b to c270764CompareNon-blocking sampling reads process memory while the target continues running, which can produce torn stacks when generators or coroutines rapidly switch between yield points. Blocking mode uses atomic process suspension (task_suspend on macOS, NtSuspendProcess on Windows, PTRACE_SEIZE on Linux) to stop the target during each sample, ensuring consistent snapshots. Use blocking mode with longer intervals (1ms+) to avoid impacting the target too much. The default non-blocking mode remains best for most cases since it has zero overhead. Also fix a frame cache bug: the cache was including the last_profiled_frame itself when extending with cached data, but this frame was executing in the previous sample and its line number may have changed. For example, if function A was sampled at line 6, then execution continued to line 10 and called B→C, the next sample would incorrectly report A at line 6 (from cache) instead of line 10. The fix uses start_idx + 1 to only trust frames ABOVE last_profiled_frame — these caller frames are frozen at their call sites and cannot change until their callees return. Signed-off-by: Pablo Galindo <[email protected]>
| // Extend frame_info with frames from start_idx onwards | ||
| PyObject*slice=PyList_GetSlice(entry->frame_list, start_idx, num_frames); | ||
| // Extend frame_info with frames ABOVE start_idx (not including it). |
pablogsalDec 20, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes a frame cache bug: the cache was including the last_profiled_frame
itself when extending with cached data, but this frame was executing in
the previous sample and its line number may have changed. For example,
if function A was sampled at line 6, then execution continued to line 10
and called B→C, the next sample would incorrectly report A at line 6
(from cache) instead of line 10. The fix uses start_idx + 1 to only trust
frames ABOVE last_profiled_frame: these caller frames are frozen at their
call sites and cannot change until their callees return.
ivonastojanovic left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just a couple of nits
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
bedevere-bot commented Dec 23, 2025
🤖 New build scheduled with the buildbot fleet by @pablogsal for commit 5438b66 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F142998%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
81c8eb8 into python:mainUh oh!
There was an error while loading. Please reload this page.
Non-blocking sampling reads process memory while the target continues
running, which can produce torn stacks when generators or coroutines
rapidly switch between yield points. Blocking mode uses atomic process
suspension (task_suspend on macOS, NtSuspendProcess on Windows,
PTRACE_SEIZE on Linux) to stop the target during each sample, ensuring
consistent snapshots.
Use blocking mode with longer intervals (1ms+) to avoid impacting the
target too much. The default non-blocking mode remains best for most
cases since it has zero overhead.
📚 Documentation preview 📚: https://cpython-previews--142998.org.readthedocs.build/