GH-118036: Fix a bug with CALL_STAT_INC#117933

gvanrossum · 2024-04-16T15:30:07Z

We were under-counting calls in _PyEvalFramePushAndInit because the CALL_STAT_INC macro was redefined to a no-op for the Tier 2 interpreter. The fix is a little convoluted (I had wanted to move the code around, but that would require moving something else around, and in the end I figured it was easier to tweak the macros @markshannon might disagree though?). This ought to result in ~37% more "Frames pushed" reported under "Call stats". The new count is the correct one (I presume).

@mdboom can you review? This is one commit from my experiment about removing Tier 2 entirely (gh-117908).

To see the effect, look at these pystat diffs.

Issue: Call stats are incorrect for tier 2 and maybe for tier 1 as well #118036

We were under-counting calls in `_PyEvalFramePushAndInit` because the `CALL_STAT_INC` macro was redefined to a no-op for the Tier 2 interpreter. The fix is a little convoluted. This ought to result in ~37% more "Frames pushed" reported under "Call stats". The new count is the correct one (I presume).

mdboom

I understand how this is broken and why this change fixes it, so I'm marking as approved, but I agree it's "weird" / more convoluted than it needs to be. Moving the order of functions in ceval.c would result in less weirdness (probably just moving _PyEval_EvalFrameDefault to the bottom since that's where all the macro updates happen) but that would create a lot of churn.

Alternatively, what if you rename REAL_CALL_STAT_INC to CALL_STAT_INC_ALWAYS and then just use CALL_STAT_INC_ALWAYS directly from _PyEvalFramePushAndInit (which I think is the only call site impacted by this change). That would get rid of the weird dance of #undef and restoring just CALL_STAT_INC (but admittedly that would just replace it with another subtlety in _PyEvalFramePushAndInit).

Let's try a different fix instead.

This should fix the issue with CALL_STAT_INC in a cleaner way (even if the diff is much larger).

gvanrossum · 2024-04-16T21:47:26Z

Here's another version, where I moved the interpreter loop (and a small assortment of related stuff) to the end of the file.

I ran Pystats on a single loop of the Richards benchmark, and found that most stats are approximately the same (not completely) except that "Frames pushed" is about 10% larger, which indicates that the fix works. (The diff is much more annoying to review, but I promise I just moved the stuff from the original lines 606 - 1120 to the bottom of the file, except I had to move #include "ceval_macros" to nearly the top (but after LLTRACE is defined.)

gvanrossum · 2024-04-17T04:32:40Z

Benchmark says speed and memory unchanged.

markshannon · 2024-04-17T15:45:24Z

I think the correct fix for CALL_STAT_INC is to not #undefine it at all.
It doesn't matter whether it happens in tier 1 or tier 2. A call is a call.

In executor_cases.c.h CALL_STAT_INC occurs only once in _PUSH_FRAME and that should be counted.

gvanrossum · 2024-04-17T17:16:03Z

I think the correct fix for CALL_STAT_INC is to not #undefine it at all.

Okay, I'll try that next.

I'm going to try yet another approach.

gvanrossum · 2024-04-17T18:14:46Z

The proof will be in the pudding. I'll fire off two benchmark runs, with pystats, one plain, one with Tier 2. (The JIT pystats ought to be similar but I don't want to wait.)

gvanrossum · 2024-04-17T20:24:05Z

Benchmark using Tier 1 only shows 36.8% more frames pushed, which is very close to what I measured with the first version of this PR, so I think that suggests this fixes that issue. Everything else I looked at is basically unchanged, suggesting I'm not breaking anything.

Still waiting for the Tier 2 benchmark, will update when I see those numbers.

gvanrossum · 2024-04-17T23:58:34Z

Benchmark with Tier 2 poses a bit of a mystery, at least the pystats diff.

Go to Call stats and open the details box.

Frames pushed is 40% higher
Calls to Python functions inlined is 30% higher (it wasn't in the Tier 1 pystats diff)

@markshannon, @mdboom, @brandtbucher -- could there be some kind of double counting going on? Or is this an expected result? The only change from main is now that we don't undefine CALL_STAT_INC, which means that where that macro is called from Tier 2 it actually updates the count.

markshannon · 2024-04-18T10:28:05Z

I think the change is correct, even though the stats are probably still wrong.
I've created an issue for this (#118036).

gvanrossum added skip issue skip news labels Apr 16, 2024

gvanrossum requested a review from mdboom April 16, 2024 15:30

gvanrossum assigned mdboom Apr 16, 2024

gvanrossum requested a review from markshannon as a code owner April 16, 2024 15:30

bedevere-appbot added the awaiting core review label Apr 16, 2024

mdboom reviewed Apr 16, 2024
View reviewed changes

mdboom approved these changes Apr 16, 2024
View reviewed changes

gvanrossum added 2 commits April 16, 2024 14:22

Revert "Fix a bug with CALL_STAT_INC"
779f7f4
Let's try a different fix instead.

Move _PyEval_EvalFrameDefault to the end of the ceval.c
34584c3
This should fix the issue with CALL_STAT_INC in a cleaner way (even if the diff is much larger).

gvanrossum added 2 commits April 17, 2024 10:17

Revert "Move _PyEval_EvalFrameDefault to the end of the ceval.c"
681ca31
I'm going to try yet another approach.

Just don't redefine CALL_STAT_INC
5e20f0f

markshannon changed the title ~~Fix a bug with CALL_STAT_INC~~GH-118036: Fix a bug with CALL_STAT_INCApr 18, 2024

markshannon removed the skip issue label Apr 18, 2024

bedevere-appbot mentioned this pull request Apr 18, 2024
Call stats are incorrect for tier 2 and maybe for tier 1 as well #118036
Closed

markshannon approved these changes Apr 18, 2024
View reviewed changes

bedevere-appbot added awaiting merge and removed awaiting core review labels Apr 18, 2024

gvanrossum merged commit 40f4d64 into python:mainApr 18, 2024

bedevere-appbot removed the awaiting merge label Apr 18, 2024

gvanrossum deleted the call-stat-inc branch April 18, 2024 14:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GH-118036: Fix a bug with CALL_STAT_INC#117933

GH-118036: Fix a bug with CALL_STAT_INC #117933

Uh oh!

gvanrossum commented Apr 16, 2024•
edited by bedevere-app bot
Loading

Uh oh!

mdboom left a comment •
edited
Loading

Uh oh!

gvanrossum commented Apr 16, 2024•
edited
Loading

Uh oh!

gvanrossum commented Apr 17, 2024

Uh oh!

markshannon commented Apr 17, 2024

Uh oh!

gvanrossum commented Apr 17, 2024

Uh oh!

gvanrossum commented Apr 17, 2024

Uh oh!

gvanrossum commented Apr 17, 2024

Uh oh!

gvanrossum commented Apr 17, 2024

Uh oh!

markshannon commented Apr 18, 2024•
edited by mdboom
Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

GH-118036: Fix a bug with CALL_STAT_INC#117933

GH-118036: Fix a bug with CALL_STAT_INC #117933

Uh oh!

Conversation

gvanrossum commented Apr 16, 2024• edited by bedevere-app botLoading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mdboom left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gvanrossum commented Apr 16, 2024• edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gvanrossum commented Apr 17, 2024

Uh oh!

markshannon commented Apr 17, 2024

Uh oh!

gvanrossum commented Apr 17, 2024

Uh oh!

gvanrossum commented Apr 17, 2024

Uh oh!

gvanrossum commented Apr 17, 2024

Uh oh!

gvanrossum commented Apr 17, 2024

Uh oh!

markshannon commented Apr 18, 2024• edited by mdboom Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gvanrossum commented Apr 16, 2024•
edited by bedevere-app bot
Loading

mdboom left a comment •
edited
Loading

gvanrossum commented Apr 16, 2024•
edited
Loading

markshannon commented Apr 18, 2024•
edited by mdboom
Loading