Skip to content

Conversation

@gvanrossum
Copy link
Member

@gvanrossumgvanrossum commented Sep 6, 2023

  • Add cache entries to bytecodes.c and update them (but don't use them yet)
  • Make tests pass
  • Use cache entries for branch prediction
  • Add new tests
  • Initialize cache entries to 0x5555 (0b_0101_0101_0101_0101)
  • Buildbots
  • Benchmark

@gvanrossumgvanrossum changed the title Branch prediction for Tier 2 interpretergh-109039: Branch prediction for Tier 2 interpreterSep 6, 2023
#if ENABLE_SPECIALIZATION
next_instr->cache = (next_instr->cache << 1) | flag;
#endif
JUMPBY(oparg * flag);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need also a SKIP_OVER the cache?

I'm guessing that could cause the assert(frame->prev_instr == instr); in _Py_call_instrumentation_jump to fail for the instrumented jumps.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That skip over the cache is already generated -- see the corresponding code in generated_cases.c.h.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean the "next_instr += 1;"? In all other cases there is an explicit SKIP_OVER(INLINE_CACHE_ENTRIES_...) in bytecodes.c.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those SKIP_OVER() calls are always followed by a DISPATCH() call (or maybe a goto).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Can we make the code generator emit SKIP_OVER(X) instead of next_instr += x;?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, though IIRC Mark at some point objected to emitting macros. So I'd rather keep the status quo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markshannon What is the reason not to emit macros?

A reason to emit them is so that they are implemented in one place, so if their implementation changes you only change there. Do we want to change the code generator (and to remember that we need to) every time the implementation of a macro like SKIP_OVER changes?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I don't expect SKIP_OVER() to ever change. In hand-written code the macro expresses the intent better. But in generated code it just obscures what happens. I had to go to some lengths to change PEEK() and POKE() calls in the generated code to using stack_pointer[x] instead; I don't want to go back. If you still disagree, try engaging @markshannon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you still disagree,

More like trying to understand than disagreeing.

try engaging @markshannon.

Yes I directed my previous comment to him.

This is needed so branch prediction can work.
Alas, this goes untested (how to test it?). In INSTRUMENTED_POP_JUMP_IF_NOT_NONE, rename flag to nflag to emphasise that its sense is reversed (this is the only op that jumps if the flag is false, because there's no Py_IsNotNone() function). (Alternatively, we could have changed the sense of the flag, but that would have been more work.)
@gvanrossumgvanrossum marked this pull request as ready for review September 8, 2023 00:44
@gvanrossum
Copy link
MemberAuthor

Benchmark is running, will post results here. I think I've addressed all actionable review comments. Please review.

@gvanrossum
Copy link
MemberAuthor

gvanrossum commented Sep 8, 2023

Benchmark is neutral(*), which is a good thing (it means adding a cache entry to the branch instructions didn't slow anything down).


(*) Or possibly some benchmarks crashed. I'm going to run buildbots to be sure.

@gvanrossumgvanrossum added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Sep 8, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @gvanrossum for commit 1850988 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-botbedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Sep 8, 2023
family_member_names.update(family.members)
for instr in self.instrs.values():
if (
instr.name not in family_member_names
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to exclude family members from this table?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's tradition -- the table only contains the data for family heads and is always consulted after looking up the deoptimized opcode in _PyOpcode_Deopt.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exclude family members from this table?

That's tradition

At first glance I thought this was some reference to a hypothetical family in the midst of conflict. 🙂

@gvanrossum
Copy link
MemberAuthor

Hm. Many buildbots fail on test_sys_settrace, but I can't (yet) reproduce it. Must be about build flags.

This time by disabling the optimizer.
@AlexWaygood
Copy link
Member

AlexWaygood commented Sep 8, 2023

Hm. Many buildbots fail on test_sys_settrace, but I can't (yet) reproduce it. Must be about build flags.

Unlikely that it's to do with this PR, as it's already happening on main -- see #109052 and #109143

@gvanrossum
Copy link
MemberAuthor

Unlikely that it's to do with this PR, as it's already happening on main -- see #109052 and #109143

Thanks, I'll not worry about it then.

Buildbots are beginning to lose their value for me. :-(

@gvanrossum
Copy link
MemberAuthor

I'll fix the conflict, then merge this.

@gvanrossumgvanrossum enabled auto-merge (squash) September 11, 2023 17:36
@gvanrossum
Copy link
MemberAuthor

(Sorry, several reviewers got a review request because I changed the bytecode magic number. Please ignore.)

@gvanrossumgvanrossum merged commit bcce5e2 into python:mainSep 11, 2023
@gvanrossumgvanrossum deleted the count-branches branch September 11, 2023 18:21
@markshannon
Copy link
Member

This introduced a regression in branch and jump monitoring, as the target is off by one.
The line numbers in test_monitoring should be unchanged from 3.12.

@gvanrossum
Copy link
MemberAuthor

This introduced a regression in branch and jump monitoring, as the target is off by one. The line numbers in test_monitoring should be unchanged from 3.12.

Okay, can you give me a hint on what went wrong? I haven't been following how the instrumentation works in detail, and I have no idea which bits are being tested by the tests I modified, or what I should fix. (If it's involved, please open a new issue and CC me.)

@markshannon
Copy link
Member

It is fixed in #111486, so nothing to worry about. Just putting it here for the record.

Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

@gvanrossum@bedevere-bot@AlexWaygood@markshannon@iritkatriel@ericsnowcurrently