Skip to content

Conversation

@markshannon
Copy link
Member

@markshannonmarkshannon commented Apr 30, 2024

Simplifies and unifies the behavior of
_GUARD_NOT_EXHAUSTED_RANGE
_GUARD_NOT_EXHAUSTED_LIST
_GUARD_NOT_EXHAUSTED_TUPLE
_FOR_ITER_TIER_TWO

Such that all leave just the iterator on the stack and they exit to the POP_TOP immediately after the associated END_FOR.

This fixes a bug in the tier 2 handling of _FOR_ITER_TIER_TWO where errors were treated as occurring at the jump target, not an the original instructions.

@markshannonmarkshannon changed the title 118095: Unify the behavior of tier 2 FOR_ITER branch micro-opsGH-118095: Unify the behavior of tier 2 FOR_ITER branch micro-opsApr 30, 2024
@markshannonmarkshannonforce-pushed the unify-tier-2-for-iter branch from 25a889a to bb7efd4CompareMay 1, 2024 11:01
@markshannon
Copy link
MemberAuthor

The stats are a bit confusing.

The number of traces executed goes up, but the number of uops executed goes down, as we would expect.
However, there is a large increase in the number of tier 1 FOR_ITER_TUPLE and FOR_ITER_LIST instructions executed.

Looking at the number of instructions executed for the various tier 1 and tier 2 FOR_ITER variants, what's happening becomes clearer:

Specialized

FOR_ITER_TUPLE +118M
FOR_ITER_LIST +236M
FOR_ITER_RANGE +22M

_ITER_CHECK_TUPLE + 165M
_ITER_CHECK_LIST + 87M
_ITER_CHECK_RANGE + 65M

Unspecialized

FOR_ITER -16M
_FOR_ITER_TIER_TWO -1081M

Specialization is being improved: we are executing more specialized T1 and T2 variants and much fewer unspecialized
FOR_ITER and _FOR_ITER_TIER_TWOs.

@brandtbucher
Copy link
Member

The test hangs for tier two seem to be in test_capi.test_misc.TestPendingCalls. I have a hunch the culprit is GH-117442... so that should probably be fixed first?

Copy link
Member

@gvanrossumgvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seem to be some unrelated cleanups -- maybe minimize those or extract them to another PR? Otherwise LGTM.

#defineMAX_EXECUTORS_SIZE 256

#ifdefPy_DEBUG
staticintbase_opcode(PyCodeObject*code, intoffset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
staticintbase_opcode(PyCodeObject*code, intoffset)
staticint
base_opcode(PyCodeObject*code, intoffset)

Comment on lines +170 to +175
if (_Py_uop_sym_is_not_null(sym)){
sym_set_bottom(sym);
return false;
}
sym_set_flag(sym, IS_NULL);
return!_Py_uop_sym_is_bottom(sym);
returntrue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this refactoring matter? If so, why not do the same for set_non_null below?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a refactoring.
Calling _Py_uop_sym_set_null on a non-NULL symbol would fail an assertion in _Py_uop_sym_is_bottom

And yes, it should be applied to set_non_null as well.

Comment on lines 767 to 773
#ifdefPy_DEBUG
uint32_tnext_inst=target+1+INLINE_CACHE_ENTRIES_FOR_ITER+ (oparg>255);
uint32_tjump_target=next_inst+oparg;
assert(base_opcode(code, jump_target) ==END_FOR||
base_opcode(code, jump_target) ==INSTRUMENTED_END_FOR);
assert(base_opcode(code, jump_target+1) ==POP_TOP);
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to include the case also in {... } to limit the scope of the two variables declared in debug mode.

@markshannonmarkshannon marked this pull request as ready for review May 2, 2024 15:14
@markshannonmarkshannon merged commit 72867c9 into python:mainMay 2, 2024
@markshannonmarkshannon deleted the unify-tier-2-for-iter branch May 2, 2024 15:18
SonicField pushed a commit to SonicField/cpython that referenced this pull request May 8, 2024
…ps (pythonGH-118420) * Target _FOR_ITER_TIER_TWO at POP_TOP following the matching END_FOR * Modify _GUARD_NOT_EXHAUSTED_RANGE, _GUARD_NOT_EXHAUSTED_LIST and _GUARD_NOT_EXHAUSTED_TUPLE so that they also target the POP_TOP following the matching END_FOR
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

@markshannon@brandtbucher@gvanrossum