GH-98831: Implement basic cache effects#99313

gvanrossum · 2022-11-10T06:16:54Z

I apologize for the mess that generate_cases.py has become. I promise I will clean it up in the next PR.

This PR is a big step forwards though -- it supports cache effects and implements those for the BINARY_OP family (with one exception -- the "hemi-super-instruction" BINARY_OP_INPLACE_ADD_UNICODE). Check the generated code for the effects.

PS. Merge conflicts for Python/bytecodes.c are quite painful, it seems there are several cooks in this kitchen. :-)

Issue: Generate the interpreter #98831

Had to refactor the parser a bit for this.

Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in C files of the Objects/ directory.

Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in Objects/dictobject.c.

…thon#99280) Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>

… venvs (pythonGH-99206) Check to see if `base_executable` exists. If it does not, attempt to use known alternative names of the python binary to find an executable in the path specified by `home`. If no alternative is found, previous behavior is preserved. Signed-off-by: Vincent Fazio <vfazio@gmail.com> Signed-off-by: Vincent Fazio <vfazio@gmail.com>

…9299)

python#99271) Also mark those opcodes that have no stack effect as such. Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>

gvanrossum · 2022-11-11T03:48:42Z

PS. I didn't implement a family that actually uses the cache (the 'counter' doesn't count, it's special since it is written, which our DSL doesn't support). But I figured I'd stop here -- keeping these PRs open for a long time is hard work due to merge conflicts.

gvanrossum · 2022-11-13T21:40:38Z

PS. I didn't implement a family that actually uses the cache (the 'counter' doesn't count, it's special since it is written to, which our DSL doesn't support). But I figured I'd stop here -- keeping these PRs open for a long time is hard work due to merge conflicts.

Working on the refactor I now know for sure there are some bugs in that part. (EDIT: Fixed in GH-99408 but not here.)

markshannon

Maybe leave checking of families to another PR, and just implement stack effects in this PR?

Python/bytecodes.c

Tools/cases_generator/generate_cases.py

bedevere-bot · 2022-11-15T10:28:11Z

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

gvanrossum · 2022-11-15T15:59:51Z

I have made the requested changes; please review again

(Well, I've answered everything and would like to merge this as-is so Brandt can continue on GH-99399.)

bedevere-bot · 2022-11-15T15:59:55Z

Thanks for making the requested changes!

@markshannon: please review the changes made to this pull request.

brandtbucher

Looks good, thanks!

I've sprinked a bunch of random notes and questions throughout. Feel free to fix now, later, or never. :)

brandtbucher · 2022-11-16T01:30:04Z

Tools/cases_generator/parser.py

-defoutputs(self):
-returnself.header.outputs
+defoutputs(self) ->list[StackEffect]:
+# This is always true


What's always true?

isinstance(x, StackEffect). It's gone in the next refactor.

brandtbucher · 2022-11-16T01:35:06Z

Tools/cases_generator/generate_cases.py

+forceffectincache:
+ifceffect.name!="unused":
+bits=ceffect.size*16
+f.write(f"{indent} PyObject *{ceffect.name} = read{bits}(next_instr + {cache_offset});\n")


Not that it matters yet, but these are almost always fixed-width integer types, not objects (though we'll eventually want handling for objects too):
Suggested change
f.write(f"{indent}PyObject *{ceffect.name} = read{bits}(next_instr + {cache_offset});\n")
f.write(f"{indent}uint{bits}_t {ceffect.name} = read{bits}(next_instr + {cache_offset});\n")

brandtbucher · 2022-11-16T01:40:22Z

Python/bytecodes.c

+ };
+
+
+inst(BINARY_OP_MULTIPLY_INT, (left, right, unused/1--prod)){


Just my preference, but I sort of prefer a name like _ to a name like unused for our syntax here. I guess it just feels more "special", and doesn't distract:
Suggested change
inst(BINARY_OP_MULTIPLY_INT, (left, right, unused/1--prod)){
inst(BINARY_OP_MULTIPLY_INT, (left, right, _/1--prod)){

brandtbucher · 2022-11-16T01:44:40Z

Python/bytecodes.c

 }

-inst(BINARY_OP_MULTIPLY_INT, (left, right--prod)){
+family(binary_op, INLINE_CACHE_ENTRIES_BINARY_OP) ={


Honestly, I don't think we really need the INLINE_CACHE_ENTRIES_WHATEVER stuff (or the asserts it produces) in this file anymore (they were originally added to simplify the JUMPBY(...) moves, but those are going to be generated now).
I feel like it sort of just complicates parsing and code generation for no real benefit... plus, we actually already have asserts to this effect in specialize.c where we ended up re-using these constants in some places.
Suggested change
family(binary_op, INLINE_CACHE_ENTRIES_BINARY_OP) ={
family(binary_op) ={

brandtbucher · 2022-11-16T01:45:32Z

Python/bytecodes.c

 staticPyObject*value, *value1, *value2, *left, *right, *res, *sum, *prod, *sub;
-staticPyObject*container, *start, *stop, *v;
+staticPyObject*container, *start, *stop, *v, *lhs, *rhs;


brandtbucher · 2022-11-16T01:58:37Z

Tools/cases_generator/generate_cases.py

+instr, predictions, indent, f,
+cache_size=find_cache_size(instr, families)
+ )
+effects_table[instr.name] =len(instr.inputs), len(instr.outputs), cache_offset


Hm. This is a bit weird because it treats caches and stack items the same, which doesn't make much sense. It seems to me we'd want something that captures just stack effect and cache size:
Suggested change
effects_table[instr.name]=len(instr.inputs), len(instr.outputs), cache_offset
stack_pre=sum(isinstance(item, StackEffect) foritemininstr.inputs)
stack_post=sum(isinstance(item, StackEffect) foritemininstr.inputs)
effects_table[instr.name] =stack_post-stack_pre, cache_offset
(This will probably get more complicated as stack effects start getting more complicated...)

brandtbucher · 2022-11-16T02:03:21Z

Tools/cases_generator/parser.py

+raiseself.make_syntax_error(
+f"Input {name!r} at pos {i} repeated in output at different pos {j}")


Why is this bad? Seems useful for things like copies and swaps (unless the intention is to have the author assign the output to a new name anyways)?
Maybe it complicates refcounting somehow? Either way, might be worth a comment.

Good question. I thought this was in Mark's DSL spec but I can't find it; I probably just misread something. Intuitively, this rules out cases like
inst(FOO, (left, right -- right)){DECREF(left)}
which requires shifting right down by one unit, and that seems a bit unexpected (could be caused by a typo?). But you're right, it doesn't cause any complications in the code generator, we'll just generate code like
{PyObject *left = PEEK(2), *right = PEEK(1); DECREF(left); STACK_SHRINK(1); POKE(1, right)}
I'll get rid of this check in the next refactor (it's in the wrong place anyway).

brandtbucher · 2022-11-16T02:05:33Z

Tools/cases_generator/parser.py

+break

-defstack_effect(self) ->tuple[list[str], list[str]]:
+defstack_effect(self) ->tuple[list[Effect], list[Effect]]:


I didn't look too closely at anything below this line (not a parsing expert). I'm sure it works fine, though. :)

brandtbucher · 2022-11-16T02:07:11Z

Tools/cases_generator/parser.py

+whileself.expect(lx.COMMA):
+iftkn:=self.expect(lx.IDENTIFIER):
+members.append(tkn.text)
+else:
+break


I find this control flow easier to follow:
Suggested change
whileself.expect(lx.COMMA):
iftkn:=self.expect(lx.IDENTIFIER):
members.append(tkn.text)
else:
break
whileself.expect(lx.COMMA) and (tkn:=self.expect(lx.IDENTIFIER)):
members.append(tkn.text)

I'm not sure I agree. Your rewrite is more compact but makes it easy to overlook that this code accepts a trailing comma. The longer form makes you pause and notice that.

brandtbucher · 2022-11-16T02:08:51Z

Tools/cases_generator/parser.py

 if (tkn:=self.expect(lx.IDENTIFIER)):
-ifself.expect(lx.LBRACKET):
-ifarg:=self.expect(lx.IDENTIFIER):
-ifself.expect(lx.RBRACKET):
-returnf"{tkn.text}[{arg.text}]"
-ifself.expect(lx.TIMES):
-ifnum:=self.expect(lx.NUMBER):
-ifself.expect(lx.RBRACKET):
-returnf"{tkn.text}[{arg.text}*{num.text}]"
-raiseself.make_syntax_error("Expected argument in brackets", tkn)
-
-returntkn.text
-ifself.expect(lx.CONDOP):
-whileself.expect(lx.CONDOP):
-pass
-return"??"
-returnNone
+ifself.expect(lx.DIVIDE):
+ifnum:=self.expect(lx.NUMBER):


I noticed that this file has pretty aggressive if nesting. Out of curiousity, any reason why you prefer not to combine many ifs into one test? Maybe it fits your mental model of the parser better?

The latter, mostly. It makes it easier to add an else clause later. Also, I like to test only one condition per line. The parser does need a bit of cleanup, but it's clean enough for now.

gvanrossum added 2 commits November 9, 2022 18:38

Support simple cache effects
9f15c4b
Had to refactor the parser a bit for this.

More BINARY_OP instructions
6189043

bedevere-bot mentioned this pull request Nov 10, 2022
Generate the interpreter #98831
Closed

bedevere-bot added the awaiting core review label Nov 10, 2022

gvanrossum added the skip news label Nov 10, 2022

gvanrossumand others added 20 commits November 10, 2022 07:27

Merge remote-tracking branch 'origin/main' into cache-effects
f5e1aed

Tweak dummy definitions in bytecodes.c after merge
a8d608d

pythongh-99300: Use Py_NewRef() in Objects/ directory (python#99332)
4ee85e7
Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in C files of the Objects/ directory.

pythongh-99300: Use Py_NewRef() in Objects/dictobject.c (python#99333)
873da31
Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in Objects/dictobject.c.

pythongh-90110: Update the C-analyzer Tool (pythongh-99307)
e0ab5b8

pythongh-99277: remove older version ofget_write_buffer_limits(py…
882fdec
…thon#99280) Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>

pythonGH-99298: Don't perform jumps before error handling (pythonGH-9…
1aa0124
…9299)

pythonGH-98831: Remove all remaining DISPATCH() calls from bytecodes.c (
d094e42
python#99271) Also mark those opcodes that have no stack effect as such. Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>

Remaining BINARY_OP family members
d3d907a

Uniformly skip 'unused' effects
0339a67

Remove superfluous asserts; fix one 'is not'
f3e7dd6

Make BINARY_OP result unused
e3ff6ac

Fix parser for family()
3db443a

Check family consistency
c58a85a

Add first family (binary_op)
756a41b

Add assert() to double-check cache struct size
48400ac

Merge commit '00ee6d506e' into cache-effects
433243a

Merge commit '694cdb24a6' into cache-effects
3d51484

Merge remote-tracking branch 'origin/main' into cache-effects
4d42a0a

gvanrossum marked this pull request as ready for review November 11, 2022 01:20

gvanrossum requested review from brandtbucher and markshannon November 11, 2022 02:34

brandtbucher mentioned this pull request Nov 11, 2022
GH-98686: Get rid of BINARY_OP_GENERIC and COMPARE_OP_GENERIC#99399
Merged

gvanrossum mentioned this pull request Nov 12, 2022
GH-98831: Refactor generate_cases.py #99408
Closed

markshannon requested changes Nov 15, 2022
View reviewed changes

Python/bytecodes.cShow resolvedHide resolved
Python/bytecodes.cShow resolvedHide resolved
Python/bytecodes.cShow resolvedHide resolved
Tools/cases_generator/generate_cases.pyShow resolvedHide resolved

bedevere-bot added awaiting changes and removed awaiting core review labels Nov 15, 2022

bedevere-bot added awaiting change review and removed awaiting changes labels Nov 15, 2022

bedevere-bot requested a review from markshannon November 15, 2022 15:59

brandtbucher approved these changes Nov 16, 2022
View reviewed changes

bedevere-bot added awaiting merge and removed awaiting change review labels Nov 16, 2022

gvanrossum merged commit e37744f into python:mainNov 16, 2022

bedevere-bot removed the awaiting merge label Nov 16, 2022

gvanrossum deleted the cache-effects branch December 8, 2022 22:42

	f.write(f"{indent}PyObject *{ceffect.name} = read{bits}(next_instr + {cache_offset});\n")
	f.write(f"{indent}uint{bits}_t {ceffect.name} = read{bits}(next_instr + {cache_offset});\n")

		};


		inst(BINARY_OP_MULTIPLY_INT, (left, right, unused/1--prod)){

	family(binary_op, INLINE_CACHE_ENTRIES_BINARY_OP) ={
	family(binary_op) ={

-effects_table[instr.name]=len(instr.inputs), len(instr.outputs), cache_offset
+stack_pre=sum(isinstance(item, StackEffect) foritemininstr.inputs)
+stack_post=sum(isinstance(item, StackEffect) foritemininstr.inputs)
+effects_table[instr.name] =stack_post-stack_pre, cache_offset

		raiseself.make_syntax_error(
		f"Input {name!r} at pos {i} repeated in output at different pos {j}")

Uh oh!

GH-98831: Implement basic cache effects#99313

GH-98831: Implement basic cache effects #99313

Uh oh!

Conversation

gvanrossum commented Nov 10, 2022• edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gvanrossum commented Nov 11, 2022

Uh oh!

gvanrossum commented Nov 13, 2022• edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markshannon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bedevere-bot commented Nov 15, 2022

Uh oh!

gvanrossum commented Nov 15, 2022

Uh oh!

bedevere-bot commented Nov 15, 2022

Uh oh!

brandtbucher left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brandtbucherNov 16, 2022• edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

gvanrossum commented Nov 10, 2022•
edited
Loading

gvanrossum commented Nov 13, 2022•
edited
Loading

brandtbucherNov 16, 2022•
edited
Loading