Skip to content

Conversation

@JelleZijlstra
Copy link
Member

@JelleZijlstraJelleZijlstra commented Sep 28, 2024

This is what Larry wants, and so it shall be. It's a bit of a hack,
but it's localized and not too bad.

This is what Larry wants, and so it shall be. It's a bit of a hack, but it's localized and not too bad.
@JelleZijlstra
Copy link
MemberAuthor

cc @larryhastings@carljm

@larryhastings
Copy link
Contributor

larryhastings commented Sep 29, 2024

Please add three tests that use an annotation of format, which is defined in a closure, class scope, and module scope respectively.

@JelleZijlstra
Copy link
MemberAuthor

@larryhastings done.

@larryhastings
Copy link
Contributor

It just hit me--do you mind adding a fourth that fails because format is not defined? I mean, let's cover all our bases here. Let no one accuse us of not doing a thorough job!

@JelleZijlstra
Copy link
MemberAuthor

Done. Because format() is a builtin, getting a NameError is a little bit involved.

Copy link
Contributor

@larryhastingslarryhastings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really just needless munging on the comments, and a question. If you checked it in as-is it'd be fine.

if (size==-1){
returnERROR;
}
PyObject*new_names=PyTuple_New(size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a critique of your approach, but--I'm surprised you needed to go to all this effort. Why was it necessary to make a new tuple, write the new value for index 0, copy over the other values, and release the reference to the old tuple? I'm assuming the reference count of co_localsplusnames is currently 1; I would have asserted that, then overwritten the first entry. I grant you your approach is more conceptually hygienic, but in practice I assume the quick-and-dirty approach would work equally well.

What am I missing?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it is possible to mutate tuples in C code, it feels riskier. For example, maybe we'll make changes in the future that rely on tuples being immutable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assure you, this is a long-standing CPython idiom. We've relied on "if there's only one reference to an object, and you own it, you may modify the object however you like" for decades now.

For fun I made a survey of CPython, literally examining every instance of PyTuple_SET_ITEM. (I didn't try the other spellings.) I found a bunch of sites where we do this. In nearly every instance the code is structured as follows:

if there's only one reference to the tuple (which we own) modify the tuple in place else create a new tuple 

(I'll append the list of such sites at the bottom of this comment.)

Clearly these existing sites are optimizations; instead of destroying the old tuple and creating a fresh one, they're just reusing the existing tuple. They have a harder time of it because generally the tuple has been shown to the interpreter. In our case, we have a freshly compiled code object that hasn't been shown to the interpreter. So there's no chance anyone else has taken any references yet.

If we did change CPython so this was no longer viable, the developer making that change would have to fix all the sites I listed below, which they would probably find the same way I did--looking for all places where people set things in tuples. I don't think modifying the tuple directly would trip up such a future developer.

So, yeah, I really do think it'd be safe to modify the tuple in-place. Just to be totally safe, I'd check the reference count was 1 and raise if it wasn't. (It'd only happen if someone was hacking on compile.c or something, at which point they would deal with it. This would never raise in the wild.)

I don't actually mind you doing it the hard way--we can ship it like this. It just seems needless. We have a longstanding idiom that lets us skip the laborious approach you took. But I'm not gonna fight you about it.


Places where CPython modifies tuples in-place:

compile.c does it a couple times in its internal cache objects. Never exposed to the user (I think).

zip_next in bltinmodule.c, uses _PyObject_IsUniquelyReferenced.

odictiter_iternext in odictobject.c, uses (Py_REFCNT(result) == 1).

enum_next_long in enumobject.c, uses if (Py_REFCNT(result) == 1).

dictiter_iternextitem in dictobject.c, uses _Py_IsOwnedByCurrentThread.

dictreviter_iter_lock_held in dictobject.c, uses Py_REFCNT(result) == 1.

intern_constants in codeojbect.c, doesn't check ownership, this is in con->consts and I assume that's internal.

Five places in itertoolsmodule.c: pairwise_nextcombinations_nextcwr_nextpermutations_nextzip_longest_next, all use Py_REFCNT(result) == 1.

p.s. you should see the if-only-one-reference-modify-the-object shenanigans in the Unicode object!

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #127058 where @markshannon proposes to deprecate existing tuple-mutation shenanigans. That strengthens my conviction that we shouldn't introduce a new tuple mutation here.

@bedevere-app
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@JelleZijlstra
Copy link
MemberAuthor

I have made the requested changes; please review again

@bedevere-app
Copy link

Thanks for making the requested changes!

@larryhastings: please review the changes made to this pull request.

@JelleZijlstra
Copy link
MemberAuthor

@larryhastings would you mind taking another look here? It appears I can't merge while your review remains unresolved.

@JelleZijlstraJelleZijlstra merged commit 3480124 into python:mainDec 30, 2024
42 of 43 checks passed
@JelleZijlstraJelleZijlstra deleted the coformatname branch December 30, 2024 16:19
srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this pull request Jan 8, 2025
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

@JelleZijlstra@larryhastings