Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 33.9k
gh-107137: Add _PyTuple_NewNoTrack() internal C API#107183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
vstinner commented Jul 24, 2023 • edited by bedevere-bot
Loading Uh oh!
There was an error while loading. Please reload this page.
edited by bedevere-bot
Uh oh!
There was an error while loading. Please reload this page.
Fix a crash in PySequence_Tuple() when the tuple which is being initialized was accessed via GC introspection, like the gc.get_referrers() function. Now the function only tracks the tuple by the GC once it's fully initialized. Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() functions to the internal C API: similar to PyTuple_New() and _PyTuple_Resize(), but don't track the tuple by the GC. Modify PySequence_Tuple() to use these functions.
vstinner commented Jul 24, 2023
vstinner commented Jul 24, 2023
gc.get_referrers() documentation has a warning added in 2003 by issue #39117:
|
vstinner commented Jul 24, 2023 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
@pablogsal: in 2021, you wrote:
Are you still against the GC bug in PySequence_Tuple()? Note: I'm only proposing to add functions to the internal C API. Make such API public would deserves its own discussion. Also, I proposed a different API to fix the issue, an internal "tuple builder" API: PR #107139. |
vstinner commented Jul 24, 2023
I mark this PR as a draft since in 2021, the issue #59313 was closed as "not a bug" by @pablogsalwho wrote:
|
gvanrossum commented Jul 25, 2023
This leaves me worried. The issue where the crash due to incomplete tuples (gh-59313) was closed, basically telling people that this is an acceptable risk, and the risk is due to the existence of cg.get_referrers() (see also gh-39117). It seems that using An alternative way to deal with this is to not call Introducing a new internal API, I feel we have two choices here:
I don't feel that just fixing (This discussion should probably go into an issue, but the issues are all closed.) |
vstinner commented Jul 25, 2023
Correct
It sounds like a backward incompatible changes. Such subtle change reminds me when instances of heap types should now call Py_DECREF(Py_TYPE(self)) in their tp_dealloc function, implement tp_traverse and tp_clear, visit the type in tp_traverse, and clear the type in tp_clear :-( I'm not saying that so many changes are needed. My worry is that such incompatible change can be silently ignored, and "suddenly" people will get new GC problems that they didn't have before since their tuples were tracked by the GC. Not tracking tuples by the GC sounds worse then the risk of race condition of code calling gc.get_referrers().
That's why I have a bigger plan: provide a new API to replace PyTuple_New() + PyTuple_SET_ITEM(): a "tuple builder" API, issue #107137. The migration plan doesn't imply to remove immediately the old API. It can be slow. Moreover, at this stage, I only propose an internal C API and so how it goes in Python internals first, how the API is used, and see if it does help fixing the race conditions for us at least.
Sure, that's also an option. But even if we decide to not provide a solution to fix 3rd party C extensions, does it mean that we should not fix known bugs in Python internals if there is a simple solution for that?
I created issue #107137. These days, most discussions happen on pull requests, issues are just created to get "an issue number" for the Changelog and What's New entries :-) I don't understand well the rationale against fixing PySequence_Tuple(). Do you mean that my internal API is too complicated to use? |
gvanrossum commented Jul 25, 2023
APIs, even internal ones, are forever. I would like you to ask other core devs to chime in here rather than it just being you and me. After all the problem has existed for decades, so why hurry now? |
vstinner commented Jul 25, 2023
The difference is that we don't provide any backward compatibility warranty on the internal C API. We have greater freedom to change it (rename functions, add parameters, remove functions, etc.)
Sure, I would like to hear more voices. I pinged multiple core devs on the issue, on my first PR and this second PR. I only created this issue very recently, other people seem to be busy (the EuroPython is just over ;-)). Why makes you think that there is a hurry? |
gvanrossum commented Jul 25, 2023
Because you have a pattern of making changes, requesting a review (or not!), and merging before you actually get a review. See e.g. gh-106871. |
davidhewitt commented Jul 30, 2023
From PyO3's perspective, if you want us to replace PyO3 can then use either |
vstinner commented Aug 14, 2023
Well. I created this PR to get reviews. 3 weeks later: no review. I pinged 3 core devs (including you). I tried to explain my problem with reviews a few times:
How can I get a review on this internal C API to fix a GC crash? :-) |
gvanrossum commented Aug 14, 2023
Not true. I wrote two separate review comments, but stating concerns about API bloat (even internal) and pointing out that the issue was closed as "won't fix" for a good (IMO) reason. What more do you want? We have one core dev (me) still against and nobody else in favor. In that case the status quo wins. |
vstinner commented Aug 14, 2023
Since issue #59313 got closed as "wont fix" two years ago, the C API was discussed and a working group is trying to make it better and address C API Problems. I'm proposing a concrete solution to one of these problems. I wrote a concrete PR to discuss the problem and my specific solution. As explained in the issue related to this PR, issue #107137, the problem is that the C API allows to create incomplete and inconsistent Python objects: capi-workgroup/problems#56 We can incrementally fix the C API to reduce this problem and design a migration path to a safer API which doesn't allow that. Fixing the problem in the internal C API is a first step, and IMO it's worth it even if the work stops here (since it does fix a concrete crash).
Well, it would be nice to hear other voices. It seems like @davidhewitt is interested by a fix for this C API issue for PyO3. |
gvanrossum commented Aug 14, 2023
It might also make sense to just wait until the core dev sprint in Brno, where the future of the C API will be discussed. |
nascheme commented Aug 14, 2023
FYI, I don't think the problem is just with |
gvanrossum commented Aug 15, 2023
But for tuple objects,
This I'm not so sure about -- IIUC the bug reported in gh-59313 shows how to get the GC to run (note it was closed because it was deemed an acceptable risk, not because it was fixed). FWIW This PR should reference that issue in its title, not gh-107137 (which is about adding a different tuple building API). |
davidhewitt commented Aug 15, 2023 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
While I'm not aware of crashes in the wild caused by this in PyO3, the issue in #59313 probably does affect Rust code using PyO3. We allow tuples to be built from Rust iterators. Users have the flexibility to run arbitrary Python code while the uninitialized tuple is under construction. We could work around this by first collecting the iterator into a Rust vector of If the fix to avoid the crash and keep current performance "just" exposing a new API which is
I think that's not feasible because stable API extensions built before this change will start producing tuples not tracked by GC, which sounds bad to me. |
markshannon commented Aug 15, 2023
As @vstinner said in capi-workgroup/problems#56 creating incomplete tuples is unsafe. Rather than an add another unsafe API, would it not be better to promote the safe ones and deprecate the unsafe ones? From smaller tuples, For larger tuples, the safe thing to do is build a list, then call |
erlend-aasland commented Aug 15, 2023
I'd be in favour of making |
corona10 commented Aug 15, 2023
Looks good to me also, because I am not sure how people want to use APIs based on elements of size. |
vstinner commented Aug 16, 2023
This change is related to issue #107137 which proposes adding a new safe API. But for the implementation, I need to add internal unsafe functions, right. |
vstinner commented Aug 16, 2023
Right, initializing tuple times to None would prevent crashes on repr(tuple) when devs play with GC introspection functions. But my concern is more general than that. For me, PyTuple_SET_ITEM() is too different from all other C API functions: it doesn't clear the reference to the old item, since it makes the assumption that the function must only be called on newly created tuple with uninitialized items.
I would prefer to move towards an API where it's impossible to create unitialized/incomplete objects and so a function like PyTuple_SET_ITEM() would make no sense. Reference counting would be more regular and the API would be less error-prone.
On the implementation side, because PyTuple_New() API exists, most Other APIs like PyTuple_Pack() or proposed "tuple builder" API doesn't have such problem: if we manage to remove |
vstinner commented Aug 22, 2023
By the way, this change also fix |
vstinner commented Aug 26, 2023
I close my issue, see: #107139 (comment) |
vstinner commented Aug 26, 2023
Ooops, wrong window, I wanted to close the issue #107137 |
vstinner commented Sep 13, 2023
Apparently, I failed to convinced that _PyTuple_NewNoTrack() is the good fix for the issue, and some people consider that this crash is not important enough to be fixed, so I close just my PR. |
markshannon commented Sep 13, 2023
Could you open an issue for the crash? |
vstinner commented Sep 13, 2023
There was an issue closed by@pablogsal and @rhettingerwas also against it. I asked @pablogsal if he can reconsider his decision but he didn't replied. So I give up. If you feel ready to reconsider this design choice, please go ahead and open a new issue :-) I'm done on this topic for now ;-) |
markshannon commented Sep 13, 2023
I think we should fix this. It is a crash in legal Python code. I don't think we need a new API, we need to fix |
vstinner commented Sep 13, 2023
My PR only added an internal C API. Is it now wrong to add internal C API functions? |
markshannon commented Sep 14, 2023
No, but I don't think this is the way to go. |
markshannon commented Sep 14, 2023
And If the loss of efficiency in refcount handling is an obstacle to adoption, we could add variants that consume the references. |
markshannon commented Apr 11, 2024
See #117747 |
Fix a crash in PySequence_Tuple() when the tuple which is being initialized was accessed via GC introspection, like the gc.get_referrers() function. Now the function only tracks the tuple by the GC once it's fully initialized.
Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() functions to the internal C API: similar to PyTuple_New() and _PyTuple_Resize(), but don't track the tuple by the GC.
Modify PySequence_Tuple() to use these functions.