Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 33.9k
gh-107137: Add _PyTupleBuilder API to the internal C API#107139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
vstinner commented Jul 23, 2023 • edited by bedevere-bot
Loading Uh oh!
There was an error while loading. Please reload this page.
edited by bedevere-bot
Uh oh!
There was an error while loading. Please reload this page.
vstinner commented Jul 23, 2023
I chose to use unsigned cc @serhiy-storchaka@methane@pablogsal@erlend-aasland@corona10 |
vstinner commented Jul 23, 2023
HPy API to build a tuples and lists: https://docs.hpyproject.org/en/latest/api-reference/builder.html HPy API seems to only support creating tuple of a size known in advance: it's not possible to extend or shrink the tuple. |
vstinner commented Jul 23, 2023
I'm considering to add a "Set" function later, and/or maybe a "GetUnsafe" function. But I prefer to start with a minimum API :-) |
gvanrossum commented Jul 23, 2023
-1. This feels unnecessarily elaborate. |
vstinner commented Jul 23, 2023
This API is a fix for this old issue: #59313 "Incomplete tuple created by PyTuple_New() and accessed via the GC can trigged a crash" which was closed as "not a bug" in 2021 by @pablogsal. |
vstinner commented Jul 23, 2023
Do you mean that capi-workgroup/problems#56 is not an issue, or that this API is too complicated to use? |
gvanrossum commented Jul 23, 2023
I believe we should review the problem before jumping to a solution. And if this is the solution, I'm not sure that it's worth fixing the problem. So please hold your horses here. |
vstinner commented Jul 23, 2023
The root issue is that PyTuple_New() tracks directly the tuple by the GC. Moreover, _PyTuple_Resize() tracks also the tuple by the GC. My PR adds _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() to avoid this issue. The _PyTupleBuilder is built on top of it to wrap the memory allocations. |
Add _PyTupleBuilder structure and functions: * _PyTupleBuilder_Init() * _PyTupleBuilder_Alloc() * _PyTupleBuilder_Append() * _PyTupleBuilder_AppendUnsafe() * _PyTupleBuilder_Finish() * _PyTupleBuilder_Dealloc() The builder tracks the size of the tuple and resize it in _PyTupleBuilder_Finish() if needed. Don't allocate empty tuple. Allocate an array of 16 objects on the stack to avoid allocating small tuple. _PyTupleBuilder_Append() overallocates the tuple by 25% to reduce the number of _PyTuple_Resize() calls. Do no track the temporary internal tuple by the GC before _PyTupleBuilder_Finish() creates the final complete and consistent tuple object. Use _PyTupleBuilder API in itertools batched_traverse(), PySequence_Tuple() and initialize_structseq_dict(). Add also helper functions: * _PyTuple_ResizeNoTrack() * _PyTuple_NewNoTrack()
vstinner commented Jul 23, 2023
PR rebased to fix a merge conflict. |
gvanrossum commented Jul 24, 2023
All I am asking is that you hold off for now. |
vstinner commented Jul 24, 2023
I marked this PR as a draft. |
corona10 commented Jul 24, 2023 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Guido, even if there is a better way to solve this issue, adding the internal private API doesn't look harmful. |
gvanrossum commented Jul 24, 2023
@corona10 I worry that APIs are forever, even internal ones. So I recommend that we have a discussion somewhere so we're all on the same page about the problem we're trying to solve and then we can how to solve it, rather than jumping the gun. (I don't require unanimity, just more people having thought about it and come to roughly the same conclusion than just Victor.) |
corona10 commented Jul 24, 2023
Thank you for the answer. CPython seems to be on the brink of change in many topics. |
vstinner commented Jul 24, 2023
In the main Python branch, you can still crash PySequence_Tuple() because it creates a temporary tuple with PyTuple_New() which is immediately tracked by the GC: importgcTAG=object() defmonitor(): lst= [xforxingc.get_referrers(TAG) ifisinstance(x, tuple)] t=lst[0] # this *is* the result tupleprint(t) # full of nulls !returnt# Keep it alive for some timedefmy_iter(): yieldTAG# 'tag' gets stored in the result tuplet=monitor() forxinrange(10): yieldx# SystemError when the tuple needs to be resizedtuple(my_iter())code from: #59313 (comment) Program in gdb: The crash occurs in The problem is that there is a second strong reference to the tuple (created by |
vstinner commented Jul 24, 2023
This PR fix this bug. With this PR,
|
gvanrossum commented Jul 24, 2023
@vstinner Please stop pushing. We've lived with this forever. It doesn't have to be fixed today. |
vstinner commented Jul 24, 2023
I extracted the non-controversial part of this PR, only _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack(), in a new PR fixing the PySequence_Tuple() crash: PR #107183. |
Uh oh!
There was an error while loading. Please reload this page.
rhettinger commented Jul 28, 2023
Can tuples be made robust enough to survive being called while partially filled? The If not, tools like |
gvanrossum commented Jul 28, 2023
Raymond's idea of filling with |
vstinner commented Aug 26, 2023
It seems like issues solved by the proposed _PyTupleBuilder API have different solutions discussed at PR #107183. This API doesn't seem to be the preferred option, so I prefer to close my PR and investigate other options first. |
vstinner commented Oct 30, 2023
Follow-up: I created issue #111489 to make |
Add _PyTupleBuilder structure and functions:
The builder tracks the size of the tuple and resize it in _PyTupleBuilder_Finish() if needed. Don't allocate empty tuple. Allocate an array of 16 objects on the stack to avoid allocating small tuple. _PyTupleBuilder_Append() overallocates the tuple by 25% to reduce the number of _PyTuple_Resize() calls.
Do no track the temporary internal tuple by the GC before _PyTupleBuilder_Finish() creates the final complete and consistent tuple object.
Use _PyTupleBuilder API in itertools batched_traverse(), PySequence_Tuple() and initialize_structseq_dict().
Add also helper functions: