Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So one or more objects "owned" by another interpreter (AKA
OTHER) could still hold a reference to one of these untracked objects (AKAUNTRACKED). The bug we're fixing demonstrates that's a possibility.Could it make it harder to break cycles? Could it lead to memory leaks?
How will the
UNTRACKEDobject impact GC for anyOTHERobject holding a reference to it?What happens if one of these
UNTRACKEDobjects is involved in a cycle with one or moreOTHERobjects? Can we still clean them all up?What happens if several of these
UNTRACKEDobjects are involved in a cycle, but one or moreOTHERobjects holds a reference to one of them? Can we still clean them all up?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an object is not tracked by the GC and is part of a ref cycle, the GC is unable to break the reference cycle, and so yes, memory leaks. Previously, Python was crashing. Well, the situation is less bad :-)
This change doesn't introduce the memory leak. An object cannot be tracked in two GC lists at the same time: _PyObject_GC_TRACK() has an assertion for that. The leak was already there.
If an object is created in interpreter 1, it's tracked by the GC of the interpreter 1. If it's copied to the interpreter 2 and the interpreter 1 is destroyed, the interpreter 2 GC is not going to automatically tracks the object. Moreover, the interpreter 1 cannot guess if another interpreter is using the object or not.
IMO untracking all objects is the least bad solution.
IMO the only way to ensure that no memory is leaked is to prevent sharing objects between interpreters, and rely on existing mechanisms (GC and finalizers) to release memory. So continue to convert static types to heap types, continue to update C extensions to the multi-phase initialization, continue moving globals into module state and per-interpreter structures, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙂
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Python 3.8, when the GC state was shared, it seems like any interpreter could run a GC collection. It couldn't happen in parallel, thanks to the "collecting" flag. The GC is not re-entrant and simply does nothing (exit) if it's already collecting.
I cannot say if Python 3.8 was able to break the reference cycles that Python 3.9 and newer can no longer break: when an object in created in an interpreter and then "migrates" to another interpreter.