[3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303)#123065
Uh oh!
There was an error while loading. Please reload this page.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This backports several PRs for gh-113993, making interned strings mortal so they can be garbage-collected when no longer needed.
Allow interned strings to be mortal, and fix related issues (gh-113993: Allow interned strings to be mortal, and fix related issues #120520)
Add an InternalDocs file describing how interning should work and how to use it.
Add internal functions to explicitly request what kind of interning is done:
_PyUnicode_InternMortal_PyUnicode_InternImmortal_PyUnicode_InternStaticSwitch uses of
PyUnicode_InternInPlaceto those.Disallow using
_Py_SetImmortalon strings directly.You should use
_PyUnicode_InternImmortalinstead:interning a immortalizing copy.
_Py_SetImmortaldoesn't handle theSSTATE_INTERNED_MORTALtoSSTATE_INTERNED_IMMORTALupdate, and those flags can't be changed inbackports, as they are now part of public API and version-specific ABI.
Add private
_only_immortalargument forsys.getunicodeinternedsize, used in refleak test machinery.Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
_Py_ID_Py_STR(including the empty string)Now, when you intern a singleton, that exact singleton will be interned.
Add a
_Py_LATIN1_CHRmacro, use it instead of_Py_ID/_Py_STRfor one-character latin-1 singletons everywhere (including Clinic).Intern
_Py_STRsingletons at startup.Beef up the tests. Cover internal details (marked with
@cpython_only).Add lots of assertions
Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (gh-113993: Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API #121364)
Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs
Document immortality in some functions that take
const char *This is PyUnicode_InternFromString;
PyDict_SetItemString, PyObject_SetAttrString;
PyObject_DelAttrString; PyUnicode_InternFromString;
and the PyModule_Add convenience functions.
Always point out a non-immortalizing alternative.
Immortalize names in code objects to avoid crash (gh-121863: Immortalize names in code objects to avoid crash #121903)
Intern latin-1 one-byte strings at startup (gh-122291: Intern latin-1 one-byte strings at startup #122303)
There are some 3.12-specific changes, mainly to allow statically allocated strings in deepfreeze. (In 3.13, deepfreeze switched to the general
_Py_ID/_Py_STR.)Co-authored-by: Eric Snow ericsnowcurrently@gmail.com
📚 Documentation preview 📚: https://cpython-previews--123065.org.readthedocs.build/
Issue: #113993