Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 34k
gh-140328: Use interned versions of string constants if they're already present#140688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading. Please reload this page.
Conversation
albertedwardson commented Oct 27, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
5e0b073 to ba291d3Comparealbertedwardson commented Oct 29, 2025
will investigate why ft build fails |
ba291d3 to 88fe228Compare88fe228 to d9eaaf6CompareUh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
| PyTuple_SET_ITEM(tuple, i, interned); | ||
| Py_DECREF(v); | ||
| _constants_tuple_modified(modified); | ||
| } else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it adds extra indentation (that I'd like to avoid) to the next code block like so:
} else{#endifif (should_intern_string(v)){PyObject*w=v; _PyUnicode_InternMortal(interp, &v); if (w!=v){PyTuple_SET_ITEM(tuple, i, v); set_modified(modified)} } } }There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's wait what say others for this case.
Uh oh!
There was an error while loading. Please reload this page.
colesbury commented Jan 12, 2026
I'm not sure this PR makes sense. This is a bunch of extra complexity and I don't see the point of it. I think there are two options that make sense to me:
|
albertedwardson commented Jan 12, 2026
IMO this is mainly about correctness and consistency of interning, not expanding its scope. Once a string is interned, keeping duplicate equal constants around feels unnecessary.
Could you please clarify the original reason for limiting interning to “identifier-like” strings? I believe there were some practical concerns |
colesbury commented Jan 12, 2026
It's not stated explicitly anywhere and was introduced almost 30 years ago: I suspect it's because:
In the GIL-enabled build, interned strings are freed when they are no longer referenced so (2) is no longer true. The free threaded build doesn't use this logic, so it's not really relevant here either. |
closes#140328
intern_constants()previously only interned string constants whenshould_intern_string()returned true, leaving equal strings duplicated even if an interned instance already existed. This change first checks the global and per-interpreter interned string tables and reuses an existing interned string when available. This applies only to string constants that come from the codeobjects; dynamically created strings are unaffected and are still created as new objects.This makes string interning behavior more consistent: once a string is interned, subsequent equal constants can reuse the canonical object. It may slightly reduce memory usage and improves interaction with
sys.intern().The change adds a small lookup cost when loading code objects, which may affect
eval(),exec(), modules loading and similar operations.I haven't run
pyperformanceyet, but I plan to provide a benchmark run once I have access to an appropriate runner. Given the limited scope of the change, I expect both the performance impact and memory savings to be relatively small but measurable.