Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 34k
gh-111089: Add PyUnicode_AsUTF8NoNUL() function#111688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
vstinner commented Nov 3, 2023 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
vstinner commented Nov 3, 2023
I'm not sure about For example, if the string is used to open a file, the function doesn't reject Maybe the name should be more explicitly about its purpose, such as: |
AlexWaygood commented Nov 3, 2023
the clinic.py changes look fine to me; I have no opinion on the C changes :) |
erlend-aasland commented Nov 3, 2023
Clinic changes are ok with me. I'll leave the C API discussion to the C API WG :) |
vstinner commented Nov 3, 2023
If this change is merged, it would be interesting to go through PyUnicode_AsUTF8() usage in the Python code base and check if switching PyUnicode_AsUTF8Safe() would be worth it. See #111656 (comment) list for example. |
vstinner commented Nov 3, 2023
The working group doesn't exist yet, it's still a draft PEP: https://discuss.python.org/t/pep-731-c-api-working-group-charter/36117 |
vstinner commented Nov 3, 2023
@encukou@gpshead@Yhg1s@serhiy-storchaka: Would you mind to review this change? |
serhiy-storchaka commented Nov 3, 2023
Are there strong objections against simply making |
Tools/clinic/clinic.py Outdated
| goto exit; | ||
| }}}} | ||
| {paramname} = PyUnicode_AsUTF8({argname}); | ||
| {paramname} = PyUnicode_AsUTF8Safe({argname}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be different code for limited_capi is true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. I wanted to add PyUnicode_AsUTF8Safe() to the limited C API in a separated PR. Maybe it's more convenient to do it in the PR, since there is a lot of code generated by Argument Clinic impacted by these changes.
vstinner commented Nov 3, 2023
I suppose that it's too late to change it, same rationale than changing PyUnicode_AsUTF8(). Moreover, |
613243e to 96b2b8bComparevstinner commented Nov 4, 2023
I rebased the PR on the main branch to get the test_asyncio fix. I also squashed commits. |
encukou commented Nov 6, 2023
Wait until the WG can discuss this. I agree that I wouldn't mind naming it |
vstinner commented Nov 6, 2023
If "Safe" is too generic, what about "AsUTF8NoNul"? I propose "Nul" instead of "Null" which looks like "NULL pointer". Or maybe "AsUTF8NoNullChars"? "Z" looks too short to me, it's not easy to guess its intent. |
vstinner commented Nov 7, 2023
Alternative short name: |
Revert PyUnicode_AsUTF8() change: it no longer rejects embedded null characters: the PyUnicode_AsUTF8Safe() function should be used instead.
96b2b8b to 193c18bComparevstinner commented Nov 7, 2023
@serhiy-storchaka@erlend-aasland: I renamed the function to PyUnicode_AsUTF8NoNUL(). Are you fine with this name? |
erlend-aasland commented Nov 7, 2023
I didn't follow the discussion. My first impression is that I don't understand what that API is supposed to do, based on the name only. |
serhiy-storchaka commented Nov 7, 2023
I am fine with added the embedded null character check in If people not fine with this, ask them with what are they fine. |
vstinner commented Nov 7, 2023
Yhg1s commented Nov 7, 2023
I don't know why we need a new public function. For a new function, if you feel like it needs a strlen check, that's fine. I don't think it's a sensible way to deal with C strings (everyone who works with C strings should know about the importance of NUL), but for new APIs I don't really care either way. However:
|
vstinner commented Nov 7, 2023
I close my issue: #111089 (comment) |
Revert PyUnicode_AsUTF8() change: it no longer rejects embedded null characters: the PyUnicode_AsUTF8NoNUL() function should be used instead.
📚 Documentation preview 📚: https://cpython-previews--111688.org.readthedocs.build/