Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 34k
gh-123378: fix some corner cases of start and end values in PyUnicodeErrorObject#123380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-123378: fix some corner cases of start and end values in PyUnicodeErrorObject#123380
Uh oh!
There was an error while loading. Please reload this page.
Conversation
picnixz commented Aug 27, 2024 • edited by bedevere-app bot
Loading Uh oh!
There was an error while loading. Please reload this page.
edited by bedevere-app bot
Uh oh!
There was an error while loading. Please reload this page.
PyUnicode{Encode,Decode}Error_GetStartstart values on PyUnicodeError objectsstart values on PyUnicodeError objectsstart values on PyUnicodeErrorObjectUh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
picnixz commented Aug 29, 2024
Thank you Victor for your review! |
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
vstinner left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@serhiy-storchaka: Do you want to double check the fix?
encukou left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit sad that we can't agree to fix the setters, but, if we're fixing the C getters, this is the way to go.
A few nitpicks:
Uh oh!
There was an error while loading. Please reload this page.
| static inline int | ||
| unicode_error_set_end_impl(PyObject *exc, Py_ssize_t end) | ||
| { | ||
| ((PyUnicodeErrorObject *)exc)->end = end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're here, could you add a type check before each cast?
assert(PyObject_TypeCheck(exc, (PyTypeObject*)&PyExc_UnicodeError)); Ideally, public API would raise TypeError on wrong types, but that change might be too big.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make a follow-up PR for adding assertions on the getters as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be great! Thank you for volunteering :)
Misc/NEWS.d/next/C_API/2024-08-27-09-07-56.gh-issue-123378.JJ6n_u.rst Outdated Show resolvedHide resolved
Uh oh!
There was an error while loading. Please reload this page.
| static inline int | ||
| unicode_error_set_end_impl(PyObject *exc, Py_ssize_t end) | ||
| { | ||
| ((PyUnicodeErrorObject *)exc)->end = end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be great! Thank you for volunteering :)
encukou commented Dec 3, 2024
Feel free to leave all the asserts to a follow-up PR. |
bc0f2e9 into python:mainUh oh!
There was an error while loading. Please reload this page.
encukou commented Dec 4, 2024
Thank you! |
picnixz commented Dec 5, 2024
Question: should we backport this to 3.12 and 3.13? it's both a feature and a bugfix in some sense but I feel it's more a bug fix even though it could change the behaviour of some code (hopefully, not too much...). The WDYT? By the way, I spotted some typos here and there. I'll correct them as part of a follow-up PR. |
encukou commented Dec 6, 2024
If I understand correctly, it could only return -1 with an empty string, which isn't something one would typically get encoding errors for. Also, at this level you get a wrong value rather than a crash. |
picnixz commented Dec 6, 2024
Yup. (to be clear, it's the size that is being stored in In this case, subsequent calls usually lead to crashes because the C API assumes non-relative offsets in general.
Thanks, it's essentially to know whether the patches for codecs handlers will be backported or not (without the backport of this PR, the handlers would still be broken I think). |
…re clamped (pythonGH-123380) Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
startvalues onPyUnicodeErrorObject#123378