Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 33.9k
GH-101291: Low level opt-in API for pylong#101685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
markshannon commented Feb 8, 2023 • edited by bedevere-bot
Loading Uh oh!
There was an error while loading. Please reload this page.
edited by bedevere-bot
Uh oh!
There was an error while loading. Please reload this page.
arhadthedev left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A minor typo.
Include/cpython/longintrepr.h Outdated
| /* Return 1 if the argument fits into a (N-1) bit integer, | ||
| * where N is the number of bits in a intptr_t, and the value | ||
| * can be extracted efficiently by _PyLong_GetSingleDigitValue | ||
| * NOTE: Some values that could into (N-1) bit integer will return 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| *NOTE: Somevaluesthatcouldinto (N-1) bitintegerwillreturn0. | |
| *NOTE: Somevaluesthatcouldfitinto (N-1) bitintegerwillreturn0. |
encukou commented Feb 8, 2023
The What does |
mdboom commented Feb 8, 2023
I worked on some Python/C calling benchmarks last year. We could test those against a Cython hacked to use this API. |
scoder commented Feb 14, 2023
So, what we currently do in Cython is that we generate a switch statement for The Note that we generate independent unpacking functions for different C integer/float sizes, so we'd need an inline function for int, long, ssize_t and the respective unsigned types as well, in order to cut down the switch statement at compile time. I certainly agree that the single digit case is an important special case. However, we currently do a lot more in the cases where the target C integer (or float) size is substantially larger than the PyLong digit size. Switching to the proposed API would get us half the way, and we'd end up with using the official public API, this new inofficial API, and the bare struct access, depending on the compile time environment/config and runtime value. The advantage then seems to be that users could still benefit from fast single-digit access if they chose to opt out of the direct struct access. If CPython doesn't want to go all the way of providing different inline conversion functions for different C target type sizes (as Cython does), then separating out the single-digit case still sounds like a reasonable option to provide to users, I think. |
markshannon commented Feb 14, 2023
The concept of a "digit" is something we want to get away from in the API. What matters is whether the Python int small enough to fit into a machine int ( Given the highly skewed distribution of ints, false negatives to should be OK for larger numbers. |
scoder commented Feb 15, 2023
In the API or in the implementation? I'm asking because the digit's were never really part of the API but an implementation detail that CPython and Cython were both making good use of. As long as they remain a part of the implementation, I don't see why Cython shouldn't make use of them. |
markshannon commented Feb 17, 2023
Both. But it needs to be in the API first, or we won't be able to do so in the implementation.
Because there is no advantage in doing so. It doesn't improve performance, it just makes maintenance harder. |
markshannon commented Feb 17, 2023
I'm now inclined to make this part of the full API. With some way to opt in to the faster version. Something like this: /* Probably need a better name for this macro :) */#ifdefPy_GIVE_ME_THE_FAST_VERSION_I_DONT_MIND_RECOMPILING_EVERY_RELEASEstaticinlineintPyLong_IsSingleDigit(PyLongObject*op){assert(PyLong_Check(op)); Py_ssize_tsigned_size=op->long_value.ob_size; return-1 <= signed_size&&signed_size <= 1} #elsePyAPI_FUNC(int) PyLong_IsSingleDigit(PyLongObject*op); #endif@encukou does this sound like a reasonable approach? |
encukou commented Feb 20, 2023 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
What do you mean by "every release"?
|
scoder commented Feb 20, 2023
Exactly. And that's the level of cross-version compatibility that CPython currently provides to its users and what Cython currently provides to its users (with the default C configuration). In theory, you could deprecate any non-stable API function tomorrow and remove it in CPython 3.13 or 3.14. Depending on which function you chose, that could end up being a blow to the existing ecosystem, but it wouldn't really be different from removing any other publicly exposed C function that is used outside of CPython itself. Personally, I don't see much of an advantage of an additional private-but-exposed API. If it's exposed, and fills a purpose (which it probably does, because it exists), why can't others use it? As long as CPython sticks to adding and removing functions only in minor (3.x.0) releases, that's just like any other kind of functionality that a specific CPython 3.x release series supports. And keeping the C-API stable for the lifetime of a minor release series seems to be widely agreed on. For a broader compatibility range, the current CPython offer is the limited API and the stable ABI, but that's a different beast and we're not discussing that here. |
markshannon commented Feb 20, 2023
Every minor release. |
markshannon commented Feb 20, 2023 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
It does need to be guarded by a macro, though. Otherwise you can't use it in a stable ABI build. |
encukou commented Feb 20, 2023
You'll want something like: #if !defined(Py_LIMITED_API) ||Py_LIMITED_API+0 >= 0x030C0000PyAPI_FUNC(int) PyLong_IsSingleDigit(PyLongObject*op); #endif#ifndefPy_LIMITED_API// (Instead of the #ifndef guard, you can put this in a Include/cpython/*.h file that's guarded as a whole)staticinlineint_PyLong_IsSingleDigit(PyLongObject*op){assert(PyLong_Check(op)); Py_ssize_tsigned_size=op->long_value.ob_size; return-1 <= signed_size&&signed_size <= 1} #definePyLong_IsSingleDigit _PyLong_IsSingleDigit #endifand then in the #undef PyLong_IsSingleDigit ... PyLong_IsSingleDigit(PyLongObject*op){return_PyLong_IsSingleDigit(PyLongObject*op)} |
encukou commented Feb 20, 2023
Note that |
markshannon commented Feb 20, 2023
Sure. However, if we provide the function regardless of whether or not |
encukou commented Feb 20, 2023 via email
There's a single flag that both limits the API you can use, and gives you a stable ABI build. (you = an extension author) You're right that it's not the clearest naming. But for users, one without the other doesn't really make much sense. On February 20, 2023 6:18:50 PM GMT+01:00, Mark Shannon ***@***.***> wrote: Sure. However, if we provide the function regardless of whether or not `Py_LIMITED_API` is defined, then `Py_LIMITED_API` doesn't limit the API. Rather, it seems that defining `Py_LIMITED_API` changes the ABI not the API. This seems confusing. -- Reply to this email directly or view it on GitHub: #101685 (comment) You are receiving this because you were mentioned. Message ID: ***@***.***> -- Sent from my Android device with K-9 Mail. Please excuse my brevity. |
markshannon commented Feb 20, 2023
An unstable ABI with a stable API makes a lot of sense, to me at least. The The reason the API counts of stable, even though the ABI might not be, is that the unstable form can be chosen at build time without any change to the code, much as we do for |
encukou commented Feb 20, 2023 via email
Stable API with unstable ABI is the default. The C API is covered by Python's general backwards compatibility policy (min. 2 years of deprecation warnings for breaking changes). Py_LIMITED_API will expose only the API that compiles to stable ABI. Nowadays it does more than simply provide a subset of the API -- it selects slow/stable implementations, as in my suggestion above -- but that's not really reason enough to go renaming the flag. On February 20, 2023 6:38:03 PM GMT+01:00, Mark Shannon ***@***.***> wrote: > But for users, one without the other doesn't really make much sense. An unstable ABI with a stable API makes a lot of sense, to me at least. The `inline` form of `PyLong_IsSingleDigit` depends on the layout of the `int` object, which will change in 3.12, so its ABI is unstable. The non-inline form is a published symbol and is stable. The reason the API counts of stable, even though the ABI might not be, is that the unstable form can be chosen at build time without any change to the code, much as we do for `Py_DEBUG`. -- Reply to this email directly or view it on GitHub: #101685 (comment) You are receiving this because you were mentioned. Message ID: ***@***.***> -- Sent from my Android device with K-9 Mail. Please excuse my brevity. |
encukou commented Feb 21, 2023 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Now, I don't think the functions with undefined behavior should be in the stable ABI. These also hardcode the fact that an efficient “digit” fits in |
zooba commented Feb 21, 2023 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
I don't see what this PR offers other than confusion for someone trying to read our API. For starters, I don't think anyone who isn't used to CPython internals will assume the correct meaning for "digit" (and even most of us will need to double check whether it means a single element from our internal storage or... a digit). For small numbers, If we want to skip the type check and just crash on a non-longobject, then we could add that API. But the size check isn't optional, so it may as well be encapsulated with extracting the value, as it currently is. I don't see why it should get into the limited API. The point of that API set isn't to grow over time - it's to remain stable, so that extensions can build once and be compatible with many versions. Every time we add a function to it, we hurt that. In some cases, it's worth it, but I don't see why I'm sympathetic to extension authors who want to extract values larger than I have very little sympathy for extension authors who need to check whether the value fits into a |
markshannon commented Feb 21, 2023
Almost all the API has undefined behavior if you don't fulfill the pre-conditions.
A large integer has to be made up of digits. So the existence of "digits" isn't really an implementation detail. Feel free free to suggest better names, but remember that "small" int is already in use for a very restricted set of values around 0. If we don't provide functions like these, then Cython, NumPy, MyPyC, etc will just use the C struct.
The functions take |
encukou commented Feb 21, 2023 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Yes, but I believe that new additions to the limited API should be “nicer” than the status quo.
How would you use this API?
Those don't use the limited API. And it seems everyone who can afford to call a DLL function would be fine with |
markshannon commented Feb 21, 2023
Which is probably true, and why I proposed I thought you were saying that these functions should be part of the limited API. If we are to have CPython perform well and have Cython, etc be available during the alpha and beta phases, we need a low-level API. |
markshannon commented Feb 21, 2023
It is considerably faster because With the internals of longs hidden behind these functions, we can improve the layout and will only need a single memory read. faster-cpython/ideas#548 |
encukou commented Feb 21, 2023
At least the inlining should be possible for
Sorry, that was a misunderstanding :( |
markshannon commented Feb 22, 2023
"Perfect" is not the term I would use 🙂 What we need is an API that includes low-level API functions, but not structs and internal functions. |
markshannon commented May 11, 2023
That's comparing the inlined version to the non-inlined version. Not the same thing as comparing a single expensive call to two cheap ones. Two successive functions calls don't need any more register spilling than a single call. |
zooba commented May 11, 2023
Because that's the change they want: for the fast path of |
wjakob commented May 11, 2023 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
@zooba I would still prefer if those are two separate functions. Suppose I am trying to fetch a 64 bit unsigned integer from Python on a 64 bit machine. Then the full-blown Of course most values are likely to still be compact. So what I want to be able to do is to first try the fast path for compact values. If the number is compact and positive, the cast succeeds, and it fails if it is negative. In the rare case that the number is not compact, I can fall back to If you bake together those functions in the CPython side, it removes that flexibility. |
zooba commented May 11, 2023
It just means you're in the same place as you would be for numbers that are bigger than Besides, I gave up on my preferred API for this because nobody seemed to like it, and have simplified my discussion point down to an existing API that other people mentioned. If I get over the argument-burnout to the point where I would actually code something up, you'd see. But everyone just wants to argue past everything I say, and I can't tell if it's me (in which case I'll bow out... again) or if it's something else (in which case I change my messaging... again... and things still don't go the way I'd like to see them go). Best of luck! |
markshannon commented May 12, 2023
Why is
The point is that the ints in range Given all that, we should choose the |
wjakob commented May 12, 2023 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Put yourself into the shoes of the consumer of this API (binding project tools like pybind11, Cython, etc.). They must expose a C++ function taking a standard integer type ( So what these tools will ideally do is to first try the fast path (if My example was simply to point out why @zooba's suggestion of gluing All of this is completely unrelated to the question of small/compact integers, which are naturally much smaller than this threshold. |
zooba commented May 12, 2023
Would my proposal from #101685 (comment) be sufficiently general? (Slightly tweaked from the first time I posted it) Bearing in mind that this is all in a header file and is always inlined, when Usage looks like: It only depends on |
scoder commented May 12, 2023 via email
Slightly tweaked from the first time I posted it That's more or less what Cython does. Reads ok to me. I posted a link to the implementation a bit further up. Obviously, it would be great to have the same feature available in the stable API. It's currently very difficult and slow to extract a 128 bit C integer from a PyLong there. |
markshannon commented May 13, 2023
@zooba |
encukou commented May 15, 2023
For unstable API, let's expose what CPython uses. |
zooba commented May 15, 2023
It doesn't have to be a perfect calculation, though. We're only guaranteeing a fast path when it's fast, so if |
encukou commented May 15, 2023
Ah, I missed that you used |
encukou commented May 15, 2023
See faster-cpython#16 for what I have in mind. |
encukou commented May 18, 2023
I've added some tests to faster-cpython#16. Feel free to cherry-pick/adapt anything from there if it helps. |
markshannon commented May 19, 2023
Thanks @encukou for doing this properly |
markshannon commented May 19, 2023
@zooba Any objections to me merging this? |
scoder commented May 21, 2023
Looks good to me, FWIW. |
markshannon commented May 21, 2023
OK, it looks like we have some sort of consensus that this is worth having. |
rwgk commented May 22, 2023
This PR breaks building pybind11. See below. Is this expected? What do you recommend we do? https://github.com/pybind/pybind11/tree/d72ffb448c58b4ffb08b5ad629bc788646e2d59e |
encukou commented May 22, 2023 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Ugh. Looks like 3.12 adds a bunch of functions that try to be const-correct, but fail to compile if you tell the compiler to actually enforce const-correctness ( |
Uh oh!
There was an error while loading. Please reload this page.
scoder commented May 22, 2023
Since it's probably difficult to change the existing cast/check macros to respect Also note that it's only asserts that fails here, so they could be removed. But it seems sensible to have them, especially since these are user facing functions. |
This is a stop-gap solution, until we decide on future directions for the C-API.
Adds some inline functions to allow fast access to the value of
PyLongObjectobjects, without explicitly probing the internals.These new functions will be guarded by
PY_UNSTABLE_PYLONG_ABI.In the future we may:
PY_UNSTABLE_PYLONG_ABIwith a more general ABI selector flag.@encukou Is this OK as a stopgap?
@scoder Is this usable by Cython?