- Notifications
You must be signed in to change notification settings - Fork 78
use getptr a lot less#618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
MilesCranmer commented Jun 4, 2025
Fixes some local segfaults I was getting. Nice! What is left before merging? Let me know if I can help. Eager to see if this fixes some PySR segfaults people have been reporting. |
cjdoris commented Jun 4, 2025
Fix the segfault on Mac 😞 - I've not got a Mac to hand right now so debugging will be painful... |
MilesCranmer commented Jun 4, 2025
@cjdoris you should try https://github.com/mxschmitt/action-tmate. It's pretty useful for when you don't have the OS that is failing |
MilesCranmer commented Jun 5, 2025
I don't see the segfault btw. It just looks like a notebook error? _________________________pytest/test_nb.ipynb::Cell0_________________________NotebookcellexecutionfailedCell0: Timeoutof2000secondsexceededwhileexecutingcell. Failedtointerruptkernelin5seconds, sofailingwithouttraceback. Input: # NBVAL_IGNORE_OUTPUTimportnumpyasnpfromjuliacallimportMainasjl |
cjdoris commented Jun 5, 2025
Yeah a more recent CI had the same issue. I suspect some bug is causing undefined behaviour which might cause segfaults or silently kill the kernel or ... |
cjdoris commented Jun 5, 2025
Oh goodie it's non-deterministic. The Mac test just passed but Linux failed! But Mac did fail with Python 3.13.2 (was previously testing on latest Python 3.13.3). At least I might be able to reproduce this locally on Linux... |
cjdoris commented Jun 5, 2025
I have completely reverted this branch (except for some trivial details in the CI workflow definition) and still get the errors, so these should be fixed first. The error seems to only occur on Mac on Python 3.13. |
cjdoris commented Jun 5, 2025
Argh Heisenbug! I add tmate to be able to observe it and it goes away. Does anyone have a Mac they can debug this locally with? I just want to run the tests under gdb or something. |
MilesCranmer commented Jun 5, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
I have a mac but don't see the issue myself - just tested. Perhaps it was an upstream dependency which is now fixed? |
MilesCranmer commented Jun 5, 2025
Ah sorry let me try Python 3.13, i was using 3.12. |
MilesCranmer commented Jun 5, 2025
Actually it does look like the macos test is failing? So you could ssh in now: https://github.com/JuliaPy/PythonCall.jl/actions/runs/15477354381/job/43576415671?pr=618 |
MilesCranmer commented Jun 5, 2025
I can reproduce the segfault: I see it on 3.13.0-3.13.2, but not on 3.13.3. So perhaps it was a Python bug that is now fixed? |
cjdoris commented Jun 6, 2025
OK the current failures are due to JuliaLang/julia#58171 so I guess we have another reproducer for that. |
dpinol commented Jun 6, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
I've run several long stress tests on linux with julia 1.11.5 & python 3.13.3/3.13.4, which used to consistently crash before this PR. |
MilesCranmer commented Jun 14, 2025
@vchuravy do you think the |
MilesCranmer commented Jun 30, 2025
@cjdoris do you think we could merge this? Even without fixing JuliaLang/julia#58171 I think it's still useful to have these fixes. In fact I do wonder if that issue isn't a Julia bug (since we've never seen it anywhere besides the PythonCall.jl instance). Maybe it's due to how Python multithreading changed in 1.13 or something... |
vchuravy commented Jun 30, 2025
I would say it's a high likelyhood that it is a missunderstanding between Python, PythonCall.jl and Julia's memory semantics. So not at all clear that it is in the Julia source itself. |
752534b into mainUh oh!
There was an error while loading. Please reload this page.
cjdoris commented Jul 1, 2025
OK I undid the debugging changes and committed the original PR with a few extra |
cjdoris commented Jul 1, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
What changed? |
MilesCranmer commented Jul 1, 2025 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
I assumed there were some internal changes to enable the GIL free version: https://peps.python.org/pep-0703/. But actually I don't actually know if the regular GIL version had changes to enable this. |
Defines an
unsafe_convertrule forPytoC.PyPtrso we can passPydirectly to C functions.Also defines
incref(::Py)so we don't have to operate on pointers.Generalises our faked C functions in
extras.jlto also allow::Pyarguments in place of::PyPtr.Fixes#617 - and more generally removes a lot of places where the Julia GC could have freed a
Pywhile we still have a pointer to the Python object, causing a read-after-free error.