- Notifications
You must be signed in to change notification settings - Fork 435
Ensure pool connection is released in acquire when cancelled#548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
aaliddell commented Mar 28, 2020 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
ClosesMagicStack#547 When wait_for is cancelled, there is a chance that the waited task has already been completed, leaving the connection looking like it is in use. This fix ensures that the connection is returned to the pool in this situation. For context, see: https://bugs.python.org/issue37658MagicStack#467
Added issue link and comment describing why this workaround is necessary. Also moved the function to be local to the call site, as it it not used elsewhere.
Uh oh!
There was an error while loading. Please reload this page.
When wait_for is called with timeout=None, it runs without a timeout, as desired
aaliddell commented May 5, 2020
There is something odd going on with the Windows 64 Python 3.8 tests, with a different single test failing each time. As far as I can see, these should not be affected by any of the changes in this PR, but do not show up on master. Is this bad luck with a flaky test or something you've seen before? |
elprans commented May 5, 2020
Looks like a Python bug to me. The proactor event look tries to |
aaliddell commented Jun 18, 2020
May I prod on getting this looked at again? This bug crops up often during CI/testing where pools are created and destroyed frequently, leading to the tests hanging whilst waiting for the pool to close. |
elprans commented Jun 19, 2020
I'll take a look at this soon. Thanks. |
simzen85 commented Aug 11, 2020
@elprans may I know the status of this PR as our team seems to have the same issue and want to try the fix. Thanks |
elprans commented Aug 11, 2020
I'm not a fan of the piecemeal approach here. This seems like a bug in asyncio's |
aaliddell commented Aug 11, 2020
The issue lies with The gist of it is that the cancellation can race the pool acquire. In this case, the acquire task has already completed and connection has already been marked as in use, but the caller of This may be solvable in fixing An untested solution may be this: But this solution may have unintended consequences, since there are now situations where |
aaliddell commented Aug 11, 2020
Actually, on second thought: the stem of this issue with Instead, the tasks in |
elprans commented Aug 15, 2020
Not quite. |
elprans commented Aug 15, 2020
Upon closer examination it actually does seem like a bug in importasyncioimportasyncpgasyncdefmain(): fut=asyncio.get_event_loop().create_future() fut.set_result(True) waiter=asyncio.ensure_future(asyncio.wait_for(fut, timeout=1)) waiter.cancel() result=awaitwaiterprint(result) asyncio.run(main())The above should print |
aaliddell commented Aug 15, 2020 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
Yep, I've got a fixed wait_for implementation that does what I proposed and in doing so it turns out it's broken in another way too: if the timeout fires and attempts to cancel the task and the task absorbs the cancellation, the task's return value is silently dropped, which will also give this same resource leak issue. I'll push it to my cpython repo fork shortly and let you know, if you want to take it? If you've done an upstream cpython change before it may be easier for you to submit, rather than me figure out how the python change process works? Otherwise I'm happy enough to try myself. |
elprans commented Aug 15, 2020
This isn't a bug an |
aaliddell commented Aug 15, 2020
So there's two sides to that bit:
To me, the former behaviour seems more clear to a user, even if badly behaved tasks are an edge case. Finally, the current behaviour of wait_for is actually already inconsistent:
|
elprans commented Aug 15, 2020
Yea, that's my fault, actually. I missed that case when I was fixing |
`asyncio.wait_for()` currently has a bug where it raises a `CancelledError` even when the wrapped awaitable has completed. The upstream fix is in python/cpython#37658. This adds a workaround until the aforementioned PR is merged, backported and released. Fixes: #467Fixes: #547 Related: #468 Supersedes: #548
`asyncio.wait_for()` currently has a bug where it raises a `CancelledError` even when the wrapped awaitable has completed. The upstream fix is in python/cpython#21894. This adds a workaround until the aforementioned PR is merged, backported and released. Fixes: #467Fixes: #547 Related: #468 Supersedes: #548
`asyncio.wait_for()` currently has a bug where it raises a `CancelledError` even when the wrapped awaitable has completed. The upstream fix is in python/cpython#21894. This adds a workaround until the aforementioned PR is merged, backported and released. Co-authored-by: Adam Liddell <[email protected]> Fixes: #467Fixes: #547 Related: #468 Supersedes: #548
elprans commented Aug 16, 2020
@aaliddell I opened python/cpython#21894 for the upstream fix and #608 here adds a clean workaround. Thank you for helping to diagnose this! |
aaliddell commented Aug 16, 2020
Nice! 👍 |
`asyncio.wait_for()` currently has a bug where it raises a `CancelledError` even when the wrapped awaitable has completed. The upstream fix is in python/cpython#21894. This adds a workaround until the aforementioned PR is merged, backported and released. Co-authored-by: Adam Liddell <[email protected]> Fixes: #467Fixes: #547 Related: #468 Supersedes: #548
elprans commented Aug 27, 2020
Fixed in #608. |
Closes#547 and possibly #467
When
wait_foris cancelled, there is a chance that the waited task has already been completed, leaving the connection looking like it is in use. This fix ensures that the connection is returned to the pool in this situation.A similar bug was fixed in #468, also with
wait_for. The python issue was https://bugs.python.org/issue37658, although this is arguably not a solvable bug in Python itself as it can't know how to undo a completed task.