Skip to content

Conversation

@jburgess777
Copy link
Contributor

The paramiko library has a known problem where checking the exit status can cause a deadlock if the command has written a lot of output to the stdout (or stderr) channel. We hit this problem when using testinfra to return data from log files during a test.

https://docs.paramiko.org/en/stable/api/channel.html#paramiko.channel.Channel.recv_exit_status

We can work around this, for the stdout case, by reading the data before checking the exit status.

I do not attempt to fix the stderr case. Fixing this would likely require a more complex fix, which reads from both stdout and stderr until they are closed, like https://stackoverflow.com/a/78765054

Here is an example of running the test case on the original code. I enabled pytest-timeouts with a 10 second timeout so you can see it stuck in recv_exit_status

$ tox -- -v -k test_command_deadlock test ========================================================================================= test session starts =========================================================================================== platform linux -- Python 3.10.12, pytest-8.3.3, pluggy-1.5.0 -- /home/jburgess/git/pytest-testinfra/.tox/py/bin/python cachedir: .tox/py/.pytest_cache rootdir: /home/jburgess/git/pytest-testinfra configfile: setup.cfg plugins: testinfra-0.1.dev789, cov-5.0.0, xdist-3.6.1, timeouts-1.2.1 setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s collected 177 items / 171 deselected / 6 selected tes=t/test_backends.py::test_command_deadlock[ssh://debian_bookworm] PASSED [ 16%] test/test_backends.py::test_command_deadlock[safe-ssh://debian_bookworm] PASSED [ 33%] test/test_backends.py::test_command_deadlock[docker://debian_bookworm] PASSED [ 50%] test/test_backends.py::test_command_deadlock[paramiko://debian_bookworm] FAILED [ 66%] test/test_backends.py::test_command_deadlock[ansible://debian_bookworm] PASSED [ 83%] test/test_backends.py::test_command_deadlock[ansible://debian_bookworm?force_ansible=True] PASSED [100%] ================================================================================================ FAILURES ================================================================================================ ___________________________________________________________________________ test_command_deadlock[paramiko://debian_bookworm] ____________________________________________________________________________ host = <testinfra.host.Host paramiko://debian_bookworm> @pytest.mark.testinfra_hosts(*HOSTS) @pytest.mark.execution_timeout(10) def test_command_deadlock(host): # Test for deadlock when exceeding Paramiko transport buffer (2MB) # https://docs.paramiko.org/en/latest/api/channel.html#paramiko.channel.Channel.recv_exit_status size = 3 * 1024 * 1024 > output = host.check_output(f"python3 -c 'print(\"a\" *{size})'") test/test_backends.py:651: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ testinfra/host.py:98: in run return self.backend.run(command, *args, **kwargs) testinfra/backend/paramiko.py:150: in run rc, stdout, stderr = self._exec_command(cmd) testinfra/backend/paramiko.py:141: in _exec_command rc = chan.recv_exit_status() .tox/py/lib/python3.10/site-packages/paramiko/channel.py:400: in recv_exit_status self.status_event.wait() /usr/lib/python3.10/threading.py:607: in wait signaled = self._cond.wait(timeout) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

With the fix, the test passes

========================================================================================== test session starts =========================================================================================== platform linux -- Python 3.10.12, pytest-8.3.3, pluggy-1.5.0 -- /home/jburgess/git/pytest-testinfra/.tox/py/bin/python cachedir: .tox/py/.pytest_cache rootdir: /home/jburgess/git/pytest-testinfra configfile: setup.cfg plugins: testinfra-0.1.dev789, cov-5.0.0, xdist-3.6.1, timeouts-1.2.1 setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s collected 177 items / 171 deselected / 6 selected test/test_backends.py::test_command_deadlock[ssh://debian_bookworm] PASSED [ 16%] test/test_backends.py::test_command_deadlock[safe-ssh://debian_bookworm] PASSED [ 33%] test/test_backends.py::test_command_deadlock[docker://debian_bookworm] PASSED [ 50%] test/test_backends.py::test_command_deadlock[paramiko://debian_bookworm] PASSED [ 66%] test/test_backends.py::test_command_deadlock[ansible://debian_bookworm] PASSED [ 83%] test/test_backends.py::test_command_deadlock[ansible://debian_bookworm?force_ansible=True] PASSED [100%] =================================================================================== 6 passed, 171 deselected in 4.24s ==================================================================================== py: OK ✔ in 5.43 seconds 

I didn't see any regressions in the rest of the test suite.

The paramiko library has a known problem where checking the exit status can cause a deadlock if the command has written a lot of output to the stdout (or stderr) channel: https://docs.paramiko.org/en/stable/api/channel.html#paramiko.channel.Channel.recv_exit_status We can work around this, for the stdout case, by reading the data before checking the exit status.
@philpepphilpep merged commit e1eb9c1 into pytest-dev:mainOct 21, 2024
5 of 7 checks passed
@philpep
Copy link
Contributor

Merged, thanks for digging into this!

Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

@jburgess777@philpep