Uh oh!
There was an error while loading. Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork 34.2k
test: fix flaky test-tls-wrap-timeout#7857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
Conversation
Trott commented Jul 23, 2016
The fix for this test is in io.js 1.3 and absent in io.js 1.2. So I modified the test a bit so that it would work on those versions to confirm that the test still passes in 1.3 and fails in 1.2. That form of the test can be found at https://gist.github.com/Trott/e4110c64314b9666185c3018dbb33b5c |
Trott commented Jul 25, 2016
indutny commented Jul 25, 2016
Hm... doesn't really LGTM. The test was about server timeout, not client's. |
indutny commented Jul 25, 2016
Sorry! |
Trott commented Jul 25, 2016
@indutny I don't understand. Both this version of the test and the current one check the same object ( The old version made sure that It was introduced in 9b6b0555. Prior to that, the timeout was firing (incorrectly) because This new proposed version removes the race condition by checking that Maybe it would be better to also check that |
Trott commented Jul 25, 2016
(Failure that this PR is supposed to address happened again on FreeBSD in CI: https://ci.nodejs.org/job/node-test-commit-freebsd/3427/nodes=freebsd10-64/console ) |
Trott commented Jul 26, 2016
@indutny I put back the timer (now |
Trott commented Jul 27, 2016
New version of test is failing on some platforms. More work to be done... |
Trott commented Jul 27, 2016 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
OK, let's try again: https://ci.nodejs.org/job/node-test-pull-request/3437/ EDIT: OK, that's looking much better so far. |
Trott commented Jul 29, 2016
Previous CI failures look CI infra related and not an issue with this test, but uh, let's run it again anyway: https://ci.nodejs.org/job/node-test-pull-request/3450/ |
Trott commented Jul 29, 2016
Ci again again: https://ci.nodejs.org/job/node-test-pull-request/3452/ |
jasnell commented Jul 29, 2016 • edited
Loading Uh oh!
There was an error while loading. Please reload this page.
edited
Uh oh!
There was an error while loading. Please reload this page.
|
Trott commented Jul 29, 2016
@jasnell All the CI failures are build related. None are related to this test. But I do love green, so let's do CI again again again: https://ci.nodejs.org/job/node-test-pull-request/3455/ |
Trott commented Jul 29, 2016
I am eager to get this fix in for this test because the version currently on master is failing a fair amount on FreeBSD. Here's another one from this morning: |
jasnell commented Jul 29, 2016
LGTM with green-ish CI then ;-) |
Trott commented Jul 29, 2016
Last CI is green! ✅ |
Trott commented Jul 30, 2016
Another failure on FreeBSD with current master. Definitely eager to land this fix. https://ci.nodejs.org/job/node-test-commit-freebsd/3506/nodes=freebsd10-64/console |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe adding here
assert.notStrictEqual(socket._idleTimeout,-1);There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@santigimeno Good idea. Added!
santigimeno commented Jul 30, 2016
I like that with this change we don't rely on timers. I'm not too sold on relying on implementation details (i.e: |
Competing timers were causing a race condition and thus the test was flaky. Instead, we check an object property on process exit. Fixes: nodejs#7650
Trott commented Jul 31, 2016
@santigimeno Yeah, I'd welcome a better way. For what it's worth, there is at least one other test where we do something similar ( |
santigimeno commented Jul 31, 2016
@Trott FWIW I gave this a try and found a way that, at least, seems less flaky than the current one: santigimeno@3df3f47. I run a couple of stress tests and didn't fail:
whereas current master did fail:
The problem is that still relies on timers and the number of tests run doesn't seem representative enough to consider it non-flaky. |
santigimeno commented Jul 31, 2016
Also, I looked to the original issue: nodejs/node-v0.x-archive#9242, and by the looks of the sample code pasted there, it seems that the timer associated with the socket is unref-ed as it doesn't keep the loop running even though the timer is already running. This test would have been so much easier to implement if the timer were ref-ed... |
Trott commented Jul 31, 2016
@santigimeno I'm inclined to think that using |
Trott commented Jul 31, 2016
@indutny@santigimeno@jasnell Maybe something like d76bb07a19d890d95e71329da504187f3e4462f2 is sufficiently minimal a change to have everyone feel better about it? (Haven't actually stress tested or run it on the old versions and so on, but you get the idea of the approach.) |
indutny commented Jul 31, 2016
Gosh, looks like I was thinking about different test all the time. LGTM. Sorry! |
santigimeno commented Jul 31, 2016
@Trott, after your explanation, I'm fine with this PR, and I think I prefer this to the new one. LGTM |
Competing timers were causing a race condition and thus the test was flaky. Instead, we check an object property on process exit. Fixes: nodejs#7650 PR-URL: nodejs#7857 Reviewed-By: Santiago Gimeno <[email protected]> Reviewed-By: Fedor Indutny <[email protected]> Reviewed-By: jasnell - James M Snell <[email protected]>
Trott commented Jul 31, 2016
Landed in bc464a8. Thanks, everyone! |
Competing timers were causing a race condition and thus the test was flaky. Instead, we check an object property on process exit. Fixes: #7650 PR-URL: #7857 Reviewed-By: Santiago Gimeno <[email protected]> Reviewed-By: Fedor Indutny <[email protected]> Reviewed-By: jasnell - James M Snell <[email protected]>
Competing timers were causing a race condition and thus the test was flaky. Instead, we check an object property on process exit. Fixes: nodejs#7650 PR-URL: nodejs#7857 Reviewed-By: Santiago Gimeno <[email protected]> Reviewed-By: Fedor Indutny <[email protected]> Reviewed-By: jasnell - James M Snell <[email protected]>
Competing timers were causing a race condition and thus the test was flaky. Instead, we check an object property on process exit. Fixes: #7650 Backport-PR-URL: #12567 PR-URL: #7857 Reviewed-By: Santiago Gimeno <[email protected]> Reviewed-By: Fedor Indutny <[email protected]> Reviewed-By: jasnell - James M Snell <[email protected]>
Checklist
make -j4 test(UNIX), orvcbuild test nosign(Windows) passesAffected core subsystem(s)
test tls
Description of change
Competing timers were causing a race condition and thus the test was
flaky. Instead, we check an object property on process exit.
Fixes: #7650
/cc @indutny @nodejs/testing @SomeKittens