Skip to content

Conversation

@ShogunPanda
Copy link
Contributor

Fixes#43623.

@nodejs-github-botnodejs-github-bot added needs-ci PRs that need a full CI run. test Issues and PRs related to the tests. labels Jul 1, 2022
@ShogunPandaShogunPanda added the fast-track PRs that do not need to wait for 48 hours to land. label Jul 1, 2022
@github-actions
Copy link
Contributor

Fast-track has been requested by @ShogunPanda. Please 👍 to approve.

@tniessen
Copy link
Member

@MoLow FYI, it looks like this is the same approach as #43633, except modifying two more occurrences in the file.

@tniessentniessen added the request-ci Add this label to start a Jenkins CI on a PR. label Jul 1, 2022
@github-actionsgithub-actionsbot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jul 1, 2022
@nodejs-github-bot

This comment was marked as outdated.

@F3n67u
Copy link
Contributor

Fixes #43623.

@ShogunPanda Could you help me understand why this pr will fix the test failure?

@tniessen
Copy link
Member

tniessen commented Jul 1, 2022

The test is still failing in CI.

If #43522 is the culprit, maybe revert (#43636) is the way to go? Looking at its CI history, it probably should not have landed: the test failed a bunch of times there, too.

@tniessen
Copy link
Member

Unrelated side note: the commit message should start with http: fix ..., not http: fixed ....

@ShogunPanda
Copy link
ContributorAuthor

We have lot of flakiness in tests lately. Please do not revert the other PR, I'll work in the next few hours to understand why it fails.

@ShogunPanda
Copy link
ContributorAuthor

I found the issue. The problem was not in using the agent but in the fact that the server.close was called before the client could receive the response. As the request was completed closeIdleConnections would have closed it.
It should be fixed now.

@nodejs-github-bot
Copy link
Collaborator

@F3n67uF3n67u removed the fast-track PRs that do not need to wait for 48 hours to land. label Jul 1, 2022
@F3n67u
Copy link
Contributor

test-gc-http-client-timeout fail with Cannot read properties of null in Server.closeIdleConnections method, see #43638 (comment). This method is introduced by you, could you help to take a look? @ShogunPanda

@F3n67u
Copy link
Contributor

F3n67u commented Jul 1, 2022

test-gc-http-client-timeout fail with Cannot read properties of null in Server.closeIdleConnections method, see #43638 (comment). This method is introduced by you, could you help to take a look? @ShogunPanda

@ShogunPanda After some investigation, I found this is also caused by #43522. 😹
Can we reconsider reverting #43522?

@ShogunPanda
Copy link
ContributorAuthor

My opinion is that after the changes we detect tests that were misbehaving on handling the socket.
I would suggest to rather fix those tests instead of simply revert the PR.

@F3n67u
Copy link
Contributor

F3n67u commented Jul 1, 2022

My opinion is that after the changes we detect tests that were misbehaving on handling the socket. I would suggest to rather fix those tests instead of simply revert the PR.

This pr causes two tests to fail, this will affect all pr. I would prefer to keep CI green.

When you fixed the test, you could land #43522 again. I don't see any downside to that. could you help me to understand why we would better not revert?

@ShogunPanda
Copy link
ContributorAuthor

Because apparently CI was not detecting those tests to fail (not even locally) while somehow it is now. So I think we are in a better spot at the moment.

@ShogunPanda
Copy link
ContributorAuthor

Moreover, can you share how you detect that the PR is involved? So I can investigate how to fix it

@F3n67u
Copy link
Contributor

Because apparently CI was not detecting those tests to fail (not even locally) while somehow it is now. So I think we are in a better spot at the moment.

Ok. I will support you.

@F3n67u
Copy link
Contributor

F3n67u commented Jul 1, 2022

Moreover, can you share how you detect that the PR is involved? So I can investigate how to fix it

test-stream-finished

This test failure is root caused by @MoLow in #43623 (comment)

test-gc-http-client-timeout

The stack tell me error is happened at Server.closeIdleConnections in Server.close. I then find it is #43522 introduce Server.closeIdleConnections in Server.close.

done/collected/total: 93/0/804done/collected/total: 322/93/804done/collected/total: 415/322/804done/collected/total: 664/349/804done/collected/total: 737/415/804done/collected/total: 804/664/804done/collected/total: 804/737/804done/collected/total: 804/804/804node:_http_server:498 connections[i].socket.destroy(); ^TypeError: Cannot read properties of null (reading 'destroy') at Server.closeIdleConnections (node:_http_server:498:27) at Server.close (node:_http_server:481:8) at Immediate.status (/home/iojs/build/workspace/node-test-binary-armv7l/test/parallel/test-gc-http-client-timeout.js:66:14) at process.processImmediate (node:internal/timers:471:21)

@ShogunPanda
Copy link
ContributorAuthor

Oh, you are on ARM. On MacOS I get a different error. But I suspect it is the same. Working on it.

@ShogunPanda
Copy link
ContributorAuthor

I just have noticed that the test-gc-http-client-timeout has been created out of test-gc-http-client which was moved to sequential tests because it was not reliable. Shall we do the same?

@tniessentniessen requested a review from mcollinaJuly 2, 2022 14:41
@tniessentniessen added the fast-track PRs that do not need to wait for 48 hours to land. label Jul 2, 2022
@github-actions
Copy link
Contributor

Fast-track has been requested by @tniessen. Please 👍 to approve.

@F3n67u
Copy link
Contributor

Great. How can you be sure the flaky test is addressed? Do we need to run Jenkins ci more times to make sure the flaky test is fixed?

We have a special stress test Jenkins job for these situations, but it appears to be broken, presumably because the main branch was renamed recently: https://ci.nodejs.org/view/Stress/job/node-stress-single-test/349/console

wow. this tool is great. I don't know that before.

Copy link
Member

@mcollinamcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@tniessentniessen added the commit-queue Add this label to land a pull request using GitHub Actions. label Jul 2, 2022
@nodejs-github-botnodejs-github-bot added commit-queue-failed An error occurred while landing this pull request using GitHub Actions. and removed commit-queue Add this label to land a pull request using GitHub Actions. labels Jul 2, 2022
@nodejs-github-bot
Copy link
Collaborator

Commit Queue failed
- Loading data for nodejs/node/pull/43641 ✔ Done loading data for nodejs/node/pull/43641 ----------------------------------- PR info ------------------------------------ Title http: fix failing test (#43641) Author Paolo Insogna (@ShogunPanda) Branch ShogunPanda:fix-failing-test -> nodejs:main Labels test, fast-track, needs-ci Commits 2 - http: fix failing test - http: moved test to sequential Committers 1 - Paolo Insogna PR-URL: https://github.com/nodejs/node/pull/43641 Reviewed-By: Matteo Collina Reviewed-By: Filip Skokan ------------------------------ Generated metadata ------------------------------ PR-URL: https://github.com/nodejs/node/pull/43641 Reviewed-By: Matteo Collina Reviewed-By: Filip Skokan -------------------------------------------------------------------------------- ℹ This PR was created on Fri, 01 Jul 2022 09:29:46 GMT ✔ Approvals: 2 ✔ - Matteo Collina (@mcollina) (TSC): https://github.com/nodejs/node/pull/43641#pullrequestreview-1026765760 ✔ - Filip Skokan (@panva): https://github.com/nodejs/node/pull/43641#pullrequestreview-1026768532 ℹ This PR is being fast-tracked ✔ Last GitHub CI successful ℹ Last Full PR CI on 2022-07-02T04:04:05Z: https://ci.nodejs.org/job/node-test-pull-request/45045/ - Querying data for job/node-test-pull-request/45045/ ✔ Last Jenkins CI successful -------------------------------------------------------------------------------- ✔ No git cherry-pick in progress ✔ No git am in progress ✔ No git rebase in progress -------------------------------------------------------------------------------- - Bringing origin/main up to date... From https://github.com/nodejs/node * branch main -> FETCH_HEAD ✔ origin/main is now up-to-date - Downloading patch for 43641 From https://github.com/nodejs/node * branch refs/pull/43641/merge -> FETCH_HEAD ✔ Fetched commits as 201460873d07..64cf037bbed1 -------------------------------------------------------------------------------- [main f57c58aa05] http: fix failing test Author: Paolo Insogna Date: Fri Jul 1 13:39:05 2022 +0200 1 file changed, 3 insertions(+), 1 deletion(-) [main 344ac7b8a0] http: moved test to sequential Author: Paolo Insogna Date: Fri Jul 1 16:29:04 2022 +0200 1 file changed, 0 insertions(+), 0 deletions(-) rename test/{parallel => sequential}/test-gc-http-client-timeout.js (100%) ✔ Patches applied There are 2 commits in the PR. Attempting autorebase. Rebasing (2/4) 

Executing: git node land --amend --yes
--------------------------------- New Message ----------------------------------
http: fix failing test

PR-URL: #43641
Reviewed-By: Matteo Collina [email protected]
Reviewed-By: Filip Skokan [email protected]

[detached HEAD 0cc5cfe164] http: fix failing test
Author: Paolo Insogna [email protected]
Date: Fri Jul 1 13:39:05 2022 +0200
1 file changed, 3 insertions(+), 1 deletion(-)
Rebasing (3/4)
Rebasing (4/4)

Executing: git node land --amend --yes
--------------------------------- New Message ----------------------------------
http: moved test to sequential

PR-URL: #43641
Reviewed-By: Matteo Collina [email protected]
Reviewed-By: Filip Skokan [email protected]

[detached HEAD 91de97a73b] http: moved test to sequential
Author: Paolo Insogna [email protected]
Date: Fri Jul 1 16:29:04 2022 +0200
1 file changed, 0 insertions(+), 0 deletions(-)
rename test/{parallel => sequential}/test-gc-http-client-timeout.js (100%)

Successfully rebased and updated refs/heads/main.

ℹ Use --fixupAll option, squash the PR manually or land the PR from the command line.

https://github.com/nodejs/node/actions/runs/2602070703

@panvapanva added commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. commit-queue Add this label to land a pull request using GitHub Actions. and removed commit-queue-failed An error occurred while landing this pull request using GitHub Actions. labels Jul 2, 2022
@nodejs-github-botnodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Jul 2, 2022
@nodejs-github-botnodejs-github-bot merged commit c753f27 into nodejs:mainJul 2, 2022
@nodejs-github-bot
Copy link
Collaborator

Landed in c753f27

@ShogunPandaShogunPanda deleted the fix-failing-test branch July 2, 2022 16:45
@F3n67u
Copy link
Contributor

F3n67u commented Jul 3, 2022

Great. How can you be sure the flaky test is addressed? Do we need to run Jenkins ci more times to make sure the flaky test is fixed?

We have a special stress test Jenkins job for these situations, but it appears to be broken, presumably because the main branch was renamed recently: https://ci.nodejs.org/view/Stress/job/node-stress-single-test/349/console

@tniessen Could we file an issue to nodejs/build or nodejs/node to record this bug?

I tried to fix it to the main branch in nodejs/build repo, but I cannot figure out where the Jenkins file of node-stress-single-test is(I think it resides in some private repo of building wg). Nonetheless, I create nodejs/build#2986 and nodejs/build#2987 to update a lot of master branch usage to main.

@F3n67u
Copy link
Contributor

@ShogunPanda
Copy link
ContributorAuthor

@F3n67u I'll soon file a request to @nodejs/build to get access to machine and investigate it. On macOS it does not happen

@F3n67u
Copy link
Contributor

@F3n67u I'll soon file a request to @nodejs/build to get access to machine and investigate it. On macOS it does not happen

Great. thank you.

targos pushed a commit that referenced this pull request Jul 12, 2022
PR-URL: #43641 Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Filip Skokan <[email protected]>
@targostargos mentioned this pull request Jul 12, 2022
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

commit-queue-squashAdd this label to instruct the Commit Queue to squash all the PR commits into the first one.fast-trackPRs that do not need to wait for 48 hours to land.needs-ciPRs that need a full CI run.testIssues and PRs related to the tests.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Investigate flaky test-stream-finished

7 participants

@ShogunPanda@tniessen@nodejs-github-bot@F3n67u@mcollina@panva@targos