Skip to content

Conversation

@mscdex
Copy link
Contributor

@mscdexmscdex commented Jun 4, 2017

I recently noticed that with the introduction of Async Hooks also came the (perhaps unintentional) reverting of 804d57d. I actually tried a lot of different solutions, but almost all of them resulted in regressions in other nextTick() benchmarks (e.g. a perf boost in no args case, but regression in args case) which I still don't quite understand. One thing I did notice is that the ideal solution of just using a plain object and embedding the symbol properties as computed properties causes TurboFan to kick in and that causes a significant slowdown for nextTick().

I was able to improve the existing non-ES6-based constructor solution, but it turns out using an ES6 class improves performance even more (even surpassing the plain object-based solutions that performed better in whichever benchmarks), so that is what I decided on.

Results:

 improvement confidence p.value process/next-tick-breadth-args.js millions=2 27.75 % *** 1.271176e-20 process/next-tick-breadth.js millions=2 7.71 % *** 4.155765e-13 process/next-tick-depth-args.js millions=12 47.78 % *** 4.150674e-52 process/next-tick-depth.js millions=12 47.32 % *** 7.742778e-31 

CI: https://ci.nodejs.org/job/node-test-pull-request/8464/

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)
  • process

@mscdexmscdex added performance Issues and PRs related to the performance of Node.js. process Issues and PRs related to the process subsystem. labels Jun 4, 2017
Copy link
Contributor

@refackrefack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (just some nits)

Copy link
Contributor

@refackrefackJun 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering... It this a style choice or a DEOPT?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style choice, I think we tend to use braces for multi-line conditionals from what I've generally seen, this is just one "offender" I ran across while working in this file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% with so much side effects in a constructor...
If it's not a performance hit I'd rather see these four lines in a static method or factory
If it is a performance hit, I think it should be documented:

// These side effects were folded into the c'tor for performance 

(or something to that effect)

Copy link
ContributorAuthor

@mscdexmscdexJun 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't move these other statements back into nextTick() without causing a performance regression from what I saw. I believe it's because of Crankshaft's inlining limits, but I did not want to introduce regressions at this moment (we will have to deal with it once TurboFan is everywhere).

Unfortunately internalNextTick() exceeds these limits, so there's not much I can do there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah so probably even a comment will "irritate" Crankshaft
What can you do 🤷‍♂️

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully once V8 (or perhaps they already do in later versions) is able to better optimize computed properties in TurboFan, we can use plain objects again without losing performance.

Copy link
Contributor

@Fishrock123Fishrock123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm w comments

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clarify the comment rather than removing it?

Copy link
ContributorAuthor

@mscdexmscdexJun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing it matches internalNextTick() and it doesn't seem like a particularly helpful comment as process._exiting is pretty self-explanatory to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the code says what but I'm wondering if it may be helpful to say why.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW Crankshaft can decide not to inline a function because it has a comment in it 😞 so if we come up with a better comment it probably should go in line 231.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, it makes little difference as the comment was already there and that will soon be an obsolete optimization as I understand.

Copy link
Member

@gibfahngibfahnJun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean by a 'why comment.'

AIUI it's a comment explaining why we're doing this (as opposed to explaining what this code does).

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gibfahn I understand that part. What I meant was I don't understand what this is in reference to. Is it the process._exiting line? The changes I'm making? Something else? What is the suggested 'why comment' ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the suggested 'why comment' ?

(AIUI) This is the comment, it's a "why comment", it's helpful to preserve it.

// on the way out, don't bother. it won't get fired anyway.

FWIW I'm sure putting it above the function is fine.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fishrock123@gibfahn I've added something above both functions now, let me know if that is what you had in mind or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fishrock123@gibfahn I've added something above both functions now, let me know if that is what you had in mind or not.

Clearer than the original, thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this is var over const?

Copy link
ContributorAuthor

@mscdexmscdexJun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was triggering a deopt.

Copy link
Contributor

@refackrefackJun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// we bail fast `if (process._exiting)` since the loop has effectively stopped// and there will never be a `nextTick`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this increase? You are getting ridiculous statistical confidence. By some quick calculation, this should only improve the confidence by a factor of 2.5. Compensating for that and the previous version should also give you confidence well beyond ***.

Copy link
ContributorAuthor

@mscdexmscdexJun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always choose something large enough to allow for any re-opts to occur and for results to stabilize. With 2, I was getting results that varied quite a bit in comparison because it was finishing so quickly (at least on my machine).

@mscdexmscdexforce-pushed the process-nexttick-perf branch 2 times, most recently from ddc2508 to 447162bCompareJune 5, 2017 20:26
@mscdexmscdex mentioned this pull request Jun 7, 2017
3 tasks
@mscdex
Copy link
ContributorAuthor

mscdex commented Jun 7, 2017

FWIW with V8 5.9 in master I now get these results with this PR as-is:

 improvement confidence p.value process/next-tick-breadth-args.js millions=2 6.17 % *** 2.725997e-04 process/next-tick-breadth.js millions=2 6.68 % ** 1.568356e-03 process/next-tick-depth-args.js millions=12 46.71 % *** 2.458707e-81 process/next-tick-depth.js millions=12 43.66 % *** 3.338435e-66 

I'm not sure how breadth-args was so much higher previously, I would've expected results more like I am seeing now in 5.9, where the args and non-args are much closer together.

@mscdexmscdex added the wip Issues and PRs that are still a work in progress. label Jun 8, 2017
@mscdexmscdexforce-pushed the process-nexttick-perf branch from 447162b to d68f8baCompareJune 8, 2017 19:16
@mscdex
Copy link
ContributorAuthor

mscdex commented Jun 8, 2017

Ok, I've made a couple more adjustments:

  • Moved the non-assignment code out of the TickObject class constructor as the difference in code location no longer makes any difference with TurboFan in V8 5.9
  • Switched from an array to a linked list for storing TickObject instances, this increases performance more significantly than the original changes alone. I think it may be possible to make some further optimizations to avoid use of the tickInfo counters shared with C++ land (at least the kIndex counter), however I've decided to leave it alone for now.

Here are the results as of these latest changes:

 improvement confidence p.value process/next-tick-breadth-args.js millions=4 135.31 % *** 3.011128e-73 process/next-tick-breadth.js millions=4 361.40 % *** 5.097232e-76 process/next-tick-depth-args.js millions=12 44.08 % *** 1.276617e-66 process/next-tick-depth.js millions=12 43.85 % *** 1.333610e-49 

CI: https://ci.nodejs.org/job/node-test-pull-request/8561/

@mscdexmscdex removed the wip Issues and PRs that are still a work in progress. label Jun 8, 2017
@mscdex
Copy link
ContributorAuthor

@refack@Fishrock123@jasnell@evanlucas Please review these recent changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is faster than spread parameters even with V8 5.9?

Copy link
ContributorAuthor

@mscdexmscdexJun 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how spread could be used here. Do you mean rest parameters? If so, I tried that and while it's a little faster in one benchmark, it's slower to varying degrees in all others.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what I meant. Sorry, could never get all the terminologies right.

@refack
Copy link
Contributor

So I'm just writing down what I saw (writing down mostly to organize my thoughts):

  1. the linked list queue
  2. inlined setupInit X 2
  3. switch for args X 2
  4. the const microTasksTickObject

Anything else of significance?

@mscdex
Copy link
ContributorAuthor

@refack That's mostly it, unless you want to also count the switch to ES6 class for TickObject.

@refack
Copy link
Contributor

Thanks for the confirmation ✔️

@mscdexmscdexforce-pushed the process-nexttick-perf branch from d68f8ba to 2126cadCompareJune 16, 2017 07:24
@mscdex
Copy link
ContributorAuthor

mscdex commented Jun 16, 2017

Rebased. If there are no objections in the next 24 hours I will assume all who approved the original changes also approve of these latest changes that I previously outlined.

PR-URL: nodejs#13446 Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Jeremiah Senkpiel <[email protected]> Reviewed-By: Evan Lucas <[email protected]>
@mscdexmscdexforce-pushed the process-nexttick-perf branch from 2126cad to 460ee75CompareJune 17, 2017 14:39
@mscdexmscdex merged commit 460ee75 into nodejs:masterJun 17, 2017
@mscdexmscdex deleted the process-nexttick-perf branch June 17, 2017 14:41
addaleax pushed a commit that referenced this pull request Jun 18, 2017
PR-URL: #13446 Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Jeremiah Senkpiel <[email protected]> Reviewed-By: Evan Lucas <[email protected]>
addaleax pushed a commit that referenced this pull request Jun 20, 2017
PR-URL: #13446 Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Jeremiah Senkpiel <[email protected]> Reviewed-By: Evan Lucas <[email protected]>
@addaleaxaddaleax mentioned this pull request Jun 21, 2017
addaleax pushed a commit that referenced this pull request Jun 21, 2017
PR-URL: #13446 Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Jeremiah Senkpiel <[email protected]> Reviewed-By: Evan Lucas <[email protected]>
addaleax pushed a commit that referenced this pull request Jun 24, 2017
PR-URL: #13446 Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Jeremiah Senkpiel <[email protected]> Reviewed-By: Evan Lucas <[email protected]>
rvagg pushed a commit that referenced this pull request Jun 29, 2017
PR-URL: #13446 Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Jeremiah Senkpiel <[email protected]> Reviewed-By: Evan Lucas <[email protected]>
addaleax pushed a commit that referenced this pull request Jul 11, 2017
PR-URL: #13446 Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Jeremiah Senkpiel <[email protected]> Reviewed-By: Evan Lucas <[email protected]>
addaleax pushed a commit that referenced this pull request Jul 18, 2017
PR-URL: #13446 Reviewed-By: Refael Ackermann <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Jeremiah Senkpiel <[email protected]> Reviewed-By: Evan Lucas <[email protected]>
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performanceIssues and PRs related to the performance of Node.js.processIssues and PRs related to the process subsystem.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants

@mscdex@refack@jasnell@AndreasMadsen@evanlucas@Fishrock123@TimothyGu@gibfahn