Skip to content

Conversation

@AndreasMadsen
Copy link
Member

@AndreasMadsenAndreasMadsen commented Jan 25, 2018

As talked about in #18112 (comment) this shows more clearly the variance of each comparison. This should also help us prevent over-running the benchmarks. If you see an accuracy of ±0.1% then properly you could spend fewer iterations running that ;)

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)

benchmark

example output:

 confidence improvement accuracy (*) (**) (***) fs/bench-readdir.js n=10000 -7.20 % ±9.94% ±13.23% ±17.22% fs/bench-readdirSync.js n=10000 -0.46 % ±6.97% ±9.27% ±12.08% fs/bench-realpath.js pathType="relative" n=10000 1.62 % ±4.25% ±5.65% ±7.35% fs/bench-realpath.js pathType="resolved" n=10000 -1.68 % ±3.89% ±5.17% ±6.74% fs/bench-realpathSync.js pathType="relative" n=10000 0.72 % ±5.87% ±7.81% ±10.18% fs/bench-realpathSync.js pathType="resolved" n=10000 0.16 % ±1.61% ±2.15% ±2.80% fs/bench-stat.js statType="fstat" n=200000 -2.39 % ±4.50% ±6.00% ±7.83% fs/bench-stat.js statType="lstat" n=200000 -2.94 % ±5.34% ±7.11% ±9.25% fs/bench-stat.js statType="stat" n=200000 2.30 % ±4.29% ±5.72% ±7.45% fs/bench-statSync.js statSyncType="fstatSync" n=1000000 -1.15 % ±5.69% ±7.57% ±9.85% fs/bench-statSync.js statSyncType="lstatSync" n=1000000 0.47 % ±2.15% ±2.90% ±3.84% fs/bench-statSync.js statSyncType="statSync" n=1000000 0.62 % ±2.99% ±3.98% ±5.19% fs/read-stream-throughput.js size=1024 filesize=1048576000 encodingType="asc" 1.08 % ±2.96% ±3.94% ±5.13% fs/read-stream-throughput.js size=1024 filesize=1048576000 encodingType="buf" -1.86 % ±3.85% ±5.13% ±6.68% fs/read-stream-throughput.js size=1024 filesize=1048576000 encodingType="utf" -1.09 % ±2.29% ±3.04% ±3.96% fs/read-stream-throughput.js size=1048576 filesize=1048576000 encodingType="asc" -1.70 % ±4.55% ±6.07% ±7.95% fs/read-stream-throughput.js size=1048576 filesize=1048576000 encodingType="buf" -0.51 % ±3.54% ±4.71% ±6.14% fs/read-stream-throughput.js size=1048576 filesize=1048576000 encodingType="utf" 0.21 % ±10.19% ±13.56% ±17.65% fs/read-stream-throughput.js size=4096 filesize=1048576000 encodingType="asc" -1.84 % ±3.14% ±4.17% ±5.43% fs/read-stream-throughput.js size=4096 filesize=1048576000 encodingType="buf" -2.14 % ±4.09% ±5.44% ±7.08% fs/read-stream-throughput.js size=4096 filesize=1048576000 encodingType="utf" 1.17 % ±3.53% ±4.70% ±6.11% fs/read-stream-throughput.js size=65535 filesize=1048576000 encodingType="asc" 1.25 % ±7.25% ±9.65% ±12.56% fs/read-stream-throughput.js size=65535 filesize=1048576000 encodingType="buf" * 6.19 % ±5.13% ±6.86% ±9.01% fs/read-stream-throughput.js size=65535 filesize=1048576000 encodingType="utf" 2.12 % ±6.49% ±8.64% ±11.25% fs/readfile.js concurrent=1 len=1024 dur=5 -0.33 % ±3.64% ±4.84% ±6.31% fs/readfile.js concurrent=1 len=16777216 dur=5 -3.03 % ±5.35% ±7.12% ±9.27% fs/readfile.js concurrent=10 len=1024 dur=5 -0.98 % ±2.13% ±2.83% ±3.68% fs/readfile.js concurrent=10 len=16777216 dur=5 0.14 % ±1.72% ±2.29% ±2.98% fs/readFileSync.js n=600000 -4.79 % ±7.59% ±10.10% ±13.17% fs/write-stream-throughput.js size=1024 encodingType="asc" dur=5 -1.60 % ±8.77% ±11.66% ±15.18% fs/write-stream-throughput.js size=1024 encodingType="buf" dur=5 0.39 % ±12.28% ±16.34% ±21.27% fs/write-stream-throughput.js size=1024 encodingType="utf" dur=5 -0.77 % ±5.77% ±7.69% ±10.03% fs/write-stream-throughput.js size=1048576 encodingType="asc" dur=5 0.18 % ±0.99% ±1.31% ±1.71% fs/write-stream-throughput.js size=1048576 encodingType="buf" dur=5 * 23.42 % ±23.37% ±31.12% ±40.58% fs/write-stream-throughput.js size=1048576 encodingType="utf" dur=5 2.73 % ±7.07% ±9.45% ±12.41% fs/write-stream-throughput.js size=2 encodingType="asc" dur=5 0.74 % ±11.24% ±14.96% ±19.47% fs/write-stream-throughput.js size=2 encodingType="buf" dur=5 -2.84 % ±5.64% ±7.52% ±9.81% fs/write-stream-throughput.js size=2 encodingType="utf" dur=5 * -15.35 % ±14.18% ±18.87% ±24.58% fs/write-stream-throughput.js size=65535 encodingType="asc" dur=5 -3.07 % ±10.67% ±14.21% ±18.53% fs/write-stream-throughput.js size=65535 encodingType="buf" dur=5 * -12.87 % ±11.94% ±15.88% ±20.67% fs/write-stream-throughput.js size=65535 encodingType="utf" dur=5 3.28 % ±9.81% ±13.08% ±17.06% Be aware that when doing many comparisons the risk of a false-positive result increases. In this case there are 41 comparisons, you can thus expect the following amount of false-positive results: 2.05 false positives, when considering a 5% risk acceptance (*, **, ***), 0.41 false positives, when considering a 1% risk acceptance (**, ***), 0.04 false positives, when considering a 0.1% risk acceptance (***) 

@nodejs-github-botnodejs-github-bot added the benchmark Issues and PRs related to the benchmark subsystem. label Jan 25, 2018
@AndreasMadsen
Copy link
MemberAuthor

/cc @joyeecheung
/cc @jasnell@Matteo - who have been my "I don't understand statistic" test subjects.

Copy link
Member

@joyeecheungjoyeecheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a R expert, but LGTM if benchmark CI is happy.

@joyeecheung
Copy link
Member

@joyeecheung
Copy link
Member

 confidence improvement accuracy (*) (**) (***) arrays/var-int.js n=25 type="Array" 4.97 % ±13.05% ±17.96% ±24.65% arrays/var-int.js n=25 type="Buffer" -0.85 % ±12.22% ±16.76% ±22.88% arrays/var-int.js n=25 type="Float32Array" -0.31 % ±0.43% ±0.58% ±0.80% arrays/var-int.js n=25 type="Float64Array" 0.34 % ±0.50% ±0.69% ±0.95% arrays/var-int.js n=25 type="Int16Array" 3.92 % ±9.76% ±14.00% ±20.56% arrays/var-int.js n=25 type="Int32Array" -0.21 % ±0.48% ±0.66% ±0.93% arrays/var-int.js n=25 type="Int8Array" -5.21 % ±9.54% ±13.70% ±20.15% arrays/var-int.js n=25 type="Uint16Array" -0.14 % ±12.71% ±17.42% ±23.73% arrays/var-int.js n=25 type="Uint32Array" -0.79 % ±1.49% ±2.13% ±3.09% arrays/var-int.js n=25 type="Uint8Array" 4.23 % ±10.50% ±15.08% ±22.17% arrays/zero-float.js n=25 type="Array" -1.47 % ±4.95% ±7.10% ±10.41% arrays/zero-float.js n=25 type="Buffer" 3.84 % ±9.00% ±12.92% ±18.98% arrays/zero-float.js n=25 type="Float32Array" 0.30 % ±0.53% ±0.73% ±0.99% arrays/zero-float.js n=25 type="Float64Array" -0.07 % ±1.77% ±2.50% ±3.59% arrays/zero-float.js n=25 type="Int16Array" 2.44 % ±7.92% ±11.26% ±16.31% arrays/zero-float.js n=25 type="Int32Array" -0.04 % ±0.63% ±0.88% ±1.24% arrays/zero-float.js n=25 type="Int8Array" 0.10 % ±2.76% ±3.78% ±5.16% arrays/zero-float.js n=25 type="Uint16Array" -0.72 % ±1.92% ±2.74% ±3.98% arrays/zero-float.js n=25 type="Uint32Array" 0.67 % ±0.92% ±1.27% ±1.76% arrays/zero-float.js n=25 type="Uint8Array" -7.27 % ±11.99% ±17.15% ±25.07% arrays/zero-int.js n=25 type="Array" 2.16 % ±8.68% ±11.95% ±16.41% arrays/zero-int.js n=25 type="Buffer" 3.86 % ±8.99% ±12.91% ±18.97% arrays/zero-int.js n=25 type="Float32Array" -0.26 % ±0.57% ±0.78% ±1.07% arrays/zero-int.js n=25 type="Float64Array" -0.42 % ±0.65% ±0.90% ±1.25% arrays/zero-int.js n=25 type="Int16Array" 4.87 % ±9.30% ±13.36% ±19.64% arrays/zero-int.js n=25 type="Int32Array" 0.16 % ±1.78% ±2.51% ±3.61% arrays/zero-int.js n=25 type="Int8Array" 4.88 % ±9.88% ±14.18% ±20.82% arrays/zero-int.js n=25 type="Uint16Array" 3.03 % ±9.31% ±13.27% ±19.29% arrays/zero-int.js n=25 type="Uint32Array" 1.28 % ±2.26% ±3.24% ±4.74% arrays/zero-int.js n=25 type="Uint8Array" 4.17 % ±9.60% ±13.79% ±20.27% Be aware that when doing many comparisions the risk of a false-positive result increases. In this case there are 30 comparisions, you can thus expect the following amount of false-positive results: 1.50 false positives, when considering a 5% risk acceptance (*, **, ***), 0.30 false positives, when considering a 1% risk acceptance (**, ***), 0.03 false positives, when considering a 0.1% risk acceptance (***) Notifying upstream projects of job completion Finished: SUCCESS 

The accuracy there probably means those benchmarks are just not that reliable in nature...

@gareth-ellis
Copy link
Member

gareth-ellis commented Jan 27, 2018

@joyeecheung / @AndreasMadsen

LGTM, i seem to be having issues with my email notifications at the moment.
I've also noticed in the output that its not particularly easy to confirm what has been done - i should be able to update the script to make it very clear what change we make to the build under test, which should make this a lot clearer.

As there is the extra warning about false positives in the benchmark output, I think we can be sure that the change was in this build - but i think i can make it clearer in the future.!

Note, the job is actually running https://github.com/nodejs/benchmarking/blob/core-benchmark/experimental/benchmarks/community-benchmark/run.sh

I'll get this into master, as that's going to lead to even more confusion (I made some changes to try and reduce output from the build, but I need to also get it to take stderr away, as we have a lot of warnings that make the rest of the output trickier to understand.)

@joyeecheungjoyeecheung added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Jan 29, 2018
@joyeecheung
Copy link
Member

Landed in 368517c, thanks!

@joyeecheungjoyeecheung removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Jan 29, 2018
joyeecheung pushed a commit that referenced this pull request Jan 29, 2018
PR-URL: #18373 Reviewed-By: Joyee Cheung <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]>
@evanlucasevanlucas mentioned this pull request Jan 30, 2018
evanlucas pushed a commit that referenced this pull request Jan 30, 2018
PR-URL: #18373 Reviewed-By: Joyee Cheung <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]>
MylesBorins pushed a commit that referenced this pull request Feb 27, 2018
PR-URL: #18373 Reviewed-By: Joyee Cheung <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]>
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
PR-URL: nodejs#18373 Reviewed-By: Joyee Cheung <[email protected]> Reviewed-By: James M Snell <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]>
Sign up for freeto join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmarkIssues and PRs related to the benchmark subsystem.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants

@AndreasMadsen@joyeecheung@gareth-ellis@jasnell@BridgeAR@MylesBorins@nodejs-github-bot