Improve traverse and parse performance #900

petersondrew · 2016-03-09T20:44:03Z

Make use of Object.keys rather than for..in + hasOwnProperty, as the
latter cannot be optimized by Node and results in deoptimized function
calls in the hot path.

Before:

   ticks  total  nonlib   name
    882    4.0%    4.0%  LazyCompile: ~traverse r.js:26797:22
    248    1.1%    1.1%  Stub: CEntryStub
    207    0.9%    0.9%  LazyCompile: *hasOwnProperty native v8natives.js:111:30
    194    0.9%    0.9%  LazyCompile: *ToName native runtime.js:466:16
    153    0.7%    0.7%  LazyCompile: ~parse.recurse r.js:26980:30

After:

   ticks  total  nonlib   name
     56    0.5%    0.5%  LazyCompile: *traverse r.js:26797:22

This has a significant impact on cpu time and wall clock performance. Test suites pass as far as I can tell, however the Rhino tests suites hung up on the sourcemap tests (they did on master as well). Not sure if it's required for this change but I have submitted a CLA just in case.

Make use of Object.keys rather than for..in + hasOwnProperty, as the latter cannot be optimized by Node and results in deoptimized function calls in the hot path.

Improve traverse and parse performance

jrburke · 2016-03-09T21:22:26Z

This is fantastic, thank you! Tested locally and updated the snapshot in dist/r.js with the changes. Noticeable speedup for the tests, looks like it is close to twice as fast to run the node tests with the change.

CLA is much appreciated, thank you. If you filled out the Dojo one, the Dojo and jQuery foundations have merged and I'm mid-process for switching over the CLA links. While it should be fine if you used the Dojo form, it is probably best if you also do the jQuery one: https://contribute.jquery.org/CLA/ which applies to other projects like jQuery, lodash, so gives a lot of contribution possibilities for the future.

Thank you so much for the analysis and the fix, a great improvement.

petersondrew · 2016-03-09T21:27:50Z

You're very welcome! I submitted a jQuery CLA as well to cover the bases.

jrburke · 2016-03-17T02:07:22Z

@petersondrew I am interested to know more about how you went about identifying the bottleneck. I am still new to those kind of tools for node. No worries though if you do not have time, just idle curiosity. I expect if I just did some more google research I would find the answer.

petersondrew · 2016-03-17T04:52:16Z

Sure, no problem 😄
I noticed the requirejs step in my build script seemed to be taking an inordinate amount of time, especially considering uglify was disabled. To profile the build script I used the --prof flag:
node.exe --prof .\node_modules\grunt-cli\bin\grunt
That creates a tick file in the same directory, named something like isolate-0x*-v8.log. This tick file can then be processed by node to produce output similar to what I posted above using the following:
node --prof-process isolate.log > profile.txt
That file will break down the number of ticks spent in different parts of the code.

If you notice in the first snippet I posted, most of the ticks were spent in LazyCompile: ~traverse. The tilde before the function name is significant here, it means that the function was not optimized. This means it was interpreted by the v8 engine on each execution using the non-optimizing compiler (node really has 2 compilers).

Now, it's not necessarily a given that an unoptimized function will kill your performance, even in the hot path, however this particular function created a double whammy. If we dig in further, we can see that (possibly due to the recursion) node continually optimizes and de-optimizes this function over and over when it finds that its assumptions were incorrect:
node --trace-deopt --trace-opt .\node_modules\grunt-cli\bin\grunt > trace.txt
That trace file will contain a log file showing each function optimization and de-optimization, along with the reason. This reason can sometimes provide enough insight to allow you to modify it in a way that v8 can optimize it successfully. See this excellent wiki for some good examples.

Let me know if you have any other questions.

jrburke · 2016-03-17T04:54:54Z

@petersondrew, this is awesome, thank you very much! Very educational.

Improve traverse and parse performance

991d88a

Make use of Object.keys rather than for..in + hasOwnProperty, as the latter cannot be optimized by Node and results in deoptimized function calls in the hot path.

jrburke added this to the 2.1.23 milestone Mar 9, 2016

jrburke added a commit that referenced this pull request Mar 9, 2016

Merge pull request #900 from petersondrew/parse-perf

f736a69

Improve traverse and parse performance

jrburke merged commit f736a69 into requirejs:master Mar 9, 2016

jrburke mentioned this pull request Mar 9, 2016

Dramatic increase in build time in r.js version 2.1.16, as compared to 2.1.15 #850

Closed

petersondrew deleted the parse-perf branch March 10, 2016 15:55

jrburke modified the milestones: 2.1.23, 2.2.0 Mar 15, 2016

samreid mentioned this pull request Dec 20, 2016

Update requirejs for a speed boost in builds phetsims/chipper#532

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve traverse and parse performance #900

Improve traverse and parse performance #900

Uh oh!

petersondrew commented Mar 9, 2016

Uh oh!

jrburke commented Mar 9, 2016

Uh oh!

petersondrew commented Mar 9, 2016

Uh oh!

jrburke commented Mar 17, 2016

Uh oh!

petersondrew commented Mar 17, 2016

Uh oh!

jrburke commented Mar 17, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Improve traverse and parse performance #900

Improve traverse and parse performance #900

Uh oh!

Conversation

petersondrew commented Mar 9, 2016

Uh oh!

jrburke commented Mar 9, 2016

Uh oh!

petersondrew commented Mar 9, 2016

Uh oh!

jrburke commented Mar 17, 2016

Uh oh!

petersondrew commented Mar 17, 2016

Uh oh!

jrburke commented Mar 17, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants