<html>

    <head>

      <base href="https://bugs.webkit.org/" />

    </head>

    <body>

      <p>

        <div>

            <b><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - Web Inspector: Iterating over a Set/Map is too slow"

   href="https://bugs.webkit.org/show_bug.cgi?id=152691#c5">Comment # 5</a>

              on <a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - Web Inspector: Iterating over a Set/Map is too slow"

   href="https://bugs.webkit.org/show_bug.cgi?id=152691">bug 152691</a>

              from <span class="vcard"><a class="email" href="mailto:utatane.tea&#64;gmail.com" title="Yusuke Suzuki &lt;utatane.tea&#64;gmail.com&gt;"> <span class="fn">Yusuke Suzuki</span></a>

</span></b>

        <pre>(In reply to <a href="show_bug.cgi?id=152691#c3">comment #3</a>)

<span class="quote">&gt; (In reply to <a href="show_bug.cgi?id=152691#c1">comment #1</a>)

&gt; &gt; I guess we can improve this by making Set.prototype.forEach written in JS.

&gt; &gt; Currently, it is implemented in C++, it avoid inlining the given callback.

&gt; 

&gt; But if you have many calls to Set.prototype.forEach, we're not going

&gt; to inline the call.

&gt; 

&gt; Is there a fast way to get an array of the elements in the set?

&gt; That way we could just iterate over that array?</span >

I created the small patch and measure the bottlenecks with Linux perf.

The test script, <a href="https://gist.github.com/Constellation/b747d897066ebbc567c8">https://gist.github.com/Constellation/b747d897066ebbc567c8</a>

And I executed the code with `jsc lodash.min.js benchmark.js above-code.js`.

And collect the perf result with `perf record jsc lodash.min.js benchmark.js above-code.js`.

Set forEach x 13,138 ops/sec ±0.21% (66 runs sampled)

And the perf report shows the following.

Samples: 23K of event 'cycles', Event count (approx.): 21446074385                                                                                                                                                                  

 34.04%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] JSC::Interpreter::execute(JSC::CallFrameClosure&amp;)                                                                                                                             

 20.48%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] vmEntryToJavaScript                                                                                                                                                           

  9.80%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] JSC::JITCode::execute(JSC::VM*, JSC::ProtoCallFrame*)                                                                                                                         

  7.95%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] JSC::setProtoFuncForEach(JSC::ExecState*)                                                                                                                                     

  5.65%  jsc  perf-22854.map                      [.] 0x00007f5d2c204a6f                                                                                                                                                            

  2.76%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] JSC::JSLock::currentThreadIsHoldingLock()                                                                                                                                     

  0.69%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] JSC::FTL::JITCode::executableAddressAtOffset(unsigned long)                                                                                                                   

  0.68%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] pthread_self&#64;plt                                                                                                                                                              

  0.65%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] JSC::Interpreter::startSampling()                                                                                                                                             

  0.61%  jsc  libpthread-2.19.so                  [.] pthread_self                                                                                                                                                                  

  0.60%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] JSC::Interpreter::stopSampling()                                                                                                                                              

  0.60%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] _ZN3JSC11Interpreter12stopSamplingEv&#64;plt                                                                                                                                      

  0.54%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] _ZN3JSC7JITCode7executeEPNS_2VMEPNS_14ProtoCallFrameE&#64;plt                                                                                                                     

  0.52%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] _ZN3JSC6JSLock26currentThreadIsHoldingLockEv&#64;plt                                                                                                                              

  0.38%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] _ZN3JSC11Interpreter7executeERNS_16CallFrameClosureE&#64;plt                                                                                                                      

  0.29%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] _ZN3JSC11Interpreter13startSamplingEv&#64;plt                                                                                                                                     

  0.25%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] JSC::Lexer&lt;unsigned char&gt;::lex(JSC::JSToken*, unsigned int, bool)

We can see that C++ -&gt; JS call becomes large performance overhead.

So, I've created the small change, adding builtin's Set.prototype.forEach. It results the following,

Set forEach x 20,827 ops/sec ±3.69% (40 runs sampled)

Samples: 23K of event 'cycles', Event count (approx.): 21255691651                                                                                                                                                                  

 62.91%  jsc  perf-22890.map                      [.] 0x00007fd117c0a3b9                                                                                                                                                            

 24.89%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] JSC::privateFuncSetIteratorNext(JSC::ExecState*)                                                                                                                              

  0.29%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] JSC::CodeBlock::updateAllPredictionsAndCountLiveness(unsigned int&amp;, unsigned int&amp;)                                                                                            

  0.24%  jsc  [vdso]                              [.] 0x00000000000008e8                                                                                                                                                            

  0.22%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] JSC::CodeBlock::predictedMachineCodeSize()                                                                                                                                    

  0.16%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] WTF::MetaAllocator::currentStatistics()                                                                                                                                       

  0.15%  jsc  libjavascriptcoregtk-4.0.so.18.3.1  [.] JSC::Lexer&lt;unsigned char&gt;::lex(JSC::JSToken*, unsigned int, bool)

The one of the bottlenecks was C++ -&gt; JS calls. It can be eliminated and now privateFuncSetIteratorNext (SetIterator's next op) becomes the main contribution.

It's still slow. But at least, dropping C++ -&gt; JS calls improve the performance; 1.6x.

$ WebKitBuild/set-map-old/Release/bin/jsc tmp/lodash.min.js tmp/benchmark.js tmp/hello.js

Linked List x 212,160 ops/sec ±0.39% (66 runs sampled)

Array x 376,348 ops/sec ±0.21% (66 runs sampled)

Array forEach x 20,054 ops/sec ±4.26% (49 runs sampled)

Set forEach x 13,294 ops/sec ±0.20% (67 runs sampled)

Fastest is Array

$ WebKitBuild/set-map/Release/bin/jsc tmp/lodash.min.js tmp/benchmark.js tmp/hello.js

Linked List x 210,151 ops/sec ±0.65% (66 runs sampled)

Array x 370,408 ops/sec ±0.51% (66 runs sampled)

Array forEach x 20,345 ops/sec ±4.35% (46 runs sampled)

Set forEach x 21,007 ops/sec ±3.51% (40 runs sampled)

Fastest is Array</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are the assignee for the bug.</li>

      </ul>

    </body>

</html>