[webkit-reviews] review requested: [Bug 69690] DFG does not have flow-sensitive intraprocedural control flow analysis : [Attachment 110283] the patch - more perf improvements
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Sat Oct 8 15:30:24 PDT 2011
Filip Pizlo <fpizlo at apple.com> has asked for review:
Bug 69690: DFG does not have flow-sensitive intraprocedural control flow
analysis
https://bugs.webkit.org/show_bug.cgi?id=69690
Attachment 110283: the patch - more perf improvements
https://bugs.webkit.org/attachment.cgi?id=110283&action=review
------- Additional Comments from Filip Pizlo <fpizlo at apple.com>
- Did more tuning to the CFA itself. It now calls clobberStructures() less
frequently. clobberStructures() is an expensive, but necessary, method. Likely
any future work on improving CFA efficiency will focus on that method - how
it's called, and what it does internally.
- Added better debugging support. Every graph dump will now print the CFA state
in a concise and clear way.
- Fixed a bug where clobberStructures() was not being called for
possibly-side-effecting operations like LogicalNot on questionable objects, and
CompareXYZ on questionable objects.
- Did some investigation into what is going on in SunSpider. It turns out that
date-format-xparb, among others, are still failing speculation like crazy. A
lot of its is due to GetByVal not having a string property case. The
exponential backoff in recompilation is doing its best, but since the benchmark
is so short-running, any increase in DFG runtime will affect such short-running
benchmarks that fail speculation systematically.
- Ran performance numbers on a separate machine, to find that the speed-up is
smaller on my Mac Pro than it is on my MacBook Pro. But it's a speed-up on
both machines.
Current performance situation:
- Overall 0.5% speed-up in geomean of preferred means.
- Slight reproducible sub-1% slow-down on SunSpider
- Either a 2.4% or a 0.5% win on V8 depending on machine.
- 1% win on Kraken.
Recommendation: we should land this, since: it's already a net win, the losses
on SunSpider are due to preexisting conditions for which we are currently
uninsured, and this should only become a bigger win as we add more machinery to
take advantage of it. (Like, we're not doing anything to take advantage of the
always-double propagation that this thing is already doing, and I haven't yet
implemented the is-constant propagation, which should be a win on function
checks among other things.) Not to mention that classically, CFA is most
profitable when combined with inlining.
Latest numbers on MacBook Pro:
Benchmark report for SunSpider, V8, and Kraken.
VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"CFA" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc
Collected 30 samples per benchmark/VM, with 10 VM invocations per benchmark.
Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get
microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.
TipOfTree CFA
SunSpider:
3d-cube 7.2512+-0.1027
7.2506+-0.1160
3d-morph 7.6656+-0.0813 ?
7.7124+-0.0978 ?
3d-raytrace 7.3843+-0.0984 ?
7.4021+-0.0967 ?
access-binary-trees 1.7302+-0.0287
1.6781+-0.0239 might be 1.0310x faster
access-fannkuch 6.3504+-0.0672
6.3159+-0.0640
access-nbody 3.4521+-0.0469 ?
3.4932+-0.0531 ? might be 1.0119x slower
access-nsieve 2.5600+-0.0356 ?
2.5885+-0.0470 ? might be 1.0111x slower
bitops-3bit-bits-in-byte 1.7154+-0.0262 ?
1.7434+-0.0319 ? might be 1.0163x slower
bitops-bits-in-byte 2.7388+-0.0391 ?
2.7447+-0.0359 ?
bitops-bitwise-and 3.3050+-0.0546 ?
3.3606+-0.0589 ? might be 1.0168x slower
bitops-nsieve-bits 5.4385+-0.0653
5.4370+-0.0711
controlflow-recursive 2.0667+-0.0212 ?
2.0812+-0.0208 ?
crypto-aes 6.6382+-0.1104 ?
6.6709+-0.1017 ?
crypto-md5 2.8355+-0.0466
2.8343+-0.0453
crypto-sha1 2.4700+-0.0373 ?
2.5155+-0.0443 ? might be 1.0184x slower
date-format-tofte 9.9355+-0.0996 ?
9.9652+-0.1371 ?
date-format-xparb 9.5529+-0.1560 !
9.9465+-0.1131 ! definitely 1.0412x slower
math-cordic 6.2958+-0.0680 ?
6.3863+-0.0523 ? might be 1.0144x slower
math-partial-sums 7.5041+-0.0695 ?
7.5846+-0.0878 ? might be 1.0107x slower
math-spectral-norm 2.8198+-0.0398 ?
2.8726+-0.0465 ? might be 1.0187x slower
regexp-dna 10.6842+-0.0893
10.6180+-0.1092
string-base64 5.2352+-0.0858 ?
5.2951+-0.0554 ? might be 1.0114x slower
string-fasta 6.3476+-0.0765 ?
6.4334+-0.0947 ? might be 1.0135x slower
string-tagcloud 11.0721+-0.1259 ?
11.1451+-0.1129 ?
string-unpack-code 20.9296+-0.2131 ?
21.3736+-0.3461 ? might be 1.0212x slower
string-validate-input 6.3714+-0.1157
6.3281+-0.0869
<arithmetic> * 6.1673+-0.0147 !
6.2222+-0.0173 ! definitely 1.0089x slower
<geometric> 5.0674+-0.0157 !
5.1036+-0.0164 ! definitely 1.0072x slower
<harmonic> 4.1647+-0.0213 ?
4.1880+-0.0207 ?
TipOfTree CFA
V8:
crypto 71.9306+-0.2217
71.6923+-0.2342
deltablue 224.0349+-1.1196 ^
220.8591+-0.8745 ^ definitely 1.0144x faster
earley-boyer 90.8330+-0.2471 ?
91.0550+-0.6016 ?
raytrace 58.6492+-0.1942 ^
57.5860+-0.1636 ^ definitely 1.0185x faster
regexp 103.2842+-0.5738 ?
103.6217+-0.3030 ?
richards 204.7647+-0.4575 ^
178.8069+-0.4249 ^ definitely 1.1452x faster
splay 93.9206+-0.3154
93.8991+-0.3847
<arithmetic> 121.0596+-0.1853 ^
116.7886+-0.1891 ^ definitely 1.0366x faster
<geometric> * 107.9179+-0.1498 ^
105.3890+-0.1611 ^ definitely 1.0240x faster
<harmonic> 97.7053+-0.1498 ^
96.2554+-0.1560 ^ definitely 1.0151x faster
TipOfTree CFA
Kraken:
ai-astar 491.8722+-1.8657
491.6102+-2.1346
audio-beat-detection 191.3896+-0.6444 !
193.1956+-0.5294 ! definitely 1.0094x slower
audio-dft 265.5784+-1.5956 ^
261.3193+-2.1896 ^ definitely 1.0163x faster
audio-fft 124.9334+-0.1853 ?
125.4338+-0.4450 ?
audio-oscillator 251.5153+-1.0876
250.4084+-1.0322
imaging-darkroom 412.2875+-0.8770 ?
412.7251+-0.8786 ?
imaging-desaturate 229.8157+-0.4044 ^
216.2189+-0.3764 ^ definitely 1.0629x faster
imaging-gaussian-blur 579.9277+-0.8575 ^
567.2564+-0.9921 ^ definitely 1.0223x faster
json-parse-financial 53.8428+-0.2035 ?
54.0464+-0.2093 ?
json-stringify-tinderbox 67.6931+-0.2539 ?
67.7358+-0.2734 ?
stanford-crypto-aes 130.2188+-0.9518
129.4420+-0.8691
stanford-crypto-ccm 99.4847+-0.3260 !
101.1548+-0.4164 ! definitely 1.0168x slower
stanford-crypto-pbkdf2 187.6499+-0.8004 ?
188.8759+-1.0071 ?
stanford-crypto-sha256-iterative 69.9878+-0.1863
69.6543+-0.1668
<arithmetic> * 225.4426+-0.3118 ^
223.5055+-0.3132 ^ definitely 1.0087x faster
<geometric> 175.7162+-0.2682 ^
174.8010+-0.2755 ^ definitely 1.0052x faster
<harmonic> 136.9772+-0.2368
136.7825+-0.2282
TipOfTree CFA
All benchmarks:
<arithmetic> 88.5950+-0.1068 ^
87.4122+-0.1004 ^ definitely 1.0135x faster
<geometric> 22.9797+-0.0474
22.9534+-0.0477
<harmonic> 7.3245+-0.0367 ?
7.3629+-0.0354 ?
TipOfTree CFA
Geomean of preferred means:
<scaled-result> 53.1381+-0.0625 ^
52.7237+-0.0651 ^ definitely 1.0079x faster
Latest numbers on Mac Pro:
Benchmark report for SunSpider, V8, and Kraken.
VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/secondary/OpenSource/WebKitBuild/Release/jsc
"CFA" at /Volumes/Data/fromMiniMe/OpenSource/WebKitBuild/Release/jsc
Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark.
Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get
microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.
TipOfTree CFA
SunSpider:
3d-cube 7.9004+-0.0265 ! 7.9902+-0.0335
! definitely 1.0114x slower
3d-morph 8.4615+-0.1076 ? 8.4887+-0.1116
?
3d-raytrace 8.0821+-0.1095 8.0724+-0.0618
access-binary-trees 1.7927+-0.0033 1.7806+-0.0161
access-fannkuch 7.6813+-0.0148 ! 7.9859+-0.0254
! definitely 1.0397x slower
access-nbody 4.1641+-0.0068 ! 4.1778+-0.0037
! definitely 1.0033x slower
access-nsieve 3.1422+-0.0032 ! 3.1584+-0.0066
! definitely 1.0052x slower
bitops-3bit-bits-in-byte 1.7320+-0.0103 ? 1.7441+-0.0089
?
bitops-bits-in-byte 5.2152+-0.0453 ? 5.2206+-0.0533
?
bitops-bitwise-and 3.4118+-0.0640 3.3993+-0.0644
bitops-nsieve-bits 5.6763+-0.0392 5.6438+-0.0351
controlflow-recursive 2.2564+-0.0244 2.2509+-0.0028
crypto-aes 6.9132+-0.0689 ! 7.0664+-0.0630
! definitely 1.0222x slower
crypto-md5 3.0181+-0.0343 ? 3.0585+-0.0331
? might be 1.0134x slower
crypto-sha1 2.7090+-0.0109 ! 2.7557+-0.0108
! definitely 1.0172x slower
date-format-tofte 10.6874+-0.0850 ? 10.7886+-0.1017
?
date-format-xparb 10.7804+-0.2653 ? 11.2082+-0.1793
? might be 1.0397x slower
math-cordic 7.2695+-0.0506 7.2238+-0.0219
math-partial-sums 10.4781+-0.0501 10.4767+-0.0339
math-spectral-norm 3.2288+-0.0168 ! 3.2505+-0.0036
! definitely 1.0067x slower
regexp-dna 12.4863+-0.1218 ? 12.5624+-0.1253
?
string-base64 5.4727+-0.0648 ? 5.5304+-0.0697
? might be 1.0105x slower
string-fasta 7.3632+-0.0576 7.3506+-0.0232
string-tagcloud 12.8958+-0.0586 12.8081+-0.0688
string-unpack-code 23.9726+-0.1714 ? 24.2456+-0.1831
? might be 1.0114x slower
string-validate-input 6.8011+-0.1062 6.7479+-0.0643
<arithmetic> * 7.0612+-0.0244 ! 7.1149+-0.0199
! definitely 1.0076x slower
<geometric> 5.7812+-0.0155 ! 5.8176+-0.0134
! definitely 1.0063x slower
<harmonic> 4.6961+-0.0111 ? 4.7189+-0.0120
?
TipOfTree CFA
V8:
crypto 80.3328+-0.2637 79.9557+-0.1580
deltablue 252.4219+-1.5655 ? 253.0487+-1.3899
?
earley-boyer 110.3867+-0.3100 ! 111.4466+-0.2599
! definitely 1.0096x slower
raytrace 66.3575+-0.3105 ^ 64.6593+-0.1899
^ definitely 1.0263x faster
regexp 123.6966+-0.5882 123.6552+-0.5325
richards 221.5944+-0.9748 ^ 215.4157+-1.0238
^ definitely 1.0287x faster
splay 122.9378+-0.7662 ? 124.5312+-0.8843
? might be 1.0130x slower
<arithmetic> 139.6754+-0.4149 138.9589+-0.3738
<geometric> * 125.8663+-0.2769 ^ 125.2510+-0.2812
^ definitely 1.0049x faster
<harmonic> 114.2437+-0.1976 ^ 113.5287+-0.2205
^ definitely 1.0063x faster
TipOfTree CFA
Kraken:
ai-astar 832.8773+-0.4054
825.4636+-11.7787
audio-beat-detection 214.1596+-1.7669 213.3784+-0.9437
audio-dft 271.3054+-4.2497 ^ 260.8641+-3.0615
^ definitely 1.0400x faster
audio-fft 135.4391+-0.1603 ! 139.2097+-1.1166
! definitely 1.0278x slower
audio-oscillator 293.9338+-3.0579 292.1754+-1.9142
imaging-darkroom 486.8431+-4.6328 484.2544+-3.7257
imaging-desaturate 244.8629+-0.2524 ^ 237.8543+-0.0918
^ definitely 1.0295x faster
imaging-gaussian-blur 641.8301+-0.2648 ^ 611.2323+-0.2040
^ definitely 1.0501x faster
json-parse-financial 68.4385+-0.1365 ? 68.6702+-0.2375
?
json-stringify-tinderbox 79.7739+-0.3035 ? 80.1401+-0.2522
?
stanford-crypto-aes 152.6437+-1.2898 151.5830+-1.4340
stanford-crypto-ccm 116.1246+-0.7775 ? 117.1616+-0.5738
?
stanford-crypto-pbkdf2 235.4855+-1.7872 232.6827+-1.6888
might be 1.0120x faster
stanford-crypto-sha256-iterative 85.6205+-0.2548 ? 86.0403+-0.3846
?
<arithmetic> * 275.6670+-0.6405 ^ 271.4793+-0.9585
^ definitely 1.0154x faster
<geometric> 208.1917+-0.4363 ^ 206.5505+-0.5262
^ definitely 1.0079x faster
<harmonic> 162.2625+-0.2764 162.1470+-0.4291
TipOfTree CFA
All benchmarks:
<arithmetic> 106.8225+-0.2224 ^ 105.4981+-0.2833
^ definitely 1.0126x faster
<geometric> 26.6011+-0.0546 ? 26.6112+-0.0438
?
<harmonic> 8.2687+-0.0192 ? 8.3073+-0.0206
?
TipOfTree CFA
Geomean of preferred means:
<scaled-result> 62.5735+-0.1251 ^ 62.3100+-0.0910
^ definitely 1.0042x faster
More information about the webkit-reviews
mailing list