[webkit-reviews] review requested: [Bug 68316] DFG JIT does not have full block-local CSE : [Attachment 107778] the patch - reduced worst case performance a bit
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Sat Sep 17 16:31:30 PDT 2011
Filip Pizlo <fpizlo at apple.com> has asked for review:
Bug 68316: DFG JIT does not have full block-local CSE
https://bugs.webkit.org/show_bug.cgi?id=68316
Attachment 107778: the patch - reduced worst case performance a bit
https://bugs.webkit.org/attachment.cgi?id=107778&action=review
------- Additional Comments from Filip Pizlo <fpizlo at apple.com>
This improces the CSE algorithm by ensuring that pure CSE only searches for
aliases in the range:
From: Beginning of the basic block, or the maximum index of the children,
whichever is bigger.
To: Last occurence of a node with the same opcode.
It searches this range backwards. This optimization does not apply for
non-pure operations because it would miss side effects.
Current numbers with latest changes.
Benchmark report for SunSpider, V8, and Kraken.
VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"PhantomCSE" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc
Collected 30 samples per benchmark/VM, with 10 VM invocations per benchmark.
Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get
microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.
TipOfTree PhantomCSE
SunSpider:
3d-cube 7.7572+-0.0872 ?
7.7909+-0.1244 ?
3d-morph 7.5184+-0.0785 ?
7.5558+-0.0963 ?
3d-raytrace 7.7239+-0.1032 ?
7.7468+-0.1156 ?
access-binary-trees 2.2740+-0.0315 ?
2.3107+-0.0482 ? might be 1.0161x slower
access-fannkuch 11.7960+-0.1183 ^
11.5626+-0.0988 ^ definitely 1.0202x faster
access-nbody 4.1633+-0.0454 ?
4.2540+-0.0832 ? might be 1.0218x slower
access-nsieve 2.6018+-0.0398 ?
2.6032+-0.0310 ?
bitops-3bit-bits-in-byte 1.6668+-0.0267 ?
1.7083+-0.0400 ? might be 1.0249x slower
bitops-bits-in-byte 2.7696+-0.0319 ?
2.8233+-0.0574 ? might be 1.0194x slower
bitops-bitwise-and 3.5817+-0.0621 ?
3.6520+-0.0629 ? might be 1.0196x slower
bitops-nsieve-bits 5.2870+-0.0667 ?
5.2977+-0.0682 ?
controlflow-recursive 2.0383+-0.0401 ?
2.1312+-0.0930 ? might be 1.0456x slower
crypto-aes 7.0860+-0.2158 ?
7.1663+-0.1855 ? might be 1.0113x slower
crypto-md5 2.8093+-0.0448 ?
2.8269+-0.0811 ?
crypto-sha1 2.2280+-0.0402 ?
2.2414+-0.0501 ?
date-format-tofte 10.2269+-0.1716
10.2042+-0.1426
date-format-xparb 8.7081+-0.1255 ?
8.8777+-0.1652 ? might be 1.0195x slower
math-cordic 6.1956+-0.0690
6.1295+-0.0601 might be 1.0108x faster
math-partial-sums 7.4034+-0.0937
7.3807+-0.0997
math-spectral-norm 2.5948+-0.0337 ?
2.6117+-0.0482 ?
regexp-dna 10.8661+-0.1062 ?
10.9373+-0.1211 ?
string-base64 5.8322+-0.1108
5.8219+-0.1171
string-fasta 6.9385+-0.1083 ?
7.0151+-0.1314 ? might be 1.0110x slower
string-tagcloud 12.1121+-0.1935 ?
12.3550+-0.2062 ? might be 1.0201x slower
string-unpack-code 18.8462+-0.3793
18.5825+-0.2283 might be 1.0142x faster
string-validate-input 6.6845+-0.1124 ?
6.7245+-0.1514 ?
<arithmetic> 6.4504+-0.0253 ?
6.4735+-0.0239 ?
<geometric> 5.3125+-0.0181 !
5.3515+-0.0169 ! definitely 1.0073x slower
<harmonic> 4.3312+-0.0168 !
4.3786+-0.0162 ! definitely 1.0109x slower
TipOfTree PhantomCSE
V8:
crypto 82.7124+-0.2753 ?
82.9466+-0.3673 ?
deltablue 239.2213+-0.8353 ?
239.8840+-1.0250 ?
earley-boyer 94.9496+-0.1586 ?
95.3712+-0.4045 ?
raytrace 69.0085+-0.2725
68.9221+-0.4887
regexp 107.3699+-0.7440
107.0229+-0.3693
richards 217.7471+-0.5430 !
219.5234+-0.5972 ! definitely 1.0082x slower
splay 98.7746+-0.3300
98.7596+-0.2519
<arithmetic> 129.9691+-0.2003 ?
130.3471+-0.2048 ?
<geometric> 116.9542+-0.1749 ?
117.1789+-0.1988 ?
<harmonic> 107.1496+-0.1678 ?
107.2769+-0.2259 ?
TipOfTree PhantomCSE
Kraken:
ai-astar 636.9671+-4.2157 ^
629.5812+-1.9895 ^ definitely 1.0117x faster
audio-beat-detection 467.7131+-1.0341 !
470.1425+-1.0160 ! definitely 1.0052x slower
audio-dft 425.3115+-3.7983
423.8902+-2.1948
audio-fft 364.0817+-0.6708
363.9339+-1.0152
audio-oscillator 312.5008+-0.3736 ?
312.5853+-0.5099 ?
imaging-darkroom 413.9106+-0.8494 ^
410.8664+-0.5848 ^ definitely 1.0074x faster
imaging-desaturate 207.8968+-0.4679 !
218.0565+-0.4687 ! definitely 1.0489x slower
imaging-gaussian-blur 1081.8128+-1.7363 ^
589.7422+-1.1866 ^ definitely 1.8344x faster
json-parse-financial 49.2932+-0.1803 ?
49.7524+-0.2908 ?
json-stringify-tinderbox 69.6689+-1.5041 ^
67.6637+-0.3880 ^ definitely 1.0296x faster
stanford-crypto-aes 144.3012+-0.4594
144.2128+-0.4487
stanford-crypto-ccm 111.9439+-0.3347
111.7710+-0.3905
stanford-crypto-pbkdf2 396.0157+-1.5465 ?
397.0790+-1.5456 ?
stanford-crypto-sha256-iterative 149.2708+-0.4787 !
150.4953+-0.5297 ! definitely 1.0082x slower
<arithmetic> 345.0492+-0.4615 ^
309.9837+-0.3411 ^ definitely 1.1131x faster
<geometric> 252.9574+-0.3953 ^
242.5861+-0.2959 ^ definitely 1.0428x faster
<harmonic> 175.1667+-0.5736 ^
173.5501+-0.3501 ^ definitely 1.0093x faster
TipOfTree PhantomCSE
All benchmarks:
<arithmetic> 125.7060+-0.1391 ^
115.3301+-0.1168 ^ definitely 1.0900x faster
<geometric> 26.6087+-0.0437 ^
26.3931+-0.0411 ^ definitely 1.0082x faster
<harmonic> 7.6444+-0.0289 !
7.7252+-0.0277 ! definitely 1.0106x slower
More information about the webkit-reviews
mailing list