[webkit-reviews] review requested: [Bug 68316] DFG JIT does not have full block-local CSE : [Attachment 107778] the patch - reduced worst case performance a bit

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Sat Sep 17 16:31:30 PDT 2011


Filip Pizlo <fpizlo at apple.com> has asked  for review:
Bug 68316: DFG JIT does not have full block-local CSE
https://bugs.webkit.org/show_bug.cgi?id=68316

Attachment 107778: the patch - reduced worst case performance a bit
https://bugs.webkit.org/attachment.cgi?id=107778&action=review

------- Additional Comments from Filip Pizlo <fpizlo at apple.com>
This improces the CSE algorithm by ensuring that pure CSE only searches for
aliases in the range:

From: Beginning of the basic block, or the maximum index of the children,
whichever is bigger.
To: Last occurence of a node with the same opcode.

It searches this range backwards.  This optimization does not apply for
non-pure operations because it would miss side effects.

Current numbers with latest changes.

Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"PhantomCSE" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc

Collected 30 samples per benchmark/VM, with 10 VM invocations per benchmark.
Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get
microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.

					    TipOfTree		    PhantomCSE 
				 
SunSpider:
   3d-cube				  7.7572+-0.0872    ?	 
7.7909+-0.1244	     ?
   3d-morph				  7.5184+-0.0785    ?	 
7.5558+-0.0963	     ?
   3d-raytrace				  7.7239+-0.1032    ?	 
7.7468+-0.1156	     ?
   access-binary-trees			  2.2740+-0.0315    ?	 
2.3107+-0.0482	     ? might be 1.0161x slower
   access-fannkuch			 11.7960+-0.1183    ^	
11.5626+-0.0988       ^ definitely 1.0202x faster
   access-nbody 			  4.1633+-0.0454    ?	 
4.2540+-0.0832	     ? might be 1.0218x slower
   access-nsieve			  2.6018+-0.0398    ?	 
2.6032+-0.0310	     ?
   bitops-3bit-bits-in-byte		  1.6668+-0.0267    ?	 
1.7083+-0.0400	     ? might be 1.0249x slower
   bitops-bits-in-byte			  2.7696+-0.0319    ?	 
2.8233+-0.0574	     ? might be 1.0194x slower
   bitops-bitwise-and			  3.5817+-0.0621    ?	 
3.6520+-0.0629	     ? might be 1.0196x slower
   bitops-nsieve-bits			  5.2870+-0.0667    ?	 
5.2977+-0.0682	     ?
   controlflow-recursive		  2.0383+-0.0401    ?	 
2.1312+-0.0930	     ? might be 1.0456x slower
   crypto-aes				  7.0860+-0.2158    ?	 
7.1663+-0.1855	     ? might be 1.0113x slower
   crypto-md5				  2.8093+-0.0448    ?	 
2.8269+-0.0811	     ?
   crypto-sha1				  2.2280+-0.0402    ?	 
2.2414+-0.0501	     ?
   date-format-tofte			 10.2269+-0.1716	
10.2042+-0.1426       
   date-format-xparb			  8.7081+-0.1255    ?	 
8.8777+-0.1652	     ? might be 1.0195x slower
   math-cordic				  6.1956+-0.0690	 
6.1295+-0.0601	       might be 1.0108x faster
   math-partial-sums			  7.4034+-0.0937	 
7.3807+-0.0997	     
   math-spectral-norm			  2.5948+-0.0337    ?	 
2.6117+-0.0482	     ?
   regexp-dna				 10.8661+-0.1062    ?	
10.9373+-0.1211       ?
   string-base64			  5.8322+-0.1108	 
5.8219+-0.1171	     
   string-fasta 			  6.9385+-0.1083    ?	 
7.0151+-0.1314	     ? might be 1.0110x slower
   string-tagcloud			 12.1121+-0.1935    ?	
12.3550+-0.2062       ? might be 1.0201x slower
   string-unpack-code			 18.8462+-0.3793	
18.5825+-0.2283 	might be 1.0142x faster
   string-validate-input		  6.6845+-0.1124    ?	 
6.7245+-0.1514	     ?

   <arithmetic> 			  6.4504+-0.0253    ?	 
6.4735+-0.0239	     ?
   <geometric>				  5.3125+-0.0181    !	 
5.3515+-0.0169	     ! definitely 1.0073x slower
   <harmonic>				  4.3312+-0.0168    !	 
4.3786+-0.0162	     ! definitely 1.0109x slower

					    TipOfTree		    PhantomCSE 
				 
V8:
   crypto				 82.7124+-0.2753    ?	
82.9466+-0.3673       ?
   deltablue				239.2213+-0.8353    ?  
239.8840+-1.0250       ?
   earley-boyer 			 94.9496+-0.1586    ?	
95.3712+-0.4045       ?
   raytrace				 69.0085+-0.2725	
68.9221+-0.4887       
   regexp				107.3699+-0.7440       
107.0229+-0.3693       
   richards				217.7471+-0.5430    !  
219.5234+-0.5972       ! definitely 1.0082x slower
   splay				 98.7746+-0.3300	
98.7596+-0.2519       

   <arithmetic> 			129.9691+-0.2003    ?  
130.3471+-0.2048       ?
   <geometric>				116.9542+-0.1749    ?  
117.1789+-0.1988       ?
   <harmonic>				107.1496+-0.1678    ?  
107.2769+-0.2259       ?

					    TipOfTree		    PhantomCSE 
				 
Kraken:
   ai-astar				636.9671+-4.2157    ^  
629.5812+-1.9895       ^ definitely 1.0117x faster
   audio-beat-detection 		467.7131+-1.0341    !  
470.1425+-1.0160       ! definitely 1.0052x slower
   audio-dft				425.3115+-3.7983       
423.8902+-2.1948       
   audio-fft				364.0817+-0.6708       
363.9339+-1.0152       
   audio-oscillator			312.5008+-0.3736    ?  
312.5853+-0.5099       ?
   imaging-darkroom			413.9106+-0.8494    ^  
410.8664+-0.5848       ^ definitely 1.0074x faster
   imaging-desaturate			207.8968+-0.4679    !  
218.0565+-0.4687       ! definitely 1.0489x slower
   imaging-gaussian-blur	       1081.8128+-1.7363    ^  
589.7422+-1.1866       ^ definitely 1.8344x faster
   json-parse-financial 		 49.2932+-0.1803    ?	
49.7524+-0.2908       ?
   json-stringify-tinderbox		 69.6689+-1.5041    ^	
67.6637+-0.3880       ^ definitely 1.0296x faster
   stanford-crypto-aes			144.3012+-0.4594       
144.2128+-0.4487       
   stanford-crypto-ccm			111.9439+-0.3347       
111.7710+-0.3905       
   stanford-crypto-pbkdf2		396.0157+-1.5465    ?  
397.0790+-1.5456       ?
   stanford-crypto-sha256-iterative	149.2708+-0.4787    !  
150.4953+-0.5297       ! definitely 1.0082x slower

   <arithmetic> 			345.0492+-0.4615    ^  
309.9837+-0.3411       ^ definitely 1.1131x faster
   <geometric>				252.9574+-0.3953    ^  
242.5861+-0.2959       ^ definitely 1.0428x faster
   <harmonic>				175.1667+-0.5736    ^  
173.5501+-0.3501       ^ definitely 1.0093x faster

					    TipOfTree		    PhantomCSE 
				 
All benchmarks:
   <arithmetic> 			125.7060+-0.1391    ^  
115.3301+-0.1168       ^ definitely 1.0900x faster
   <geometric>				 26.6087+-0.0437    ^	
26.3931+-0.0411       ^ definitely 1.0082x faster
   <harmonic>				  7.6444+-0.0289    !	 
7.7252+-0.0277	     ! definitely 1.0106x slower


More information about the webkit-reviews mailing list