[webkit-reviews] review requested: [Bug 69690] DFG does not have flow-sensitive intraprocedural control flow analysis : [Attachment 110283] the patch - more perf improvements

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Sat Oct 8 15:30:24 PDT 2011


Filip Pizlo <fpizlo at apple.com> has asked  for review:
Bug 69690: DFG does not have flow-sensitive intraprocedural control flow
analysis
https://bugs.webkit.org/show_bug.cgi?id=69690

Attachment 110283: the patch - more perf improvements
https://bugs.webkit.org/attachment.cgi?id=110283&action=review

------- Additional Comments from Filip Pizlo <fpizlo at apple.com>
- Did more tuning to the CFA itself. It now calls clobberStructures() less
frequently. clobberStructures() is an expensive, but necessary, method. Likely
any future work on improving CFA efficiency will focus on that method - how
it's called, and what it does internally.

- Added better debugging support. Every graph dump will now print the CFA state
in a concise and clear way.

- Fixed a bug where clobberStructures() was not being called for
possibly-side-effecting operations like LogicalNot on questionable objects, and
CompareXYZ on questionable objects.

- Did some investigation into what is going on in SunSpider. It turns out that
date-format-xparb, among others, are still failing speculation like crazy. A
lot of its is due to GetByVal not having a string property case. The
exponential backoff in recompilation is doing its best, but since the benchmark
is so short-running, any increase in DFG runtime will affect such short-running
benchmarks that fail speculation systematically.

- Ran performance numbers on a separate machine, to find that the speed-up is
smaller on my Mac Pro than it is on my MacBook Pro.  But it's a speed-up on
both machines.

Current performance situation:

- Overall 0.5% speed-up in geomean of preferred means.

- Slight reproducible sub-1% slow-down on SunSpider

- Either a 2.4% or a 0.5% win on V8 depending on machine.

- 1% win on Kraken.

Recommendation: we should land this, since: it's already a net win, the losses
on SunSpider are due to preexisting conditions for which we are currently
uninsured, and this should only become a bigger win as we add more machinery to
take advantage of it. (Like, we're not doing anything to take advantage of the
always-double propagation that this thing is already doing, and I haven't yet
implemented the is-constant propagation, which should be a win on function
checks among other things.) Not to mention that classically, CFA is most
profitable when combined with inlining.

Latest numbers on MacBook Pro:


Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc
"CFA" at /Volumes/Data/pizlo/OpenSource/WebKitBuild/Release/jsc

Collected 30 samples per benchmark/VM, with 10 VM invocations per benchmark.
Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get
microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.

					    TipOfTree		       CFA     
				 
SunSpider:
   3d-cube				  7.2512+-0.1027	 
7.2506+-0.1160	     
   3d-morph				  7.6656+-0.0813    ?	 
7.7124+-0.0978	     ?
   3d-raytrace				  7.3843+-0.0984    ?	 
7.4021+-0.0967	     ?
   access-binary-trees			  1.7302+-0.0287	 
1.6781+-0.0239	       might be 1.0310x faster
   access-fannkuch			  6.3504+-0.0672	 
6.3159+-0.0640	     
   access-nbody 			  3.4521+-0.0469    ?	 
3.4932+-0.0531	     ? might be 1.0119x slower
   access-nsieve			  2.5600+-0.0356    ?	 
2.5885+-0.0470	     ? might be 1.0111x slower
   bitops-3bit-bits-in-byte		  1.7154+-0.0262    ?	 
1.7434+-0.0319	     ? might be 1.0163x slower
   bitops-bits-in-byte			  2.7388+-0.0391    ?	 
2.7447+-0.0359	     ?
   bitops-bitwise-and			  3.3050+-0.0546    ?	 
3.3606+-0.0589	     ? might be 1.0168x slower
   bitops-nsieve-bits			  5.4385+-0.0653	 
5.4370+-0.0711	     
   controlflow-recursive		  2.0667+-0.0212    ?	 
2.0812+-0.0208	     ?
   crypto-aes				  6.6382+-0.1104    ?	 
6.6709+-0.1017	     ?
   crypto-md5				  2.8355+-0.0466	 
2.8343+-0.0453	     
   crypto-sha1				  2.4700+-0.0373    ?	 
2.5155+-0.0443	     ? might be 1.0184x slower
   date-format-tofte			  9.9355+-0.0996    ?	 
9.9652+-0.1371	     ?
   date-format-xparb			  9.5529+-0.1560    !	 
9.9465+-0.1131	     ! definitely 1.0412x slower
   math-cordic				  6.2958+-0.0680    ?	 
6.3863+-0.0523	     ? might be 1.0144x slower
   math-partial-sums			  7.5041+-0.0695    ?	 
7.5846+-0.0878	     ? might be 1.0107x slower
   math-spectral-norm			  2.8198+-0.0398    ?	 
2.8726+-0.0465	     ? might be 1.0187x slower
   regexp-dna				 10.6842+-0.0893	
10.6180+-0.1092       
   string-base64			  5.2352+-0.0858    ?	 
5.2951+-0.0554	     ? might be 1.0114x slower
   string-fasta 			  6.3476+-0.0765    ?	 
6.4334+-0.0947	     ? might be 1.0135x slower
   string-tagcloud			 11.0721+-0.1259    ?	
11.1451+-0.1129       ?
   string-unpack-code			 20.9296+-0.2131    ?	
21.3736+-0.3461       ? might be 1.0212x slower
   string-validate-input		  6.3714+-0.1157	 
6.3281+-0.0869	     

   <arithmetic> *			  6.1673+-0.0147    !	 
6.2222+-0.0173	     ! definitely 1.0089x slower
   <geometric>				  5.0674+-0.0157    !	 
5.1036+-0.0164	     ! definitely 1.0072x slower
   <harmonic>				  4.1647+-0.0213    ?	 
4.1880+-0.0207	     ?

					    TipOfTree		       CFA     
				 
V8:
   crypto				 71.9306+-0.2217	
71.6923+-0.2342       
   deltablue				224.0349+-1.1196    ^  
220.8591+-0.8745       ^ definitely 1.0144x faster
   earley-boyer 			 90.8330+-0.2471    ?	
91.0550+-0.6016       ?
   raytrace				 58.6492+-0.1942    ^	
57.5860+-0.1636       ^ definitely 1.0185x faster
   regexp				103.2842+-0.5738    ?  
103.6217+-0.3030       ?
   richards				204.7647+-0.4575    ^  
178.8069+-0.4249       ^ definitely 1.1452x faster
   splay				 93.9206+-0.3154	
93.8991+-0.3847       

   <arithmetic> 			121.0596+-0.1853    ^  
116.7886+-0.1891       ^ definitely 1.0366x faster
   <geometric> *			107.9179+-0.1498    ^  
105.3890+-0.1611       ^ definitely 1.0240x faster
   <harmonic>				 97.7053+-0.1498    ^	
96.2554+-0.1560       ^ definitely 1.0151x faster

					    TipOfTree		       CFA     
				 
Kraken:
   ai-astar				491.8722+-1.8657       
491.6102+-2.1346       
   audio-beat-detection 		191.3896+-0.6444    !  
193.1956+-0.5294       ! definitely 1.0094x slower
   audio-dft				265.5784+-1.5956    ^  
261.3193+-2.1896       ^ definitely 1.0163x faster
   audio-fft				124.9334+-0.1853    ?  
125.4338+-0.4450       ?
   audio-oscillator			251.5153+-1.0876       
250.4084+-1.0322       
   imaging-darkroom			412.2875+-0.8770    ?  
412.7251+-0.8786       ?
   imaging-desaturate			229.8157+-0.4044    ^  
216.2189+-0.3764       ^ definitely 1.0629x faster
   imaging-gaussian-blur		579.9277+-0.8575    ^  
567.2564+-0.9921       ^ definitely 1.0223x faster
   json-parse-financial 		 53.8428+-0.2035    ?	
54.0464+-0.2093       ?
   json-stringify-tinderbox		 67.6931+-0.2539    ?	
67.7358+-0.2734       ?
   stanford-crypto-aes			130.2188+-0.9518       
129.4420+-0.8691       
   stanford-crypto-ccm			 99.4847+-0.3260    !  
101.1548+-0.4164       ! definitely 1.0168x slower
   stanford-crypto-pbkdf2		187.6499+-0.8004    ?  
188.8759+-1.0071       ?
   stanford-crypto-sha256-iterative	 69.9878+-0.1863	
69.6543+-0.1668       

   <arithmetic> *			225.4426+-0.3118    ^  
223.5055+-0.3132       ^ definitely 1.0087x faster
   <geometric>				175.7162+-0.2682    ^  
174.8010+-0.2755       ^ definitely 1.0052x faster
   <harmonic>				136.9772+-0.2368       
136.7825+-0.2282       

					    TipOfTree		       CFA     
				 
All benchmarks:
   <arithmetic> 			 88.5950+-0.1068    ^	
87.4122+-0.1004       ^ definitely 1.0135x faster
   <geometric>				 22.9797+-0.0474	
22.9534+-0.0477       
   <harmonic>				  7.3245+-0.0367    ?	 
7.3629+-0.0354	     ?

					    TipOfTree		       CFA     
				 
Geomean of preferred means:
   <scaled-result>			 53.1381+-0.0625    ^	
52.7237+-0.0651       ^ definitely 1.0079x faster

Latest numbers on Mac Pro:


Benchmark report for SunSpider, V8, and Kraken.

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/secondary/OpenSource/WebKitBuild/Release/jsc

"CFA" at /Volumes/Data/fromMiniMe/OpenSource/WebKitBuild/Release/jsc

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark.
Used 1 benchmark iteration per VM
invocation for warm-up. Used the jsc-specific preciseTime() function to get
microsecond-level timing. Reporting
benchmark execution times with 95% confidence intervals in milliseconds.

					   TipOfTree		      CFA      
				
SunSpider:
  3d-cube				 7.9004+-0.0265    !	 7.9902+-0.0335
      ! definitely 1.0114x slower
  3d-morph				 8.4615+-0.1076    ?	 8.4887+-0.1116
      ?
  3d-raytrace				 8.0821+-0.1095 	 8.0724+-0.0618
      
  access-binary-trees			 1.7927+-0.0033 	 1.7806+-0.0161
      
  access-fannkuch			 7.6813+-0.0148    !	 7.9859+-0.0254
      ! definitely 1.0397x slower
  access-nbody				 4.1641+-0.0068    !	 4.1778+-0.0037
      ! definitely 1.0033x slower
  access-nsieve 			 3.1422+-0.0032    !	 3.1584+-0.0066
      ! definitely 1.0052x slower
  bitops-3bit-bits-in-byte		 1.7320+-0.0103    ?	 1.7441+-0.0089
      ?
  bitops-bits-in-byte			 5.2152+-0.0453    ?	 5.2206+-0.0533
      ?
  bitops-bitwise-and			 3.4118+-0.0640 	 3.3993+-0.0644
      
  bitops-nsieve-bits			 5.6763+-0.0392 	 5.6438+-0.0351
      
  controlflow-recursive 		 2.2564+-0.0244 	 2.2509+-0.0028
      
  crypto-aes				 6.9132+-0.0689    !	 7.0664+-0.0630
      ! definitely 1.0222x slower
  crypto-md5				 3.0181+-0.0343    ?	 3.0585+-0.0331
      ? might be 1.0134x slower
  crypto-sha1				 2.7090+-0.0109    !	 2.7557+-0.0108
      ! definitely 1.0172x slower
  date-format-tofte			10.6874+-0.0850    ?	10.7886+-0.1017
      ?
  date-format-xparb			10.7804+-0.2653    ?	11.2082+-0.1793
      ? might be 1.0397x slower
  math-cordic				 7.2695+-0.0506 	 7.2238+-0.0219
      
  math-partial-sums			10.4781+-0.0501 	10.4767+-0.0339
      
  math-spectral-norm			 3.2288+-0.0168    !	 3.2505+-0.0036
      ! definitely 1.0067x slower
  regexp-dna				12.4863+-0.1218    ?	12.5624+-0.1253
      ?
  string-base64 			 5.4727+-0.0648    ?	 5.5304+-0.0697
      ? might be 1.0105x slower
  string-fasta				 7.3632+-0.0576 	 7.3506+-0.0232
      
  string-tagcloud			12.8958+-0.0586 	12.8081+-0.0688
      
  string-unpack-code			23.9726+-0.1714    ?	24.2456+-0.1831
      ? might be 1.0114x slower
  string-validate-input 		 6.8011+-0.1062 	 6.7479+-0.0643
      

  <arithmetic> *			 7.0612+-0.0244    !	 7.1149+-0.0199
      ! definitely 1.0076x slower
  <geometric>				 5.7812+-0.0155    !	 5.8176+-0.0134
      ! definitely 1.0063x slower
  <harmonic>				 4.6961+-0.0111    ?	 4.7189+-0.0120
      ?

					   TipOfTree		      CFA      
				
V8:
  crypto				80.3328+-0.2637 	79.9557+-0.1580
      
  deltablue			       252.4219+-1.5655    ?   253.0487+-1.3899
      ?
  earley-boyer			       110.3867+-0.3100    !   111.4466+-0.2599
      ! definitely 1.0096x slower
  raytrace				66.3575+-0.3105    ^	64.6593+-0.1899
      ^ definitely 1.0263x faster
  regexp			       123.6966+-0.5882        123.6552+-0.5325
      
  richards			       221.5944+-0.9748    ^   215.4157+-1.0238
      ^ definitely 1.0287x faster
  splay 			       122.9378+-0.7662    ?   124.5312+-0.8843
      ? might be 1.0130x slower

  <arithmetic>			       139.6754+-0.4149        138.9589+-0.3738
      
  <geometric> * 		       125.8663+-0.2769    ^   125.2510+-0.2812
      ^ definitely 1.0049x faster
  <harmonic>			       114.2437+-0.1976    ^   113.5287+-0.2205
      ^ definitely 1.0063x faster

					   TipOfTree		      CFA      
				
Kraken:
  ai-astar			       832.8773+-0.4054       
825.4636+-11.7787      
  audio-beat-detection		       214.1596+-1.7669        213.3784+-0.9437
      
  audio-dft			       271.3054+-4.2497    ^   260.8641+-3.0615
      ^ definitely 1.0400x faster
  audio-fft			       135.4391+-0.1603    !   139.2097+-1.1166
      ! definitely 1.0278x slower
  audio-oscillator		       293.9338+-3.0579        292.1754+-1.9142
      
  imaging-darkroom		       486.8431+-4.6328        484.2544+-3.7257
      
  imaging-desaturate		       244.8629+-0.2524    ^   237.8543+-0.0918
      ^ definitely 1.0295x faster
  imaging-gaussian-blur 	       641.8301+-0.2648    ^   611.2323+-0.2040
      ^ definitely 1.0501x faster
  json-parse-financial			68.4385+-0.1365    ?	68.6702+-0.2375
      ?
  json-stringify-tinderbox		79.7739+-0.3035    ?	80.1401+-0.2522
      ?
  stanford-crypto-aes		       152.6437+-1.2898        151.5830+-1.4340
      
  stanford-crypto-ccm		       116.1246+-0.7775    ?   117.1616+-0.5738
      ?
  stanford-crypto-pbkdf2	       235.4855+-1.7872        232.6827+-1.6888
	might be 1.0120x faster
  stanford-crypto-sha256-iterative	85.6205+-0.2548    ?	86.0403+-0.3846
      ?

  <arithmetic> *		       275.6670+-0.6405    ^   271.4793+-0.9585
      ^ definitely 1.0154x faster
  <geometric>			       208.1917+-0.4363    ^   206.5505+-0.5262
      ^ definitely 1.0079x faster
  <harmonic>			       162.2625+-0.2764        162.1470+-0.4291
      

					   TipOfTree		      CFA      
				
All benchmarks:
  <arithmetic>			       106.8225+-0.2224    ^   105.4981+-0.2833
      ^ definitely 1.0126x faster
  <geometric>				26.6011+-0.0546    ?	26.6112+-0.0438
      ?
  <harmonic>				 8.2687+-0.0192    ?	 8.3073+-0.0206
      ?

					   TipOfTree		      CFA      
				
Geomean of preferred means:
  <scaled-result>			62.5735+-0.1251    ^	62.3100+-0.0910
      ^ definitely 1.0042x faster


More information about the webkit-reviews mailing list