[webkit-dev] architecture specific optimizations

x yz lastguy at yahoo.com
Sun Feb 15 11:36:09 PST 2009


Hi,
I'm working on port to MIPS32, where no SSE2 can be used. I believe no where except for <assembler> shall have architecture or assembly related code. I see after last release <JIT> changes indeed towards such direction.
No reference such as X86:: outside <assembler> folder. 
rgds
joe 


--- On Fri, 2/13/09, Osztrogonac Csaba <oszi at inf.u-szeged.hu> wrote:

> From: Osztrogonac Csaba <oszi at inf.u-szeged.hu>
> Subject: [webkit-dev] architecture specific optimizations
> To: webkit-dev at lists.webkit.org
> Date: Friday, February 13, 2009, 6:47 PM
> Hi all,
> 
> We are interested in SFX speed optimizations, and we have
> experimented with some architecture specific optimizaton.
> 
> If enable gcc to generate SSE2 instructions with -msse2
> option,
> SunSpider has 4.8% progression with JIT, and 2.4%
> progression
> with interpreter. (result attached) (-msse2 is default
> option
> on MAC platform, but it isn't on qt-linux platform)
> 
> Nowadays the rate of sse2 capability CPU is increasing.
> (e.g. all of the x86-64 architecture have sse2.) I think
> we should take advantage of different architectures. Have
> you got any idea? e.g. different build for architectures -
> determine the platform capabilities at buid time, etc
> 
> br,
> Ossy
> 
> 
> TEST                   COMPARISON            FROM          
>       TO             DETAILS
> 
> =============================================================================
> 
> ** TOTAL **:           1.024x as fast    2396.1ms +/- 0.3% 
>  2341.0ms +/- 0.5%     significant
> 
> =============================================================================
> 
>   3d:                  1.060x as fast     381.9ms +/- 0.7% 
>   360.3ms +/- 0.7%     significant
>     cube:              1.081x as fast     125.7ms +/- 1.5% 
>   116.3ms +/- 1.6%     significant
>     morph:             1.069x as fast     144.1ms +/- 0.8% 
>   134.8ms +/- 0.7%     significant
>     raytrace:          1.027x as fast     112.1ms +/- 0.6% 
>   109.2ms +/- 0.9%     significant
> 
>   access:              ??                 341.8ms +/- 0.3% 
>   342.8ms +/- 1.0%     not conclusive: might be *1.003x as
> slow*
>     binary-trees:      *1.041x as slow*    29.4ms +/- 1.3% 
>    30.6ms +/- 2.3%     significant
>     fannkuch:          *1.015x as slow*   130.4ms +/- 0.4% 
>   132.3ms +/- 0.3%     significant
>     nbody:             1.027x as fast     146.5ms +/- 0.3% 
>   142.7ms +/- 2.3%     significant
>     nsieve:            *1.048x as slow*    35.5ms +/- 1.1% 
>    37.2ms +/- 0.8%     significant
> 
>   bitops:              1.016x as fast     222.0ms +/- 0.3% 
>   218.5ms +/- 0.3%     significant
>     3bit-bits-in-byte: -                   38.5ms +/- 1.0% 
>    38.2ms +/- 0.8% 
>     bits-in-byte:      -                   50.9ms +/- 0.4% 
>    50.7ms +/- 1.0% 
>     bitwise-and:       -                   46.0ms +/- 0.0% 
>    46.0ms +/- 1.0% 
>     nsieve-bits:       1.036x as fast      86.6ms +/- 0.4% 
>    83.6ms +/- 0.6%     significant
> 
>   controlflow:         ??                  25.5ms +/- 1.5% 
>    25.8ms +/- 1.8%     not conclusive: might be *1.012x as
> slow*
>     recursive:         ??                  25.5ms +/- 1.5% 
>    25.8ms +/- 1.8%     not conclusive: might be *1.012x as
> slow*
> 
>   crypto:              1.043x as fast     158.0ms +/- 0.6% 
>   151.5ms +/- 0.4%     significant
>     aes:               1.016x as fast      57.5ms +/- 1.2% 
>    56.6ms +/- 0.9%     significant
>     md5:               1.060x as fast      51.0ms +/- 0.7% 
>    48.1ms +/- 0.5%     significant
>     sha1:              1.058x as fast      49.5ms +/- 0.8% 
>    46.8ms +/- 0.6%     significant
> 
>   date:                -                  168.0ms +/- 1.6% 
>   166.7ms +/- 1.5% 
>     format-tofte:      1.026x as fast      67.8ms +/- 1.1% 
>    66.1ms +/- 1.3%     significant
>     format-xparb:      ??                 100.2ms +/- 2.1% 
>   100.6ms +/- 2.2%     not conclusive: might be *1.004x as
> slow*
> 
>   math:                1.072x as fast     304.9ms +/- 0.3% 
>   284.3ms +/- 0.7%     significant
>     cordic:            1.112x as fast     111.6ms +/- 0.6% 
>   100.4ms +/- 1.2%     significant
>     partial-sums:      1.048x as fast     128.1ms +/- 0.3% 
>   122.2ms +/- 0.7%     significant
>     spectral-norm:     1.057x as fast      65.2ms +/- 0.5% 
>    61.7ms +/- 0.8%     significant
> 
>   regexp:              -                  300.3ms +/- 0.4% 
>   299.6ms +/- 0.3% 
>     dna:               -                  300.3ms +/- 0.4% 
>   299.6ms +/- 0.3% 
> 
>   string:              -                  493.7ms +/- 0.9% 
>   491.5ms +/- 1.0% 
>     base64:            1.029x as fast      52.6ms +/- 2.0% 
>    51.1ms +/- 1.5%     significant
>     fasta:             1.031x as fast      83.7ms +/- 1.7% 
>    81.2ms +/- 1.5%     significant
>     tagcloud:          ??                 154.7ms +/- 1.1% 
>   156.0ms +/- 0.9%     not conclusive: might be *1.008x as
> slow*
>     unpack-code:       ??                 124.2ms +/- 1.7% 
>   125.7ms +/- 1.7%     not conclusive: might be *1.012x as
> slow*
>     validate-input:    -                   78.5ms +/- 1.9% 
>    77.5ms +/- 1.1% 
> 
> 
> TEST                   COMPARISON            FROM          
>       TO             DETAILS
> 
> =============================================================================
> 
> ** TOTAL **:           1.048x as fast    1391.0ms +/- 0.4% 
>  1327.4ms +/- 0.4%     significant
> 
> =============================================================================
> 
>   3d:                  1.081x as fast     273.3ms +/- 0.5% 
>   252.8ms +/- 0.5%     significant
>     cube:              1.093x as fast      94.9ms +/- 1.0% 
>    86.8ms +/- 1.0%     significant
>     morph:             1.084x as fast     106.8ms +/- 0.6% 
>    98.5ms +/- 0.6%     significant
>     raytrace:          1.061x as fast      71.6ms +/- 0.7% 
>    67.5ms +/- 1.0%     significant
> 
>   access:              1.111x as fast     154.7ms +/- 0.9% 
>   139.2ms +/- 0.9%     significant
>     binary-trees:      1.113x as fast      15.7ms +/- 5.7% 
>    14.1ms +/- 4.4%     significant
>     fannkuch:          ??                  18.4ms +/- 2.0% 
>    18.7ms +/- 3.6%     not conclusive: might be *1.016x as
> slow*
>     nbody:             1.149x as fast     110.5ms +/- 1.4% 
>    96.2ms +/- 0.3%     significant
>     nsieve:            ??                  10.1ms +/- 2.2% 
>    10.2ms +/- 3.0%     not conclusive: might be *1.010x as
> slow*
> 
>   bitops:              1.038x as fast      51.5ms +/- 0.7% 
>    49.6ms +/- 0.7%     significant
>     3bit-bits-in-byte: -                    4.4ms +/- 8.4% 
>     4.1ms +/- 5.5% 
>     bits-in-byte:      -                    9.3ms +/- 3.7% 
>     9.2ms +/- 3.3% 
>     bitwise-and:       ??                  12.2ms +/- 2.5% 
>    12.3ms +/- 2.8%     not conclusive: might be *1.008x as
> slow*
>     nsieve-bits:       1.067x as fast      25.6ms +/- 1.4% 
>    24.0ms +/- 0.0%     significant
> 
>   controlflow:         ??                   5.1ms +/- 4.4% 
>     5.2ms +/- 5.8%     not conclusive: might be *1.020x as
> slow*
>     recursive:         ??                   5.1ms +/- 4.4% 
>     5.2ms +/- 5.8%     not conclusive: might be *1.020x as
> slow*
> 
>   crypto:              1.057x as fast      77.4ms +/- 1.0% 
>    73.2ms +/- 1.0%     significant
>     aes:               -                   23.3ms +/- 1.5% 
>    23.2ms +/- 1.3% 
>     md5:               1.077x as fast      28.0ms +/- 1.2% 
>    26.0ms +/- 1.3%     significant
>     sha1:              1.088x as fast      26.1ms +/- 1.6% 
>    24.0ms +/- 1.4%     significant
> 
>   date:                -                  144.8ms +/- 2.3% 
>   143.9ms +/- 1.6% 
>     format-tofte:      1.032x as fast      52.2ms +/- 1.8% 
>    50.6ms +/- 1.4%     significant
>     format-xparb:      ??                  92.6ms +/- 2.8% 
>    93.3ms +/- 2.3%     not conclusive: might be *1.008x as
> slow*
> 
>   math:                1.109x as fast     203.5ms +/- 0.4% 
>   183.5ms +/- 0.7%     significant
>     cordic:            1.186x as fast      67.0ms +/- 0.7% 
>    56.5ms +/- 0.7%     significant
>     partial-sums:      1.056x as fast      99.7ms +/- 0.5% 
>    94.4ms +/- 1.3%     significant
>     spectral-norm:     1.129x as fast      36.8ms +/- 0.8% 
>    32.6ms +/- 1.5%     significant
> 
>   regexp:              ??                  50.4ms +/- 0.7% 
>    50.7ms +/- 1.0%     not conclusive: might be *1.006x as
> slow*
>     dna:               ??                  50.4ms +/- 0.7% 
>    50.7ms +/- 1.0%     not conclusive: might be *1.006x as
> slow*
> 
>   string:              -                  430.3ms +/- 0.9% 
>   429.3ms +/- 0.9% 
>     base64:            1.057x as fast      38.9ms +/- 1.8% 
>    36.8ms +/- 3.0%     significant
>     fasta:             -                   64.6ms +/- 3.0% 
>    63.8ms +/- 1.9% 
>     tagcloud:          -                  154.3ms +/- 0.7% 
>   153.6ms +/- 0.6% 
>     unpack-code:       ??                 103.2ms +/- 2.0% 
>   103.5ms +/- 1.4%     not conclusive: might be *1.003x as
> slow*
>     validate-input:    ??                  69.3ms +/- 2.6% 
>    71.6ms +/- 3.5%     not conclusive: might be *1.033x as
> slow*
> _______________________________________________
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


      


More information about the webkit-dev mailing list