[webkit-dev] architecture specific optimizations
x yz
lastguy at yahoo.com
Sun Feb 15 11:36:09 PST 2009
Hi,
I'm working on port to MIPS32, where no SSE2 can be used. I believe no where except for <assembler> shall have architecture or assembly related code. I see after last release <JIT> changes indeed towards such direction.
No reference such as X86:: outside <assembler> folder.
rgds
joe
--- On Fri, 2/13/09, Osztrogonac Csaba <oszi at inf.u-szeged.hu> wrote:
> From: Osztrogonac Csaba <oszi at inf.u-szeged.hu>
> Subject: [webkit-dev] architecture specific optimizations
> To: webkit-dev at lists.webkit.org
> Date: Friday, February 13, 2009, 6:47 PM
> Hi all,
>
> We are interested in SFX speed optimizations, and we have
> experimented with some architecture specific optimizaton.
>
> If enable gcc to generate SSE2 instructions with -msse2
> option,
> SunSpider has 4.8% progression with JIT, and 2.4%
> progression
> with interpreter. (result attached) (-msse2 is default
> option
> on MAC platform, but it isn't on qt-linux platform)
>
> Nowadays the rate of sse2 capability CPU is increasing.
> (e.g. all of the x86-64 architecture have sse2.) I think
> we should take advantage of different architectures. Have
> you got any idea? e.g. different build for architectures -
> determine the platform capabilities at buid time, etc
>
> br,
> Ossy
>
>
> TEST COMPARISON FROM
> TO DETAILS
>
> =============================================================================
>
> ** TOTAL **: 1.024x as fast 2396.1ms +/- 0.3%
> 2341.0ms +/- 0.5% significant
>
> =============================================================================
>
> 3d: 1.060x as fast 381.9ms +/- 0.7%
> 360.3ms +/- 0.7% significant
> cube: 1.081x as fast 125.7ms +/- 1.5%
> 116.3ms +/- 1.6% significant
> morph: 1.069x as fast 144.1ms +/- 0.8%
> 134.8ms +/- 0.7% significant
> raytrace: 1.027x as fast 112.1ms +/- 0.6%
> 109.2ms +/- 0.9% significant
>
> access: ?? 341.8ms +/- 0.3%
> 342.8ms +/- 1.0% not conclusive: might be *1.003x as
> slow*
> binary-trees: *1.041x as slow* 29.4ms +/- 1.3%
> 30.6ms +/- 2.3% significant
> fannkuch: *1.015x as slow* 130.4ms +/- 0.4%
> 132.3ms +/- 0.3% significant
> nbody: 1.027x as fast 146.5ms +/- 0.3%
> 142.7ms +/- 2.3% significant
> nsieve: *1.048x as slow* 35.5ms +/- 1.1%
> 37.2ms +/- 0.8% significant
>
> bitops: 1.016x as fast 222.0ms +/- 0.3%
> 218.5ms +/- 0.3% significant
> 3bit-bits-in-byte: - 38.5ms +/- 1.0%
> 38.2ms +/- 0.8%
> bits-in-byte: - 50.9ms +/- 0.4%
> 50.7ms +/- 1.0%
> bitwise-and: - 46.0ms +/- 0.0%
> 46.0ms +/- 1.0%
> nsieve-bits: 1.036x as fast 86.6ms +/- 0.4%
> 83.6ms +/- 0.6% significant
>
> controlflow: ?? 25.5ms +/- 1.5%
> 25.8ms +/- 1.8% not conclusive: might be *1.012x as
> slow*
> recursive: ?? 25.5ms +/- 1.5%
> 25.8ms +/- 1.8% not conclusive: might be *1.012x as
> slow*
>
> crypto: 1.043x as fast 158.0ms +/- 0.6%
> 151.5ms +/- 0.4% significant
> aes: 1.016x as fast 57.5ms +/- 1.2%
> 56.6ms +/- 0.9% significant
> md5: 1.060x as fast 51.0ms +/- 0.7%
> 48.1ms +/- 0.5% significant
> sha1: 1.058x as fast 49.5ms +/- 0.8%
> 46.8ms +/- 0.6% significant
>
> date: - 168.0ms +/- 1.6%
> 166.7ms +/- 1.5%
> format-tofte: 1.026x as fast 67.8ms +/- 1.1%
> 66.1ms +/- 1.3% significant
> format-xparb: ?? 100.2ms +/- 2.1%
> 100.6ms +/- 2.2% not conclusive: might be *1.004x as
> slow*
>
> math: 1.072x as fast 304.9ms +/- 0.3%
> 284.3ms +/- 0.7% significant
> cordic: 1.112x as fast 111.6ms +/- 0.6%
> 100.4ms +/- 1.2% significant
> partial-sums: 1.048x as fast 128.1ms +/- 0.3%
> 122.2ms +/- 0.7% significant
> spectral-norm: 1.057x as fast 65.2ms +/- 0.5%
> 61.7ms +/- 0.8% significant
>
> regexp: - 300.3ms +/- 0.4%
> 299.6ms +/- 0.3%
> dna: - 300.3ms +/- 0.4%
> 299.6ms +/- 0.3%
>
> string: - 493.7ms +/- 0.9%
> 491.5ms +/- 1.0%
> base64: 1.029x as fast 52.6ms +/- 2.0%
> 51.1ms +/- 1.5% significant
> fasta: 1.031x as fast 83.7ms +/- 1.7%
> 81.2ms +/- 1.5% significant
> tagcloud: ?? 154.7ms +/- 1.1%
> 156.0ms +/- 0.9% not conclusive: might be *1.008x as
> slow*
> unpack-code: ?? 124.2ms +/- 1.7%
> 125.7ms +/- 1.7% not conclusive: might be *1.012x as
> slow*
> validate-input: - 78.5ms +/- 1.9%
> 77.5ms +/- 1.1%
>
>
> TEST COMPARISON FROM
> TO DETAILS
>
> =============================================================================
>
> ** TOTAL **: 1.048x as fast 1391.0ms +/- 0.4%
> 1327.4ms +/- 0.4% significant
>
> =============================================================================
>
> 3d: 1.081x as fast 273.3ms +/- 0.5%
> 252.8ms +/- 0.5% significant
> cube: 1.093x as fast 94.9ms +/- 1.0%
> 86.8ms +/- 1.0% significant
> morph: 1.084x as fast 106.8ms +/- 0.6%
> 98.5ms +/- 0.6% significant
> raytrace: 1.061x as fast 71.6ms +/- 0.7%
> 67.5ms +/- 1.0% significant
>
> access: 1.111x as fast 154.7ms +/- 0.9%
> 139.2ms +/- 0.9% significant
> binary-trees: 1.113x as fast 15.7ms +/- 5.7%
> 14.1ms +/- 4.4% significant
> fannkuch: ?? 18.4ms +/- 2.0%
> 18.7ms +/- 3.6% not conclusive: might be *1.016x as
> slow*
> nbody: 1.149x as fast 110.5ms +/- 1.4%
> 96.2ms +/- 0.3% significant
> nsieve: ?? 10.1ms +/- 2.2%
> 10.2ms +/- 3.0% not conclusive: might be *1.010x as
> slow*
>
> bitops: 1.038x as fast 51.5ms +/- 0.7%
> 49.6ms +/- 0.7% significant
> 3bit-bits-in-byte: - 4.4ms +/- 8.4%
> 4.1ms +/- 5.5%
> bits-in-byte: - 9.3ms +/- 3.7%
> 9.2ms +/- 3.3%
> bitwise-and: ?? 12.2ms +/- 2.5%
> 12.3ms +/- 2.8% not conclusive: might be *1.008x as
> slow*
> nsieve-bits: 1.067x as fast 25.6ms +/- 1.4%
> 24.0ms +/- 0.0% significant
>
> controlflow: ?? 5.1ms +/- 4.4%
> 5.2ms +/- 5.8% not conclusive: might be *1.020x as
> slow*
> recursive: ?? 5.1ms +/- 4.4%
> 5.2ms +/- 5.8% not conclusive: might be *1.020x as
> slow*
>
> crypto: 1.057x as fast 77.4ms +/- 1.0%
> 73.2ms +/- 1.0% significant
> aes: - 23.3ms +/- 1.5%
> 23.2ms +/- 1.3%
> md5: 1.077x as fast 28.0ms +/- 1.2%
> 26.0ms +/- 1.3% significant
> sha1: 1.088x as fast 26.1ms +/- 1.6%
> 24.0ms +/- 1.4% significant
>
> date: - 144.8ms +/- 2.3%
> 143.9ms +/- 1.6%
> format-tofte: 1.032x as fast 52.2ms +/- 1.8%
> 50.6ms +/- 1.4% significant
> format-xparb: ?? 92.6ms +/- 2.8%
> 93.3ms +/- 2.3% not conclusive: might be *1.008x as
> slow*
>
> math: 1.109x as fast 203.5ms +/- 0.4%
> 183.5ms +/- 0.7% significant
> cordic: 1.186x as fast 67.0ms +/- 0.7%
> 56.5ms +/- 0.7% significant
> partial-sums: 1.056x as fast 99.7ms +/- 0.5%
> 94.4ms +/- 1.3% significant
> spectral-norm: 1.129x as fast 36.8ms +/- 0.8%
> 32.6ms +/- 1.5% significant
>
> regexp: ?? 50.4ms +/- 0.7%
> 50.7ms +/- 1.0% not conclusive: might be *1.006x as
> slow*
> dna: ?? 50.4ms +/- 0.7%
> 50.7ms +/- 1.0% not conclusive: might be *1.006x as
> slow*
>
> string: - 430.3ms +/- 0.9%
> 429.3ms +/- 0.9%
> base64: 1.057x as fast 38.9ms +/- 1.8%
> 36.8ms +/- 3.0% significant
> fasta: - 64.6ms +/- 3.0%
> 63.8ms +/- 1.9%
> tagcloud: - 154.3ms +/- 0.7%
> 153.6ms +/- 0.6%
> unpack-code: ?? 103.2ms +/- 2.0%
> 103.5ms +/- 1.4% not conclusive: might be *1.003x as
> slow*
> validate-input: ?? 69.3ms +/- 2.6%
> 71.6ms +/- 3.5% not conclusive: might be *1.033x as
> slow*
> _______________________________________________
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
More information about the webkit-dev
mailing list