[Webkit-unassigned] [Bug 77950] SSE optimization for vsvesq and vmaxmgv

Wed Feb 15 19:01:21 PST 2012

https://bugs.webkit.org/show_bug.cgi?id=77950

--- Comment #7 from xingnan.wang at intel.com  2012-02-15 19:01:22 PST ---
(From update of attachment 126033)
View in context: https://bugs.webkit.org/attachment.cgi?id=126033&action=review

>> Source/WebCore/ChangeLog:8
>> +        Achieved the performance of 3.7x on vsvesq and 4.1x on vmaxmgv.
> 
> Out of curiosity is this speed up measured using the typical 128-length vectors?  Or did you use longer vectors so that the initial setup is in the noise?

Yes, the result above is measured with 128-length vectors, and also variable lengths are measured. For example, with 1280-length vectors the speed up is up to 3.88x and 5.1x.
I think the 128-length result is most valuable for us now, so just present the 128-length result.

>> Source/WebCore/platform/audio/VectorMath.cpp:500
>> +            mMin = _mm_min_ps(mMin, source);
> 
> I think it would be more in line with the original code to compute the absolute value before computing the max.  Something like
> 
> source = _mm_and_ps(source, mask);
> mMax = _mm_max_ps(mMax, source);
> 
> where mask is #x7fffffff (4 copies, one for each float).  I think this would be as fast as the current code.
> 
> If this is done, then lines 505-507 can be deleted.

Good suggestion, thanks.

>> Source/WebCore/platform/audio/VectorMath.cpp:514
>> +
> 
> What is the reason for the temp variable?  Is this an optimization so that we're not reading and writing to the same max variable?

I just used the temp variable for the readability without considering the optimization here because it`s not in the loop so it`s not quite necessary, I think. If the opt is truly needed some SSE function may be applied here.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.