[Webkit-unassigned] [Bug 77950] SSE optimization for vsvesq and vmaxmgv
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Wed Feb 15 19:01:21 PST 2012
https://bugs.webkit.org/show_bug.cgi?id=77950
--- Comment #7 from xingnan.wang at intel.com 2012-02-15 19:01:22 PST ---
(From update of attachment 126033)
View in context: https://bugs.webkit.org/attachment.cgi?id=126033&action=review
>> Source/WebCore/ChangeLog:8
>> + Achieved the performance of 3.7x on vsvesq and 4.1x on vmaxmgv.
>
> Out of curiosity is this speed up measured using the typical 128-length vectors? Or did you use longer vectors so that the initial setup is in the noise?
Yes, the result above is measured with 128-length vectors, and also variable lengths are measured. For example, with 1280-length vectors the speed up is up to 3.88x and 5.1x.
I think the 128-length result is most valuable for us now, so just present the 128-length result.
>> Source/WebCore/platform/audio/VectorMath.cpp:500
>> + mMin = _mm_min_ps(mMin, source);
>
> I think it would be more in line with the original code to compute the absolute value before computing the max. Something like
>
> source = _mm_and_ps(source, mask);
> mMax = _mm_max_ps(mMax, source);
>
> where mask is #x7fffffff (4 copies, one for each float). I think this would be as fast as the current code.
>
> If this is done, then lines 505-507 can be deleted.
Good suggestion, thanks.
>> Source/WebCore/platform/audio/VectorMath.cpp:514
>> +
>
> What is the reason for the temp variable? Is this an optimization so that we're not reading and writing to the same max variable?
I just used the temp variable for the readability without considering the optimization here because it`s not in the loop so it`s not quite necessary, I think. If the opt is truly needed some SSE function may be applied here.
--
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
More information about the webkit-unassigned
mailing list