[Webkit-unassigned] [Bug 75528] Optimize the multiply-add in Biquad.cpp::process

Tue Mar 20 10:00:27 PDT 2012

https://bugs.webkit.org/show_bug.cgi?id=75528

--- Comment #46 from Raymond Toy <rtoy at chromium.org>  2012-03-20 10:00:27 PST ---
(From update of attachment 132760)
View in context: https://bugs.webkit.org/attachment.cgi?id=132760&action=review

Code looks good; just a few more minor items to look at.

> Source/WebCore/platform/audio/Biquad.cpp:109
> +    __m128d mm4 = _mm_set_pd(0, y1); // mm4 = (0, y1)

It looks like we only care about the the low part of mm4.  Maybe add comment that we only use the low part of mm4.

Same goes for mma1 in line 113.

> Source/WebCore/platform/audio/Biquad.cpp:122
> +        mm4 = _mm_mul_sd(mm4, mma1); // mm4 = (0, -y1 * a1)

I think the comment should be mm4=-y1*a1 since we're doing mul_sd.  We don't care what's in the high part of mm4.

> Source/WebCore/platform/audio/Biquad.cpp:125
> +        mm6 = mm5; // mm6 = (x1 * b1 - y2 * a2, x2 * b2 + x * b0)

See note on for line 130.  Shuffle mm5 to the right parts of mm6 now instead of later?

> Source/WebCore/platform/audio/Biquad.cpp:126
> +        mm7 = _mm_load_ss(sourceP); // Load next x value, mm7 = (0, 0, 0, x)

We load mm7 from memory here, but use it in the very next instruction.  Could this be moved further up to lessen any pipeline issues?  We're not using the value of mm7 until line 127.

> Source/WebCore/platform/audio/Biquad.cpp:128
> +        mm5 = _mm_add_sd(mm4, mm5); // mm5 = (x1 * b1 - y2 * a2, x2 * b2 + x * b0 - y1 * a1)

Since we're doing mm_add_sd, we don't care what's in the high part of mm5.  Maybe the comment should be mm5 = x2*b2 + x*b0-y1*a1.

> Source/WebCore/platform/audio/Biquad.cpp:130
> +        mm5 = _mm_add_sd(mm5, mm6); // mm5 = (x1 * b1 - y2 * a2, x * b0 + x1 * b1 + x2 * b2 - y1 * a1 - y2 * a2) = (*, y)

Same comment here as for line 128.  We don't care what's in the high part of mm5.

Also, in line 129, we swap the high and low parts of mm6.  Isn't it possible to do that in line 125 where we assign mm5 to mm6?  Just shuffle mm5 to the right parts of mm6?  That's one less instruction, but I don't know what the pipeline affects will be.

> Source/WebCore/platform/audio/Biquad.cpp:131
> +        mm7 = _mm_cvtsd_ss(mm7, mm5); // mm7 = (0, 0, 0, y)

Actually, I think the comment is wrong. (Sorry about that.)  cvtsd_ss doesn't modify the other parts of mm7.  Plus we're only using the low part, maybe just say mm7 = static_cast<float>(y).

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.