[Webkit-unassigned] [Bug 75528] Optimize the multiply-add in Biquad.cpp::process

Mon Mar 19 12:13:27 PDT 2012

https://bugs.webkit.org/show_bug.cgi?id=75528

--- Comment #43 from Raymond Toy <rtoy at chromium.org>  2012-03-19 12:13:27 PST ---
(From update of attachment 132532)
View in context: https://bugs.webkit.org/attachment.cgi?id=132532&action=review

This looks much nicer than the assembly code, and it's faster too!  Code looks good, but adding some comments would help to understand what's going on.

> Source/WebCore/platform/audio/Biquad.cpp:113
> +    __m128d mma1 = _mm_set_sd(-a1);

mma1 = (0, -a1)

> Source/WebCore/platform/audio/Biquad.cpp:117
> +        mm6 = mm1;

Add comment mm6 = (y2, x)

> Source/WebCore/platform/audio/Biquad.cpp:118
> +        mm1 = _mm_shuffle_pd(mm1, mm4, 0); // y2 = y1

Add comment that mm1 = (y1, x)

> Source/WebCore/platform/audio/Biquad.cpp:119
> +        mm5 = _mm_mul_pd(mm2, mm0); // (x1*b1 x2*b2)

Insert comma:  (x1 * b1, x2 * b2)
Webkit style includes a space around arithmetic operators.  Do the same for next line.

> Source/WebCore/platform/audio/Biquad.cpp:121
> +        mm0 = _mm_shuffle_pd(mm0, mm1, 1); // (x1 = x, x2 = x1)

Comment is confusing.  Maybe say mm0 = (x, x1)

> Source/WebCore/platform/audio/Biquad.cpp:122
> +        mm4 = _mm_mul_sd(mm4, mma1); // -y1*a1

Be more explicit about contents of mm4.

mm4 = (0, -y1 * a1)

> Source/WebCore/platform/audio/Biquad.cpp:123
> +        mm5 = _mm_add_pd(mm5, mm6); // (x1*b1-y2*a2 x2*b2+x*b0)

Insert comma, add space around operators:

(x1 * b1 - y2 * a2, x2 * b2 + x * b0)

> Source/WebCore/platform/audio/Biquad.cpp:124
> +        n--;

Why decrement n here?  Is this a pipeline issue?  If not, can we put the decrement back into while (n--)?

> Source/WebCore/platform/audio/Biquad.cpp:125
> +        mm6 = mm5;

mm6 = (x1 * b1 - y2 * a2, x2 * b2 + x * b0)

> Source/WebCore/platform/audio/Biquad.cpp:126
> +        mm7 = _mm_load_ss(sourceP);

mm7 = (0, 0, 0, *sourceP)

And we're loading the next x value.

> Source/WebCore/platform/audio/Biquad.cpp:127
> +        mm1 = _mm_cvtss_sd(mm1, mm7); // x = *sourceP

Maybe say exactly what is in mm1?  mm1 = (y1, x)

> Source/WebCore/platform/audio/Biquad.cpp:128
> +        mm5 = _mm_add_pd(mm4, mm5); // (x1*b1-y2*a2 x2*b2+x*b0-y1*a1) 

Same comment as for line 123, 119.

Any advantage in doing mm_add_pd?  It looks like we don't care about the high part of mm5.  Is that right?

> Source/WebCore/platform/audio/Biquad.cpp:129
> +        mm6 = _mm_shuffle_pd(mm6, mm6, 1);

mm6 = (x2 * b2 + x * b0, x1 * b1 - y2 * a2)

> Source/WebCore/platform/audio/Biquad.cpp:130
> +        mm5 = _mm_add_pd(mm5, mm6); // y

mm5 = (x1*b1-y2*a2+x2*b2+x*b0, y = x2*b2+x*b0-y1*a1+x1*b1-y2*a2), and we ignore high part of mm5, so maybe just mm_add_sd?

> Source/WebCore/platform/audio/Biquad.cpp:131
> +        mm7 = _mm_cvtsd_ss(mm7, mm5);

mm7 = (0, 0, 0, y)

> Source/WebCore/platform/audio/Biquad.cpp:133
> +        mm4 = mm5; // y1 = y

mm4 = (x1*b1-y2*a2+x2*b2+x*b0, y)

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.