[Webkit-unassigned] [Bug 75528] Optimize the multiply-add in Biquad.cpp::process
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Mon Mar 19 12:13:27 PDT 2012
https://bugs.webkit.org/show_bug.cgi?id=75528
--- Comment #43 from Raymond Toy <rtoy at chromium.org> 2012-03-19 12:13:27 PST ---
(From update of attachment 132532)
View in context: https://bugs.webkit.org/attachment.cgi?id=132532&action=review
This looks much nicer than the assembly code, and it's faster too! Code looks good, but adding some comments would help to understand what's going on.
> Source/WebCore/platform/audio/Biquad.cpp:113
> + __m128d mma1 = _mm_set_sd(-a1);
mma1 = (0, -a1)
> Source/WebCore/platform/audio/Biquad.cpp:117
> + mm6 = mm1;
Add comment mm6 = (y2, x)
> Source/WebCore/platform/audio/Biquad.cpp:118
> + mm1 = _mm_shuffle_pd(mm1, mm4, 0); // y2 = y1
Add comment that mm1 = (y1, x)
> Source/WebCore/platform/audio/Biquad.cpp:119
> + mm5 = _mm_mul_pd(mm2, mm0); // (x1*b1 x2*b2)
Insert comma: (x1 * b1, x2 * b2)
Webkit style includes a space around arithmetic operators. Do the same for next line.
> Source/WebCore/platform/audio/Biquad.cpp:121
> + mm0 = _mm_shuffle_pd(mm0, mm1, 1); // (x1 = x, x2 = x1)
Comment is confusing. Maybe say mm0 = (x, x1)
> Source/WebCore/platform/audio/Biquad.cpp:122
> + mm4 = _mm_mul_sd(mm4, mma1); // -y1*a1
Be more explicit about contents of mm4.
mm4 = (0, -y1 * a1)
> Source/WebCore/platform/audio/Biquad.cpp:123
> + mm5 = _mm_add_pd(mm5, mm6); // (x1*b1-y2*a2 x2*b2+x*b0)
Insert comma, add space around operators:
(x1 * b1 - y2 * a2, x2 * b2 + x * b0)
> Source/WebCore/platform/audio/Biquad.cpp:124
> + n--;
Why decrement n here? Is this a pipeline issue? If not, can we put the decrement back into while (n--)?
> Source/WebCore/platform/audio/Biquad.cpp:125
> + mm6 = mm5;
mm6 = (x1 * b1 - y2 * a2, x2 * b2 + x * b0)
> Source/WebCore/platform/audio/Biquad.cpp:126
> + mm7 = _mm_load_ss(sourceP);
mm7 = (0, 0, 0, *sourceP)
And we're loading the next x value.
> Source/WebCore/platform/audio/Biquad.cpp:127
> + mm1 = _mm_cvtss_sd(mm1, mm7); // x = *sourceP
Maybe say exactly what is in mm1? mm1 = (y1, x)
> Source/WebCore/platform/audio/Biquad.cpp:128
> + mm5 = _mm_add_pd(mm4, mm5); // (x1*b1-y2*a2 x2*b2+x*b0-y1*a1)
Same comment as for line 123, 119.
Any advantage in doing mm_add_pd? It looks like we don't care about the high part of mm5. Is that right?
> Source/WebCore/platform/audio/Biquad.cpp:129
> + mm6 = _mm_shuffle_pd(mm6, mm6, 1);
mm6 = (x2 * b2 + x * b0, x1 * b1 - y2 * a2)
> Source/WebCore/platform/audio/Biquad.cpp:130
> + mm5 = _mm_add_pd(mm5, mm6); // y
mm5 = (x1*b1-y2*a2+x2*b2+x*b0, y = x2*b2+x*b0-y1*a1+x1*b1-y2*a2), and we ignore high part of mm5, so maybe just mm_add_sd?
> Source/WebCore/platform/audio/Biquad.cpp:131
> + mm7 = _mm_cvtsd_ss(mm7, mm5);
mm7 = (0, 0, 0, y)
> Source/WebCore/platform/audio/Biquad.cpp:133
> + mm4 = mm5; // y1 = y
mm4 = (x1*b1-y2*a2+x2*b2+x*b0, y)
--
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
More information about the webkit-unassigned
mailing list