[Webkit-unassigned] [Bug 73789] Need SSE optimization for SincResampler::Process()
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Mon Dec 5 10:16:33 PST 2011
https://bugs.webkit.org/show_bug.cgi?id=73789
Benjamin Poulain <benjamin at webkit.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #117871|review? |review-
Flag| |
--- Comment #14 from Benjamin Poulain <benjamin at webkit.org> 2011-12-05 10:16:33 PST ---
(From update of attachment 117871)
View in context: https://bugs.webkit.org/attachment.cgi?id=117871&action=review
> Source/WebCore/ChangeLog:7
> + Implement the SSE optimization in SincResampler::process()
> + https://bugs.webkit.org/show_bug.cgi?id=73789
> +
> + Reviewed by NOBODY (OOPS!).
> +
You should add a little description, mentioning the performance gain of the change.
> Source/WebCore/ChangeLog:10
> + * platform/audio/SincResampler.cpp:
> + (WebCore::SincResampler::process):
> 2011-12-05 Roland Steiner <rolandsteiner at chromium.org>
Missing space at the end of the ChangeLog.
> Source/WebCore/platform/audio/SincResampler.cpp:264
> + while ((reinterpret_cast<size_t>(inputP) & 0x0F) && n) {
This should be uintptr_t instead of size_t.
> Source/WebCore/platform/audio/SincResampler.cpp:270
> + int group = n / 4;
Instead of tracking this, you could find the end, and then loop until the end:
while (mInput < end) {
...
}
The advantage is you remove the computation of group-- from a tight loop.
> Source/WebCore/platform/audio/SincResampler.cpp:282
> + sums1 = _mm_setzero_ps();
> + sums2 = _mm_setzero_ps();
You can move those to the lines where you declare sums1 and sums2.
> Source/WebCore/platform/audio/SincResampler.cpp:284
> + k1Aligned = !(reinterpret_cast<size_t>(k1) & 0x0F);
> + k2Aligned = !(reinterpret_cast<size_t>(k2) & 0x0F);
This should be uintptr_t instead of size_t.
> Source/WebCore/platform/audio/SincResampler.cpp:298
> + mul1 = _mm_mul_ps(mInput, mK1);
> + mul2 = _mm_mul_ps(mInput, mK2);
> + sums1 = _mm_add_ps(sums1, mul1);
> + sums2 = _mm_add_ps(sums2, mul2);
> +
> + inputP += 4;
> + k1 += 4;
> + k2 += 4;
This code is copied three time. It could be moved to a macro CONVOLVE_4_SAMPLE_SSE.
You can try an inline function but I think that will fail with MSVC compiler due to a bug when passing vectors as arguments.
--
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
More information about the webkit-unassigned
mailing list