[Webkit-unassigned] [Bug 75528] Optimize the multiply-add in Biquad.cpp::process
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Thu Mar 15 19:17:59 PDT 2012
https://bugs.webkit.org/show_bug.cgi?id=75528
--- Comment #40 from xingnan.wang at intel.com 2012-03-15 19:17:59 PST ---
(In reply to comment #38)
> Created an attachment (id=132095)
--> (https://bugs.webkit.org/attachment.cgi?id=132095&action=review) [details]
> Updated patch to handle 32 and 64-bit builds
>
> Here is an updated patch. This passes all of the webaudio tests in debug mode. (Release mode test in progress.)
>
> This basically replaces all uses of the edx and ecx registers with their 64-bit counterparts, rdx and rcx.
>
> However, the complexity of maintaining this assembly code is getting rather large, so I'm inclined not to do this. It would be much better if intrinsics could be used, or, even better, do what the FIXME comment says and unroll the while loop with carefully scheduled arithmetic (in C) if that can achieve the desired performance gains.
Hi Ray, thank you very much for fixing this bug. I agree that we should not make the code here to get too large.
I have tried some ways to use intrinsics and C code to unroll the loop, but they could not achieve the performance increased by assembly code, because it was not easy to pipeline and reschedule the executing order of the code as assembly did.
But I can make a try to find whether there is a better way to optimize the code with intrinsics.
--
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
More information about the webkit-unassigned
mailing list