[Webkit-unassigned] [Bug 234920] ImageBitmap has poor performance on iOS

Mon Jan 10 09:29:31 PST 2022

https://bugs.webkit.org/show_bug.cgi?id=234920

--- Comment #6 from Simon Taylor <simontaylor1 at ntlworld.com> ---
(In reply to Kenneth Russell from comment #4)
> It looks like the performance is acceptable on macOS; is that correct?

I had only really tested on iOS as we primarily target mobile platforms.

macOS looks pretty slow too - My 2017 intel MBP gives around 50 FPS in the test case with 20ms of "upload" time per frame, my new M1 Pro MBP maintains 60 FPS with around 10ms total upload time. Significant actual CPU work observed on all platforms.

(In reply to Kimmo Kinnunen from comment #5)
> ImageBitmap seems the correct abstraction for this use-case, but the
> use-case does not seem to make sense to me just based on this explanation.
> [...]
> Everything of course should be optimal but in practice not everything will
> be made optimal. It would be useful to understand what is the source of the
> requirement. 

In more detail, the actual use case is wanting to do some image processing on video frames (in WebAssembly, after a read back to CPU), and later rendering that processed frame along with data from that processing. Effectively it's a pipeline where we want to kick off processing on a new frame but keep the old one around for rendering.

Right now we do use a single shared WebGL context and a pool of textures. That way we can upload the current frame to a fresh texture from the pool, process it with our shaders / wasm, then flip to using the new texture in the rendering when processing is finished. That allows the renderer to keep hold of the previously processed frame until processing is ready on the new one. Bug 203148 is a bit of a problem in that case, right now we just ensure sufficient time has passed that we expect there to be a new frame available and hope for the best.

I don't have a need for "arbitrary" numbers of contexts, but using a separate one for the processing would have some advantages:

- Our code uses the WebGL API directly, but our users want to make use of WebGL engines (Three, Babylon, PlayCanvas etc) for their rendering. Engines often maintain a cache of underlying WebGL state to avoid un-necessarily resetting bits that are unchanged. They don't always provide public APIs to invalidate or query those caches, so integrating other WebGL code into a shared context is not really well-supported. One possible solution would be using "gl.get*" to query the WebGL state we might alter before resetting after our code, but that has performance implications on some implementations. The other would be to wrap our low-level code into "Program" abstractions for each engine, which is a lot of work and maintenance burden for a small team.

- On browsers that support OffscreenCanvas, our processing context can run on a worker. The frame is only needed on one context at a time, so can be transferred to the worker and back again for the renderer. Converting video -> ImageBitmap can only happen on the main thread, hence this seems the correct "intermediate" representation.

- With MediaStreamTrackProcessor + OffscreenCanvas, the video frames can be delivered to the worker directly.

> Currently in WebKit ImageBitmap is not implemented to be an optimisation
> across multiple Context2D and WebGL elements.

Without wishing to sound rude - what's the intention of the current implementation then? Is it more about supporting the various color space conversion options and less about performance?

I guess it's natural to see APIs how you want them to be, but for me it feels the intention of ImageBitmap is to keep hold of potentially-large, uncompressed images so they can be easily consumed in various places. It's up to the developer not to over-use them and to close() when finished, and in return they should be quick to consume. The availability of a "bitmaprender" context for canvas and transferToImageBitmap() for OffscreenCanvas both indicate efficient transfers are one of the main use cases.

For me createImageBitmap means "please do any prep work / decoding / etc to get this source ready for efficient use in other web APIs - and off the main thread please, just let me know when it's ready".

I've recently discovered WebGL2 readPixels to a PIXEL_PACK_BUFFER and then getBufferSubData (ideally after a sync object has been signalled to avoid blocking the GPU) has a really nice and efficient implementation in current versions of Safari (great work guys!).

That effectively gives a RGBA ArrayBuffer that can be uploaded pretty efficiently to other WebGL contexts with texImage2D and I guess is pretty efficient in Canvas2D contexts with putImageData too. It can also be transferred efficiently to / from workers, so basically fulfils most of the hopes I had for ImageBitmap.

In browsers with GPU-based Canvas 2D it makes sense ImageBitmap would map to some sort of GPU texture handle. As Safari uses CPU-based Canvas 2D a CPU-side blob of pixels seems reasonable too. Right now it seems consuming an ImageData is sufficiently more costly than a JS-side ArrayBuffer of pixels, which felt pretty unexpected to me.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20220110/b051bb67/attachment.htm>