[webkit-dev] Enable REQUEST_ANIMATION_FRAME on all ports? (was Re: ENABLE flag cleanup strawman proposal)

Tue Sep 27 20:57:34 PDT 2011

On Tue, Sep 27, 2011 at 10:34 AM, Chris Marrin <cmarrin at apple.com> wrote:

>
> On Sep 26, 2011, at 9:48 PM, James Robinson wrote:
> >
> > What happened at frame 9?  Instead of advancing by 15 milliseconds worth,
> the bird jumped forward by twice the normal amount.  Why?  We ran the rAF
> callback twice between frames 8 and 9 - once at 135ms and once at 150ms.
>  What's actually going on here is we're accumulating a small amount of drift
> on every frame (1.66666... milliseconds, to be precision) between when the
> display is refreshing and when the callbacks are being invoked.  This has to
> catch up sometime so we end up with a beat pattern every (16 2/3) / abs(16
> 2/3 - 15) = 10 frames.  The same thing happens with a perfect 16ms timer
> every 25 frames, or with a perfect 17ms timer every 50 frames.  Even a very
> close timer will produce these regular beat patterns and as it turns out the
> human eye is incredibly good at picking out and getting annoyed by these
> effects in an otherwise smooth animation.
>
> I generally agree with your analysis, but I believe your example is
> misleading. "Skipping a frame" would only cause the bird to jump by 30 units
> rather than 15 if you were simply adding 15 units to its position on every
> call to rAF. But that would make the rate of movement of the bird change
> based on the rate at which rAF is called, and that would be poor design. If
> an implementation decided to call rAF at 30ms intervals (due to system load,
> for instance) then the bird would appear to move half as fast, which isn't
> what you want.
>
> Assuming you're basing the position on the time at which the animation
> started, then the bird's apparent rate will not change depending on the rate
> at which rAF is firing.
>
> With that said, I agree with you that there will still be a visual glitch
> in the current implementation. But what's actually happening is that the
> timestamp we're sending to rAF is wrong. We're sending current time.
> Depending on when rAF fires relative to the display refresh, the timestamp
> might be as much as 16ms behind the time the frame is actually seen. If
> you're basing motion on this timestamp, there will be an occasion when one
> frame will have a timestamp that is very close to the display time and the
> next will have a timestamp that is 15ms or so behind. That's why the glitch
> is happening.
>

I'm assuming in this example that the script changing the position of the
bird to match the timestamp parameter passed in. You are correct in saying
that changing the timestamp parameter to reflect the next display time would
get rid of the visual glitch in this example.  In that case the behavior
between frames 8 and 10 would be:

time (millis) : action
120: rAF fired with timestamp 133 1/3
133 1/3: frame 8 produced
135: rAF fired with timestamp 150
150: rAF fired with timestamp 150
 150 0/3: frame 9 produced
165: rAF fired with timestamp 166 2/3
166 2/3: frame 10 produced

The problem here is that in the real world, frames aren't infinitely cheap
to produce and so attempting to run the rAF callback twice between frames 8
and 9 is just as likely to produce a rendering glitch as the problem in the
original example - even though the timestamp is correct.  In order to keep
the animation running smoothly here it's necessary to keep the timestamp and
the scheduling in sync with the actual display rate.

> So I don't believe this has anything to do with Timers per se, but with the
> wrong timestamp we happen to be sending to rAF. We knew this would happen
> and we chose this method because it gave us a nice simple first
> implementation. I still believe it's a fine implementation and it is
> platform independent, so it allows all the ports to support rAF.

> >
> > For this reason, you really need a precise time source that is tied in to
> the actual display's refresh rate.  Not all displays are exactly 60Hz - at
> smaller form factors 50 or even 55hz displays are not completely unheard of.
>  Additionally the normal clock APIs aren't always precise enough to stay in
> sync with the actual display - particularly on windows it's really hard to
> find a clock that doesn't drift around all over the place.
> >
> > The above analysis assumes that all calls are infinitely fast and there's
> no real contention for system resources.  In practice, though, this is
> rarely the case.  It's not uncommon that the system will temporarily get
> overloaded and has to make tradeoffs between maintaining a high framerate
> and remaining responsive to user input.  In Chromium, we have some logic to
> ensure that we load balance between handling input events and painting to
> ensure that processing one type doesn't completely starve the other.  In a
> multi-process environment, such as WebKit2 or Chromium, there needs to be
> coordination between the two processes in the non-composited path in order
> to paint a bitmap and get it onscreen.  If this logic is all operating
> completely independently from the rAF scheduling then it's very easy to end
> up triggering callbacks at a time when the browser can't produce a frame
> anyway, or painting without invoking the rAF callbacks even if they should
> be invoked.  A related issue is what to do when the rAF callbacks themselves
> cause us to be unable to hit our target framerate - for example by
> invalidating some portion of the page that is very expensive to repaint.  In
> that case, the ideal behavior is to throttle down the rAF callback rate to
> what we can sustain, which requires some feedback from the rest of the
> graphics stack.
>
> I think the issue of supplying rAF with accurate timestamps is independent
> of whatever feedback mechanism an implementation uses to do the throttling.
> I'm sure those heuristics will improve over time. But the first step is to
> supply rAF with an accurate timestamp. I've opened
> https://bugs.webkit.org/show_bug.cgi?id=68911 for this. My intention is to
> create a call, similar to scheduleAnimation() but which simply asks platform
> specific code for a time estimate of when the next frame will be visible.
> That can not only be used as the timestamp sent to rAF, but as the basis for
> when the next call to rAF is made. That should avoid any excessive calls to
> rAF.
>

That sounds like a good start, but I don't really think it will be
sufficient.  How will the WebKit layer know when the next frame will be
visible?  There are many considerations in frame scheduling in addition to
the screen's display rate.  I think you'll need to end up duplicating all of
the WebKit-specific frame scheduling logic into the WebCore implementation,
or just be wrong most of the time.

> For Mac, I plan to look into adding a displayLink thread which will
> maintain a timestamp value tied to refresh. I didn't try using a displayLink
> at first because I initially thought I'd use it to actually drive the firing
> of the callback, which would have been complicated and require a lot of
> communication between the threads. Just having the displayLink maintain a
> timestamp means I just need to provide thread safe access to that value.
> Hopefully that will keep overhead low but will achieve the synchronization
> goal.
>

Getting the refresh interval is only the first step.  In the contention case
(which is the really interesting one) just knowing the refresh time of the
display does not give you enough insight into when to pump the animation in
order to make the next frame.

> >
> > Architecturally I think that WebCore is the wrong place to address these
> issues.  WebCore is responsible for generating repaint invalidations and
> passing them out to the WebKit layer via ChromeClient, and it's responsible
> for painting content when the WebKit layer asks it to.  Otherwise, all of
> the frame scheduling logic that would be relevant to rAF lives outside of
> WebCore in the port-specific layers.  Determining a valid clock source for a
> given graphics stack and deciding when to produce new frames are also highly
> port-specific.
> >
> > Note that I don't think that using a timer is necessarily evil in all
> cases.  With some rendering architectures or graphics libraries, it may not
> be possible to produce a better solution.  We still use a timer in chromium
> in our non-composited path, although it is integrated with our frame
> scheduling and back pressure logic.  Additionally a timer is quite easy to
> code up and works "pretty well" most of the time (although you can be sure
> that your pickier users will complain).  There are also some benefits to
> providing this API even without great scheduling - for example a port can
> throttle the rAF callbacks for non-visible content or tabs without the
> backwards compat issues doing the same thing for setTimeout() would have,
> leading to dramatically lower power and resource consumption in some cases.
> >
> > I still think it's dangerous to provide this as a default for all ports
> to fall back on, because I worry that if it's there ports will use it
> without considering the issues I mention above.
>
> I don't think you need to worry. The current REQUEST_ANIMATION_FRAME_TIMER
> implementation does what it was intended to do - provide a platform
> independent implementation of requestAnimationFrame. It provides callbacks
> at an even rate and avoids excessive CPU consumption. I don't think the
> occasional animation glitch is a major flaw. It's just an issue that needs
> to be addressed on a platform specific basis to improve animation quality.
>

Sure, but the only way to achieve that improvement is to replace the
WebCore-timer based system completely.

- James

> -----
> ~Chris
> cmarrin at apple.com
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20110927/9a632034/attachment.html>