[webkit-dev] Media elements meeting notes

Eric Uhrhane ericu at google.com
Tue Apr 13 14:52:00 PDT 2010


Rough, but prompt, here they are.

     Eric

What do people want to get out of the session?

Darin Adler: Use webkit loader to get data into media
elements--currently all the media
elements just get a URL and work around it themselves.

Sam Wienig: What new things can we do with media element?  Integrate
canvas with audio
APIs, editing APIs?

Maciej Stachowiak: There are proposals for captioning and other
accessability stuff with
media elements.  What are out plans?

Eric Seidel: Who's working on what?
             How does media interact with WebKit2?

Simon Fraser [smfr]: Talk about fullscreen. [Maciej: Should the
nonstandard webkit extension
for fullscreen be proposed as a standard?]

Alex Roessler: What do we do about streaming?

Maciej: How does the code fit together?  What are the layers like?

Eric Carlson: Media layout tests.

Which ports active in media?  Apple, Chromium, Qt, Gtk.

Architecture:

HTMLMediaElement, exposed to DOM.  Is either VideoElement or AudioElement.
Implements DOM API.  When the element has to load a file, it creates an
instance of a MediaPlayer object.

MediaPlayer: has API to do all the things an HTMLMediaElement might be asked to
do: load, seek, set volume.  It is HTMLMediaElement interface to the platform
media engine.  However, MediaPlayer doesn't actually talk to the platform media
engine.  When asked to load a file, it gets URL, mime type, optional codecs
param.  It queries the installed media engine(s) to find any that can handle
this type/codec.  If it finds one, it creates an instance of it, a MediaPrivate.

When HTMLMediaElement is asked to start playing, it calls to MediaPlayer, which
calls to MediaPrivate, which calls to media engine.  There's a similar
backchannel that traces back up that path to report state information, ask about
the environment, HW accel, etc.

Q: What here actually puts pixels on the screen?

A: The video element has a RenderMedia, which knows about the element, and can
get through there to the MediaPlayer.  In the non-accel case, when
PlatformEngine tells MediaPrivate that it's got a new frame of video, that calls
up to MediaPlayer to RenderMedia.  When the paint message comes, same chain in
the other direction.

HWAccel case: different on each platform.  Ecarlson gives some details of how
magic happens on one platform.

Q: Where does image decoding work?

A: YUV->RGB conversion?
Implementation detail, different per platform/hardware.  All outside of WebCore
itself.  The paint call gets delegated down the chain to the point that it can
be passed to platform-specific code.

New topic: Captions and other accessibility features.

There are a number of proposals in front of the WG, from the accessibility task
force, to add features to make media more accessible.  Option one: tracks API to
give scripts access to track-level info about movie.  Pick between tracks,
enable/disable tracks, enumerate CC tracks, add UI for them.  Option two: in
markup, refer to external caption file.  Media engine then has to load+display
in sync.  WebKit already has something like option one, with webkit-prefixed
properties exposed.  It would be very straightforward to implement the WG
proposal.

New topic: Fullscreen.

HTML says you should not expose fullscreen to script for security reasons.  And
yet we implement this on Mac+Win already, due to numerous request.  We add a
couple of attributes by which a script can ask if a movie supports fullscreen
[maybe not supported for audio only], if it's currently in fullscreen, and can
move it in+out of fullscreen.  Security concern: trick the user into thinking
that they're looking at the desktop and get them to reveal private info.

How we deal with this:
  Require a user gesture to get in+out of fullscreen.
  Animate in+out in obvious fashion.
  Only the video element goes fullscreen; no custom controls can be
drawn over it.

What do other ports think of this?

  Gtk: We're interested, but the current implementation is hard for us to use.
  It would be easier if we could access VSYNC directly.

  Ecarlson: Check out the windows port; we do things a little bit differently.

Andrew Scherkus: What do we do about the page in the background?  What if it
pops up an alert?  Do we pause?  Block popups?
We'll have to check, but it probably just comes to the front.  Maciej: perhaps
spec it as if it's a separate full-screen window in front of the browser.
Darin: We did [foo] years ago, and it was super annoying; let's not do that
again.

Andrew: What about keyboard trapping concerns?
Ecarlson: We block all key events.

Maciej: Would be cool to have any element able to go fullscreen.  But the
security issues are much bigger outside of video.  Spoofing attacks are
mitigated with the current video implementation with its keyboard/mouse block
and animated transitions.

Darin: Current HTML spec explicitly says not to do what we're doing.  How do we
fix this?
Maciej: I think we can explain how our limited implementation isn't sufficiently
dangerous to be a problem.  It would be nice to get that out of the spec.

Maciej: The point of the user-interaction requirement isn't to make everything
safe, it's to make sure that the user was there when the transition happened.
Eric Uhrhane: And to prevent all ads from forcing themselves to fullscreen.

New topic: Streaming.

Currently the only way to tell if you're streaming a file is if its duration is
infinite.
Darin: Our internal interface is adequate for streaming; the plumbings all
there.

Alex: What's the current state?
Ecarlson: We currently can't use the WebKit loader for streaming.
Darin: None of the current engines use the loader, so it's not blocking them,
but if they want to use it, it's got to have streaming.

New topic: Loader.

Things that would be improved by unifying loading between media elements and the
rest of webkit:
  Unified resource security model.
  Make application cache work for media.
  Make WebCore cache work for media, better cache balancing by unifying caching
  with that of the platform MediaEngine cache.
  Make Inspector work better for media.
  Make authentication work [cookies, referrer, http-auth].
  Easier to add new schemes [data://, file URNs], since it would work with all
  protocols supported by the loader.

We have another list of requirements for the loader before it could be used.
Requirements and drawbacks of unification:
  Need to support non-web protocols such as rtsp.
  May need 2 code paths.
  Need to add byte-range and discontinuous cache support.
  Need to support files > 2GB.
  May be difficult to hook our loader into platform media engine.
  Some media engine's loaders are optimized for media playback heuristics; we'd
  need to match that behavior.
  Andrew: Threading limitations on resource loaders?

Michael Nordman said something here about using ResourceHandle for this.  He can
flesh that out, as I didn't follow the discussion.

New topic: Media Layout Tests.

Ecarlson: A pain, due to platform-specific and engine-specific behaviors.  You
have to generate all combinations of results every time you change anything.
It would be handy to have a trybot that would do result generation for new tests
and changes for you.

??: Often you can test without dumping render tree if there are other mechanisms
for introspection to verify results.  Some of these can be platform-independent.

Sam: What about using an ugly platform-independent theme to be used only for
testing?
Ecarlson: What we need to test is what we need to ship.

Darin: Perhaps push out from the middle: Make extremely platform-specific tests
and totally platform-independent tests, and don't have so many compromises in
the middle.

Ecarlson: The problem is that sometimes you really need to use setTimeout.  If
you hit play, you need to make sure that time actually advances.

Andrew+Ecarlson are pretty much using the trybot technique, but doing it by
breakig the build and then fixing it every time.  It'd be really nice to have a
bot that would generate a nicely-packaged set of results that's easy to checkin.

New topic: Media Elements In WebKit2

Sam: There's no sandbox.  It's no different.
Maciej: When there's a sandbox, we'll have to accomodate that.
Darin: Today, some codecs already run in a separate process.  Let's make sure
they all do.
Maciej: Don't need to sandbox all of them.  Only if they need capabilities that
the web process doesn't normally need do we need to move them out.  For example,
Core Animation can already draw across process.

Andrew: It was really hard to get audio to work across processes in real time.
Plan ahead for this.  One process decodes untrusted audio, the other accesses
system sound device.  XP lets sandboxed code touch the audio device, Vista
doesn't, so a solution has to deal with this sandbox specification granularity.

Darin: Our experience sandboxing plugins can help inform any needed sandboxing
of media engines.


More information about the webkit-dev mailing list