[webkit-dev] Implementing the <device> element

Wed Feb 9 13:10:30 PST 2011

Hi,

El 7 de febrero de 2011 18:56, Adam Bergkvist <adam.bergkvist at ericsson.com>
 escribió:

> Hi,
>
> On 2011-02-04 19:21, Leandro Graciá Gil wrote:
> > This is good news! Especially for the situations where WebCore can't
> > directly access the hardware. One existing case of this we should keep
> > in mind are the sandboxed environments, where both the probing and the
> > connections must be requested to somewhere outside the sandbox. Usually
> > this will require to communicate with another process, and in this case
> > asynchronous messages are preferred to avoid delays and to make
> > inter-process communication simpler.
> >
> > However I have to disagree in one point. The specification doesn't say
> > anywhere that we should always present a dialog, only that the device
> > element represents a device selector.
>
> The UI has been a central part of the discussions that lead up to the
> device element proposal. The reason for that is that it should enforce
> security.
>
> See the thread "UI for enabling webcam use from untrusted content":
> http://lists.w3.org/Archives/Public/public-device-apis/2009Dec/0149.html
> especially
> http://lists.w3.org/Archives/Public/public-device-apis/2009Dec/0194.html
>
> You're right that we shouldn't limit ourselves by referring to the device
> selector as a dialog. I've renamed ChromeClient::runDeviceDialog to
> runDeviceSelector and the corresponding client interface accordingly.
>

Thanks. I also think that this name, and more importantly this point of
view, is more appropriate.

>
> > Consider this completely valid use
> > case: instead of asking repeatedly the user to select a device, a UA
> > might decide to create some kind of internal device settings
> > configuration panel to select a set of default devices. Later when
> > visiting a page and clicking the device element the UA will
> > automatically use the preferred devices from its internal settings if
> > they are available and the page is trusted. Where is the dialog here?
> >
>
> Couldn't you just see the internal device settings configuration panel,
> you mention, as the device selector that produces a device list that's
> reused several times? In that case you would skip the dialog and simply
> apply the predefined selections (similar to the case where a "remember
> my choices" check box would be available in the device dialog).
>
> Yes, of course. The reason I proposed that use case was to show that we
should not force any dialog, but ask for selection in the way that the UA
decides. This was just an example of what I mentioned in the previous
paragraph and I'm glad that we agree on this.

>
> > I agree that the device should perform selection, as the spec says.
> > However as I've already explained I don't think we need for example a
> > selection dialog for all use cases. Considering that we don't explicitly
> > need a dialog to perform the selection, the only reason to bring lists
> > of available devices back to WebCore is to send them again to a client,
> > probably the same one we asked for probing. Also, if we consider the
> > possibility of sandboxed environments then the device connection
> > operation cannot be a synchronous operation as commented before.
> >
>
> As mentioned above, I see a point in sending the list of available
> devices to WebCore to determine if a new Stream (or other device
> handler) should be created since this behavior should be consistent
> across platforms, regardless if the device is of type "media" or
> "fishtank".
>

So, does that mean that a WebCore platform-independent code is going to
determine if a new device handler should be created? I think that some UAs
may like to, for example, keep a track of trusted pages for specific types
of devices to determine if such a handler should be created. And that is not
likely to be implemented in the same way by all the different platforms even
if they intend to perform a set of basic common security steps. I completely
agree with you that the behaviour should be consistent across platforms,
especially the security and privacy aspects, but I don't think that forcing
the common code to use available device lists is the way to do it. If we
want a consistent behaviour we should ask for it in the specification and
leave the implementation specific details open, not the other way around.

>
> The device "connection operation" is not handled by the device element. The
> device element is used to simply select devices (similar to how you select
> a file with <input type=file> and get a File object which is just a handle
> to the actual file). The "connection" takes place when you use the device,
> e.g., when you play a Stream in a video element; and that will happen
> asynchronously.
>

I think that we're not talking about exactly the same thing when we say
"connection", and probably I should have specified more. I'm not implying
that data should flow from the moment we select the device (and we establish
a "connection" in our model). The data should start flowing asynchronously
when there is something to consume it, for example a video element, a file
(StreamRecorder) or a peer connection. If there is no consumer there is no
need to stream or record any device data. The connection state that our
model refers to is about selecting a device and reserving it to be used.

One important point that a selection-based design misses is that potentially
many devices, if not all, will require exclusive access to them. Let me give
a example. We have two different pages with device elements. First, a user
selects some device (e.g. a microphone) in the first page, but the page
makes no use of it for the moment and hence the real connection is not
performed. Minutes later and without closing the first page, the same user
selects the same device from before on the second device element (if you
have only one microphone and an average user this could be very easy to
happen) but this page also makes no use for it at that moment. What happens
if for example the first page starts making use of the device and causes the
connections on the second page to fail? This is almost for sure not the
behaviour that the user is expecting or wanting.

One approach to solve this problem could be to have some list of busy
devices so that they couldn't be selected twice if mutual exclusivity is
required. That would work, but devices need a way to get out of that list.
To solve that you need to keep an eye to when a connection fails or closes,
and that will mean that you will be creating the concept of a session: you
(may) reserve the object from the time of its selection to the time of its
disconnection. Since we're going to create the concept implicitly
anyway, wwouldn't
it be better to bring this connection concept out explicitly, instead of
purely selecting the devices? That's exactly the approach we propose in our
design.

> > Reviewing the design with all these factors leaves us the following
> scheme:
> > - Request device selection asynchronously to the client (not necessarily
> > using a dialog).
> > - Retrieve the available device list.
> > - Forward the list to a client (probably the same that a moment ago
> > probed the devices) to connect them. Do it asynchronously to keep
> > compatibility with sandboxed environments.
> > - Receive the connection request result and some device specific data.
> >
> > So, the available device lists are being sent back to the device element
> > not for making any specific use of them, but for forwarding them to a
> > connection client in an asynchronous model.
> >
> > Wouldn't it be simpler if we refactor the process in this way?
> > - Request device selection asynchronously to a client.
> > - Receive the connection request result and some device specific data.
> >
> > This is exactly what our model proposes. The same goal can be performed
> > by handling connections to devices instead to actual device lists,
> > especially when we're likely to give back the list to the same client
> > that provided it to us. It also avoids any list handling code outside
> > the clients and to implement an intermediate selection/connection
> > element state.
> >
> > To make our proposal clearer, we have uploaded a patch with most of our
> > WebCore implementation. It can be found here:
> > https://bugs.webkit.org/show_bug.cgi?id=53777
> > This patch it's not intended to be reviewed (it's too big for that) but
> > to serve as an implementation example of our proposed model. Our
> > original plan was to upload it in small, easy to review pieces. This
> > patch would be intended to be the second of them, after introducing the
> > compilation guards: https://bugs.webkit.org/show_bug.cgi?id=53776
> >
> > I have also created a small diagram to explain how our implementation
> works:
> >
> https://docs.google.com/a/google.com/drawings/edit?id=1jSW-6MJd8mp2qPvwnvZnBVzll6UtBz3r1viZgTE4XVA&hl=en&authkey=CPLpy5oJ
> > <
> https://docs.google.com/a/google.com/drawings/edit?id=1jSW-6MJd8mp2qPvwnvZnBVzll6UtBz3r1viZgTE4XVA&hl=en&authkey=CPLpy5oJ
> >
> >
> > The basic idea is that we have a device controller that handles all the
> > message forwarding to the platform specific clients in an asynchronous,
> > easy to extend way. Please note that the platform specific clients are
> > not necessarily required to be implemented by WebKit, but they are not
> > designed to be part of the common inter-platform code.
> >
> > Our implementation already implements most details of the device element
> > and the stream / stream recorder objects in the specification by making
> > use of simple messages to the clients and their replies. It's also
> > designed to be easily extended to future media types if the
> > specification requires so.
> >
> > Please try to analyze and understand our proposed model. I'm doing the
> > same with yours, so we can discuss the details and reach a good design
> > in common that suits all of us.
> >
>
> I've looked at your patch in http://webkit.org/b/53777 and have some
> initial comments.
>
> Source/WebCore/html/HTMLDeviceElement.cpp
> 121: return Stream::create(this, m_connection.streamUrl());
> "data" is getter property on the device element and shouldn't create a
> new Stream on every call.
>

You're right on this. Thanks for pointing it out, I've already fixed it.

> Source/WebCore/dom/Stream.h
> 58: RefPtr<HTMLDeviceElement> m_deviceElement;
> The Stream is tightly coupled to the device element that created it and
> this doesn't work well with a remote Stream created by ConnectionPeer.
> Your StreamRecorder is also dependent on the device element.
>

Again, you're completely right on this. The design we proposed was too
focused to the local device connection case, to the point that it identifies
device connections with the device element itself. Thanks to your
suggestion, we have now reviewed our design to remove these unnecessary
couplings and dependencies, whilest also supporting remote devices.

I'm glad to present this new and updated design:
https://bugs.webkit.org/attachment.cgi?id=81854  (thanks John for preparing
this!)
I have added its new reference implementation in the bug page (
https://bugs.webkit.org/show_bug.cgi?id=53777 ).

For the moment this new diagram shows the flow for a successful local device
connection and disconnection. We are currently preparing new design diagrams
to explain with all the main message flows, including the details of its
interaction with the ConnectionPeer API. I will upload them in the next few
days and post links in this thread.

Here are some of the major changes in our design:

   - We no longer identify the connection (we have discussed our concept of
   connection above) with the device element, but we transform the
   DeviceController to a DeviceConnectionController that is the actual holder
   and manager of all the DeviceConnection objects.

   - We have extended our DeviceConnection object to handle transparently
   both local and remote streams.

   - The DeviceConnectionController becomes a black box creating, destroying
   and managing all the device connections on request. From this point of view,
   the device element is just a client of it creating and using local device
   connections. In the same way, the ConnectionPeer API is the client that
   provides the remote device connections.

   - We have changed the way that the connection listeners work. Now there
   can be many different objects listening to the status updates (connection,
   disconnection, failure, etc.) of a device connection, either local or
   remote. This way every single object (device elements, streams, stream
   recorders...) can properly and clearly handle what to do when the connection
   they are dependent on changes.

   - We have introduced connection ids as the way to communicate with the
   connection controller, and now even requests to connect a local device have
   a connection id associated with it. The connection information and
   descriptors are handled transparently to any client of the connection
   controller and they automatically cleared when they are no longer needed.

   - Our design talks about remote devices, not only remote streams.
   Regarding the device design, we could easily extend the ConnectionPeer API
   in the future to send data for other non-media types.

   - Since both the device element and the ConnectionPeer API are clients of
   a device connection management service, it would be extremely easy to
   replace the html device element with a generic local device client in case
   that the html element finally disappears in favor of a device API (something
   that is currently being considered as a real possibility). Most of the
   actual code could be reused in that case.

We are also preparing a basic skeleton for the ConnectionPeer API
implementation to show how it can integrate with the proposed device design
providing remote streams and sending the local ones at request. We expect to
upload a new example code patch in the next days.

We think that this can be a good proposal for the design, but it can be
better with the feedback of those who are following this thread and may have
ideas to contribute. Please don't hesitate to make any suggestions or
questions. I'll do my best to explain any details on what we propose.

> > What I mean is that we should put only generic code in WebCore so that
> > it can be used by all platforms. If we code in WebCore some details on
> > how the selection or connection should be performed it may be the case
> > that some platforms won't use part of the WebCode code to avoid them. We
> > should also keep in mind that even if now the spec only focuses in media
> > devices, it is quite possible that in the future new types of devices
> > may be added with different contexts, availability or connection
> > requests, or even security requirements. The more generic we stay about
> > selecting and connecting them, the easier it will be later to adapt to
> > any future changes in the specification.
> >
>
> Even though every platform will have to implement their own device selector
> UI, it is quite likely that several platforms may share the same media
> backend and could thus share the same probing and stream handling code.
> For example, we've based our implementation on GStreamer which can be used
> as a component in the GTK, EFL and WinCairo ports.
>

Yes, you're right on this. I pointed out in my last email that I don't think
that there is any problem in having some platform specific code in WebCore
as long as any platform can choose to use it or to easily delegate this task
to its own WebKit implementation. Sorry if my proposed class diagram
mentioned communicating with WebKit as it's not necessarily the case. We
have already fixed this for the new diagrams.

>
> BR
> Adam
>

Thanks,
Leandro
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20110209/9729cf45/attachment.html>