[Webkit-unassigned] [Bug 42367] Speech input plumbing in webkit

Fri Jul 16 11:40:01 PDT 2010

https://bugs.webkit.org/show_bug.cgi?id=42367

--- Comment #8 from Satish Sampath <satish at chromium.org>  2010-07-16 11:40:01 PST ---
(In reply to comment #7)

Thanks for the comments, some questions/replies below.

> WebKit/chromium/public/WebViewClient.h:340
>  +       virtual WebKit::WebSpeechInputClient* speechInputClient() { return 0; }
> instead of introducing a new interface here, just add the methods
> from WebSpeechInputClient to WebViewClient.
> 
> Then rename WebSpeechInputClientListener to WebSpeechInputListener.

I see other features such as Geolocation have an interface (WebKit::WebGeolocationService) and implemented in the chromium render process as a separate dispatcher-like class which seemed clean. Just double checking with you if we are moving away from that model..

> WebKit/chromium/public/WebSpeechInputClientListener.h:49
>  +      virtual void setRecognitionResult(const WebString&) { WEBKIT_ASSERT_NOT_REACHED(); }
> didCompleteRecognition?

This method can potentially get called multiple times, if there are partial results available as the user keeps speaking. The current name may be more suitable for such cases, is it ok to leave it as is?

> WebKit/chromium/public/WebSpeechInputClient.h:45
>  +      virtual bool attachListener(WebSpeechInputClientListener*) { WEBKIT_ASSERT_NOT_REACHED(); }
> why do we need multiple listeners?

We don't need, there should be a 1:1 mapping between the client and listener. But since the client is fetched from the embedder on creation of WebViewImpl, I was not sure how this attach can be done before that. Any suggestions for making this better?

> WebKit/chromium/public/WebSpeechInputClient.h:57
>  +      virtual void stopRecording() { WEBKIT_ASSERT_NOT_REACHED(); }
> why do we need the recordingComplete (didCompleteRecording) event if
> there is an explicit stopRecording method?  are there other ways that
> a recording could stop such that WebKit would require the notification
> that recording stopped?

stopRecording() is an optional call, without that the browser's speech recording 'endpointer' will detect silence in the input and stop recording automatically once the user stops speaking. stopRecording() is there to let users explicitly click the speech/mic button again in case they are not familiar with the workings or for better feature discoverability/usability.
recordingComplete() will be issued in both these cases. The speech element in WebKit explicitly needs to know when recording stops so it can update the UI and indicate that recognition is in progress (which can take a while if done via a speech recognition server).

> WebKit/chromium/public/WebSpeechInputClient.h:57
>  +      virtual void stopRecording() { WEBKIT_ASSERT_NOT_REACHED(); }
> should there be some way to start recording again?

startRecognition() does that. This method is named stopRecording() and not stopRecognition() because the audio recorded so far is still recognized and the result returned to the input element.

> WebKit/chromium/public/WebSpeechInputClientListener.h:44
>  +      virtual void recordingComplete() { WEBKIT_ASSERT_NOT_REACHED(); }
> since these methods are implemented by WebKit, they should be pure virtual.

I got comments from Marcus (above) to change from pure virtual to this model as he felt this was the new hotness, I'm happy to move back to pure virtual if you think that is the way to go.

> what if there are multiple pages independently listening for speech
> input?  given these interfaces, it seems like one of the pages could
> cancel the speech recognition process, hampering the efforts of the
> other page.  or, is that dealt with at a lower level within WebCore?

In my understanding each WebViewImpl manages only a single page, but each page can have multiple speech enabled input elements. The code so far handles the multiple input element case by allowing only one input element to record at a time. Speech input code in the browser layer should handle the multiple page case, which is not done yet.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.