[Webkit-unassigned] [Bug 39485] New: Beginnings of a HTML Speech Input Element

Fri May 21 07:15:39 PDT 2010

https://bugs.webkit.org/show_bug.cgi?id=39485

           Summary: Beginnings of a HTML Speech Input Element
           Product: WebKit
           Version: 528+ (Nightly build)
          Platform: PC
               URL: http://docs.google.com/Doc?docid=0AaYxrITemjbxZGNmZzc5
                    cHpfM2Ryajc5Zmhx&hl=en
        OS/Version: Mac OS X 10.5
            Status: UNCONFIRMED
          Severity: Normal
          Priority: P2
         Component: Forms
        AssignedTo: webkit-unassigned at lists.webkit.org
        ReportedBy: satish at chromium.org
                CC: jorlow at chromium.org

This is a parent bug for all patches related to adding a new input tag type for speech recognition. This will render like a button element (with an embedded status indicator) in the page for the user to start/stop speech recognition. After speech recognition the element's onchange handler is fired with the recognized text as the event's value.

We have discussed this proposal with some browser vendors earlier and received good feedback. So we'd like to move towards implementing it as a conditionally compiled feature in WebKit (off by default) and get more web developer input before making a formal proposal to W3C.

The speech input element itself will appear like a clickable push-button with an embedded status indicator/icon. The embedded status indicator/icon will be themable and UAs/platforms can style it to match their current themes.

Backwards compatibility:
1. UAs which don't recognize this new input type will render it as a text input element, and any speech specific API calls made from javascript code will throw an exception due to missing properties/methods.
2. Once the initial implementation is ready we intend to enable this API in Chrome behind a run-time flag, which will let web developers turn on the feature in their own machines and experiment with it to give useful feedback.

We intend to add this API to webkit as a series of small steps:

1. Add a bare bones 'speech' type to the existing html input element in webkit and associated
    rendering code to render it like a push button. This gives a properly rendering control with
    no associated actions.
    a. Add speech input element styles to the UA style sheet.
    b. Make HTMLInputElement recognize 'speech' as a valid input type
    c. A new renderer based on RenderButton to draw the speech input element
    d. Platform and UA specific themed rendering (RenderThemeXxxxx files) for the
       embedded status indicator
2. Add a speech service to talk to the UA in a platform specific manner, modeled after
    an existing service such as Geolocation
    a. A new SpeechService and SpeechServiceClient under Webcore/platform
    b. A chromium specific extension of the above under Webcore/platform/chromium
    c. A chromium specific bridge class under WebKit/chromium to let multiple speech
        input elements in a single page talk to the same provider in the UA
3. Hook up the speech input element with the speech service and the UA
    a. Handle click event and fire onchange in speech control renderer
    b. Render the various states of the speech control via the theme layer
4. Implement UA specific code (outside webkit) for handling speech recognition.

An informal spec of the new API, along with some sample apps and use cases can be found at http://docs.google.com/Doc?docid=0AaYxrITemjbxZGNmZzc5cHpfM2Ryajc5Zmhx&hl=en.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.