CARVIEW |
- Scope
- Deliverables
- Related Groups
- Participation
- Communication
- Decision Policy
- Patent Policy
- Additional Information
- About this Charter
HTML Speech Incubator Group Charter
The mission of the HTML Speech Incubator Group, part of the Incubator Activity, is to determine the feasibility of integrating speech technology in HTML5 in a way that leverages the capabilities of both speech and HTML (e.g., DOM) to provide a high-quality, browser-independent speech/multimodal experience while avoiding unnecessary standards fragmentation or overlap.
End date | 38 November 2011 |
---|---|
Confidentiality | Proceedings are public |
Initial Chairs | Dan Burnett, Voxeo Mike Bodell, Microsoft Dave Burke, Google |
Initiating Members | |
Usual Meeting Schedule | Primary communication: Email Teleconferences: As needed Face-to-face: As needed |
Scope
The scope of work is
- Development of requirements
- Creation of change requests to HTML5 for the purpose of enabling scenarios that take advantage of speech recognition and synthesis
- Creation of change requests to SRGS, SSML, SISR, VoiceXML 3, or other languages and specifications where appropriate for the purposes of consistency
Technical aspects of the discussion explicitly allowed include, but are not limited to
- how automatic speech recognition (ASR) will be made available
- audio output control, including text-to-speech processing (TTS)
- extent of support for speech from a variety of sources
- requirements, if any, on audio capture and playback APIs
- whether speech should be a standalone feature or built on lower level audio capture and playback APIs
- extent of support for non-text field input, e.g. radiobuttons, checkboxes, comboboxes
- extent of support for user interactions with non-speech components, e.g., maps
- appropriate chrome for visual feedback with speech or for interaction with speech
- security and privacy considerations around making speech available in Web browsers
- extent of support for text-equivalent input (text instead of speech)
- extent of support for different language models (context-free grammars, statistical language models, etc.)
- extent of support for asynchronous/background ASR, for example ongoing background ASR or explicit re-recognition
- extent of support for non-immediate recognition, for example non-real-time ASR, offline transcription, or recognition of recorded audio
- how much visual interaction is required. For example, can you only do recognition against visible items
- whether recognition must be tied to specific items or whether it can be associated with the entire web page, e.g. "help"
- extent of chrome around ambiguity (confidences/nbest, in particular behavior of browser when ambiguity occurs -- should the browser provide behavior or should it be the page, or bar?)
- format of the returned results
- level of author control of recognizer selection (e.g. which recognizer)
- level of author control of recognizer configuration (e.g. parameters)
- level of author control of protocol between browser and speech resource
- extent of support for simultaneous speech and recording
- extent of support for incrementality (e.g. live partial results)
Out of Scope
Technical aspects of the discussion explicitly out of scope are
- Speaker authentication (identification and verification)
- Dialog representation
- Gesture recognition
- Complex call control
- Browser-level ASR (browser is listening)
- Compound document mixing the execution of VXML with the execution of HTML
Deliverables
There will be one deliverable, the XG Report, which may include
- Requirements
- Use cases
- Change requests to HTML5 and, as appropriate, other specifications, e.g., capture API, CSS, Audio XG, EMMA, SRGS, VoiceXML 3
Related Groups
While there are no dependencies of the XG's work on that of any group other than the HTML Working Group, the XG may review or discuss the work of other related groups as needed to produce a high-quality work product. A partial list of such groups follows.
W3C Groups
- HTML WG
- The XG will ensure that its recommendations are consistent with the current direction of the HTML Working Group.
- Voice Browser Working Group
- We may produce change requests to VoiceXML 3 and other relevant specifications where appropriate for the purposes of consistency.
- Multimodal Interaction Working Group
- We may produce change requests to MMI specifications where appropriate for the purposes of consistency.
- Audio XG
- We may produce change requests to Audio XG specifications where appropriate for the purposes of consistency.
- Device APIs and Policy Working Group
- We may produce change requests to Device APIs and Policy Working Group specifications where appropriate for the purposes of consistency.
Participation
Primary participation is by email. However, conference calls and/or face-to-face meetings may be called as needed. Participants are expected to attend any such teleconferences and face-to-face meetings.
Teleconferences are expected for the first few weeks to set general directions.
Communication
All technical work in this group will be conducted on the public mailing list public-xg-htmlspeech@w3.org (archive) and at any scheduled teleconferences and face-to-face meetings. The group's Member-only list, member-xg-htmlspeech@w3.org (archive), will be used only as necessary for Administrative tasks.
Information about the group (deliverables, participants, face-to-face meetings, teleconferences, etc.) is available from the HTML Speech Incubator Group home page.
Decision Policy
As explained in the Process Document (section 3.3), this group will seek to make decisions when there is consensus. When the Chair puts a question and observes dissent, after due consideration of different opinions, the Chair should record a decision (possibly after a formal vote) and any objections, and move on.
Patent Policy
This Incubator Group provides an opportunity to share perspectives on the topic addressed by this charter. W3C reminds Incubator Group participants of their obligation to comply with patent disclosure obligations as set out in Section 6 of the W3C Patent Policy. While the Incubator Group does not produce Recommendation-track documents, when Incubator Group participants review Recommendation-track specifications from Working Groups, the patent disclosure obligations do apply.
Incubator Groups have as a goal to produce work that can be implemented on a Royalty Free basis, as defined in the W3C Patent Policy.
For more information about disclosure obligations for this group, please see the W3C Patent Policy Implementation.
About this Charter
This charter for the HTML Speech Incubator Group has been created according to the Incubator Group Procedures documentation. In the event of a conflict between this document or the provisions of any charter and the W3C Process, the W3C Process shall take precedence.
Charter update history:
- On 2 September 2011, this charter was extended until 30 November 2011.
Daniel C. Burnett, Michael Bodell, Dave Burke
Copyright© 2010 W3C ® (MIT , ERCIM , Keio), All Rights Reserved.
$Date: 2011/09/02 08:47:35 $