CARVIEW

MOTORHOMES

Select Language

HTTP/2 301 date: Fri, 10 Oct 2025 07:19:39 GMT content-type: text/html; charset=iso-8859-1 location: https://www.w3.org/TR/2000/NOTE-speechobjects-20001114/ cf-ray: 98c459c14e6acb77-BLR cache-control: max-age=31536000 expires: Fri, 02 Oct 2026 06:58:58 GMT x-backend: www-mirrors x-request-id: 9882507a786af212 strict-transport-security: max-age=15552000; includeSubdomains; preload content-security-policy: frame-ancestors 'self' https://cms.w3.org/ https://cms-dev.w3.org/; upgrade-insecure-requests cf-cache-status: HIT set-cookie: __cf_bm=ZfJzvt3RckLWBiPRY1XUPSJrGkyXUdQXtecMlSsLnZI-1760080779-1.0.1.1-40AhXZLEuWlu6CsWs.6DrEBjvKmuRcTYWlFWSNgRNMTYx.HGTxg1JG0JYO9biVuJE8bwTYGCsIjHqn2R_OuQkSX_o2CkI9oihxfB.ErLsVw; path=/; expires=Fri, 10-Oct-25 07:49:39 GMT; domain=.w3.org; HttpOnly; Secure; SameSite=None vary: Accept-Encoding server: cloudflare alt-svc: h3=":443"; ma=86400 HTTP/2 200 date: Fri, 10 Oct 2025 07:19:39 GMT content-type: text/html; charset=iso-8859-1 content-encoding: gzip content-location: Overview.html last-modified: Mon, 02 Oct 2017 10:22:41 GMT cache-control: max-age=31536000 expires: Sat, 10 Oct 2026 07:19:39 GMT vary: Accept-Encoding link: ;rel="canonical", ;rel="timegate", ;rel="original" memento-datetime: Mon, 02 Oct 2017 10:22:41 GMT x-backend: www-mirrors x-request-id: 98c459c5dc8acb77 strict-transport-security: max-age=15552000; includeSubdomains; preload content-security-policy: frame-ancestors 'self' https://cms.w3.org/ https://cms-dev.w3.org/; upgrade-insecure-requests cf-cache-status: BYPASS server: cloudflare cf-ray: 98c459c5dc8acb77-BLR alt-svc: h3=":443"; ma=86400 SpeechObjects Specification V1.0

SpeechObjects Specification V1.0

W3C Note 14 November 2000

This Version:: https://www.w3.org/TR/2000/NOTE-speechobjects-20001114
Latest version:: https://www.w3.org/TR/speechobjects
Editors:: Daniel C. Burnett, Nuance Communications <burnett@nuance.com>

Abstract

This document describes SpeechObjects, a core set of reusable dialog components that are callable through a dialog markup language such as VoiceXML, to perform specific dialog tasks, for example, get a date or a credit card number, etc. The major goal of SpeechObjects is to complement the capabilities of the dialog markup language and to leverage best practices and reusable component technology in the development of speech applications.

Status of this document

This document is a submission to the World Wide Web Consortium from Nuance Communications, Inc. (see Submission Request, W3C Staff Comment). For a full list of all acknowledged Submissions, please see Acknowledged Submissions to W3C.

This document is a Note made available by W3C for discussion only. This work does not imply endorsement by, or the consensus of the W3C membership, nor that W3C has, is, or will be allocating any resources to the issues addressed by the Note. This document is a work in progress and may be updated, replaced, or rendered obsolete by other documents at any time.

A list of current W3C technical documents can be found at the Technical Reports page.

Introduction
Background
Concepts
SpeechObject specifications
Appendix

SpeechObjects Specification V1.0

1. Introduction

SpeechObjects are reusable software components that encapsulate discrete pieces of conversational dialog. SpeechObjects are based on an open architecture that can be deployed on any of the major server and IVR (interactive voice response) platforms. This paper describes a specification based on Nuance's Java implementation of SpeechObjects.

Simply stated, a SpeechObject is a reusable software component that implements a dialog flow and is packaged with the audio prompts and recognition grammars that support that dialog. A Java call to a SpeechObject can be as simple as

// Initialize the SpeechObject
SODate date = new SODate();
// Invoke the SpeechObject
SODate = date.invoke(sc, dc, cs);
// Look at the results
int month = result.getMonth();
int day = result.getDayOfMonth();
int year = result.getYear();

In this document we will present both the configuration parameters (JavaBean properties) and the return values for each of the SpeechObjects.

?

2. Background

2.1 Architectural model

The Java SpeechObjects architecture was designed to be portable and extensible, as well as easy to use. To this end SpeechObjects are all based on a primary interface, SpeechObject. This simple interface defines:

A single method, invoke, that applications call to run a SpeechObject
An inner class, Result, used to return the recognition results obtained during the dialog executed by the SpeechObject

From the SpeechObject interface and a set of supporting interfaces, SpeechObject developers can build objects of any complexity that can be run with a single method call. The invoke method for any given SpeechObject executes the entire dialog for that SpeechObject. A simple invoke method might just play a standard prompt, wait for speech, and return the results after recognition completes. A more complicated invoke method could include multiple dialog states, smart prompts, intelligent error handling for both user and system errors, context-sensitive help, and any other features built in by the SpeechObject developer.

To call a SpeechObject from your application, however, doesn't require you to know anything about how the invoke method is implemented. You only need to provide the correct arguments and know what information you want to extract from the results.

The SpeechChannel is the object that provides recognition functionality to a SpeechObject. When an application is launched, the environment allocates a SpeechChannel for each supported port. This SpeechChannel is passed to the application for each incoming call and persists until the application terminates. The SpeechObjects that make up the application use the SpeechChannel to interact with the caller-requesting recognition services, playing prompts, setting configuration parameters, and so on. Telephony control is an optional component of the SpeechChannel.

2.2 Goals and principles of Design

In short, SpeechObjects

are components

Components are software objects that are complex enough to provide useful functionality, but small enough to be broadly reusable. A well-designed component exposes the properties that allow users to tailor it for specific needs. As such, components help make application developers more efficient and significantly reduce the time to market while improving overall application quality.
are uniform

Both the simple Yes/No SpeechObject and the more complex Confirm and Correct SpeechObject implement the same SpeechObject interface. This characteristic allows them to be treated identically by someone who is using them to implement a higher-level dialog. For example, a "flowchart" SpeechObject might contain a set of SpeechObjects, and invoke them in some order determined by its property settings. The container SpeechObject doesn't need to know any of the specifics of the SpeechObjects it contains-it only cares about the attributes they have in common, such as the way they are invoked and the format of the results they return. Because this kind of uniformity encourages reuse and simplifies the programming model, all SpeechObjects are required to implement a core set of methods defined in the SpeechObject interface.
are interoperable

Not only do all SpeechObjects implement the same interface, but as mentioned earlier they rely on another abstract interface, called a SpeechChannel, to execute their dialogs. The SpeechChannel provides the interface to a speech recognition subsystem, and is presented as an abstract interface so that any speech recognition vendor can implement it. This allows SpeechObjects written by any developer to run on top of a SpeechChannel supporting any recognition engine, provided that both providers adhere to the SpeechObject standard. Additionally, SpeechObjects written by one vendor can refer to, invoke, extend, or encapsulate SpeechObjects provided by another vendor. SpeechObjects are also callable from the VoiceXML dialog markup language. The Dialog Markup Language, presently being specified by the World Wide Web Consortium, is based on VoiceXML and is therefore expected to be able to call SpeechObjects as well.

Here is an example of how to call a SpeechObject from within VoiceXML:

<object name="appt_date"
????????classid="speechobject://nuance.so.SODate">
??<param name="initialPrompt"
?????????type="nuance.common.prompts.URLPrompt">
????<param name="URL" type="java.net.URL"
???????????expr="../prompts/when.wav"/>
??</param>
</object>
are object-oriented

In addition to the fact that SpeechObjects are themselves objects, SpeechObjects support an object-oriented development method. Application development is SpeechObject-centric, encouraging speech application developers to develop and package audio prompts, recognition grammars, and any other required resources together with the SpeechObject. Because these dialog elements are interrelated and interdependent, keeping them together results in better dialogs. The object-based nature of SpeechObjects makes them more interoperable with other software development environments and easier to package, deploy and install.
vary in power and complexity

Any unit of dialog of any complexity can be packaged as a SpeechObject. A common dialog example is a SpeechObject that knows how to elicit a date from a caller. In an ideal scenario, this is an almost trivial dialog: the application prompts for a date and the caller responds. The SpeechObjects paradigm, however, allows much more sophisticated functionality to be built in, including error-handling, disambiguation, and reprompting that allow recovery from speaker or recognizer errors. This is provided without making the SpeechObject more complex to use, and turns even a "simple" dialog, such as eliciting a date, into an extremely valuable object.

At the other end of the complexity spectrum, domain-specific dialogs can also be packaged as SpeechObjects. For example, an application that works with a caller to perfect a travel itinerary can be packaged as a single SpeechObject. An application developer could use this SpeechObject as a turnkey application or embed it as a subsystem in a larger application, eliminating days or weeks of development that would have been spent on the travel dialog.

Nuance's Enterprise SpeechObjects demonstrate the flexibility in creating SpeechObjects for vertical industries like Customer Relationship Management (CRM) or Order Entry applications etc.
are modular

The travel itinerary SpeechObject described above would probably not be written from scratch-it would use other SpeechObjects as necessary. For example, to ask the caller for the departure date, it would delegate that part of the interaction to a specialized "carview.php?tsp=date" SpeechObject, and so on. In this way more complex SpeechObjects can be constructed using simpler SpeechObjects. This ability to leverage smaller components when building larger components is an important contributor to the power of component-based technology.
exist in a universal namespace

Java SpeechObjects adhere to the Java convention of naming classes in a globally unique way. For example, Nuance's SpeechObjects are part of the nuance.so Java package. Other vendors should follow the same convention, guaranteeing that SpeechObjects from different vendors have globally unique names and can coexist. The SpeechObject architecture uses a similar convention for naming audio prompt files to avoid naming collisions.
leverage existing standards

SpeechObjects make use of existing standards wherever possible. For example, the universal commands described in section 3.1 were proposed by AVIOS' Telephone Speech Standards Committee (TSSC).

2.3 Implementation Platform and Requirements

This section briefly describes the process of invoking a SpeechObject as a motivation for the runtime requirements and then presents the SpeechChannel.

2.3.1 Invocation process

To use a SpeechObject from your application, you simply instantiate it and call its invoke method.

The invoke method executes the dialog defined by the SpeechObject, and returns an instance of the Result class used by that SpeechObject. This result provides your application with the data that was accumulated during the dialog.

The invoke method takes several arguments:

A SpeechChannel object, allocated to your application for the duration of a call. The SpeechChannel provides the bridge between your application and the Nuance recognition client, giving you access to recognition functionality and telephony control features. The SpeechChannel is described below in more detail.
A DialogContext object that can be used to pass information between the SpeechObjects that make up a dialog.
A CallState object that maintains information about the current call, such as the number of times the user has asked for help or the number of errors that have occurred, and the overall application state. The CallState object references an additional object, an AppState, that stores application state information that applies across all handled calls.

2.3.2 Runtime environment requirements

In order for the invocation described above to work, a platform must implement a SpeechChannel and provide a launcher that creates the DialogContext and CallState and invokes the object.

2.3.3 SpeechChannel

The SpeechChannel is an integral part of a SpeechObjects application. Application developers use SpeechObjects to implement the dialog flow, and SpeechObject developers use SpeechChannel methods to implement the recognition functionality of the dialog.

This section describes the abstract SpeechChannel architecture in more detail.

SpeechChannel interfaces

Functionality provided by the SpeechChannel is actually separated into five interfaces: the main speech channel interface that provides recognition functions, and four separate interfaces that define the functionality for:

Prompt playback, that is playing out standard audio prompts - either pre-recorded or to a text-to-speech engine, and managing the prompt queue.
Dynamic grammars, that is, the ability to create and modify grammars at runtime. This is typically used to build caller-specific grammars, for example, for personal address books. It can also be used to allow SpeechObjects to construct grammars on the fly.
Speaker verification, that is, the ability to verify that a speaker is who they claim to be by analyzing their voice.
Telephony features. This interface is optional, allowing SpeechChannel implementations to more easily be created for non-telephony platforms or for specialized telephony requirements.

The SpeechChannel is the primary object and provides access to the corresponding implementation of the other interfaces. SpeechObjects work with the single SpeechChannel object passed to them and can access the other interfaces when needed:

SpeechChannel

These interfaces can be implemented in the same class, or in separate classes, as appropriate for the platform. In either case, the SpeechChannel interface defines methods that return each of the other interfaces. For example, if a SpeechObject wanted to access dynamic grammar functionality, it would call the SpeechChannel getDynamicGrammarControl method and use the returned object to make dynamic grammar requests.

3. Concepts

This section briefly describes some concepts underlying SpeechObjects that may provide a context for understanding the specific parameters and return values described in Section 4.

3.1 Universal commands

A universal command is an utterance that the speaker can say at any point during any dialog. The framework includes a grammar allowing recognition of a small set of universal commands, and provides default handling when these utterances are recognized. The universal commands currently defined by the SpeechObjects framework are:

Help

This universal allows the speaker to get context-sensitive help at any point in the dialog simply by saying "help." The default behavior is to play the help prompt defined by the currently executing SpeechObject and reexecute the dialog loop.
Operator

This universal redirects the caller to a live agent if your application is so enabled. The "0" touch-tone is also interpreted as a request to transfer to the operator.
Repeat

This allows the speaker to ask the dialog to replay the last set of prompts.
Go to the top

This returns the speaker to the beginning of the application dialog.

Through method calls the application can substitute its own handling for any of the supported universals, including disabling them.

As mentioned earlier, these universal commands are based on standards proposed by the Telephone Speech Standards Committee (TSCC).

3.2 Error handling and recovery

All SpeechObjects provide default handling, including prompts and logic (and grammar adjustments, if necessary), for all of the common recognizer error conditions: rejection, no speech timeout, too much speech, spoke too early, recognizer too slow, and unexpected key.

The default error handlers for each of these error types play an error prompt and then attempt to reexecute the dialog. The error prompt is generated by combining an application-wide error prompt that is specific to the type of error with a generic prompt provided by the current SpeechObject. For example, if a no-speech timeout occurs while a Yes/No SpeechObject dialog is executing, the framework concatenates the application-wide error prompt "I'm sorry, I didn't hear you" and the Object error prompt "Please say 'yes' or 'no'."

The default error handling mechanism continually reexecutes the dialog until a valid response is generated or the error threshold for the object is reached. When the threshold is reached, an exception is thrown and you can implement whatever error handling behavior you prefer, such as transferring the caller to a live agent. You can also override the default error handlers for any of the defined error types through class method calls.

All prompts can be overridden through configuration parameters.

3.3 Playables

All prompts used by SpeechObjects are encapsulated using classes that implement the core interface Playable, which defines the protocol for objects that can be played over an audio channel. The framework defines a set of classes that implement Playable that provide additional prompt behavior, including:

Concatenated prompts let you easily assemble a prompt from multiple elements-for example, from multiple audio files. The resulting prompt object can be manipulated and added to the prompt queue just as if it were a single prompt, instead of having to consider each element individually.
Escalating prompts let you define a sequence of prompts-each time you play the prompt, the next prompt in the sequence is played. This is especially useful for help or error prompts. If the prompt needs to be repeated because the users asks for help again or the dialog generates repeated errors, the prompt can provide additional information on later repetitions.
Context-sensitive prompts encapsulate a set of prompts and select the correct prompt to play based on a given context.
Random prompts let you add variety to a dialog by defining a set of prompts, one of which is selected at random each time the prompt is played. You can also assign relative probabilities to each prompt in the set.
Text-to-speech prompts let you generate synthesized prompts from a text string via a third-party text-to-speech engine.
Silence prompts provide an easy way to generate a prompt to be used to insert a pause into a dialog prompt sequence. You can use this prompt class to automatically generate a period of silence rather than having to record your own "silence" prompt. You define the length of the pause.

3.4 Grammars

SpeechObjects are designed to let you define grammars in a variety of ways, based on the requirements for each dialog and the need to customize the grammar at runtime. Most SpeechObjects use the Nuance dynamic grammar mechanism, meaning that the grammar for a given SpeechObject is compiled and loaded onto the recognition server when the SpeechObject is constructed. This allows SpeechObjects to be reused much more easily, as you don't have to compile the grammar for each SpeechObject into a recognition package before using it.

The SpeechObjects framework grammar classes let you build your grammars in a variety of ways:

From a text file: This format allows grammars to easily be updated as the application is developed and tuned. You don't need to recompile your application when you change the grammar, as the grammar is dynamically generated from the file at runtime.
From a grammar you define programmatically: This is harder to tune but is useful when a grammar's contents are determined by criteria only available at runtime. The Alpha Digit String SpeechObject, for example, generates its grammar at runtime based on how the object is configured, including the number of letters/digits in the string and where pauses or delimiter phrases such as "dash" might occur.

You can also initialize a grammar from a file and subsequently update it programmatically.

The framework also allows compound grammars. Compound grammars let you define a single grammar object comprised of multiple grammars to be used in parallel. For example, in a corporate dialing application you might use compound grammars containing a set of employee names and a set of employee extensions, to allow speaker to dial either by name or number. The framework uses compound grammars to combine each SpeechObject's grammar with the grammar defining the set of Universal commands.

3.5 Results

The Result class is a subclass of a utility class KVSet, which defines an object used to encapsulate a set of key/value pairs. This structure is analogous to natural language slots and the values they are filled in with during recognition. Because the value stored in a KVSet can be any type of object, SpeechObjects have the flexibility to populate Result objects with any set of values that are appropriate. For example:

SOYesNo.Result contains a single String value, "yes" or "no"
SODate.Result contains a Java Calendar object
MyFlightInfo.Result contains a set of values, including String values providing the codes of the origin and destination airports, an integer value representing the flight time, a Calendar object representing the flight date, and an integer representing the flight number.

The value at any given key can also be another KVSet, providing the ability to nest result structures if appropriate.

Each Result class defined by the SpeechObjects includes convenience methods allowing easy access to the specific information it encapsulates.

Result subclasses also have another characteristic, which is that they can be played over the current audio output device. They implement the Playable interface, which allows objects to be appended to the prompt queue and then played by a SpeechChannel or other object that supports audio playback.

This lets you easily play the recognized information, for example, for confirmation dialogs or during testing.

3.6 Redo Objects

Many SpeechObjects are used in conjunction with ConfirmAndCorrect, which confirms all of the information obtained by those SpeechObjects, and upon a negative confirmation, identifies which information needs to be corrected (e.g., "Would you like to change the date, the time, or the telephone number?"). The SpeechObjects corresponding to the piece ("the date") or pieces ("the date and the time") of information that need correction is then re-invoked (e.g., re-invoke the Date object), prompting the user for the information again.

To promote better dialog, rather than simply re-invoking the same SpeechObject again during this error-correcting phase, a SpeechObject may offer a "RedoObject" which should be used to re-obtain the desired information. This RedoObject may simply ask for the information in a different manner, by changing prompts as appropriate ("Please say the date again."). Alternatively, the "RedoObject" may actually employ a different dialog strategy, perhaps breaking up the task into a set of smaller tasks in order to facilitate recognition of complex items. RedoObjects typically share the same SOKey ('object instance name') as their original SpeechObject in order to share n-best information from the original SpeechObject. SpeechObjects that do not employ a RedoObject may return "null" to indicate that the same instance should be used during this error-correction phase of a confirmation dialog.

3.7 Identifiable interface

Many of the SpeechObjects implement the Identifiable interface, which enables them to be used in the Confirm and Correct SpeechObject. The Identification phase of the Confirm and Correct process makes use of

IdentifyPrompt - a Playable representing the Object in a spoken list, and
IdentifyExpression - a grammar expression covering the ways in which the user might refer to the Object.

Here's a sample of how the prompt might be used by the system (with the prompt highlighted):

Which would you like to change - the departure city or the arrival city?

The corresponding grammar expression for a derived ArrivalCity SpeechObject might then accept phrases like "the arrival city", "the destination", or "destination city".

3.8 Inherited parameters and/or return values

Many SpeechObjects inherit parameters, return values, and behavior from other SpeechObjects. These relationships are helpful in understanding what parameters might possibly be common (in syntax and behavior) across a large number of Objects. A simplified inheritance diagram for all of the SpeechObjects in this document is shown below.

SpeechObject inheritance diagram

4. SpeechObject specifications

Although SpeechObjects as implemented in Java have method calls for setting and getting various values, the specification below is restricted to listing only JavaBean properties of the SpeechObjects, i.e. properties for which there are both "get" and "set" methods. While this restriction limits configuration to discrete parameters which may be changed but not added to (1), it also results in a cleaner interface for the users of the Objects - these properties may be edited in a GUI, set and retrieved in a scripting environment, etc.

Parameter type and return type descriptions can be found in the appendix.

Parameters and return values common to all SpeechObjects

Configuration parameters:

Parameter	Type	Description
RedoObject	SpeechObject	New object to call in case the caller negatively confirms the result from the original object in a confirmation scenario
SOKey	String	Name for this instance's family (i.e., the object itself plus any redo objects for this object)

Return values:

Return value	Type	Description
getNextResult	Result	Next Result in n-best list, or null if no more
requiredAdditionalInteraction	boolean	Boolean indicating whether or not additional interaction between the SO and the caller was required in obtaining this result. Typically, this means that the SO has already done any needed disambiguation
isAutoConfirmed	boolean	Boolean indicating whether or not this Result has already been confirmed

Dialog SpeechObject

Description:

This SpeechObject does not implement a specific dialog -- it simply provides the framework for a dialog. The default behavior is:

Append the SpeechObject's initial prompt to the current prompt buffer and play the buffer

Wait for speech and send it to the recognizer for recognition, using the top-level grammar currently set by the SpeechObject

If recognition was successful, pass the result to the SpeechObject's result processing methods and return the final result

If recognition was not successful, perform the necessary error handling and attempt the dialog again

Configuration parameters:

Parameter	Type	Description
Filter	ResultFilter	Used to examine n-best SpeechObject.Results and filter out invalid results
Grammar	Grammar	The grammar used for recognition
HelpPrompt	Playable	This prompt is played if the user requests help
InitialPrompt	Playable	Unless an error occurs or the user requests help, this is the prompt that is played before recognition
MaxErrorCount	int	The maximum number of errors (rejections, timeouts, or unexpected dtmf keypresses) permitted before the SpeechObject gives up
MaxHelpCount	int	The maximum number of help requests permitted before the SpeechObject gives up
NoResultFoundPrompt	Playable	This prompt is played after a valid recognition but when none of the candidates in the n-best list are successfully processed into a SpeechObject.Result (e.g. if the entries fail to pass this Object's Filter)
NoSpeechTimeoutPrompt	Playable	This prompt is played when a recognition error code of "no speech timeout" is returned by the recognizer
RecognitionErrorPrompt	Playable	This prompt is played by default when a recognition error occurs unless a more specific error prompt is defined
RecognizerTooSlowTimeoutPrompt	Playable	This prompt is played when a recognition error code of "recognizer too slow timeout" is returned by the recognizer
RejectedPrompt	Playable	This prompt is played when a recognition error code of "rejected" is returned by the recognizer
ReturnAllPossibleResults	boolean	If true, this Object returns an entire n-best list of SpeechObject.Results. Otherwise, it will return only the first valid result it interprets and processes
SpeechTooEarlyPrompt	Playable	This prompt is played when a recognition error code of "speech too early" is returned by the recognizer
TooMuchSpeechTimeoutPrompt	Playable	This prompt is played when a recognition error code of "too much speech timeout" is returned by the recognizer
UnexpectedKeyPrompt	Playable	This prompt is played when a recognition error code of "unexpected_key" is returned by the recognizer

Return results:

Return value	Type	Description
toString	String	A String representation of this Object's recognized result

Yes/No

Description:

This SpeechObject expects an answer to a yes-or-no question.

Configuration parameters:

All configuration parameters of the Dialog SpeechObject, plus

Parameter	Type	Description
IdentifyExpression	Expression	Identification expression for the Identifiable interface
IdentifyPrompt	Playable	Identification prompt for the Identifiable interface
StrictGrammar	boolean	If true, loads and uses a limited (strict) grammar to maximize performance

Return results:

All return results of the Dialog SpeechObject, plus

Return value	Type	Description
YesNo	String	String indicating yes or no
saidYes	boolean	True if the user said yes, false otherwise
saidNo	boolean	True if the user said no, false otherwise

Quantity

Description:

This SpeechObject recognizes quantities of items. By default, this SpeechObject recognizes 1-4 digit (0-9,999) quantities and has an absolute range of 1-8 digits (0-99,999,999). A developer can (and should) configure this SpeechObject to recognize quantities only within a certain range by setting the minDigits and maxDigits properties, as appropriate for a specific domain and application. The Quantity SpeechObject does not itself perform any confirmation or validity checking. The range of numbers that the speaker is allowed to say is limited by limiting the grammar used for recognition to that range. If the speaker says a number that is out of the current range, the utterance is rejected by the recognizer.

Configuration parameters:

All configuration parameters of the Dialog SpeechObject, plus

Parameter	Type	Description
IdentifyExpression	Expression	Identification expression for the Identifiable interface
IdentifyPrompt	Playable	Identification prompt for the Identifiable interface
MaxDigits	int	Maximum allowed number of digits for the quantity that will be recognized (e.g. 4 => '9999')
MinDigits	int	Minimum allowed number of digits for the quantity that will be recognized (e.g. 2 => '10')

Return results:

All return results of the Dialog SpeechObject, plus

Return value	Type	Description
Quantity	int	The quantity recognized

Simple Digit String

Description:

This SpeechObject can be configured to recognize a string of digits of a fixed length. When NumberDigits is set, the SpeechObject automatically creates a grammar for recognizing that number of digits (without natural numbers). The Simple Digit String Speech Object does not itself perform any confirmation or validity checking. If there are specific constraints on what constitutes a valid number string for the controlling application, using the result filter mechanism to filter out inconsistent hypotheses is highly recommended.

?

Configuration parameters:

All configuration parameters of the Dialog SpeechObject, plus

Parameter	Type	Description
IdentifyExpression	Expression	Identification expression for the Identifiable interface
IdentifyPrompt	Playable	Identification prompt for the Identifiable interface
NumberDigits	int	Number of digits to be recognized

Return results:

All return results of the Dialog SpeechObject, plus

Return value	Type	Description
DigitString	String	The recognized digit string

Date

Description:

This SpeechObject prompts for and interprets a date. The date may be specified in one of many formats, including just getting a day of the week, a day of the month, a relative date (today, tomorrow, yesterday, next tuesday), and so forth. More complex expressions which specify the date in multiple ways are allowed (tomorrow, December 12th); consistency of such dates are checked, and if the date is inconsistent, the user will be reprompted for the date with an appropriate error message. Invalid dates, such as April 31, are similarly disallowed, causing reprompting for a date.

The speech object makes an effort to interpret the date intelligently:

If the day of week is given, such as Thursday, the SpeechObject interprets the date as if it were the upcoming Thursday. For example, if today is Monday, February 15, 1999, and the caller said "Thursday", the SpeechObject would interpret this as Thursday, February 18, 1999.

If the day of month is given, such as the 27^th, and this day is later than the current day (for example, February 15), this SpeechObject assumes the date is in the same month as the current date. For example, the 27^th would be interpreted as Saturday, February 27, 1999.

If the day of month is given, such as the 5^th, and this day is a number less than the current date (for example, February 15), the SpeechObject assumes the day is for the next month. In this example, the 5^th would be interpreted as Friday, March 5, 1999. If the next month is January, then the SpeechObject assumes the date is in the following year as well.

If the month is before the current month, the Date SpeechObject assumes the caller intends this date in the following year. For example, if the caller said January 3, this would be interpreted as January 3, 2000. If the caller says "today", the SpeechObject determines the current date unless specified by the developer.

When the caller says only a month, the SpeechObject will follow up by prompting the caller to specify the day of the month. This is actually implemented by the invocation of a default DisambiguateTime SpeechObject, which may be overridden.

The SpeechObject performs the following validation checking of the recognized date:

When inconsistent information is provided by the caller, such as a conflicting day of month and day of week (for example, Tuesday, February 15, 1999), the Date SpeechObject plays a prompt that identifies the correct information (February 15th is a Monday) and then reprompts the caller.

Likewise, if the Modifier such as "today" is inconsistent with the day of month, the SpeechObject will play a prompt specifying what 'today's' date is and reprompt the caller.

Invalid date handling:

When the caller responds with an invalid date such as "February 30", the SpeechObject plays a prompt that explains why this date is invalid "... there are only 30 days in April," and then reprompts the caller.

Configuration parameters:

All configuration parameters of the Dialog SpeechObject, plus

Parameter	Type	Description
DateTooEarlyPrompt	Playable	Prompt played if the stated date is before the lower DateLimit, e.g. "I'm sorry, I thought you said 'February 12^th, 1985' but that day is too far in the past"
DateTooLatePrompt	Playable	Prompt played if the stated date is after the upper DateLimit, e.g. "I'm sorry, I thought you said 'February 12^th, 2085' but that day is too far in the future"
DayOfMonthSO	SODayOfMonth	SODayOfMonth instance used to obtain a day of month when just the month or just the month and year are specified
IdentifyExpression	Expression	Identification expression for the Identifiable interface
IdentifyPrompt	Playable	Identification prompt for the Identifiable interface
InconsistentDayOfWeekPrompt	Playable	Prompt played if the stated date includes a day of week that doesn't match, e.g. "I'm sorry, I thought you said 'Tuesday, December 10^th', but December 10^th is a Friday."
InconsistentModifiedDayOfWeekPrompt	Playable	Prompt played if the stated date includes a modified day of week that doesn't match, e.g. "I'm sorry, I thought you said 'next Tuesday, December 10^th', but next Tuesday is December 14^th."
InconsistentNamedTodayEtcPrompt	Playable	Prompt played if the stated date includes a today expression as well as a day of week that actually refers to another date, e.g. "I'm sorry, I thought you said 'today, Tuesday, December 10^th', but today is December 14^th."
InconsistentTodayEtcPrompt	Playable	Prompt played if the stated date includes a today expression that refers to another date, e.g.,"I'm sorry, I thought you said 'today, December tenth', but today is December fourteenth."
InvalidDatePrompt	Playable	Prompt played if the stated date is invalid (the day of month exceeds the number for the month
LowerDate	java.util.Calendar or int or SODate.DateLimit	Earliest permissible date, represented by a Calendar object, an offset in days, or a DateLimit object
UpperDate	java.util.Calendar or int or SODate.DateLimit	Latest permissible date, represented by a Calendar object, an offset in days, or a DateLimit object

Return results:

All return results of the Dialog SpeechObject, plus

Return value	Type	Description
Calendar	java.util.Calendar	The date as a Calendar object
DayOfMonth	int	Day of month specified by the caller
DayOfWeek	int	Day of week specified by the caller
Month	int	Month represented as an integer between 1 and 12
Year	int	Year represented as a four-digit integer

Time

Description:

This SpeechObject defines a generic dialog for getting a time expression from the speaker.

The Time SpeechObject is generic and may be specialized (through modification of parameters, prompts, and/or grammars) for use in a range of applications, for example, flight information and reservation systems, personal agenda management, or package delivery/pickup scheduling.

In response to a prompt requesting the time, the caller speaks the time in a natural way (i.e. the time using natural expressions such as "in the morning" or "at night" as well as "am" or "pm".) The Time SpeechObject recognizes a clock time, for example, "three forty-five am". If the time is ambiguous (am/pm not specified), the SpeechObject conducts any additional dialog with the caller needed to ensure that an unambiguous time is obtained. This dialog is implemented by invoking an instance of the DisambiguateTime SpeechObject.

Configuration parameters:

All configuration parameters of the Dialog SpeechObject, plus

Parameter	Type	Description
DisambigObject	SODisambiguateTime	Sets object that disambiguates ambiguous times, e.g. '10' => 10 am or 10 pm
IdentifyExpression	Expression	Identification expression for the Identifiable interface
IdentifyPrompt	Playable	Identification prompt for the Identifiable interface
InconsistentTimePrompt	Playable	Prompt played when the user response in the disambiguation dialog is inconsistent with the original time they said. For example, the user is asked to disambiguate if 11 o'clock is in the morning or evening, and replies "in the afternoon".

Return results:

All return results of the Dialog SpeechObject, plus

Return value	Type	Description
AM_PM	String	Returns whether the time said by the caller was AM or PM
Calendar	java.util.Calendar	Returns the time in a Calendar representation
ClockTime	int	A numerical representation of the time
ClockTimePlayable	Playable	The time as a Playable in the standard format (with trailing "am" or "pm")
Hours	int	The hour portion of the time said by the caller
Minutes	int	The minutes portion of the time said by the caller
SmartTimePlayable	Playable	A time Playable in the "intelligent" (colloquial) format (for example, "7 in the evening", "5 in the morning", "noon")
UserStatedModifier	String	Any user-stated modifier that disambiguated the time

Menu

Description:

The Menu SpeechObject does not itself define a default dialog. The dialog is generated dynamically based on the number of items defined. The dialog presents the list of menu items and allows the caller to choose one of them. It enables the developer to dynamically build menus from pairs of grammars and prompt atoms, and in addition it permits the developer to associate a listener with any of the items so that the listener's action is performed in response to selecting the item.

The menu may be defined dynamically by calling a method that adds menu items sometime before invocation. Each menu item is defined in terms of:

an item name, which is the text string returned in the result if the item is selected

an optional Playable for representing the item in the menu prompts if they are autogenerated

an optional grammar expression to trigger selection of this item

?

Note that at this time the menu items cannot be set merely by setting JavaBean properties.

Configuration parameters:

All configuration parameters of the Dialog SpeechObject, plus

Parameter	Type	Description
ErrorPromptPostfix	Playable	Audio to use as the postfix of the error prompt (if the prompt is being auto-generated)
ErrorPromptPrefix	Playable	Audio to use as the prefix of the error prompt (if the prompt is being auto-generated).
HelpPromptPostfix	Playable	Audio to use as the postfix of the help prompt (if the prompt is being auto-generated)
HelpPromptPrefix	Playable	Audio to use as the prefix of the help prompt (if the prompt is being auto-generated)
IdentifyExpression	Expression	Identification expression for the Identifiable interface
IdentifyPrompt	Playable	Identification prompt for the Identifiable interface
InitialPromptPostfix	Playable	Audio to use as the postfix of the initial prompt (if the prompt is being auto-generated)
InitialPromptPrefix	Playable	Audio to use as the prefix of the initial prompt (if the prompt is being auto-generated)
ItemListPrompt	Playable	Prompt that is an explicit listing of all the menu items

Return results:

All return results of the Dialog SpeechObject, plus

Return value	Type	Description
ItemName	String	The name of the selected item
RecResult	RecResult	The recognition result of the interaction

US Currency

Description:

This SpeechObject prompts for and recognizes a dollar and cent amount in one utterance. If neccessary, disambiguation is performed, for utterances like "seven fifty". This disambiguation is performed by invoking a default instance of the DisambiguateCurrency SpeechObject (which may of course be overridden). This SpeechObject provides a DTMF backoff strategy if the caller encounters recognition problems.

Configuration parameters:

All configuration parameters of the Dialog SpeechObject, plus

Parameter	Type	Description
DisambigObject	SODisambiguateCurrency	Object that disambiguates ambiguous currencies, e.g. 'ten fifty' => $10.50 or $1050
IdentifyExpression	Expression	Identification expression for the Identifiable interface
IdentifyPrompt	Playable	Identification prompt for the Identifiable interface
Range	Range	Sets the allowed value range (also propagated to the disambiguation object)

Return results:

All return results of the Dialog SpeechObject, plus

Return value	Type	Description
Amount	float	Floating point number indicating the recognized amount of dollars and cents
Cents	int	Integer indicating the recognized amount of cents
Dollars	int	Integer indicating the recognized dollar amount, not including cents

North American Telephone Number

Description:

This SpeechObject prompts for and obtains a telephone number from the user, in the standard 10-digit format used in Canada, Mexico, and USA.

Configuration parameters:

All configuration parameters of the Dialog SpeechObject, plus

Parameter	Type	Description
IdentifyExpression	Expression	Identification expression for the Identifiable interface
IdentifyPrompt	Playable	Identification prompt for the Identifiable interface
UseNatural	boolean	If true, allows natural numbers within each section, e.g. 'six five oh, eight four seven, eleven fifty five'

Return results:

All return results of the Dialog SpeechObject, plus

Return value	Type	Description
AreaCode	String	The first 3 digits of the 10-digit recognized phone number
Exchange	String	The second set of 3 digits of the 10 digit recognized phone number
Subscriber	String	The last 4 digits of the 10 digit recognized phone number
PhoneNumber	String	Entire phone number as a string

Alpha Digit String

Description:

This SpeechObject can be configured to prompt for and recognize alphanumeric digit-strings which may consist of sections, such as an account credit card number, social security number, and the such. Natural numbers are optionally allowed within each section.

As with the Sectioned Digit String SpeechObject, each format of the sectioning is specified as a '-' delimited string. Each format should be of the form:

DDD-DD-DDDD or DDD-DD-AADD

and so forth. The first formatting specifies that the digitstring grammar should recognize a section of three digits, two digits, and four digits (e.g., a Social Security Number). The letter D is used for a digit (0-9), and the letter A is used for any alpha (A-Z).

One can also use a user-defined group to recognize a subset of the alphabet and optionally allow for digits in certain positions as well as an alpha. For example, one could define "V" to correspond to "AEIOU" - the vowels. This is useful for when only certain letters are allowed in a position within the digit-string - the automatically generated grammar can reflect this constraint directly.

Configuration parameters:

All configuration parameters of the Dialog and Sectioned Digit String SpeechObjects, plus

Parameter	Type	Description
Group	Group[]	Defines the groups
Group	(int, Group)	Defines a specific group

Return results:

All return results of the Dialog and Sectioned Digit String SpeechObject.

Confirm and Correct

Description:

The Confirm and Correct SpeechObject can be used to have the caller confirm one or more pieces of information together, and correct (by invoking suitable SpeechObjects) any pieces of information that are incorrect. The items that can be so confirmed, identified, and/or corrected must be SpeechObjects implementing the Identifiable interface. This is done in the three phases of Confirmation, Identification, and Correction:

The first phase is the Confirmation phase. During the Confirmation phase, the inner Confirmation object is invoked to play the confirmation prompt ("...is this correct?"), so that the caller can indicate whether all the information is correct. If the caller answers in the affirmative, this SpeechObject is finished.

If the caller indicates that information is not completely correct, Confirm and Correct moves on to the Identification phase, invoking the inner Identify object. During this phase, the caller identifies which piece(s) of information need to be corrected. The caller can respond by indicating up to two items that are incorrect -- or the caller can specify that all of the information is wrong; the caller can also answer that none of the items is incorrect, in which case execution returns to the Confirmation phase, and starts over.

If the caller Identifies at least one incorrect piece of information during the Identification phase, execution moves on to the Correction phase. During the Correction phase, Confirm and Correct obtains the re-do object for the SpeechObject whose Result is wrong. After getting and invoking each re-do object, execution returns to the Confirmation phase, to confirm all of the contained SpeechObject.Results.

At the end of a successful invocation (i.e., after all results have been confirmed), the Confirm and Correct SpeechObject returns a Result that contains all the results of the contained SpeechObjects, with each contained SpeechObject Result stored under the contained SpeechObject's SO key. For example, if the contained SpeechObjects are SODate and SOTime, the Result instance returned by Confirm and Correct will contain an SODate.Result stored under SODate's SO key, and an SOTime.Result stored under SOTime's SOKey.

Configuration parameters:

Parameter	Type	Description
Confirmation	Confirmation	The object that performs confirmation
GetInitialResultsIfNeeded	boolean	If true, will initially invoke all contained SpeechObjects that have not yet obtained results
Identify	Identify	The object that identifies which information needs to be corrected
MaxRetryCount	int	Maximum number of retries attempted by Confirm and Correct
SpeechObject	SpeechObject[]	SpeechObjects to be contained (confirmed/corrected)
SpeechObject	(int, SpeechObject)	Adds/sets a SpeechObject for confirmation/correction

Return results:

Return value	Type	Description
SOKeysEnum	Enumeration	Enumeration of contained SpeechObjects' Result keys

Browsable Selection

Description:

The Browsable Selection SpeechObject acts similarly to the Browsable List SpeechObject except that it also supports a "select" command the caller can use to select the current item being browsed.

Configuration parameters:

All configuration parameters of the Browsable List SpeechObject, plus

Parameter	Type	Description
IdentifyExpression	Expression	Identification expression for the Identifiable interface
IdentifyPrompt	Playable	Identification prompt for the Identifiable interface
SelectionExpression	Expression	Application-specific grammar rule to specify that the current item is to be selected

Return results:

All return results of the Browsable List SpeechObject.

Browsable Action

Description:

The Browsable Action List SpeechObject acts similarly to the Browsable List SpeechObject except that the user can add any custom command and an associated handler into the list. When a custom command is spoken, the corresponding handler is fired to handle it.

Note that at this time there is no way to specify these custom commands and handlers using JavaBean properties.

Configuration parameters:

All configuration parameters of the Browsable List SpeechObject.

Return results:

All return results of the Browsable List SpeechObject.

US Zip Code

Description:

This SpeechObject will collect either a 5- or 9-digit US ZIP code. The filter used to validate 5-digit codes is based on a list of currently existing codes issued by the U.S. Postal Service. The 4-digit extension, if spoken, is not validated. It is possible to disable the filter if you want to accept any 5-digit code.

?

Configuration parameters:

All configuration parameters of the Dialog SpeechObject, plus

Parameter	Type	Description
FilterDisabled	boolean	True prevents the recognized result from being validated
FiveDigitsOnly	boolean	True restricts recognition to 5 digits
IdentifyExpression	Expression	Identification expression for the Identifiable interface
IdentifyPrompt	Playable	Identification prompt for the Identifiable interface

Return results:

All return results of the Dialog SpeechObject, plus

Return value	Type	Description
Extension	String	Either a string representation of the last 4 digits (if a 9-digit zip code was recognized) or null (if a 5-digit zip code).
ZipCode	String	String representation of the 5 digit zip code (or first 5 digits, if a 9-digit zip code was recognized)

Credit Card Info

Description:

This speech object encapsulates the functionality of acquiring information on credit card type, credit card number, and credit card expiration date.

Configuration parameters:

All configuration parameters of the Dialog SpeechObject, plus

Parameter	Type	Description
AcceptExpiredCard	boolean	If true, credit cards that are expired before today are accepted
AllTypesEnabled	boolean	If true, all eight built-in credit card types are acceptable
CardTypeEnabled	(int, boolean)	Sets whether a specific card type is accepted or not
CreditCardExpirationDateSpeechObject	SpeechObject	Internal expiration date speech object
CreditCardInfoCANDCSpeechObject	SpeechObject	Internal confirm and correct speech object
CreditCardNumberSpeechObject	SpeechObject	Internal card number speech object
CreditCardTypeSpeechObject	SpeechObject	Internal card type speech object
InitialState	String	Initial state for the call-flow
PreamblePrompt	Playable	Prompt that is played at the beginning of the dialog
TypeQueryExplicit	boolean	If true, the credit card type is queried explicitly

Return results:

All return results of the Dialog SpeechObject, plus

Return value	Type	Description
CreditCardExpirationMonth	int	Credit card expiration month.
CreditCardExpirationYear	int	Credit card expiration year
CreditCardNumber	String	Credit card number as digit string
CreditCardType	String	Credit card type as string
ResultStatus	int	Result status
isResultOk	boolean	Whether the result is Ok or not

Sectioned Digit String

Description:

This SpeechObject can be configured to recognize a string of digits broken up into various sections. Alternate sectionings can be provided, and the use of natural numbers in the grammar can be enabled. The maximum length of a section is six digits.

Each format of the sectioning is specified as a '-' delimited string; the number of digits in each section of a given format is specified by a sequence of 'D' characters. For example,

"DDD-DDD-DDDD"

specifies a sectioning of three digits, three digits, and four digits.

"DDD-DD-DDDDD"

specifies a sectioning of three digits, two digits, and five digits.

Developers can also set the delimiter that may be spoken by callers when reading the sectioned digitstring. By default, this is a "dash", but this can be changed to any single word or valid GSL expression, such as "dot" or "[dash dot]" or null, etc.

Natural numbers can also be enabled through a simple property setting.

This SpeechObject does not itself perform any confirmation or validity checking. If there are specific constraints on what constitutes a valid digit string for your application, using the result filter mechanism to filter out inconsistent hypotheses is highly recommended.

The configuration of the digit string -- that is, the number of sections and the length of each section -- determines the construction of the grammar used for recognition. If the speaker says a digit string that does not match one of the defined patterns, the recognizer rejects the utterance.

Configuration parameters:

All configuration parameters of the Dialog SpeechObject, plus

Parameter	Type	Description
DelimiterExpression	String	The (optional) delimeter expression used in the grammar between sections of the alphadigit string (dash, dot, etc.)
DelimiterPrompt	Playable	Audio played between sections of the recognized digit string (e.g. 'dash.wav')
Format	String[]	Defines formats of all sections of the string
Format	(int, String)	Defines the format for the given section of the string (e.g. 'DDD-DD-DDDD')
Format	WeightedFormat[]	Defines formats of all sections of the string
Format	(int, WeightedFormat)	Defines the format for the given section of the string
IdentifyExpression	Expression	Identification expression for the Identifiable interface
IdentifyPrompt	Playable	Identification prompt for the Identifiable interface
UseNatural	boolean	If true, allows natural numbers within each numeric section, e.g. 'three six two, fifty seven, eleven hundred'

Return results:

All return results of the Dialog SpeechObject, plus

Return value	Type	Description
DigitString	String	The recognized string without any section delimeters
Section	String[]	The sections of the recognized string
SectionedDigitString	String	The recognized string with '-' between sections, eg. "52-764".

Browsable List

Description:

The Browsable List SpeechObject allows the caller to hear items in a list in sequence, and navigate through the list. The object provides methods through which the developer may dynamically define the list. The list of items to be browsed is encapsulated within a Browsable object.

When invoked, the list plays prompts associated with items, one after another. Depending on configuration, the list advances automatically or through a "next" command to the next item. Based on configuration, the list may terminate automatically at the end, or as the result of an "exit list" command.

The general dialog flow is as follows:

The dialog begins with a preamble if enabled (recognition state which accepts relevant navigational commands), which automatically advances to the first item. Some commands are invalid in the preamble. (For example, "previous" or application-specific commands like delete). The invalid commands are handled as errors. For each item, the item prompt is played, with optional pre-pending and post-pending prompts. Navigation and application-specific commands are active. If enabled, a timeout automatically advances list to the next item. If enabled, a timeout automatically exits the list after the last item.

The default navigation commands are: next, previous, last, first, and exit.

?

Configuration parameters:

Parameter	Type	Description
AutoAdvance	boolean	If set to true, forces the list to advance automatically to the next item if there is no response from the user
Browsable	Browsable	Object providing access to items to be browsed
ExitPrompt	Playable	Prompt played when the list exits
FirstPrePrompt	Playable	The pre-prompt played before the first item
Grammar	Grammar	The grammar used for both preamble and list item recognitions
LastPrePrompt	Playable	The pre-prompt played before the last item
ListNoSpeechTimeoutPrompt	Playable	The prompt played if there is a "no-speech" timeout when the user says a command after the list item
ListRecognizerTooSlowTimeoutPrompt	Playable	The prompt played if there is a "recognizer-too-slow" timeout when the user says a command after the list item
ListRejectedPrompt	Playable	The prompt played if there is a rejection when the user says a command after the list item
ListSpeechTooEarlyPrompt	Playable	The prompt played if there is a "speech-too-early" condition when the user says a command after the list item
ListTooMuchSpeechTimeoutPrompt	Playable	The prompt played if there is a "too-much-speech" timeout when the user says a command after the list item
ListUnexpectedKeyPrompt	Playable	The prompt played if the user presses a dtmf key instead of speaking a command after the list item
MultiItemListErrorPrompt	Playable	The recognition error prompt when the list has more than one item
MultiItemListHelpPrompt	Playable	The help prompt when the list has more than one item
NextPrePrompt	Playable	The pre-prompt played when the user says "next", or when the list auto-advances to the next item
OnlyItemListErrorPrompt	Playable	The recognition error prompt when the list has only one item
OnlyItemListHelpPrompt	Playable	The help prompt when the list has only one item
OnlyItemPrePrompt	Playable	The pre-prompt played when there is only one item
PreambleHelpPrompt	Playable	The prompt played if the user asks for help during the preamble
PreamblePrompt	Playable	The prompt that is played only once when the user first enters the list
PreambleRecognitionErrorPrompt	Playable	The prompt played if there is a recognition error in the preamble
PreviousPrePrompt	Playable	The pre-prompt played when the user says "previous"
ReturnAtEnd	boolean	If true, list will automatically exit when it reaches the end

Return results:

Return value	Type	Description
exitedFromList	boolean	True if the user exited during the list portion
exitedFromPreamble	boolean	True if the user exited during the preamble
Index	int	The index of the item on which the list exited
toString	String	The index of the item on which the list exited, in string form. If the list exited in the preamble, returns the string "PREAMBLE"

5. Appendix

This appendix describes the types of the configuration parameters and return values.

Browsable: One class implementing this interface is BrowsableVector, which includes methods to add to, set, and check items in the list of browsable items. An "item" consists of a Playable, which will be played as the name of the item when the list is browsed, and optional arbitrary user data (to be returned if the item is selected).
Confirmation: This class is used by Confirm and Correct during the Confirmation phase to confirm the values of the Speech Objects being managed by Confirm and Correct. It will ask a question like "I have you flying from Boston to New York on Friday, May 5th. Is that correct?" It is rare that a developer will need to override the default instance for this class.
Expression: An Expression is basically an encoded representation of the right-hand side of a arbitrarily-complex grammar production rule, including semantic tags and probabilities.
Grammar: Extensions of this class include both static and dynamic grammars, specified in code or in files, and combinations thereof. For all grammars it is possible to set the top-level Rulename (a String) that should be used for recognition. These grammars can include probabilities and semantic tags.
Group: This class allows you to define a group: a single-letter name that represents a collection of letters. This group name, along with others, can then be used in a format string for the Alpha Digit String SpeechObject. Each group can also enable or disable digit recognition.
Identify: This class is used by Confirm and Correct during the Identification phases to obtain the user's selection of which Object's value was incorrect. It will ask a question like "Which would you like to change - X, Y, or Z?" It is rare that a developer will need to override the default instance for this class.
Playable: There are many classes that implement the Playable interface. Approximately speaking, a Playable can be any concatenation of silence, recorded audio, tts, random prompts (2), and escalating prompts.(3) See Section 3.3 for a longer explanation of Playables.
Range: This class is a property of US Currency that captures the range as an order of magntitude. For example Range[1,3] captures a range of $0 to $999. By default, the Range is [1,8].
RecResult: This is the standard result object from a single recognition and includes such things as a recognized text string, a confidence score, natural language interpretations and their scores.
ResultFilter: This class (actually an interface) provides one method, pass, which examines a SpeechObject.Result and returns a boolean indicating whether or not the specified result passes this filter. It is used internally by the Dialog SpeechObject and its subclasses to postprocess recognition results.
SODate.DateLimit: Base class allowing absolute and relative date specifiers to set a lower or upper date for use by the Date SpeechObject.
SODayOfMonth: The SpeechObject responsible for getting the date of the month. It is invoked by the Date SpeechObject if it is determined that the day of the month has not been specified and cannot be inferred.
SODisambiguateCurrency: The SpeechObject responsible for disambiguating an ambiguous currency expression. When invoked by the Currency SpeechObject, it lists all the possible interpretations of the recognition result and asks the caller to specify the actual amount in dollars and cents. For example, if the caller said "two fifty" it asks, "Did you mean two dollars fifty cents or two hundred and fifty dollars?". The caller inputs an amount in this state that will override the amount obtained by the Currency SpeechObject.
SODisambiguateTime: The SpeechObject responsible for disambiguating an ambiguous time expression. When invoked by the Time SpeechObject, it asks whether the time is "in the morning or in the evening", or "in the afternoon or in the morning", and so on, depending on the candidate time (i.e. the time to be disambiguated). For the value "12", for example, it asks "Is that twelve noon or midnight?"
WeightedFormat: This is a convenience class used by the Sectioned Digit String SpeechObject to represent a weighted format. It consists of a string representing the sectioning, for example "DDDD-DD-DD", and of an associated probability.

Footnotes

(1) e.g. one could set a parameter to have a linked list as a value but not to add an element to the end of the list - setting to a value is allowed, but executing a function on the value is not. Of course, the calling application is free to check the value, compute a new value using this value, and set the parameter to the new value.

(2) a set of Playables, one of which is selected at random each time the random prompt is to be played.

(3) an ordered set of Playables 1 ... n such that 1 is played the first time the escalating prompt is to be played, 2 the second, and so on.

Original Source | Taken Source

SpeechObjects Specification V1.0

W3C Note 14 November 2000

Abstract

Status of this document

Table of Contents

SpeechObjects Specification V1.0

1. Introduction

?

2. Background

2.1 Architectural model

2.2 Goals and principles of Design

2.3 Implementation Platform and Requirements

2.3.1 Invocation process

2.3.2 Runtime environment requirements

2.3.3 SpeechChannel

3. Concepts

3.1 Universal commands

3.2 Error handling and recovery

3.3 Playables

3.4 Grammars

3.5 Results

3.6 Redo Objects

3.7 Identifiable interface

3.8 Inherited parameters and/or return values

4. SpeechObject specifications

Parameters and return values common to all SpeechObjects

5. Appendix

Footnotes