Layer 3 Collaboration

Layer 3 technology adds additional intelligence to the IVR dialogue which enables better collaboration between the user and the machine. Errors are minimized and the dialogue is as effective, efficient, and satisfying as possible.

Turn Taking

Turn-taking problems associated with using barge-in along with speech recognition are a major cause of user interface failure and call frustration. The term “turn-taking” refers to the pattern of interaction when two or more people communicate using spoken language. At any given moment, one or all of the people in a conversation may be speaking, thinking of speaking, or remaining silent. Turn-taking is the protocol by which the participants in the conversation decide whether, and when, it is their turn to speak.

The normal pattern of turn-taking is for one person to speak at a time. However, there are instances where speakers overlap their speech. Turn-taking also exists in a conversation between a machine and a person or “user”. Just as with Human-Human conversation, Human-Machine conversation must deal with the problem of interruptions.

Barge-in technology alone cannot resolve the turn-taking user interface problems. SPT Layer 3 technology works with or without barge-in being enabled and gracefully resolves all major turn-taking issues, including:

  • Yielding the floor
  • Holding the floor
  • Mutual back-off and re-start

Dynamic Mode-Switching

As the user interacts with the IVR, immediate circumstances3 determine which mode to present first. SPT directed dialogues include extremely well-researched methods for switching modes, degrading from speech to touch-tone in noise, upgrading from touch-tone to speech for speech-relevant interactions, detecting user modality preferences, and presenting effective prompts that support both modalities. In fact, the term ShadowPrompt® was originally coined as a descriptor for these multimodal prompting solutions.

Layer 3 technology includes multi-dimensional confidence. Prior to this invention, speech recognition systems would only return confidence values associated with the acoustic/phonetic patterns of the spoken speech input. Layer 3 extracts additional confidence dimensions, including such things as:

  • Speech timing (onset of speech)
  • Speech duration
  • State-completion
  • Mode (e.g. touch-tone or speech)

In addition, speech recognition systems did not track confidence between dialogue states prior to this invention. Layer 3 manages turn-taking across multiple turns of the dialogue so that mode (e.g. speech or touch-tone) can gracefully adapt to how well the application is performing. A mode confidence level parameter is introduced and incremented up or down based on the confidence at each stage of the dialogue

Open Speech (this is an optional add-on package for those with SLM applications)

For further information, please contact us.