Reducing recording time when constructing a concatenative TTS voice using a reduced script and pre-recorded speech assets
    1.
    发明授权
    Reducing recording time when constructing a concatenative TTS voice using a reduced script and pre-recorded speech assets 有权
    使用减少的脚本和预录制的语音资源构建级联TTS语音时减少录制时间

    公开(公告)号:US08019605B2

    公开(公告)日:2011-09-13

    申请号:US11748256

    申请日:2007-05-14

    CPC classification number: G10L13/04

    Abstract: The present invention discloses a system and a method for creating a reduced script, which is read by a voice talent to create a concatenative text-to-speech (TTS) voice. The method can automatically process pre-recorded audio to derive speech assets for a concatenative TTS voice. The pre-recording audio can include sets of recorded phrases used by a speech user interface (Sill). A set of unfulfilled speech assets needed for foil phonetic coverage of the concatenative TTS voice can be determined. A reduced script can be constructed that includes a set of phrases, which when read by a voice talent result in a reduced corpus. When the reduced corpus is automatically processed, a reduced set of speech assets result. The reduced set includes each of the unfulfilled speech assets. When this reduced corpus is combined with existing speech assets the result will be a voice with a complete set of speech assets.

    Abstract translation: 本发明公开了一种用于创建简化脚本的系统和方法,该脚本由语音天才读取以创建级联的文本到语音(TTS)语音。 该方法可以自动处理预先录制的音频,以便为连续的TTS语音导出语音资源。 预录音音频可以包括由语音用户界面(Sill)使用的记录短语集合。 可以确定一连串的TTS语音的箔语音覆盖所需的一组未实现的语音资产。 可以构造一个简化的脚本,其包括一组短语,当通过语音天赋读取时,会产生减少的语料库。 当自动处理缩减的语料库时,会产生一组减少的语音资源。 缩减的集合包括每个未实现的语音资产。 当这种减少的语料库与现有语音资源相结合时,结果将是具有完整语音资产的语音。

    Adjusting a speech engine for a mobile computing device based on background noise
    2.
    发明授权
    Adjusting a speech engine for a mobile computing device based on background noise 有权
    基于背景噪声调整移动计算设备的语音引擎

    公开(公告)号:US09076454B2

    公开(公告)日:2015-07-07

    申请号:US13358097

    申请日:2012-01-25

    CPC classification number: G10L21/0208 G10L15/20

    Abstract: Methods, apparatus, and products are disclosed for adjusting a speech engine for a mobile computing device based on background noise, the mobile computing device operatively coupled to a microphone, that include: sampling, through the microphone, background noise for a plurality of operating environments in which the mobile computing device operates; generating, for each operating environment, a noise model in dependence upon the sampled background noise for that operating environment; and configuring the speech engine for the mobile computing device with the noise model for the operating environment in which the mobile computing device currently operates.

    Abstract translation: 公开了用于基于背景噪声调整用于移动计算设备的语音引擎的方法,装置和产品,该移动计算设备可操作地耦合到麦克风,其包括:通过麦克风对多个操作环境的背景噪声进行采样 其中移动计算设备运行; 根据所述操作环境的采样背景噪声,为每个操作环境产生噪声模型; 以及为移动计算设备当前操作的操作环境的噪声模型配置移动计算设备的语音引擎。

    Enhancing media playback with speech recognition
    3.
    发明授权
    Enhancing media playback with speech recognition 有权
    通过语音识别增强媒体播放

    公开(公告)号:US08478592B2

    公开(公告)日:2013-07-02

    申请号:US12180583

    申请日:2008-07-28

    CPC classification number: G10L15/22 G10L15/19 G10L2015/228

    Abstract: A method for enhancing a media file to enable speech-recognition of spoken navigation commands can be provided. The method can include receiving a plurality of textual items based on subject matter of the media file and generating a grammar for each textual item, thereby generating a plurality of grammars for use by a speech recognition engine. The method can further include associating a time stamp with each grammar, wherein a time stamp indicates a location in the media file of a textual item corresponding with a grammar. The method can further include associating the plurality of grammars with the media file, such that speech recognized by the speech recognition engine is associated with a corresponding location in the media file.

    Abstract translation: 可以提供用于增强媒体文件以实现语音导航命令的语音识别的方法。 该方法可以包括基于媒体文件的主题接收多个文本项目并为每个文本项目生成语法,从而生成多个用于语音识别引擎使用的语法。 该方法还可以包括将时间戳与每个语法相关联,其中时间戳表示与语法相对应的文本项的媒体文件中的位置。 该方法还可以包括将多个语法与媒体文件相关联,使得由语音识别引擎识别的语音与媒体文件中的对应位置相关联。

    Partially filling mixed-initiative forms from utterances having sub-threshold confidence scores based upon word-level confidence data
    4.
    发明授权
    Partially filling mixed-initiative forms from utterances having sub-threshold confidence scores based upon word-level confidence data 有权
    从基于词级置信度数据的具有子阈值置信度得分的话语部分填充混合主动形式

    公开(公告)号:US07870000B2

    公开(公告)日:2011-01-11

    申请号:US11692741

    申请日:2007-03-28

    CPC classification number: G10L15/22 G10L15/193

    Abstract: The present disclosure relates to prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played.

    Abstract translation: 本公开涉及提示提供多个元素的输入的口头响应。 可以接收包括多个元素的内容的单个语音话语,其中每个元素被映射到数据字段。 讲话语音可以是语音到文本转换,以导出每个多个元素的值。 可以确定话语等级置信度得分,其可以低于相关的确定性阈值。 然后可以确定每个派生元素的元素级置信度得分。 多个元素的第一组可以具有高于相关确定性阈值的元素级置信度得分,而第二组可以具有下面的得分。 值可以存储在映射到第一组的数据字段中。 可以播放第二组的输入提示。

    SYSTEM AND METHOD FOR IMPROVING MESSAGE DELIVERY IN VOICE SYSTEMS UTILIZING MICROPHONE AND TARGET SIGNAL-TO-NOISE RATIO
    5.
    发明申请
    SYSTEM AND METHOD FOR IMPROVING MESSAGE DELIVERY IN VOICE SYSTEMS UTILIZING MICROPHONE AND TARGET SIGNAL-TO-NOISE RATIO 有权
    利用麦克风和目标信号噪声比改善语音系统中的信息传递的系统和方法

    公开(公告)号:US20080147386A1

    公开(公告)日:2008-06-19

    申请号:US11612329

    申请日:2006-12-18

    CPC classification number: G10L21/0208

    Abstract: A method for delivering a message to a recipient in an environment with ambient noise includes the steps of recording the ambient noise in the environment at a certain time interval, analyzing the recorded ambient noise to obtain an average power Pnoise or a RMS amplitude Anoise of the ambient noise, providing a predetermined desired SNRdesired, calculating an average signal power Psignal or a RMS amplitude Asignal of the message to be delivered based on the Pnoise or Anoise and the desired SNRdesired, and adjusting a volume of the message to be delivered according to the Psignal or Asignal. Alternatively, the actual SNRactual will be computed and the message will be repeated if the SNRactual falls below the SNRmin. Systems for delivering a message to a recipient in an environment with ambient noise and computer-readable media having computer-executable instructions for carrying out the methods are also provided.

    Abstract translation: 用于在具有环境噪声的环境中向接收者发送消息的方法包括以一定时间间隔在环境中记录环境噪声的步骤,分析所记录的环境噪声以获得平均功率P SUB噪声 >或环境噪声的RMS幅度A SUB噪声,提供预期的期望SNR ,计算平均信号功率P SUB信号或RMS 将要传送的消息的幅度A 信号基于所需的噪声或A ,并且根据P 信号或A 信号调整要传送的消息的音量。 或者,将计算实际的SNR实际,并且如果SNR实际低于SNR ,则将重复该消息。 还提供了用于在具有环境噪声的环境中向接收者发送消息的系统以及具有用于执行方法的计算机可执行指令的计算机可读介质。

    Associating file types with web-based applications for automatically launching the associated application
    6.
    发明授权
    Associating file types with web-based applications for automatically launching the associated application 有权
    将文件类型与基于Web的应用程序相关联,以自动启动关联的应用程序

    公开(公告)号:US08990697B2

    公开(公告)日:2015-03-24

    申请号:US11834315

    申请日:2007-08-06

    CPC classification number: G06F3/048

    Abstract: The present invention discloses a launching engine configured to automatically launch a Web site and load an electronic document responsive to a launching event for the electronic document. The launching engine can be a component of a computer operating system (e.g., MAC OS, OS/2, WINDOWS XP, etc.) or a graphics management component (e.g., KDE, GNOME, etc.) of a computer. A launching event can be initiated by user selection of a document icon, a user selection of an electronic document from a file management application, a launching script for the electronic document triggered by a media insertion action, and the like.

    Abstract translation: 本发明公开了一种启动引擎,其配置成响应于电子文档的发射事件而自动启动网站并加载电子文档。 启动引擎可以是计算机操作系统(例如,MAC OS,OS / 2,WINDOWS XP等)或计算机的图形管理组件(例如,KDE,GNOME等)的组件。 可以通过用户选择文档图标,从文件管理应用程序对电子文档的用户选择,由媒体插入动作触发的电子文档的启动脚本等来启动启动事件。

    System and method for improving message delivery in voice systems utilizing microphone and target signal-to-noise ratio
    7.
    发明授权
    System and method for improving message delivery in voice systems utilizing microphone and target signal-to-noise ratio 有权
    使用麦克风和目标信噪比改善语音系统中消息传送的系统和方法

    公开(公告)号:US08027437B2

    公开(公告)日:2011-09-27

    申请号:US11612329

    申请日:2006-12-18

    CPC classification number: G10L21/0208

    Abstract: A method for delivering a message to a recipient in an environment with ambient noise includes the steps of recording the ambient noise in the environment at a certain time interval, analyzing the recorded ambient noise to obtain an average power Pnoise or a RMS amplitude Anoise of the ambient noise, providing a predetermined desired SNRdesired, calculating an average signal power Psignal or a RMS amplitude Asignal of the message to be delivered based on the Pnoise or Anoise and the desired SNRdesired, and adjusting a volume of the message to be delivered according to the Psignal or Asignal. Alternatively, the actual SNRactual will be computed and the message will be repeated if the SNRactual falls below the SNRmin. Systems for delivering a message to a recipient in an environment with ambient noise and computer-readable media having computer-executable instructions for carrying out the methods are also provided.

    Abstract translation: 在具有环境噪声的环境中向接收者发送消息的方法包括以一定时间间隔在环境中记录环境噪声的步骤,分析记录的环境噪声以获得平均功率Pnoise或RMS幅度的噪声 环境噪声,提供预定的期望的SNR,计算基于Pnoise或Anoise和所需要的SNR所要传送的消息的平均信号功率Psignal或RMS幅度Asignal,并且根据所述信号调整要传送的消息的音量 信号或信号。 或者,如果SNR实际值低于SNRmin,则将计算实际的SNR实际值并重复该消息。 还提供了用于在具有环境噪声的环境中向接收者发送消息的系统以及具有用于执行方法的计算机可执行指令的计算机可读介质。

    ENHANCING MEDIA PLAYBACK WITH SPEECH RECOGNITION
    8.
    发明申请
    ENHANCING MEDIA PLAYBACK WITH SPEECH RECOGNITION 有权
    增强媒体播放声音识别

    公开(公告)号:US20100010814A1

    公开(公告)日:2010-01-14

    申请号:US12180583

    申请日:2008-07-28

    CPC classification number: G10L15/22 G10L15/19 G10L2015/228

    Abstract: A method for enhancing a media file to enable speech-recognition of spoken navigation commands can be provided. The method can include receiving a plurality of textual items based on subject matter of the media file and generating a grammar for each textual item, thereby generating a plurality of grammars for use by a speech recognition engine. The method can further include associating a time stamp with each grammar, wherein a time stamp indicates a location in the media file of a textual item corresponding with a grammar. The method can further include associating the plurality of grammars with the media file, such that speech recognized by the speech recognition engine is associated with a corresponding location in the media file.

    Abstract translation: 可以提供用于增强媒体文件以实现语音导航命令的语音识别的方法。 该方法可以包括基于媒体文件的主题接收多个文本项目并为每个文本项目生成语法,从而生成多个用于语音识别引擎使用的语法。 该方法还可以包括将时间戳与每个语法相关联,其中时间戳表示与语法相对应的文本项的媒体文件中的位置。 该方法还可以包括将多个语法与媒体文件相关联,使得由语音识别引擎识别的语音与媒体文件中的对应位置相关联。

    Adjusting A Speech Engine For A Mobile Computing Device Based On Background Noise
    9.
    发明申请
    Adjusting A Speech Engine For A Mobile Computing Device Based On Background Noise 有权
    基于背景噪声调整移动计算设备的语音引擎

    公开(公告)号:US20090271188A1

    公开(公告)日:2009-10-29

    申请号:US12109151

    申请日:2008-04-24

    CPC classification number: G10L21/0208 G10L15/20

    Abstract: Methods, apparatus, and products are disclosed for adjusting a speech engine for a mobile computing device based on background noise, the mobile computing device operatively coupled to a microphone, that include: sampling, through the microphone, background noise for a plurality of operating environments in which the mobile computing device operates; generating, for each operating environment, a noise model in dependence upon the sampled background noise for that operating environment; and configuring the speech engine for the mobile computing device with the noise model for the operating environment in which the mobile computing device currently operates.

    Abstract translation: 公开了用于基于背景噪声调整用于移动计算设备的语音引擎的方法,装置和产品,该移动计算设备可操作地耦合到麦克风,其包括:通过麦克风对多个操作环境的背景噪声进行采样 其中移动计算设备运行; 根据所述操作环境的采样背景噪声,为每个操作环境产生噪声模型; 以及为移动计算设备当前操作的操作环境的噪声模型配置移动计算设备的语音引擎。

    VOICE RECOGNITION INTERACTIVE SYSTEM
    10.
    发明申请
    VOICE RECOGNITION INTERACTIVE SYSTEM 有权
    语音识别交互系统

    公开(公告)号:US20080140400A1

    公开(公告)日:2008-06-12

    申请号:US11609667

    申请日:2006-12-12

    CPC classification number: G10L15/22

    Abstract: A system and method for voice recognition interaction is provided. The system can have a processor for receiving a voice signal and determining a command based on the voice signal. The system can also have a confirmation interface operably connected to the processor, where the confirmation interface is capable of receiving a confirmation signal from a user and providing the confirmation signal to the processor. The system can have a user identifying device for determining an identity of the user. The processor can determine a confirmation criteria based at least in part on the identity of the user or a type of the command. The satisfaction of the confirmation criteria can be applied to allow or prevent performance of the command.

    Abstract translation: 提供了一种用于语音识别交互的系统和方法。 系统可以具有用于接收语音信号的处理器,并且基于语音信号确定命令。 系统还可以具有可操作地连接到处理器的确认接口,其中确认接口能够从用户接收确认信号并向处理器提供确认信号。 该系统可以具有用于确定用户身份的用户识别装置。 处理器可以至少部分地基于用户的身份或命令的类型来确定确认标准。 可以应用确认标准的满足以允许或防止执行命令。

Patent Agency Ranking