Systems and methods for principled bias reduction in production speech models

Invention Grant

US10657955B2 Systems and methods for principled bias reduction in production speech models 有权

Please log in to see more content

Patent Title: Systems and methods for principled bias reduction in production speech models
Application No.: US15884239

Application Date: 2018-01-30
Publication No.: US10657955B2

Publication Date: 2020-05-19
Inventor: Eric Battenberg , Rewon Child , Adam Coates , Christopher Fougner , Yashesh Gaur , Jiaji Huang , Heewoo Jun , Ajay Kannan , Markus Kliegl , Atul Kumar , Hairong Liu , Vinay Rao , Sanjeev Satheesh , David Seetapun , Anuroop Sriram , Zhenyao Zhu
Applicant: Baidu USA, LLC
Applicant Address: US CA Sunnyvale
Assignee: Baidu USA LLC
Current Assignee: Baidu USA LLC
Current Assignee Address: US CA Sunnyvale
Agency: North Weber & Baugh LLP
Main IPC: G10L17/18
IPC: G10L17/18 ; G10L15/16 ; G10L15/04 ; G10L15/22 ; G10L15/02 ; G10L25/18

Systems and methods for principled bias reduction in production speech models

Abstract:

Described herein are systems and methods to identify and address sources of bias in an end-to-end speech model. In one or more embodiments, the end-to-end model may be a recurrent neural network with two 2D-convolutional input layers, followed by multiple bidirectional recurrent layers and one fully connected layer before a softmax layer. In one or more embodiments, the network is trained end-to-end using the CTC loss function to directly predict sequences of characters from log spectrograms of audio. With optimized recurrent layers and training together with alignment information, some unwanted bias induced by using purely forward only recurrences may be removed in a deployed model.

Public/Granted literature

US20180247643A1 SYSTEMS AND METHODS FOR PRINCIPLED BIAS REDUCTION IN PRODUCTION SPEECH MODELS Public/Granted day:2018-08-30

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L17/00	讲话者辨认或验证
G10L17/18	.人工神经网络，连接方法