Multi-type acoustic feature integration method and system based on deep neural networks

Invention Grant

US11217225B2 Multi-type acoustic feature integration method and system based on deep neural networks 有权

Please log in to see more content

Patent Title: Multi-type acoustic feature integration method and system based on deep neural networks
Application No.: US17154801

Application Date: 2021-01-21
Publication No.: US11217225B2

Publication Date: 2022-01-04
Inventor: Lin Li , Zheng Li , Qingyang Hong
Applicant: XIAMEN UNIVERSITY
Applicant Address: CN Fujian
Assignee: XIAMEN UNIVERSITY
Current Assignee: XIAMEN UNIVERSITY
Current Assignee Address: CN Fujian
Agency: Calfee Halter & Griswold LLP
Priority: CN202010073244.8 20200122
Main IPC: G10L15/02
IPC: G10L15/02 ; G06N3/04 ; G06N3/08 ; G10L15/01 ; G10L15/06 ; G10L15/16 ; G10L15/22

Abstract:

The application discloses a multi-type acoustic feature integration method and system based on deep neural networks. The method and system include using labeled speech data set to train and build a multi-type acoustic feature integration model based on deep neural networks, to determine or update the network parameters of the multi-type acoustic feature integration model; the method and system includes inputting the multiple types of acoustic features extracted from the testing speech into the trained multi-type acoustic feature integration model, and extracting the deep integrated feature vectors in frame level or segment level. The solution supports the integrated feature extraction for multiple types of acoustic features in different kinds of speech tasks, such as speech recognition, speech wake-up, spoken language recognition, speaker recognition, and anti-spoofing etc. It encourages the deep neural networks to explore internal correlation between multiple types of acoustic features according to practical speech tasks, to improve the recognition accuracy and stability of speech applications.

Public/Granted literature

US20210233511A1 MULTI-TYPE ACOUSTIC FEATURE INTEGRATION METHOD AND SYSTEM BASED ON DEEP NEURAL NETWORKS Public/Granted day:2021-07-29

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/02	.语音识别的特征提取；识别单位的选择