Providing high quality speech recognition

Invention Grant

US11721324B2 Providing high quality speech recognition 有权

Please log in to see more content

Patent Title: Providing high quality speech recognition
Application No.: US17343431

Application Date: 2021-06-09
Publication No.: US11721324B2

Publication Date: 2023-08-08
Inventor: Yuan Jin , Xi Xi Liu , Li ping Wang , Fan Xiao Xin , Zheng Ping Chu
Applicant: International Business Machines Corporation
Applicant Address: US NY Armonk
Assignee: International Business Machines Corporation
Current Assignee: International Business Machines Corporation
Current Assignee Address: US NY Armonk
Agency: Shackelford, Bowen, McKinley & Norton, LLP
Agent Robert A. Voigt, Jr.
Main IPC: G10L15/30
IPC: G10L15/30 ; G10L15/02 ; G06N3/08 ; G10L25/51 ; G10L25/30

Abstract:

A computer-implemented method, system and computer program product for providing high quality speech recognition. A first speech-to-text model is selected to perform speech recognition of a customer's spoken words and a second speech-to-text model is selected to perform speech recognition of the agent's spoken words during a call. The combined results of the speech-to-text models used to process the customer's and agent's spoken words are then analyzed to generate a reference speech-to-text result. The customer speech data that was processed by the first speech-to-text model is reprocessed by multiple other speech-to-text models. A similarity analysis is performed on the results of these speech-to-text models with respect to the reference speech-to-text result resulting in similarity scores being assigned to these speech-to-text models. The speech-to-text model with the highest similarity score is then selected as the new speech-to-text model for performing speech recognition of the customer's spoken words during the call.

Public/Granted literature

US20220399006A1 PROVIDING HIGH QUALITY SPEECH RECOGNITION Public/Granted day:2022-12-15

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/28	.语音识别系统的结构细节
G10L15/30	..分布式识别，例如：客户端-服务器系统，为移动电话或网络应用