Speaker diarization with early-stop clustering

Invention Grant

US12112759B2 Speaker diarization with early-stop clustering 有权

Please log in to see more content

Patent Title: Speaker diarization with early-stop clustering
Application No.: US17432454

Application Date: 2019-03-29
Publication No.: US12112759B2

Publication Date: 2024-10-08
Inventor: Liping Chen , Kao-Ping Soong
Applicant: Microsoft Technology Licensing, LLC
Applicant Address: US WA Redmond
Assignee: Microsoft Technology Licensing, LLC
Current Assignee: Microsoft Technology Licensing, LLC
Current Assignee Address: US WA Redmond
Agency: Schwegman Lundberg & Woessner, P.A.
International Application: PCT/CN2019/080617 2019.03.29
International Announcement: WO2020/199013A 2020.10.08
Date entered country: 2021-08-19
Main IPC: G10L17/16
IPC: G10L17/16 ; G10L17/02 ; G10L17/06 ; G10L17/18 ; G10L21/028

Abstract:

A method and apparatus for speaker diarization with early-stop clustering, segmenting an audio stream into at least one speech segment (710), the audio stream comprising speeches from at least one speaker; clustering the at least one speech segment into a plurality of clusters (720), the number of the plurality of clusters being greater than the number of the at least one speaker; selecting, from the plurality of clusters, at least one cluster of the highest similarity (730), the number of the selected at least one cluster being equal to the number of the at least one speaker; establishing a speaker classification model based on the selected at least one cluster (740); and aligning, through the speaker classification model, speech frames in the audio stream to the at least one speaker (750).

Public/Granted literature

US20220122615A1 SPEAKER DIARIZATION WITH EARLY-STOP CLUSTERING Public/Granted day:2022-04-21

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L17/00	讲话者辨认或验证
G10L17/16	.隐马尔科夫模型