Invention Grant
- Patent Title: Permutation invariant training for talker-independent multi-talker speech separation
-
Application No.: US15226527Application Date: 2016-08-02
-
Publication No.: US10249305B2Publication Date: 2019-04-02
- Inventor: Dong Yu
- Applicant: Microsoft Technology Licensing, LLC
- Applicant Address: US WA Redmond
- Assignee: Microsoft Technology Licensing, LLC
- Current Assignee: Microsoft Technology Licensing, LLC
- Current Assignee Address: US WA Redmond
- Agency: Schwegman Lundberg & Woessner, P.A.
- Main IPC: G10L21/0272
- IPC: G10L21/0272 ; G06K9/62 ; G10L17/04 ; G10L17/18 ; G10L19/022 ; G10L21/0208

Abstract:
The techniques described herein improve methods to equip a computing device to conduct automatic speech recognition (“ASR”) in talker-independent multi-talker scenarios. In some examples, permutation invariant training of deep learning models can be used for talker-independent multi-talker scenarios. In some examples, the techniques can determine a permutation-considered assignment between a model's estimate of a source signal and the source signal. In some examples, the techniques can include training the model generating the estimate to minimize a deviation of the permutation-considered assignment. These techniques can be implemented into a neural network's structure itself, solving the label permutation problem that prevented making progress on deep learning based techniques for speech separation. The techniques discussed herein can also include source tracing to trace streams originating from a same source through the frames of a mixed signal.
Public/Granted literature
- US20170337924A1 PERMUTATION INVARIANT TRAINING FOR TALKER-INDEPENDENT MULTI-TALKER SPEECH SEPARATION Public/Granted day:2017-11-23
Information query