Invention Grant
- Patent Title: Multi-encoder end-to-end automatic speech recognition (ASR) for joint modeling of multiple input devices
-
Application No.: US17354480Application Date: 2021-06-22
-
Publication No.: US11978433B2Publication Date: 2024-05-07
- Inventor: Felix Weninger , Marco Gaudesi , Ralf Leibold , Puming Zhan
- Applicant: Microsoft Technology Licensing, LLC
- Applicant Address: US WA Redmond
- Assignee: Microsoft Technology Licensing, LLC.
- Current Assignee: Microsoft Technology Licensing, LLC.
- Current Assignee Address: US WA Redmond
- Agency: Barta Jones, PLLC
- Main IPC: G10L19/02
- IPC: G10L19/02 ; G10L15/04 ; G10L21/0208 ; G10L25/24

Abstract:
An end-to-end automatic speech recognition (ASR) system includes: a first encoder configured for close-talk input captured by a close-talk input mechanism; a second encoder configured for far-talk input captured by a far-talk input mechanism; and an encoder selection layer configured to select at least one of the first and second encoders for use in producing ASR output. The selection is made based on at least one of short-time Fourier transform (STFT), Mel-frequency Cepstral Coefficient (MFCC) and filter bank derived from at least one of the close-talk input and the far-talk input. If signals from both the close-talk input mechanism and the far-talk input mechanism are present for a speech segment, the encoder selection layer dynamically selects between the close-talk encoder and the far-talk encoder to select the encoder that better recognizes the speech segment. An encoder-decoder model is used to produce the ASR output.
Public/Granted literature
- US20220406295A1 MULTI-ENCODER END-TO-END AUTOMATIC SPEECH RECOGNITION (ASR) FOR JOINT MODELING OF MULTIPLE INPUT DEVICES Public/Granted day:2022-12-22
Information query