Invention Grant
- Patent Title: Long-context end-to-end speech recognition system
-
Application No.: US17069538Application Date: 2020-10-13
-
Publication No.: US11978435B2Publication Date: 2024-05-07
- Inventor: Takaaki Hori , Niko Moritz , Chiori Hori , Jonathan Le Roux
- Applicant: Mitsubishi Electric Research Laboratories, Inc.
- Applicant Address: US MA Cambridge
- Assignee: Mitsubishi Electric Research Laboratories, Inc.
- Current Assignee: Mitsubishi Electric Research Laboratories, Inc.
- Current Assignee Address: US MA Cambridge
- Agent Gene Vinokur; Hironori Tsukamoto
- Main IPC: G10L15/07
- IPC: G10L15/07 ; G10L15/16 ; G10L15/22

Abstract:
This invention relates generally to speech processing and more particularly to end-to-end automatic speech recognition (ASR) that utilizes long contextual information. Some embodiments of the invention provide a system and a method for end-to-end ASR suitable for recognizing long audio recordings such as lecture and conversational speeches. This disclosure includes a Transformer-based ASR system that utilizes contextual information, wherein the Transformer accepts multiple utterances at the same time and predicts transcript for the last utterance. This is repeated in a sliding-window fashion with one-utterance shifts to recognize the entire recording. In addition, some embodiments of the present invention may use acoustic and/or text features obtained from only the previous utterances spoken by the same speaker as the last utterance when the long audio recording includes multiple speakers.
Public/Granted literature
- US20220115006A1 Long-context End-to-end Speech Recognition System Public/Granted day:2022-04-14
Information query