Invention Grant
- Patent Title: Multimodal speech recognition for real-time video audio-based display indicia application
-
Application No.: US14967726Application Date: 2015-12-14
-
Publication No.: US09959872B2Publication Date: 2018-05-01
- Inventor: Priscilla Barreira Avegliano , Carlos Henrique Cardonha , Stefany Mazon , Julio Nogima
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Current Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Current Assignee Address: US NY Armonk
- Agency: Cantor Colburn LLP
- Main IPC: G10L15/00
- IPC: G10L15/00 ; G10L15/26 ; G10L15/32 ; G10L21/10 ; G10L21/18 ; H04N21/488 ; H04N21/44 ; H04N21/439 ; H04N21/84 ; H04N21/845 ; G10L21/06

Abstract:
Aspects relate to computer implemented methods, systems, and processes to automatically generate audio-based display indicia of media content including receiving, by a processor, a plurality of media content categories including at least one feature, receiving a plurality of categorized speech recognition algorithms, each speech recognition algorithm being associated with a respective one or more of the plurality of media content categories, determining a media content category of a current media content based on at least one feature of the current media content, selecting one speech recognition algorithm from the plurality of categorized speech recognition algorithms based on the determination of the media content category of the current media content, and applying the selected speech recognition algorithm to the current media content.
Public/Granted literature
- US20170169827A1 MULTIMODAL SPEECH RECOGNITION FOR REAL-TIME VIDEO AUDIO-BASED DISPLAY INDICIA APPLICATION Public/Granted day:2017-06-15
Information query