SYSTEMS AND METHODS FOR ALIGNING A REFERENCE SEQUENCE OF SYMBOLS WITH HYPOTHESIS REQUIRING REDUCED PROCESSING AND MEMORY

    公开(公告)号:US20220115003A1

    公开(公告)日:2022-04-14

    申请号:US17069462

    申请日:2020-10-13

    Applicant: Rev.com, Inc.

    Abstract: A method of determining an alignment sequence between a reference sequence of symbols and a hypothesis sequence of symbols includes loading a reference sequence of symbols to a computing system and creating a reference finite state automaton for the reference sequence of symbols. The method further includes loading a hypothesis sequence of symbols to the computing system and creating a hypothesis finite state automaton for the hypothesis sequence of symbols. The method further includes traversing the reference finite state automaton, adding new reference arcs and new reference transforming properties arcs and traversing the hypothesis finite state automaton, adding new hypothesis arcs and new hypothesis transforming properties arcs. The method further includes composing the hypothesis finite state automaton with the reference finite state automaton creating alternative paths to form a composed finite state automaton and tracking a number of the alternative paths created. The method further includes pruning the alternative paths based on likely top paths, backtracking over most likely paths of the composed finite state automaton, and rescoring edit-distances of the composed finite state automaton.

    SYSTEMS AND METHODS FOR A TWO PASS DIARIZATION, AUTOMATIC SPEECH RECOGNITION, AND TRANSCRIPT GENERATION

    公开(公告)号:US20210050015A1

    公开(公告)日:2021-02-18

    申请号:US17087330

    申请日:2020-11-02

    Applicant: Rev.com, Inc.

    Abstract: In one embodiment, a method for transcript generation includes receiving an audio file and dividing it into a plurality of chunks. The method further includes sending each instance of the plurality of chunks to a speech service module. The method further includes converting speech to text for each instance of the plurality of chunks and returning the text for each instance of the plurality of chunks. The method further includes merging the text for each instance of the plurality of chunks to yield an audio file transcript and sending the audio file and chunks to a diarization module. The method further includes performing first pass diarization on the chunks to yield a plurality of diarized chunks and performing second pass diarization on the plurality of diarized chunks and the audio file to yield a diarized audio file. The method further includes merging the files to yield a final transcript.

    SYSTEMS AND METHODS FOR A TWO PASS DIARIZATION, AUTOMATIC SPEECH RECOGNITION, AND TRANSCRIPT GENERATION

    公开(公告)号:US20200135204A1

    公开(公告)日:2020-04-30

    申请号:US16177061

    申请日:2018-10-31

    Applicant: Rev.com, Inc.

    Abstract: In one embodiment, a method for transcript generation includes receiving an audio file and dividing it into a plurality of chunks. The method further includes sending each instance of the plurality of chunks to a speech service module. The method further includes converting speech to text for each instance of the plurality of chunks and returning the text for each instance of the plurality of chunks. The method further includes merging the text for each instance of the plurality of chunks to yield an audio file transcript and sending the audio file and chunks to a diarization module. The method further includes performing first pass diarization on the chunks to yield a plurality of diarized chunks and performing second pass diarization on the plurality of diarized chunks and the audio file to yield a diarized audio file. The method further includes merging the files to yield a final transcript.

    Systems and methods for aligning a reference sequence of symbols with hypothesis requiring reduced processing and memory

    公开(公告)号:US12254866B2

    公开(公告)日:2025-03-18

    申请号:US17069462

    申请日:2020-10-13

    Applicant: Rev.com, Inc.

    Abstract: A method of determining an alignment sequence between a reference sequence of symbols and a hypothesis sequence of symbols includes loading a reference sequence of symbols to a computing system and creating a reference finite state automaton for the reference sequence of symbols. The method further includes loading a hypothesis sequence of symbols to the computing system and creating a hypothesis finite state automaton for the hypothesis sequence of symbols. The method further includes traversing the reference finite state automaton, adding new reference arcs and new reference transforming properties arcs and traversing the hypothesis finite state automaton, adding new hypothesis arcs and new hypothesis transforming properties arcs. The method further includes composing the hypothesis finite state automaton with the reference finite state automaton creating alternative paths to form a composed finite state automaton and tracking a number of the alternative paths created. The method further includes pruning the alternative paths based on likely top paths, backtracking over most likely paths of the composed finite state automaton, and rescoring edit-distances of the composed finite state automaton.

Patent Agency Ranking