Invention Grant
- Patent Title: Systems and methods for identification of repetitive language in document using linguistic analysis and correction thereof
-
Application No.: US16902034Application Date: 2020-06-15
-
Publication No.: US11544467B2Publication Date: 2023-01-03
- Inventor: Davide Turcato , Alfredo R. Arnaiz , Domenic Joseph Cipollone , Michael Wilson Daniels
- Applicant: Microsoft Technology Licensing, LLC
- Applicant Address: US WA Redmond
- Assignee: Microsoft Technology Licensing, LLC
- Current Assignee: Microsoft Technology Licensing, LLC
- Current Assignee Address: US WA Redmond
- Main IPC: G06F40/30
- IPC: G06F40/30 ; G06N20/00 ; G06F40/211

Abstract:
The present disclosure relates to processing operations configured to provide a linguistic-based approach to evaluating repetition in content of an electronic document. The approach of the present disclosure is about detecting terms/words/phrases that are likely to be perceived as being repetitious by native speakers of a language rather than just identifying the occurrence of identical words or strings in a document as done by traditional language checks. Processing of the present disclosure detects and evaluates terms or phrases using positive linguistic evidence derived from evaluation of linguistic relationships between words in a string in syntactic ways. This results in more accurate and efficient determination as to whether a term is truly repetitious at the linguistic level as compared with traditional language checks. As compared with string-based evaluation, fewer flags are raised for identification of repetitive/over-used language, but more precise/accurate identification of repetition occurs using processing of the present disclosure.
Public/Granted literature
Information query