Invention Grant
- Patent Title: Preprocessing text to enhance statistical features
- Patent Title (中): 预处理文本以增强统计特征
-
Application No.: US12395319Application Date: 2009-02-27
-
Publication No.: US08527500B2Publication Date: 2013-09-03
- Inventor: James Paul Schneider
- Applicant: James Paul Schneider
- Applicant Address: US NC Raleigh
- Assignee: Red Hat, Inc.
- Current Assignee: Red Hat, Inc.
- Current Assignee Address: US NC Raleigh
- Agency: Lowenstein Sandler LLP
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
A document preprocessor preprocess a document to enhance the statistical features of the document. The system preprocesses the document by matching a prefix and a trailing context in the document with one or more matching prefixes in a transformation database, where the prefix is a first string of one or more tokens in the first document and the trailing context is a second string of one or more tokens in the first document that trail the prefix. Alternatively, the system preprocesses the document by computing cyclic permutations of the document, sorting these permutations and taking the last token from each of the sorted permutations.
Public/Granted literature
- US20100223288A1 PREPROCESSING TEXT TO ENHANCE STATISTICAL FEATURES Public/Granted day:2010-09-02
Information query