Invention Grant
- Patent Title: System and method for automatic document classification in eDiscovery, compliance and legacy information clean-up
-
Application No.: US14989969Application Date: 2016-01-07
-
Publication No.: US10565502B2Publication Date: 2020-02-18
- Inventor: Johannes Cornelis Scholtes
- Applicant: MSC INTELLECTUAL PROPERTIES B.V.
- Applicant Address: NL Amsterdam
- Assignee: MSC INTELLECTUAL PROPERTIES B.V.
- Current Assignee: MSC INTELLECTUAL PROPERTIES B.V.
- Current Assignee Address: NL Amsterdam
- Agency: The Villamar Firm PLLC
- Agent Carlos R. Villamar
- Main IPC: G06N5/02
- IPC: G06N5/02 ; G06N20/00 ; G06F16/00 ; G06F16/35

Abstract:
A system, method and computer program product for automatic document classification, including an extraction module configured to extract structural, syntactical and/or semantic information from a document and normalize the extracted information; a machine learning module configured to generate a model representation for automatic document classification based on feature vectors built from the normalized and extracted semantic information for supervised and/or unsupervised clustering or machine learning; and a classification module configured to select a non-classified document from a document collection, and via the extraction module extract normalized structural, syntactical and/or semantic information from the selected document, and generate via the machine learning module a model representation of the selected document based on feature vectors, and match the model representation of the selected document against the machine learning model representation to generate a document category, and/or classification for display to a user.
Public/Granted literature
Information query