Invention Grant
- Patent Title: Identifying training documents for a content classifier
- Patent Title (中): 识别内容分类器的培训文档
-
Application No.: US12497467Application Date: 2009-07-02
-
Publication No.: US08352386B2Publication Date: 2013-01-08
- Inventor: Srinivas Varma Chitiveli , Barton Wayne Emanuel , Alexander Wolcott Holt , Michael E. Moran
- Applicant: Srinivas Varma Chitiveli , Barton Wayne Emanuel , Alexander Wolcott Holt , Michael E. Moran
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Patterson & Sheridan, LLP
- Main IPC: G06F15/18
- IPC: G06F15/18

Abstract:
Systems, methods and articles of manufacture are disclosed for identifying a training document for a content classifier. One or more thresholds may be defined for designating a document as a training document for a content classifier. A plurality of documents may be evaluated to compute a score for each respective document. The score may represent suitability of a document for training the content classifier with respect to a category. The score may be computed based on content of the plurality of documents, metadata of the plurality of documents, link structure of the plurality of documents, user feedback (e.g., user supplied document tags) received for the plurality of documents, and document metrics received for the plurality of documents. Based on the computed scores, a training document may be selected. The content classifier may be trained using the selected training document.
Public/Granted literature
- US20110004573A1 IDENTIFYING TRAINING DOCUMENTS FOR A CONTENT CLASSIFIER Public/Granted day:2011-01-06
Information query