Invention Grant
- Patent Title: Ground truth generation for machine learning based quality assessment of corpora
-
Application No.: US15269253Application Date: 2016-09-19
-
Publication No.: US10552498B2Publication Date: 2020-02-04
- Inventor: Corville O. Allen , Shannen B. Lambdin , Nicolas B. Lopez , Anuj Sharma
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agent Stephen R. Tkacs; Stephen J. Walder, Jr.; Diana R. Gerhardt
- Main IPC: G06F16/9535
- IPC: G06F16/9535

Abstract:
A mechanism is provided in a computing device configured with instructions executing on a processor of the computing device to implement a ground truth generation system for quality assessment scoring of articles in a corpus. The ground truth generation system receives recommendations of a set of recommended articles from subject matter experts. The ground truth generation system identifies a set of non-recommended articles. A topic clustering component within the ground truth generation system performs topic clustering on a combination of the set of recommended articles and the set of non-recommended articles to form a set of topic clusters containing recommended articles and non-recommended articles. The ground truth generation system identifies a first number of recommended articles and a second number of non-recommended articles in each of the set of topic clusters to form a quality assessment training set. The mechanism trains a quality assessment machine learning model using the quality assessment training set.
Public/Granted literature
- US20180082211A1 Ground Truth Generation for Machine Learning Based Quality Assessment of Corpora Public/Granted day:2018-03-22
Information query