Invention Grant
- Patent Title: Labeling of data for machine learning
-
Application No.: US15654750Application Date: 2017-07-20
-
Publication No.: US10565526B2Publication Date: 2020-02-18
- Inventor: Prasanta Ghosh , Shantanu R. Godbole , Sachindra Joshi , Srujana Merugu , Ashish Verma
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agent Scott S. Dobson
- Main IPC: G06N20/00
- IPC: G06N20/00 ; G06F16/35 ; G06F16/901 ; G06N5/02

Abstract:
A computer generates labels for machine learning algorithms by retrieving, from a data storage circuit, multiple label sets that contain labels that each classify data points in a corpus of data. A graph is generated that includes a plurality of edges, each edge between two respective labels from different label sets of the multiple label sets. Weights are determined for the plurality of edges based upon a consistency between data points classified by two labels connected by the edges. An algorithm is applied that groups labels from the multiple label sets based upon the weights for the plurality of edges. Data points are identified from the corpus of data that represent conflicts within the grouped labels. An electronic message is transmitted in order to present the identified data points to entities for further classification. A new label set is generated using the further classification received from the entities.
Public/Granted literature
- US20170316348A1 LABELING OF DATA FOR MACHINE LEARNING Public/Granted day:2017-11-02
Information query