Invention Grant
- Patent Title: Data curation for corpus enrichment
-
Application No.: US16656280Application Date: 2019-10-17
-
Publication No.: US11436505B2Publication Date: 2022-09-06
- Inventor: Tracy Canada , Jim Dewan
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Patterson + Sheridan, LLP
- Main IPC: G06N5/04
- IPC: G06N5/04

Abstract:
Techniques for data curation are provided. A data set is received for ingestion into a question answering system, where the data set includes a first question and a first answer. Relevance of the first question is validated by comparing the first question to a first question cluster in the question answering system, and it is determined that the first answer satisfies predefined security criteria. The first data set is evaluated to identify a set of references, and a generalized data set is generated by replacing each respective reference of the set of references with a corresponding entity identifier. The first generalized data set is then ingested into the question answering system.
Public/Granted literature
- US20210117820A1 DATA CURATION FOR CORPUS ENRICHMENT Public/Granted day:2021-04-22
Information query