Invention Grant
- Patent Title: Linking data elements based on similarity data values and semantic annotations
-
Application No.: US13491724Application Date: 2012-06-08
-
Publication No.: US10229200B2Publication Date: 2019-03-12
- Inventor: Mihaela Ancuta Bornea , Songyun Duan , Achille Belly Fokoue-Nkoutche , Oktie Hassanzadeh , Anastasios Kementsietsidis , Kavitha Srinivas , Michael James Ward
- Applicant: Mihaela Ancuta Bornea , Songyun Duan , Achille Belly Fokoue-Nkoutche , Oktie Hassanzadeh , Anastasios Kementsietsidis , Kavitha Srinivas , Michael James Ward
- Applicant Address: US NY Armonk
- Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Current Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Current Assignee Address: US NY Armonk
- Agency: Patent Portfolio Builders PLLC
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
Data elements from data sources and having a data value set are linked by using hash functions to determine a dimensionally reduced instance signature for each data element based on all data values associated with that data element to yield a plurality of dimensionally reduced instance signatures of equivalent fixed size such that similarities among the data values in the data value sets across all data elements is maintained among the plurality of instance signatures. Candidate pairs of data elements to link are identified using the plurality of instance signatures in locality sensitive hash functions, and a similarity index is generated for each candidate pair using a pre-determined measure of similarity. Candidate pairs of data elements having a similarity index above a given threshold are linked.
Public/Granted literature
- US20130332466A1 Linking Data Elements Based on Similarity Data Values and Semantic Annotations Public/Granted day:2013-12-12
Information query