Invention Grant
US08391614B2 Determining near duplicate “noisy” data objects 有权
确定接近重复的嘈杂数据对象

  • Patent Title: Determining near duplicate “noisy” data objects
  • Patent Title (中): 确定接近重复的嘈杂数据对象
  • Application No.: US12161775
    Application Date: 2007-01-25
  • Publication No.: US08391614B2
    Publication Date: 2013-03-05
  • Inventor: Yiftach RavidAmir Milo
  • Applicant: Yiftach RavidAmir Milo
  • Applicant Address: IL Rosh Haayin
  • Assignee: Equivio Ltd.
  • Current Assignee: Equivio Ltd.
  • Current Assignee Address: IL Rosh Haayin
  • Agency: Oliff & Berridge, PLC
  • International Application: PCT/IL2007/000095 WO 20070125
  • International Announcement: WO2007/086059 WO 20070802
  • Main IPC: G06K9/68
  • IPC: G06K9/68 G06K9/40
Determining near duplicate “noisy” data objects
Abstract:
A system configured to find near duplicate documents. For each two (or more) documents that are similar to each other, the system is configured to identify which of the differences is likely to be generated by an Optical Character Recognition software or otherwise due to difference between the original documents. As a result, the process of identifying similarity between documents is improved by identifying documents that were originally exact duplicates but are different one with respect to the other only due to OCR errors, or correct the similarity level between the documents by correcting errors introduced by the OCR tool.
Public/Granted literature
Information query
Patent Agency Ranking
0/0