Invention Grant
US08687886B2 Method and apparatus for document image indexing and retrieval using multi-level document image structure and local features
有权
用于文档图像索引和检索的方法和装置,使用多级文档图像结构和局部特征
- Patent Title: Method and apparatus for document image indexing and retrieval using multi-level document image structure and local features
- Patent Title (中): 用于文档图像索引和检索的方法和装置,使用多级文档图像结构和局部特征
-
Application No.: US13340513Application Date: 2011-12-29
-
Publication No.: US08687886B2Publication Date: 2014-04-01
- Inventor: Yibin Tian
- Applicant: Yibin Tian
- Applicant Address: US CA San Mateo
- Assignee: Konica Minolta Laboratory U.S.A., Inc.
- Current Assignee: Konica Minolta Laboratory U.S.A., Inc.
- Current Assignee Address: US CA San Mateo
- Agency: Chen Yoshimura LLP
- Main IPC: G06K9/00
- IPC: G06K9/00

Abstract:
An image based document index and retrieval method is described. During document indexing, each source document is analyzed to generate index information at document, page, region and unit levels. Region and unit level index information is generated by segmenting each text region into units, constructing unit length or unit density histograms, and analyzing the units in a few most frequent bins of the histogram. The index information and the source document images are stored in a database. During document retrieval, a target document is analyzed to generate target index information in the same way as during document indexing. The target index information is compared to stored index information in a progressive manner (from higher to lower levels) to identify source documents with index information that matches the target index information. Fuzzy logic is used in the comparison steps to increase the robustness of the document retrieval.
Public/Granted literature
Information query