Performing optical character recognition using spatial information of regions within a structured document

Invention Grant

US10013643B2 Performing optical character recognition using spatial information of regions within a structured document 有权

Please log in to see more content

Patent Title: Performing optical character recognition using spatial information of regions within a structured document
Application No.: US15219888

Application Date: 2016-07-26
Publication No.: US10013643B2

Publication Date: 2018-07-03
Inventor: Vijay Yellapragada , Peijun Chiang , Sreeneel K. Maddika
Applicant: INTUIT INC.
Applicant Address: US CA Mountain View
Assignee: INTUIT INC.
Current Assignee: INTUIT INC.
Current Assignee Address: US CA Mountain View
Agency: Patterson + Sheridan, LLP
Main IPC: G06K9/00
IPC: G06K9/00 ; G06K9/62 ; G06T7/00

Performing optical character recognition using spatial information of regions within a structured document

Abstract:

Techniques are disclosed for facilitating optical character recognition (OCR) by identifying one or more regions in an electronic document to perform the OCR. For example a method for identifying information in an electronic document includes obtaining a set of training documents for each template of a plurality of templates for the electronic document, extracting spatial attributes for at least a first label region and at least a first corresponding value region from the set, and training a classifier model based on the extracted spatial attributes, wherein the classifier model is used to identify the information in the electronic document. The spatial attributes represent a position of at least the first label region and at least the first value region within the electronic document.

Public/Granted literature

US20180032842A1 PERFORMING OPTICAL CHARACTER RECOGNITION USING SPATIAL INFORMATION OF REGIONS WITHIN A STRUCTURED DOCUMENT Public/Granted day:2018-02-01

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06K	图形数据读取（图像或视频识别或理解G06V）；数据的呈现；记录载体；处理记录载体
G06K9/00	识别模式的方法或装置（图形读取或将机械参数模式（例如力或存在）转换为电信号的方法或装置 G06K11/00）（图像或视频识别或理解 G06V）（语音识别 G10L15/00 )