Automating creation of accurate OCR training data using specialized UI application

Invention Grant

US10282604B2 Automating creation of accurate OCR training data using specialized UI application 有权

Please log in to see more content

Patent Title: Automating creation of accurate OCR training data using specialized UI application
Application No.: US16111121

Application Date: 2018-08-23
Publication No.: US10282604B2

Publication Date: 2019-05-07
Inventor: Eugene Krivopaltsev , Sreeneel K. Maddika , Vijay S. Yellapragada
Applicant: INTUIT INC.
Applicant Address: US CA Mountain View
Assignee: Intuit, Inc.
Current Assignee: Intuit, Inc.
Current Assignee Address: US CA Mountain View
Agency: Patterson + Sheridan, LLP
Main IPC: G06K9/00
IPC: G06K9/00 ; G06K9/46 ; G06K9/52 ; G06K9/62 ; G06T7/60 ; G06T11/60 ; G06T7/73 ; G06T7/13 ; G06T7/70 ; G06F3/0481 ; G06F17/21

Automating creation of accurate OCR training data using specialized UI application

Abstract:

Systems of the present disclosure generate accurate training data for optical character recognition (OCR). Systems disclosed herein generates images of a text passage as displayed piecemeal in a user interface (UI) element rendered in a selected font type and size, determine accurate dimensions and locations of bounding boxes for each character pictured in the images, stitch together a training image by concatenating the images, and associate the training image, the bounding box dimensions and locations, and the text passage together in a collection of training data. The collection of training data also includes a computer-readable master copy of the text passage with newline characters inserted therein.

Public/Granted literature

US20180365487A1 AUTOMATING CREATION OF ACCURATE OCR TRAINING DATA USING SPECIALIZED UI APPLICATION Public/Granted day:2018-12-20

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06K	图形数据读取（图像或视频识别或理解G06V）；数据的呈现；记录载体；处理记录载体
G06K9/00	识别模式的方法或装置（图形读取或将机械参数模式（例如力或存在）转换为电信号的方法或装置 G06K11/00）（图像或视频识别或理解 G06V）（语音识别 G10L15/00 )