Method and apparatus for extracting web page content

Invention Grant

US09934206B2 Method and apparatus for extracting web page content 有权

Please log in to see more content

Patent Title: Method and apparatus for extracting web page content
Application No.: US14341446

Application Date: 2014-07-25
Publication No.: US09934206B2

Publication Date: 2018-04-03
Inventor: Tingyong Tang , Yulei Liu , Wei Li , Xi Wang , Bo Hu , Kai Zhang , Bosen He , Ying Huang , Huijiao Yang , Zhengkai Xie , Zhipei Wang , Cheng Feng , Sirui Liu
Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
Applicant Address: CN Shenzhen
Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
Current Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
Current Assignee Address: CN Shenzhen
Agency: Anova Law Group, PLLC
Priority: CN201310101245 20130327
Main IPC: G06F17/22
IPC: G06F17/22 ; G06F3/0484 ; G06F3/0488 ; G06F17/30

Method and apparatus for extracting web page content

Abstract:

Methods and apparatus for extracting web page content are provided herein. An exemplary method can be implemented by a mobile terminal. A request command to open a first web page can be received. Whether a source code contains text content tags can be determined. When the source code corresponding to the first web page contains the text content tags, text content of the first web page enclosed within the text content tags can be extracted by a reader. When the source code does not contain the text content tags, a start position and an end position to indicate the text content of the first web page can be identified in the source code. The text content tags can be respectively added after the start position and before the end position. The text content of the first web page enclosed within the text content tags can then be extracted.

Public/Granted literature

US20140337699A1 METHOD AND APPARATUS FOR EXTRACTING WEB PAGE CONTENT Public/Granted day:2014-11-13

Information query

Espacenet