Invention Grant
- Patent Title: Incremental crawling of multiple content providers using aggregation
- Patent Title (中): 使用聚合增量爬取多个内容提供商
-
Application No.: US12343009Application Date: 2008-12-23
-
Publication No.: US08799261B2Publication Date: 2014-08-05
- Inventor: Batya Kenig , Constantin Radchenko , Eitan Shapiro
- Applicant: Batya Kenig , Constantin Radchenko , Eitan Shapiro
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Edell, Shapiro & Finnan, LLC
- Agent Joe Polimeni
- Main IPC: G06F7/00
- IPC: G06F7/00 ; G06F17/30

Abstract:
A method for incremental crawling of content stored on a plurality of content providers using aggregation is provided. The method comprises receiving a request to crawl content on one or more associated content providers; retrieving one or more first references to content on a first content provider; retrieving one or more second references to content on one or more second content providers during the same request; aggregating the first and second references; and returning the aggregated first and second references. This is done while taking into consideration opaque timestamp object which is managed in a distributed manner. The opaque timestamp is filled in by the content providers but stored in the crawler side between crawling sessions.
Public/Granted literature
- US20090307211A1 INCREMENTAL CRAWLING OF MULTIPLE CONTENT PROVIDERS USING AGGREGATION Public/Granted day:2009-12-10
Information query