Invention Grant
US08407193B2 Data deduplication for streaming sequential data storage applications
有权
流顺序数据存储应用程序的重复数据删除
- Patent Title: Data deduplication for streaming sequential data storage applications
- Patent Title (中): 流顺序数据存储应用程序的重复数据删除
-
Application No.: US12695127Application Date: 2010-01-27
-
Publication No.: US08407193B2Publication Date: 2013-03-26
- Inventor: Daniel F. Gruhl , Jan H. Pieper , Mark A. Smith
- Applicant: Daniel F. Gruhl , Jan H. Pieper , Mark A. Smith
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Sherman & Zarrabian LLP
- Agent Kenneth L. Sherman, Esq.; Michael Zarrabian, Esq.
- Main IPC: G06F17/00
- IPC: G06F17/00

Abstract:
Data deduplication compression in a streaming storage application, is provided. The disclosed deduplication process provides a deduplication archive that enables storage of the archive to, and extraction from, a streaming storage medium. One implementation involves compressing fully sequential data stored in a data repository to a sequential streaming storage, by: splitting fully sequential data into data blocks; hashing content of each data block and comparing each hash to an in-memory lookup table for a match, the in-memory lookup table storing all hashes that have been encountered during the compression of the fully sequential data; for each data block without a hash match, adding the data block as a new data block for compression of fully sequential data; and encoding duplicate data blocks using the in-memory lookup table into data segments.
Public/Granted literature
- US20110185149A1 DATA DEDUPLICATION FOR STREAMING SEQUENTIAL DATA STORAGE APPLICATIONS Public/Granted day:2011-07-28
Information query