Distributed data set indexing
Abstract:
An apparatus including a processor to: index multiple data records within a data cell by first and second data fields in a single read pass through the data cell; wherein for each data record within the first data cell, the processor is to retrieve data values from the first and second data fields, search a first binary tree to determine whether the data value from the first data field comprises a unique value, and add the data value to the first binary tree if it is unique, and search a second binary tree to determine whether the data value from the second data field comprises a unique value, and add the data value to the second binary tree if it is unique; and generate a first and second unique values indexes of identifiers of the data records associated with the unique data values within the first and second binary trees.
Information query
Patent Agency Ranking
0/0