System and method for generating a multi dimensional data cube for analytics using a map-reduce program
Abstract:
In accordance with an embodiment, described herein is a system and method for generating a data cube for analytics. A map-reduce program running in a data processing cluster can read each line of a source data, and generate a key-value pair for each of a plurality of data combinations in that line of data. Values paired with the same key can be aggregated to generate one or more frequency values or one or more aggregate values, for representing the result of a query on the source data. Each query result can be stored in an output file, and can be encapsulated into a data cube cached in a distributed file system of the data processing cluster. The data cube can map a query from a client application to an output file, and can return a pre-computed result in the output file from the data cube to the client application.
Information query
Patent Agency Ranking
0/0