In: Computer Science
State the data cube computation methods. Give example for each method
Data Cube Computation Methods:
1. Multiway Array Aggregation
Example
Suppose the data size on each dimension A, B and C is 40, 400 and 4000, respectively.
Minimum memory required when traversing the order, 1,2,3,4,5,…, 64
Total memory required is 100×1000 + 40×1000 + 40×400.
Cuboids should be sorted and computed according to the data size on each dimension.
Keeps the smallest plane in the main memory, fetches and computes only one chunk at a time for the largest plane.
2. Bottom-Up Computation (BUC)
example: Iceburg Running Process
Partitioning
Sorts data values
Partitions into blocks that fit in memory
Apriori Pruning
For each block • If it does not satisfy min_sup, its descendants are pruned • If it satisfies min_sup, materialization and a recursive call including the next dimension.
Computation of sparse data cubes.
3. Star-Cubing
Example:Iceberg Pruning Strategy
Iceberg Pruning in a Shared Dimension
If AB is a shared dimension of ABD, then the cuboid AB is computed simultaneously with ABD
If the aggregate on a shared dimension does not satisfy the iceberg condition (min_sup), then all the cells extended from this shared
dimension cannot satisfy the condition either - If we can compute the shared dimensions before the actual cuboid, we can apply Apriori pruning.
Cuboid Trees
Tree structure to represent cuboids
Base cuboid tree, 3-D cuboid trees, 2-D cuboid trees, …
Each level represents a dimension
Each node represents an attribute
Each node includes the attribute value, aggregate value, descendant(s)
The path from the root to a leaf represents a tuple
Example • count(a1,b1,,) = 10 • count(a1,b1,c1,*) = 5 → Pruning while aggregating simultaneously on multiple dimensions.