Question

In: Computer Science

There are several typical cube computation methods such as Multi-Way, BUC, and Star-cubing. Briefly describe each...

There are several typical cube computation methods such as Multi-Way, BUC, and Star-cubing. Briefly describe each one of these methods outlining the key points.

Solutions

Expert Solution

1).ANSWER :

GIVENTHAT :

Data Cube Computation Methods

1. Multi-Way Array Aggregation for Cube Computation (MOLAP)

2. BUC (Bottom-Up Computation)

3. Star-Cubing

4. High-Dimensional OLAP

Multi-Way Array Aggregation

1. Array-based bottom-up algorithm

2. Using multi-dimensional chunks

3. No direct tuple comparisons

4. Simultaneous aggregation on multiple dimensions

5. Intermediate aggregate values are re-used for computing ancestor cuboids

6. Cannot do Apriori pruning: No iceberg optimization

7. Partition arrays into chunks (a small subcube which inserts in memory).

8. Compressed sparse array addressing: (chunk_id, offset)

9. the simplest traversing order to try to to multi-way aggregation is that the one that minimizes the memory requirement and reduced I/Os

10. Method: the planes should be sorted and computed consistent with their size in ascending order

11. Idea: keep the littlest plane within the main memory, fetch and compute just one chunk at a time for the most important plane

12. Limitation of the method: computing well just for alittle number of dimensions

13. If there are an outsized number of dimensions, “top-down” computation and iceberg cube computation methods are often explored.

Bottom-Up Computation

1. Divides dimensions into partitions and facilitates iceberg pruning

a. If a partition doesn't satisfy min_sup, its descendants are often pruned

b. If minsup = 1 Þ compute full CUBE!

2. No simultaneous aggregation

3. Usually, entire data set can’t slot in main memory

4. Sort distinct values

5. partition into blocks that fit

6. Continue processing

7. Optimizations

8. Partitioning

9. External Sorting, Hashing, Counting Sort

10. Ordering dimensions to encourage pruning

11. Cardinality, Skew, Correlation

12. Collapsing duplicates

13. Can’t do holistic aggregates anymore!

Star-Cubing

1. Explore shared dimensions

2. Allows for shared computations

3. Aggregate during a top-down manner but with the bottom-up sub-layer underneath

4. Shared dimensions grow in bottom-up fashion

5. Anti-monotonic property of shared dimensions

6. Intuition: if we'll compute the shared dimensions before the actual cuboid, we'll use them to undertake to to Apriori pruning

7. Problem: the way to prune while still aggregate simultaneously on multiple dimensions?

8. Use a tree structure almost like H-tree to represent cuboids

9. Collapses common prefixes to save lots of memory

10. Keep count at node

11. Traverse the tree to retrieve a specific tuple

14. Solution: Replace such attributes by a *. Such attributes are star attributes, and therefore the corresponding nodes within the cell tree are star nodes

15. Suppose minsup = 2

16. Perform one-dimensional aggregation. Replace attribute values for which count < 2 with *. And collapse all *’s together

17. Resulting table has all such attributes replaced with the star-attribute

18. With regards to the iceberg computation, this new table may be a lossless compression of the first table

19. Given the new compressed table, it's possible to construct the corresponding cell tree—called star tree

20. Keep a star table at the side for straightforward lookup of star attributes

21. The star tree may be a lossless compression of the first cell tree

High-Dimensional OLAP

1. None of the previous cubing method can handle high dimensionality!

2. Challenge to current cubing methods:

a. The “curse of dimensionality’’ problem

b. Iceberg cube and compressed cubes: only delay the inevitable explosion

c. Full materialization: still significant overhead in accessing results on disk

3. High-D OLAP is required in applications

a. Science and engineering analysis

b. Bio-data analysis: thousands of genes

c. Statistical surveys: many variables

4. Observation: OLAP occurs only on alittle subset of dimensions at a time

5. Semi-Online Computational Model

a. Partition the set of dimensions into shell fragments

b. Calculate the data cubes for every shell fragment while retaining inverted indices or value-list indices

c. Given the pre-computed fragment cubes, dynamically compute cube cells of the high-dimensional data cube online


Related Solutions

I.Financial Intermediaries: Briefly describe each of the following financial intermediaries in terms of the way they...
I.Financial Intermediaries: Briefly describe each of the following financial intermediaries in terms of the way they help issuers raise capital: Commercial Bank: Lending and deposit- taking; also expanded into investment advisory/trust services. Investment Bank: Handle securities issuance, merger/acquisition and reorganization advisory services. Financial Services Company: Includes broker/dealers, investment advisory firms. B.  In what ways do efficient capital markets help both issuersand investors? II Capital Raising by Corporations: Identifyand describetwo typesof securities corporationsmay issue in order to raise capital. Is an initial...
Pick two of the following methods and briefly describe how they work. For each method that...
Pick two of the following methods and briefly describe how they work. For each method that is chosen, include a description of a research question (real or hypothetical) for each method, and please explain why it why chosen. Be as descriptive as possible Following methods to choose from: functional magnetic resonance imaging (fMRI) transcranial magnetic stimulation (TMS) electroencephalography (EEG) computed tomography (CT) diffusion tensor imaging (DTI)
describe the methods for determining road-way conditions
describe the methods for determining road-way conditions
Sketch the layout for a typical conventional wastewater treatment plant. Label each process. Briefly describe pretreatment,...
Sketch the layout for a typical conventional wastewater treatment plant. Label each process. Briefly describe pretreatment, primary treatment, and secondary treatment and their basic functions.
Briefly describe one way you could use each of the models in a research setting. Simple...
Briefly describe one way you could use each of the models in a research setting. Simple linear regression Multiple regression Factorial analysis of variance Multivariate analysis of variance Factor analysis
Briefly describe each and every way in which funds can be accessed through internal sources.
Briefly describe each and every way in which funds can be accessed through internal sources.
List three (3) methods or techniques for generating business ideas and briefly describe each of them.
List three (3) methods or techniques for generating business ideas and briefly describe each of them.
What are the two basic methods used by airlines to calculate earned revenue? Briefly describe each....
What are the two basic methods used by airlines to calculate earned revenue? Briefly describe each. Cite a section of the FASB ASC to support your answer.
What are the two basic methods used by airlines to calculate earned revenue? Briefly describe each....
What are the two basic methods used by airlines to calculate earned revenue? Briefly describe each. Cite a section of the FASB ASC to support your answer.
Section 28.1 Briefly describe the general body plan of a typical sponge. Also describe the germ...
Section 28.1 Briefly describe the general body plan of a typical sponge. Also describe the germ layer and symmetry. What is the purpose of the choanocytes and amoebocytes? How do sponges obtain nutrients? Mode of locomotion.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT