Question

In: Computer Science

Data Science for Data Mining Why is it often better to perform reductions using operators rather...

Data Science for Data Mining

Why is it often better to perform reductions using operators rather than excluding attributes or observations as data are imported? (Write Minimum 100 words)

Solutions

Expert Solution

Answer:-

Data reduction is the process of reducing the amount of capacity required to store data. Data reduction can increase storage efficiency and reduce costs. Storage vendors will often describe storage capacity in terms of raw capacity and effective capacity, which refers to data after the reduction.

Data Reduction strategies are always welcomed by SMBs, as it helps them to reduce storage costs and to preserve the storage capacity in case of prospective data growth requirements. Below are few tried and widely followed techniques, which reduce the storage space in the SAN environment and keep the data from crossing the line of control.

Thin Provisioning- Thin Provisioning technology offers an enterprise a more proficient use of storage capacity by reducing the reserve on unwritten blocks of storage. With the help of this technique, enterprises can realize storage savings of up to 30% and that too with no/low impact on the regular storage operations. A variety of vendors offer thin provisioning and so the enterprises of all sizes can make a benefit from them

Data Deduplication- Data Deduplication technique identifies repeated data patterns and reduces them to a single instance to save capacity in the san storage environment. So, depending on the data, the decrease of repeat patterns to a single physical copy is observed and thus storage savings ranging in between 2:1 to 10:1 are gained


Related Solutions

ID Documents 1 I love data mining 2 The seven dwarves love mining 3 Data science...
ID Documents 1 I love data mining 2 The seven dwarves love mining 3 Data science is a hot new career 4 I don't love my major or career Use the corpus of documents shown in the above table to answer the quiz questions below. What is the inverse document frequency (IDF) of the term "love"? (Round your answer to 2 decimal places). What is the TF-IDF value (importance) of the term "data" to document 1? (Round your answer to...
14. Why is better sensitivity usually achieved using graphite furnace atomization rather than flame atomization. 15....
14. Why is better sensitivity usually achieved using graphite furnace atomization rather than flame atomization. 15. Why is spectral interference more of a problem in flame emission AA than in ICP?
Briefly summarize what is Data Science for Business? What you need to know about data mining...
Briefly summarize what is Data Science for Business? What you need to know about data mining and data-analytic thinking". How do you think about the emerging trend of Big Data and Data Mining?
Explain why the discipline of economics is often considered the "grey science"? Be specific.
Explain why the discipline of economics is often considered the "grey science"? Be specific.
Why is data mining a key piece of analytics?
Why is data mining a key piece of analytics?
Statistics is its own language. In fact, it is often called the language of science. Why...
Statistics is its own language. In fact, it is often called the language of science. Why do you think it is called the language of science? What does it mean to be statistically literate? Why is it important to be statistically literate? Please don't copy from chegg or anywhere
Why might it be better to use these types of designs rather than the independent-samples design?
Why might it be better to use these types of designs rather than the independent-samples design?
Why is net profit better kept at retained earnings level rather than to be used to...
Why is net profit better kept at retained earnings level rather than to be used to reduce debt?
The better-selling candies are often high in calories. Assume that the following data show the calorie...
The better-selling candies are often high in calories. Assume that the following data show the calorie content from samples of M&M's, Kit Kat, and Milky Way II. M&M's Kit Kat Milky Way II 220 215 200 210 205 208 240 245 202 250 235 190 220 230 180 Assuming we don't know about the shape of the population distribution, use the Kruskal-Wallis Test to test for significant differences among the calorie content of these three candies. State the null and...
The better-selling candies are often high in calories. Assume that the following data show the calorie...
The better-selling candies are often high in calories. Assume that the following data show the calorie content from samples of M&M's, Kit Kat, and Milky Way II. M&M's Kit Kat Milky Way II 250 245 200 210 205 208 230 235 202 240 225 190 250 220 180 Assuming we don't know about the shape of the population distribution, use the Kruskal-Wallis Test to test for significant differences among the calorie content of these three candies. State the null and...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT