In: Computer Science
Data Science for Data Mining
Why is it often better to perform reductions using operators rather than excluding attributes or observations as data are imported? (Write Minimum 100 words)
Answer:-
Data reduction is the process of reducing the amount of capacity required to store data. Data reduction can increase storage efficiency and reduce costs. Storage vendors will often describe storage capacity in terms of raw capacity and effective capacity, which refers to data after the reduction.
Data Reduction strategies are always welcomed by SMBs, as it helps them to reduce storage costs and to preserve the storage capacity in case of prospective data growth requirements. Below are few tried and widely followed techniques, which reduce the storage space in the SAN environment and keep the data from crossing the line of control.
Thin Provisioning- Thin Provisioning technology offers an enterprise a more proficient use of storage capacity by reducing the reserve on unwritten blocks of storage. With the help of this technique, enterprises can realize storage savings of up to 30% and that too with no/low impact on the regular storage operations. A variety of vendors offer thin provisioning and so the enterprises of all sizes can make a benefit from them
Data Deduplication- Data Deduplication technique identifies repeated data patterns and reduces them to a single instance to save capacity in the san storage environment. So, depending on the data, the decrease of repeat patterns to a single physical copy is observed and thus storage savings ranging in between 2:1 to 10:1 are gained