In: Statistics and Probability
A company has invested tens of millions of dollars to collect, store, back-up, secure, and hire individuals to manage a data warehouse. Still, whenever data are actually extracted from the warehouse and used to help solve a business problem, the process of data preparation (e.g., cleaning, filtering, exploring, etc.) can typically take up to 80-90% of the time spent on an analytics project. The other 10-20% makes up the actual analytics/modeling of the data. This data paradigm has existed for many years in the company. What are some of the things that a company can do so that most of an analyst's time is spent actually doing analytics rather than getting the data ready?
ANSWER :-
Given that,
An organization has contributed a huge number of dollars to gather, store, back-up, secure, and procure people to deal with an information stockroom.
All things considered, at whatever point information are really removed from the distribution center and used to help tackle a business issue, the procedure of information readiness
(e.g., cleaning, separating, investigating, and so forth.)
can ordinarily take up to 80-90% of the time spent on an examination venture. The other 10-20% makes up the genuine examination/displaying of the information.
This information worldview has existed for a long time in the organization.
we need to find What are a portion of the things that an organization can do as such that the vast majority of an expert's time is spent really doing examination instead of preparing the information
they are
Smoothing
Aggregation
Generalization
Normalization
Attribute construction