In: Statistics and Probability
Explain what is meant by the terms below as a limitation of Pattern Discovery. Provide an example for them.
-poor data quality
-opportunity
-intervention
-separability
-oblivious
-non-stationarity
1) Poor data quality:
matters. Data quality problems can have a significant impact on a company’s bottom line. Bad data can result in redundant work and missed opportunities. Data quality problems can accumulate, increasing in scope and impact, as data moves through the enterprise. In the worst cases, this can cause executives to reach incorrect conclusions and make bad business decisions. Pretty serious stuff. Yet most companies have no formal data quality programs that can measure and mitigate data quality problems. Most companies are not even aware that they have a data quality problem.
The solution is to institute an enterprise data quality (DQ) program. By its very nature, an enterprise DQ program is beyond the capabilities of any single canned solution. DQ requires a holistic approach – with touchpoints throughout the business and implemented across a range of technologies. DQ should be an integral part of the data processing pipeline and should not be limited to just offline, retrospective analysis. DQ is not just about customer name and address cleansing. It’s about the consistency and representation of all enterprise information.
If the technologies used for DQ are to be part of the processing pipeline, they have to be production-level robust. They have to deal with complex legacy data, real-time transactions, and high sustained processing volumes. Approaches that do not meet all these requirements end up being relegated to offline deployments and rarely meet expectations. This is what typically happens with special-purpose niche DQ tools that specialize in certain types of data and that can be used only in limited circumstances.
2) Opportunity:
Big data is a collection of data sets that are so large and complex that they become awkward to work with using traditional database management tools. The volume, variety and velocity of big data have introduced challenges across the board for capture, storage, search, sharing, analysis, and visualization. Examples of big data sources include web logs, RFID, sensor data, social networks, Internet search indexing, call detail records, military surveillance, and complex data in astronomic, biogeochemical, genomics, and atmospheric sciences. Big Data is the core of most predictive analytic services offered by IT organizations.[33] Thanks to technological advances in computer hardware—faster CPUs, cheaper memory, and MPP architectures—and new technologies such as Hadoop, MapReduce, and in-database and text analytics for processing big data, it is now feasible to collect, analyze, and mine massive amounts of structured and unstructured data for new insights.
3) non-stationarity
Data points are often non-stationary or have means, variances and covariances that change over time. Non-stationary behaviors can be trends, cycles, random walks or combinations of the three.
Non-stationary data, as a rule, are unpredictable and cannot be modeled or forecasted. The results obtained by using non-stationary time series may be spurious in that they may indicate a relationship between two variables where one does not exist. In order to receive consistent, reliable results, the non-stationary data needs to be transformed into stationary data. In contrast to the non-stationary process that has a variable variance and a mean that does not remain near, or returns to a long-run mean over time, the stationary process reverts around a constant long-term mean and has a constant variance independent of time.
4) separability:
The processing unit of a single-layer perceptron network is able to categorize a set of patterns into two classes as the linear threshold function defines their linear separability. Conversely, the two classes must be linearly separable in order for the perceptron network to function correctly.
5) intervention:
An invention is a unique or novel device, method, composition or process. The invention process is a process within an overall engineering and product development process. It may be an improvement upon a machine or product or a new process for creating an object or a result. An invention that achieves a completely unique function or result may be a radical breakthrough. Such works are novel and not obvious to others skilled in the same field. An inventor may be taking a big step toward success or failure.
Some inventions can be patented. A patent legally protects the intellectual property rights of the inventor and legally recognizes that a claimed invention is actually an invention. The rules and requirements for patenting an invention vary by country and the process of obtaining a patent is often expensive.