In: Computer Science
25. In Data Mining, ___ is a set of input variables used to predict an observation's outcome class or continuous outcome value.
26. During each iteration of cluster analysis, the distances between new clusters are determined until any two clusters are sufficiently close to be linked using an algorithm called ___.
27. In the CRISP-DM process for data mining, which phase is the cleaning of the data so it is ready for modeling tools?
25. In Data Mining, ___ is a set of input variables used to predict an observation's outcome class or continuous outcome value.
Ans : Hypothesis
Explanation : Hypothesis is main part of Learning problems or Algorithms that maps from set of input variables to target variables or predicted value.In order to design Learning algorithms, Firstly we have to decide is how we want to represent hypothesis ? So, In most supervised leaning algorithms , our main goal is to find out the possible hypothesis that could possibly map out the inputs to proper outputs.
26. During each iteration of cluster analysis, the distances between new clusters are determined until any two clusters are sufficiently close to be linked using an algorithm called ___.
Ans : Agglomerative Hierarchical clustering algorithm
Explanation : This is one of the type of Hierarchical clustering algorithm that works on assumption that Initially all individual points are considered as clusters and then successively linked the two closest clusters until only one cluster remains.
For more Information -
Hierarchical clustering algorithm is of two types :
Agglomerative Hierarchical clustering algorithm or AGNES (agglomerative nesting)
Divisive Hierarchical clustering algorithm or DIANA (divisive analysis).
27. In the CRISP-DM process for data mining, which phase is the cleaning of the data so it is ready for modeling tools?
Ans : Phase 3 - Data Preparation Phase
This phase of CRISP-DM involves the tasks –
1. Select data
2. Clean data
3. Construct data
4. Integrate data
5. Format data