Question

In: Statistics and Probability

1. A good predictive model is one that fits the data closely whereas a good explanatory...

1. A good predictive model is one that fits the data closely whereas a good explanatory model is one that predicts new cases accurately. A. True B. False

2. The specificity of a classifier is its ability to detect the important class members correctly and sensitivity is its ability to rule out C0 members correctly. A. True B. False

3. This method of finding the best subset of predictors relies on partial, iterative search through the space of all possible regression models. The end product is one best subset of predictors. A. Exhaustive Search B. Subset Selection Algorithms C. Stepwise Regression D. All the above

4. This Model is used to fit a linear relationship between a quantitative dependent variable also called outcome or response variable and a set of predictors also called independent variables. A. Multiple Linear Regression B. Simple Linear Regression C. Stepwise Regression D. All the above

5. The advantage of choosing K > 1 is A. fitting to the noise of data B. That higher values of k provide smoothing that reduces the risk of overfitting due to noise in the training data. C. helps in finding the outliers D. None of the above

6. The number of records required in the training set to qualify as large increases exponentially with the number of predictors p. This is because the expected distance to the nearest neighbor goes up dramatically with p unless the size of the training set increases exponentially with p. This Phenomenon is known as A. Curse of Dimensionality B. Overfitting C. Smoothing D. All the above

7. The naive Bayes classifier's beauty is in its A. Simplicity B. Computational efficiency C. Good Classification Performance D. All the above

8. This is the difficulty with the practical exploitation of the power of the k-NN approach A. Time to find the nearest neighbors in a large training set can be prohibitive B. Number of records required in the training set to qualify as large increases exponentially with the number of predictors. C. The time consuming computation is deferred to the time of prediction D. All the above  

9. Which of the following statement is not true A. Naive Bayes classifier requires a very large number of records to obtain good results. B. With Naive Bayes, Good performance is obtained when the goal is classification or ranking. C. Naive Bayes method is used frequently in Credit Scoring. D. None of the above

10. This is a recursive partitioning method that predates CART (Classification and Regression Tree) procedures by several years and is widely used in database marketing applications to this day. A. Pruning B. CHAID C. Naive Bayes D. All the above

Solutions

Expert Solution

1. A good predictive model is one that fits the data closely whereas a good explanatory model is one that predicts new cases accurately. ANS: A.TRUE

2. The specificity of a classifier is its ability to detect the important class members correctly and sensitivity is its ability to rule out C0 members correctly. ANS:B. FALSE

It is other way round. Sensitivity is defined as the ability of a test to identify as positive, all the patients who actually have the disease. Specificity is defined as the ability of a test to identify as negative all the patients who do not have the disease.

3. This method of finding the best subset of predictors relies on partial, iterative search through the space of all possible regression models. The end product is one best subset of predictors. ANS: D. All the above

4. This Model is used to fit a linear relationship between a quantitative dependent variable also called outcome or response variable and a set of predictors also called independent variables. ANS: A. Multiple Linear Regression

5. The advantage of choosing K > 1 is ANS:B. That higher values of k provide smoothing that reduces the risk of overfitting due to noise in the training data.

6. The number of records required in the training set to qualify as large increases exponentially with the number of predictors p. This is because the expected distance to the nearest neighbor goes up dramatically with p unless the size of the training set increases exponentially with p. This Phenomenon is known as ANS: A. Curse of Dimensionality

7. The naive Bayes classifier's beauty is in its ANS: D. All the above

8. This is the difficulty with the practical exploitation of the power of the k-NN approach ANS: D. All the above  


Related Solutions

A two-variable model involving one quantitative explanatory variable and one categorical (binary) explanatory variable (and no...
A two-variable model involving one quantitative explanatory variable and one categorical (binary) explanatory variable (and no interaction), results in two regression lines that are: A.     Always parallel. B.     Could be parallel but, depending on the data, may not. C.      Never parallel. D.     Always horizontal. The two methods of including a binary categorical variable in a regression model are to use indicator coding or effect coding. For indicator coding in the two-variable model (with no interaction): A.     The binary variable is coded (-1,1) and the coefficient...
Multiple Regression: Must find a model that best fits the data: USING R 1. Test to...
Multiple Regression: Must find a model that best fits the data: USING R 1. Test to see if x1 and x2 are highly correlated using variance inflation factor technique. What can we conclude? Is Multicollinearity present? 2. Construct scatter plot in R to visualize relationship between y and each x. Dataset: Y= Time X1= School X2=District "School" "District" "Time" 1,3,4 2,6,7 18,9,24 4,10,114 9, 2, 16
You estimate a model with 9 explanatory variables and an intercept from a data set with...
You estimate a model with 9 explanatory variables and an intercept from a data set with 100 observations. To test hypotheses on this model you should use a t-distribution with how many degrees of freedom? Select one: a. 1 b. 10 c. 9 d. 90
Question 6: Analytics Concepts What is a “model” in terms of predictive data analysis? Give an...
Question 6: Analytics Concepts What is a “model” in terms of predictive data analysis? Give an example. How will you explain the concept of hypothesis testing to a layperson? What is the difference between Mean and Median? If you could pick only one number to explain the salaries of NBA players, which one will you pick and why? “With big data we can escape the straightjacket of group identities and replace them with much more granular predictions for each individual....
What is the regression model for the data? Is this a good model? Year 2006 =...
What is the regression model for the data? Is this a good model? Year 2006 = 8,860 Students 2007 = 9,056 2008 = 9,050 2009 = 9,429 2010 = 9,407 2011 = 9,352 2012 = 9,608 2013 = 10,107 2014 = 10,382 2015 = 10,340 2016 = 10,805 2017 = 11,034 2018 = 11,639
This is for Predictive Analytics. 1. Read the iris data set into a data frame. 2....
This is for Predictive Analytics. 1. Read the iris data set into a data frame. 2. Print the first few lines of the iris dataset. 3. Output all the entries with Sepal Length > 5. 4. Plot a box plot of Petal Length with a color of your choice. 5. Plot a histogram of Sepal Width. 6. Plot a scatter plot showing the relationship between Petal Length and Petal Width. 7. Find the mean of Sepal Length by species. Hint:...
In the long run is the German model a good one?
In the long run is the German model a good one?
QUESTION 1: The Life is Good brand's mission statement "To spread the power of optimism" fits...
QUESTION 1: The Life is Good brand's mission statement "To spread the power of optimism" fits with which characteristics of good mission statements? It is easily converted into actionable tactics. It is short, memorable, and meaningful. It describes the company's growth plan. It defines the major competitive spheres in which the company will operate. It stresses the company's major policies and values. QUESTION 2: There is a shared need among consumers that is not currently satisfied by any existing product...
1. Scalability and Replicability One of the main goals of good model/template design is scalability and...
1. Scalability and Replicability One of the main goals of good model/template design is scalability and replicability. What do each of these words mean in the context of model/template building? 2. The budget process is often iterative and may involve a first, second, and even third round of discussions and updates with department managers/senior management. Given this situation, what could we do from an organizational standpoint to effectively keep track of all these changes and why is it important to...
Collect data on one response (dependent or y) variable and two different explanatory (independent or x)...
Collect data on one response (dependent or y) variable and two different explanatory (independent or x) variables. This will require a survey with three questions. For example: To predict a student’s GPA (y), you might collect data on two x variables: SAT score and age. So we would be trying to determine if there was a linear correlation between someone’s SAT score and their GPA, as well as their age and their GPA. (Note: students may not choose GPA as...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT