Question

In: Statistics and Probability

let us consider the following validation data set confusion matrix is the result of a logistic...

  1. let us consider the following validation data set confusion matrix is the result of a logistic regression model which includes if the patient will have a heart attack as a dependent variable which is connected to the range of independent variables. In this model y=1 indicates heart attack and y=0 indicates not having a heart attack. Cutoff value is considered 50 per cent.
  1. Calculate sensitivity, specificity and overall error of the model.
  2. Considering this confusion matrix do you think shall we make a change in the cut-off value? Justify your answer.

Predicted class

1

0

Actual class

1

2700

1000

0

70

3068

Solutions

Expert Solution

Predicted Class
1 0
Actual Class 1 2700 1000
0 70 3068

a.

Sensitivity:

True Positive Rate = (True Positive)/Positive = (predicted as 1 given it was 1)/(all predicted as 1) = 2700/(2700+70)

= 2700/2770 = 0.9749

or Sensitivity = 0.9749

Specificity:

True Negative Rate = (True Negative)/Negative = {predicted as 0 given it was 0)/(all predicted 0) = 3068/(3068+1000) = 3068/4068 = 0.7542

or Specificity = 0.7542

Total Error: (all false predictions)/total = (predicted as 1 but was 0 and predicted as 0 but was 1)/total

= (1000+70)/(2700+70+1000+3068) = 1070/6838 = 0.1565

or Total error = 0.1565

b. The total accuracy for Logistic Regression model is = 1 - 0.1565 = 0.8444

Important:

Here, we are predicting if a patient has heart attack, now try to understand two scenario:

1. Sensitivity : Predicting that patient has heart attack given that he actually had it

2. Specificity: Predicting that patient didnt have heart attack given that he didnt actually had it

So, here in confusion matrix, if we are supposed to increase the Specificity we will have to reduce Sensitivity.

In medical diagnosis, predicting patient has heart attack given that he didnt have it actually is less riskier than predicting patient does not have heart attack when he actually had it

So, we can compromise Specificity given that we have good Sensitivity.

Further, if we increase the cut-off we will be able to increase Specificity but that can lead to reduced Sensitivity which we do not want.

Hence, we will not change cut-off from 0.5 to any other probability.

Please rate my answer and comment for doubt


Related Solutions

Let us consider strategic risk. Look at the matrix and identify the two parameters that determine...
Let us consider strategic risk. Look at the matrix and identify the two parameters that determine strategic risk. These two parameters are about the threat of opportunistic behavior from third-parties from which a company sources services. They do not necessarily distinguish between offshore and onshore providers. This observation leads us to a question: Now think of the measures that a company can adopt to mitigate strategic risk. Are some of those measures weakened by (or are costlier to implement) in...
Let A be some m*n matrix. Consider the set S = {z : Az = 0}....
Let A be some m*n matrix. Consider the set S = {z : Az = 0}. First show that this is a vector space. Now show that n = p+q where p = rank(A) and q = dim(S). Here is how to do it. Let the vectors x1, . . . , xp be such that Ax1, . . . ,Axp form a basis of the column space of A (thus each x can be chosen to be some unit...
1. As a result of running a simple regression on a data set, the following estimated...
1. As a result of running a simple regression on a data set, the following estimated regression equation was obtained:       = 9.7 + 13.4x Furthermore, it is known that SST = 622, and SSE = 150. 2. You are given the following information about y and x: y x Dependent Variable Independent Variable 11 6 15 5 10 2 14 2 Linear regression using least squares method yielded the following equation:   = 12.06 + 0.12x What is the predicted value...
Plot logistic regression in Rstudio: The data set in the table considers information on the spread...
Plot logistic regression in Rstudio: The data set in the table considers information on the spread of prostate cancer to the lymph nodes for 53 patients. For a sample of prostate cancer patients, a set of possible predictor variables were measured before surgery to determine if the lymph nodes were compromised. Subsequently, the patient underwent surgery and the status of his lymph nodes was determined. The data set contains 53 observations of 7 variables: id: identifiers for each subject in...
Consider the following preferences and election problem. Let us assume that a president has to be...
Consider the following preferences and election problem. Let us assume that a president has to be elected. 4 candidates want to become a president, who are representing different political ideologies: A is a left-wing candidate, B is a social-democrat, C is a right-liberal candidate and D is a right-wing candidate. 20% of the voters (group left) preference A≻B≻C≻D, 30% of the voters (group social democrats) have the preference B≻A≻C≻D, 10% of the voters (group right-liberal) preference C≻B≻A≻D, 40% of the...
Let V be the set of all ordered pairs of real numbers. Consider the following addition...
Let V be the set of all ordered pairs of real numbers. Consider the following addition and scalar multiplication operations V. Let u = (u1, u2) and v = (v1, v2). Show that V is not a vector space. • u ⊕ v = (u1 + v1 + 1, u2 + v2 + 1 ) • ku = (ku1 + k − 1, ku2 + k − 1) 1)Show that the zero vector is 0 = (−1, −1). 2)Find the...
Consider the data shown in the following table. Identify any modifications that would result in a...
Consider the data shown in the following table. Identify any modifications that would result in a more effective table. Comment the characteristics of participants. Baseline Characteristics of Participants in Randomized Trial Characteristic Treatment Group (n = 361) Placebo Group (n = 344) Age, years 45.8 47.6 Male sex 201 198 Systolic blood pressure 128.25 ± 14.2 121.7 ± 13.91 Race/ethnicity Asian 184 180 White 107 100 Hispanic 70 64
(1 point) Given the following data set, let xx be the explanatory variable and yy be...
(1 point) Given the following data set, let xx be the explanatory variable and yy be the response variable. xx 5 7 6 2 8 7 8 yy 7 4 4 9 3 3 2 (a) If a least squares line was fitted to this data, what percentage of the variation in the yy would be explained by the regression line? (Enter your answer as a percent.) ANSWER: % (b) Compute the correlation coefficient: r=
Let us consider that there is a circular tray on a table that can rotate in...
Let us consider that there is a circular tray on a table that can rotate in a horizontal plane. Let the wagon start moving in straight direction with an acceleration of 0.5 m / s^2 relative to the ground. Let the tray on the table start rotating at the same time with the angular speed of ω = 2πt. Here, t is in seconds and ω is in radians per second. At this moment, an insect with a mass of...
Let us consider a situation exist in a market with no barriers to enter or to...
Let us consider a situation exist in a market with no barriers to enter or to exit. The Firms in the market suffering crises and losses - you and you competitors as well. You decided to not shot down the business and stay in the market .. Based on that .. answer the following questions 1. what type of markets you are operating in ? 2. what the economic rule for a decision not to shut-down? 3. A how you...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT