Question

In: Accounting

Plot logistic regression in Rstudio: The data set in the table considers information on the spread...

Plot logistic regression in Rstudio:

The data set in the table considers information on the spread of prostate cancer to the lymph nodes for 53 patients.
For a sample of prostate cancer patients, a set of possible predictor variables were measured before surgery to determine if the lymph nodes were compromised. Subsequently, the patient underwent surgery and the status of his lymph nodes was determined.
The data set contains 53 observations of 7 variables:
id: identifiers for each subject in the study.

ssln: takes the value of 1 if the cancer has spread to the lymph nodes and 0 if not.

age: a numeric vector containing the age of the patient at the time of diagnosis.

acid: a numerical vector that contains the levels of acid phosphatase in the blood (serum acid phosphatase or prostatic acid phosphatase PAP). High PAP levels may be associated with the presence of prostate cancer.

xray: a measure of the seriousness of the cancer obtained from a radiological examination.
A value of 1 represents a more serious case.

size: Size of the tumor determined by palpation. 
A value of 1 identifies a large tumor that can be palpated without problems.

grade: Another measure of tumor seriousness obtained from a pathologist reading a biopsy obtained using a needle prior to surgery.
1 corresponds to a more serious case.

Using Rstudio, plot the logistic regression model associated to this exercise to explain which of the variables taken before surgery are associated with the spread of cancer to the lymph nodes.

Use the table below:

id ssln age acid xray size grade
1 1 0 66 0.48 0 0 0
2 2 0 68 0.56 0 0 0
3 3 0 66 0.5 0 0 0
4 4 0 56 0.52 0 0 0
5 5 0 58 0.5 0 0 0
6 6 0 60 0.49 0 0 0
7 7 0 65 0.46 1 0 0
8 8 0 60 0.62 1 0 0
9 9 1 50 0.56 0 0 1
10 10 0 49 0.55 1 0 0
11 11 0 61 0.62 0 0 0
12 12 0 58 0.71 0 0 0
13 13 0 51 0.65 0 0 0
14 14 1 67 0.67 1 0 1
15 15 0 67 0.47 0 0 1
16 16 0 51 0.49 0 0 0
17 17 0 56 0.5 0 0 1
18 18 0 60 0.78 0 0 0
19 19 0 52 0.83 0 0 0
20 20 0 56 0.98 0 0 0
21 21 0 67 0.52 0 0 0
22 22 0 63 0.75 0 0 0
23 23 1 59 0.99 0 0 1
24 24 0 64 1.87 0 0 0
25 25 1 61 1.36 1 0 0
26 26 1 56 0.82 0 0 0
27 27 0 64 0.4 0 1 1
28 28 0 61 0.5 0 1 0
29 29 0 64 0.5 0 1 1
30 30 0 63 0.4 0 1 0
31 31 0 52 0.55 0 1 1
32 32 0 66 0.59 0 1 1
33 33 1 58 0.48 1 1 0
34 34 1 57 0.51 1 1 1
35 35 1 65 0.49 0 1 0
36 36 0 65 0.48 0 1 1
37 37 0 59 0.63 1 1 1
38 38 0 61 1.02 0 1 0
39 39 0 53 0.76 0 1 0
40 40 0 67 0.95 0 1 0
41 41 0 53 0.66 0 1 1
42 42 1 65 0.84 1 1 1
43 43 1 50 0.81 1 1 1
44 44 1 60 0.76 1 1 1
45 45 1 45 0.7 0 1 1
46 46 1 56 0.78 1 1 1
47 47 1 46 0.7 0 1 0
48 48 1 67 0.67 0 1 0
49 49 1 63 0.82 0 1 0
50 50 1 57 0.67 0 1 1
51 51 1 51 0.72 1 1 0
52 52 1 64 0.89 1 1 0
53 53 1 68 1.26 1 1 1

Solutions

Expert Solution

  

  


Related Solutions

R-Studio; Statistics The data set in the table considers information on the spread of prostate cancer...
R-Studio; Statistics The data set in the table considers information on the spread of prostate cancer to the lymph nodes for 53 patients. For a sample of prostate cancer patients, a set of possible predictor variables were measured before surgery to determine if the lymph nodes were compromised. Subsequently, the patient underwent surgery and the status of his lymph nodes was determined. The data set contains 53 observations of 7 variables: id: identifiers for each subject in the study. ssln:...
When should logistic regression be used for data analysis? What is the assumption of logistic regression?...
When should logistic regression be used for data analysis? What is the assumption of logistic regression? How to explain odds ratio?
.  Draw a plot of the following set of data and determine the linear regression equation.  What is...
.  Draw a plot of the following set of data and determine the linear regression equation.  What is the      value of the slope and intercept?   What is r and R2?  Are there any outlier values?   (15 points)                                 Age (X):     20  25  36  29  41  35  56  43  66  50  59  67  51  75  75  81  54  66  52  48            Total Body Water (Y):     61  57  52  59  53  58  48  51  37  44  42  41  48  38  41  39  47  42  51  50  
Please answer this using Rstudio For the oyster data, calculate regression fits (simple regression) for the...
Please answer this using Rstudio For the oyster data, calculate regression fits (simple regression) for the 2D and 3D data a.1) Give null and alternative hypotheses a.2) Fit the regression model a.3) Summarize the fit and evaluation of the regression model (is the linear relationship significant). a.4 )Calculate residuals and make a qqplot. Is the normal assumption reasonable? Actual   2D   3D 13.04   47.907   5.136699 11.71   41.458   4.795151 17.42   60.891   6.453115 7.23   29.949   2.895239 10.03   41.616   3.672746 15.59   48.070   5.728880 9.94  ...
Complete the following for the data set: Scatter Plot Calculate the regression line in Y-intercept form...
Complete the following for the data set: Scatter Plot Calculate the regression line in Y-intercept form (do this piece by piece in Excel, or by hand) Interpret in Words your Beta coefficient If X=5; then your Y-hat equals what? Is this a good estimate or not (explain in words) Plot the regression line Use STATA to calculate and interpret the R2 Yi Xi 2 11 4 9 4 14 6 9 8 9 10 8 10 13 11 5 13...
Complete the following for the data set: Scatter Plot Calculate the regression line in Y-intercept form...
Complete the following for the data set: Scatter Plot Calculate the regression line in Y-intercept form (do this piece by piece in Excel, or by hand) Interpret in Words your Beta coefficient If X=5; then your Y-hat equals what? Is this a good estimate or not (explain in words) Plot the regression line Use STATA to calculate and interpret the R2 DATA SET 1: Yi Xi 14 3 11 5 11 3 8 8 5 7 7 10 4 9...
What is the formula using RStudio to plot these 3 columns of data? Y1 Y2 Y3...
What is the formula using RStudio to plot these 3 columns of data? Y1 Y2 Y3 43595 43032 47474 27829 55141 35865 62946 40311 66920 47186 77191 56664 89661 68040 101373 64008 123904 79400 149453 96731 168672 100517 200977 89685 192985 105709 205667 120532 187450 127457 165613 117306 200871 106498 229394 79543 283229 106599 25092.23 289717 126328 24271.51 270585 160896 25828.99 279514 174346 27697.5 349768 176014 27956.23 339689 184417 29924.43 331610 174960 30216.83 351175 174960 32613.5 450569 104885 36053.17 421513...
Use RStudio. The carsafety data set in the UsingR package has records of the number of...
Use RStudio. The carsafety data set in the UsingR package has records of the number of passenger deaths in the “Other.deaths” column and the type of vehicle in the type column. Determine which type of vehicle is the deadliest for passengers by looking at difference in variance between groups.
make a plot of the data in the table below , plot position versus time time...
make a plot of the data in the table below , plot position versus time time squared (t2 ). make sure to label your graph properly.( Y axis title with units, X axis title with unit, title of the graph with equation , major tick marks. use data to plot the graph manually. use proper scal measurement position(m) time(s) time2 (s2 ) 1 0 0 0 2 5 1 1 3 21 2 4 4 46 3 9 5 82...
Table below shows the multiple logistic regression for factors associated with having three or more cardiovascular...
Table below shows the multiple logistic regression for factors associated with having three or more cardiovascular disease risk factors among a population of 500 participants who participated in a cross-sectional survey. Table 4. Multiple Logistic Regression for the predictors of having three or more Cardio Vascular Disease risk factors (N=500) Variables Odds ratio 95% confidence interval Perceived health status Excellent/Very Good Good Fair/Poor 2.4 Reference 4.6 0.8-4.2 3.8-7.1 Gender Female Male 1.6 Reference 0.9-2.9 Race/Ethnicity African American Hispanic White 2.4...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT