In: Statistics and Probability
Parameter estimates:
intercept: 14179.871 3298.484 4.30 0.0003* (estimate Std Error tRatio Prob>t )
InCostAid: 0.7768753 0.254022 3.06 0,0058* (estimate Std Error tRatio Prob>t )
Baruch College has substantially less average debt compared to the other schools with similar in-state costs. Figure 10.13 contains JMP output for the simple linear regression of AveDebt on InCostAid with this case removed.
a) State the least-squares regression line.
b) The University of North Florida is one school in this sample. It has an in-state cost of $11,421 and average debt of $17,617. What is the residual?
c) Construct a 95% confidence interval for the slope. What does this interval tell you about the change in average debt for a $1000 change in the in-state cost?
ans:
a) y = 0.7768753InCostAid+14179.871
b) 0.7768*11421+14179.871 = 23052.5628
residual = 23052.5628 – 17617 = 5435.562801
c) 0.7768753+- t*0.254022 = 0.7768753+- 2.073873*0.254022 =
(0.2500659328, 1.303684667)
Baruch College was removed from this analysis because it was deemed an outlier. Let’s investigate its impact on the fit.
A) Refit the model using the entire sample of 25 schools. Create a table that summarizes the model estimates with and without this case.
B) Describe the impact this observation has on the fit of the linear regression model.
C) If you were writing a report for publication, would you include the fit with or without this case? Explain your answer.
Need answer for A, B, C
Ans. Suppose in my 25 observation sample data I want to keep variable like incercept , IncostAid(x) , average debt (y) for my regression model. I want to do a regression analysis between x (incostaid) & y (average debt) on SPSS. Please look at my data first. Suppose Intercept is 14,180 for my case.
Now I want to update the image of caculations of the predicted value of average debt (ybar) & residuals in each data. please look at the data
A). ans.
From the above image you can easily get know that Error = Observed (y) - Predicted (y)
B) ans. Now impact of the observations on the model , to see it we need to understand the entire regression result. Please look at the image below :-
From the above result we easily see that from the first table that Rsquare = 0.44.
So, 44% of the fitted model is good.
From the anova table we can see that Total variation of the average debt is 6657172.5 and among that 2932067.36 are defined by Incostaid & rest are errors.
Our null hypothesis in this regression is H0 : Beta = 0
alternative Ha : Beta 0
By seeing the p-value from co-efficients table of regression result (sig. column) , which is 0.000 , we could decide to reject null that is Beta = 0. Also we can see it from the value of anaova table.
Lastly from the residual statistics table we get the entire descriptive statistics for the residuals.
C)ans. The model Rsquare = 0.44. So, only 44% is of Average Debt is defined by Incostaid. Now I would like to find some other predictor variables , which are caused for Average _Debt. Then run a regression. It would help me to get a high R values for more significant variable.