In: Statistics and Probability
In one of Boston’s public parks, mugging in summer months has been a serious issue. A police cadet took a random sample of 10 days and compiled the data. For each day, x represents the number of police officers on duty in the park and y represents the number of reported muggings on that day. A scatter plot has been provided in the following.
(a) What information can we learn from the above scatter plot?
(b) One wishes to use the linear model for this question. Please specify the theoretical linear model and the common assumptions.
(c) Based on the following SAS output, please write out the regression line.
(d) Can you predict y value when x = 30? Why or why not?
(e) Please construct a 95% confidence interval for the slope.
(f) Overall is this model useful? (Please set up hypotheses and report the result from SAS output).
(g) Find the sample correlation coefficient.
Answer:
Given that:
In one of Boston’s public parks, mugging in summer months has been a serious issue. A police cadet took a random sample of 10 days and compiled the data.
(a) What information can we learn from the above scatter plot?
From the scatter plot,we can say that the two variables are related and there exists a decreasing trend
b) One wishes to use the linear model for this question. Please specify the theoretical linear model and the common assumptions.
The theoretical linear model is given as y = mx +c where m-slope and c - intercept
The assumptions are
Linear relationship
Multivariate normality
No or little multicollinearity
No auto-correlation
Homoscedasticity
(c) Based on the following SAS output, please write out the regression line.
From the output,the regression line is y=9.78 - 4.48x
d) Can you predict y value when x = 30? Why or why not?
When x=30,then y=9.78 - 4.48(30) = negative value which makes no sense ,this value of x is outside the scope of the model
e) Please construct a 95% confidence interval for the slope.
The 95% confidence interval for slope is given as
The confidence interval is (-5.26404 , -3.695)
f) Overall is this model useful? (Please set up hypotheses and report the result from SAS output).
: The model doesn't affects the data significantly
:The model affects the data significantly
Since the p value of F is less than .05,we reject the null hypothesis and conclude that the overall model is significant
g) Find the sample correlation coefficient.
The correlation coefficient,
(Negative sign is due to decreasing trend)