In: Statistics and Probability
Use the data on snowfall and mood to create a regression equation. Then, you will use this regression equation to make predictions about negative affect levels. Using your graphing calculator, create a regression equation with the following data:
Total Daily Snowfall | Negative Affect Score |
---|---|
5in | 10 |
3in | 4 |
10in | 24 |
12in | 30 |
8in | 12 |
10in | 39 |
24in | 50 |
36in | 50 |
20in | 47 |
Then, use this equation to predict Negative Affect Score for three different snowfall amounts not listed. Provide a 1-2 paragraph summary in which you address the following:
Present the regression equation.
Identify the three snowfall amounts you selected to predict Negative Affect Score for.
Identify the Negative Affect Score predicted for each.
Explain whether or not you agree with these predictions, and reflect on how appropriate regression is for these variables.
To first describe the data, we shall create a scatterplot to
observe the apparent relationship of Y on X, where,
Y denotes the Negative Affect Scores
X denotes the Total Daily Snowfall
___________________________________________________________________________
#Using the following R code, we obtain the scatterplot :-
x=c(5,3,10,12,8,10,24,36,20)
y=c(10,4,24,30,12,39,50,50,47)
plot(x,y,main="Scatterplot of Mood on Snowfall",xlab="Total
Daily Snowfall",ylab="Negative Afeect Score")
___________________________________________________________________________
The scatterplot is obtained as,
Interpretation :- It is very evident from the above plot, that there exist a logarithmic relationship between, Y and X.
So, I have decided to test the following no intercept regression model,
where,
___________________________________________________________________________
#Using the following R code, we obtain the fitted regression equation and its summary output,
fit=lm(y~-1+I(log(x)))
summary(fit)
___________________________________________________________________________
The fitted regression equation is obtained as,
As we can see from the above analysis, the adjusted r-squared value is 0.922 which depicts that 92.2% of the variability in Y is explained by log(X) in the above model, signifying that the model is a good fit.
3 snowfall amounts chosen to predict the Negative Affect score
are taken as follows,
The predicted negative affect scores for the corresponding 3 Snowfall amount as stated above are obtained as follows,
Yes, I agree with the above predictions, because, as a result of
the previous analysis, it is evident that the model is a very good
fit with highly significant coefficients.
To visualize the position of the predicted observations, we shall
create a scatterplot with the above results.