In: Statistics and Probability
data set will need at least four variables - at least two categorical and at least two quantitative. For example, you might consider the following variables for American participants in a survey: birth month (categorical), state of birth (categorical), average number of bowls of cereal eaten per week (quantitative), and amount spent on groceries (quantitative).
(a) First, formulate a research question relating to two of your quantitative variables along the lines of "how does *quantitative variable 1* relate to *quantitative variable 2*?" For example, you might ask "Does the average height for students relate to the average number of hours slept by students?" Include the question in your Word document.
(b) Create a least-squares regression line that answers the research question posed in part (a). Your answer here will be graded on the following: (i) an appropriate scatterplot related to the two variables (ii) correlation coefficient "r" and coefficient of determination "r2" between the two variables, (iii) a determination of whether the correlation coefficient is significant and (iv) whether your line is correct (with slope and intercept) based on the data provided!
Hours of sleep | Height (inches) |
4 | 62 |
5 | 65 |
6 | 65 |
6 | 62 |
7 | 63 |
7 | 67 |
7 | 60 |
7 | 74 |
7 | 64 |
7 | 63 |
8 | 73 |
8 | 62 |
8 | 66 |
8 | 70 |
8 | 72 |
8 | 69 |
8 | 63 |
9 | 60 |
9 | 67 |
10 | 73 |
## ANSWER:
a) research question is
whether the Correlation between number of hours of sleep & height is significant?
b)
i) scatter plot is attached above
ii) Correlation coefficient r= 0.3658
this indicates that there is week positive linear correlation between number of hours of sleep & the height of a person
*Coefficient of determination= 0.1338
this indicates only 13.38% of the variability in y(height)s is explained by the X(hours)
iii)to test the Correlation coefficient is significant or not.
test statistic is t= 1.6677
Df= 18
p-value= 0.1127
let significance level is 0.05 which is smaller than pvalue
we fail to reject the null hypothesis of non significance
we conclude they the there is not significant linear relationship between these two variables
iv) to test the regression equation
height= 57.3529 + 1.1765×number of hours of sleep
intercept= 57.3529
slope= 1.1765
here test statistic F=2.7812
pvalue = 0.1127
since pvalue is less than 0.05 we conclude that the regression model is not significant.