In: Statistics and Probability
A student wants to see if the number of times a book has been checked out of the library in the past year and the number of pages in the book are related. She guesses that the number of times a book has been checked out of the library in the past year will be a predictor of the number of pages it has. She randomly sampled 10 books from the library, test her claim at a 0.01 level of significance.
number of times checked out | number of pages |
---|---|
4 | 277 |
31 | 303 |
15 | 242 |
12 | 382 |
25 | 329 |
22 | 263 |
37 | 361 |
43 | 323 |
44 | 278 |
31 | 350 |
The correlation coefficient:
r= ___ (round to 3 decimal places)
The equation y=a+bx is: (round to 3 decimal places)
y=___+__ x
The hypotheses are:
H0:ρ=0H0:ρ=0 (no linear relationship)
HA:ρ≠0HA:ρ≠0 (linear relationship) (claim)
Since αα is 0.01 the critical value is -3.355 and 3.355
The test value is: (round to 3 decimal places)
The p-value is: (round to 3 decimal places)
The decision is to
Thus the final conclusion sentence is
Solution :-
1). Thr correlation coefficient:
r= 0.182
To find this result run the following code in python.
times_checked=[4,31,15,12,25,22,37,43,44,31]
pages=[277,303,242,382,329,263,361,323,278,350]
from scipy.stats import pearsonr
corr,_= pearsonr(times_checked, pages)
print (corr)
0.18231726381436067
2). The equation y=a+bx is: (round to 3 decimal places)
y=294.258+ 0.627x
These values that are unerlied are naswer to this question and this we can obtain by running a linear regression. This result for linear regression can be obtained by running following code of python.
from sklearn.linear_model import LinearRegression
times_checked=np.array([4,31,15,12,25,22,37,43,44,31]).reshape(-1,1)
pages=np.array([277,303,242,382,329,263,361,323,278,350])
model = LinearRegression().fit(times_checked, pages)
print('intercept:', model.intercept_)
print('slope:', model.coef_)
intercept: 294.257935516121
slope: [0.62659335]
3).The hupothesis to be tested in this case is as follows;-
Since
Test statistics
4). Now having found the answer for t-value as 0.524, we can find the p-value for this t-value. To find the p-value we need to do the following calculation:-
5). Hence the decision is do not reject H0.
We reach to this decision because p-value is far greater than value of level of signifance i.e. 0.01.
We can also say that we fail to reject null hypothesis because t-statistics because t-statistics falls between the critical-region (-3.355,3.355).
6). Thus the final conclusion sentence is-
There is not enough evidence to support the claim that there is a linear relationship.
This is because we fail to reject Null hypothesis.