In: Statistics and Probability
3.1. A researcher is interested in determining whether a new and cheaper method of analysis to determine the fat content of small organisms works as well as the older method used for many years. She has decided to split 15 samples and determine fat content in the two halves using the two methods. She has decided that simple linear regression will be the best method of analysis because she can test the hypothesis that the two methods give a 1:1 relationship (slope = 1) over the working range of the methods. The data collected are (mg fat/g tissue):
Sample# | Old Method (mg/g) | New Method (mg/g) |
1 | 18.6 | 18.1 |
2 | 5.9 | 6.1 |
3 | 19.1 | 19.6 |
4 | 23.2 | 20.1 |
5 | 12.4 | 8.1 |
6 | 14.2 | 15.2 |
7 | 18.3 | 15.6 |
8 | 15.6 | 13.5 |
9 | 14.8 | 10.9 |
10 | 27.6 | 27.8 |
11 | 13.6 | 12.1 |
12 | 17.5 | 19.2 |
13 | 26.4 | 24.6 |
14 | 6.4 | 4.5 |
15 | 9.2 | 8.1 |
A). Create a well labelled xy scatter plot (where x = old method) on the spreadsheet page with the data; make note of the general pattern of the plotted data and whether there are any suspect data points.
B). Use the advanced regression analysis tool to conduct a complete simple linear regression analysis for the data. You will not require the options for "residuals" for this analysis. Remember to complete the seven steps of hypothesis testing in the spaces provided on the worksheet.
C). Use the linear regression equation determined in part "B" to calculate a set of "predicted y values" for each observed x value.
D). What are the predicted y values (ie predicted “fat content using the new method”) for the following x values? a). 11.3 mg/g b). 17.2 mg/g c). 21.5 mg/g d). 148.4 mg/g
E). Whether the regression from question part “B” is significant or not, use the t-test for slope to test the hypothesis that the slope = 1.0. Remember to complete the five steps of hypothesis testing in the spaces provided on the worksheet. What does this analysis tell you about the two methods?
**please show how you got the answers so that I can learn and understand, as well as each of the steps for the seven steps of hypothesis testing so that I can see each step clearly **
Ans a ) the scatterplot of the data is
there is a positive linear relationship between the fat content using the new method and old method . there is no any suspect data points.
B ) using excel>data>data analysis>Regresion
we have
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.964485 | |||||
R Square | 0.930232 | |||||
Adjusted R Square | 0.924866 | |||||
Standard Error | 1.857378 | |||||
Observations | 15 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 597.9719 | 597.9719 | 173.3325 | 6.82E-09 | |
Residual | 13 | 44.8481 | 3.449854 | |||
Total | 14 | 642.82 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | -1.50844 | 1.335397 | -1.12958 | 0.279067 | -4.39339 | 1.376516 |
Old Method (mg/g) | 1.013701 | 0.076996 | 13.16558 | 6.82E-09 | 0.84736 | 1.180041 |
B). Use the advanced regression analysis tool to conduct a complete simple linear regression analysis for the data.
the null and alternative hypothesis is
Ho: there is not a significant relationship between the old and new method
Ha: there is significant relationship between the old and new method
the value of F stat = 173.333
p value = 0.0000
since p value is less than 0.05 so we reject Ho and conclude that there is significant relationship between the old and new method .
C). Use the linear regression equation determined in part "B" to calculate a set of "predicted y values" for each observed x value. (2 marks)
Old Method (mg/g) | New Method (mg/g) | predicted |
18.6 | 18.1 | 17.3464 |
5.9 | 6.1 | 4.472396 |
19.1 | 19.6 | 17.85325 |
23.2 | 20.1 | 22.00942 |
12.4 | 8.1 | 11.06145 |
14.2 | 15.2 | 12.88611 |
18.3 | 15.6 | 17.04229 |
15.6 | 13.5 | 14.3053 |
14.8 | 10.9 | 13.49433 |
27.6 | 27.8 | 26.46971 |
13.6 | 12.1 | 12.27789 |
17.5 | 19.2 | 16.23133 |
26.4 | 24.6 | 25.25327 |
6.4 | 4.5 | 4.979246 |
9.2 | 8.1 | 7.817609 |
D). What are the predicted y values (ie predicted “fat content using the new method”) for the following x values?
(4 marks) a). 11.3 mg/g b). 17.2 mg/g c). 21.5 mg/g d). 148.4 mg/g
value | predicted | |
a | 11.3 | 9.946381 |
b | 17.2 | 15.92722 |
c | 21.5 | 20.28613 |
d | 148.4 | 148.9248 |
E). Whether the regression from question part “B” is significant or not,
the regression is significant as we shown in part B
use the t-test for slope to test the hypothesis that the slope = 1.0.
the null and alternative hypothesis is
Ho:b1=1
Ha:b1 1
t = = 0.178
t at 95% with 13 df = 2.179
since t < 2.179 so we do not reject ho and conclude that the slope = 1.0
It tells us that there is significant relationship between the old and new method and about 93.02% variation in new method can be explained by the old method through the regression line equation.