In: Statistics and Probability
A | B | C | D | E | RESPONSE |
-1 | -1 | -1 | -1 | 1 | 38.9 |
1 | -1 | -1 | -1 | -1 | 35.3 |
-1 | 1 | -1 | -1 | -1 | 36.7 |
1 | 1 | -1 | -1 | 1 | 45.5 |
-1 | -1 | 1 | -1 | -1 | 35.3 |
1 | -1 | 1 | -1 | 1 | 37.8 |
-1 | 1 | 1 | -1 | 1 | 44.3 |
1 | 1 | 1 | -1 | -1 | 34.8 |
-1 | -1 | -1 | 1 | -1 | 34.4 |
1 | -1 | -1 | 1 | 1 | 38.4 |
-1 | 1 | -1 | 1 | 1 | 43.5 |
1 | 1 | -1 | 1 | -1 | 35.6 |
-1 | -1 | 1 | 1 | 1 | 37.1 |
1 | -1 | 1 | 1 | -1 | 33.8 |
-1 | 1 | 1 | 1 | -1 | 36 |
1 | 1 | 1 | 1 | 1 |
44.9 |
Determine if there are any two-factor interactions, Determine an illistrate the combinations from any interactions that give the greatest results.
# I have done this using R. But have definitely explained all the steps very well.
str(df)
'data.frame': 16 obs. of 6 variables:
$ A : int -1 1 -1 1 -1 1 -1 1 -1 1 ...
$ B : int -1 -1 1 1 -1 -1 1 1 -1 -1 ...
$ C : int -1 -1 -1 -1 1 1 1 1 -1 -1 ...
$ D : int -1 -1 -1 -1 -1 -1 -1 -1 1 1 ...
$ E : int 1 -1 -1 1 -1 1 1 -1 -1 1 ...
$ RESPONSE: num 38.9 35.3 36.7 45.5 35.3 37.8 44.3 34.8 34.4 38.4
...
# Now We will make a copy of the data for future use.
copy.df <- df
# Now making factors
df$A <- as.factor(df$A)
df$B <- as.factor(df$B)
df$C <- as.factor(df$C)
df$D <- as.factor(df$D)
df$E <- as.factor(df$E)
str(df)
'data.frame': 16 obs. of 6 variables:
$ A : Factor w/ 2 levels "-1","1": 1 2 1 2 1 2 1 2 1 2 ...
$ B : Factor w/ 2 levels "-1","1": 1 1 2 2 1 1 2 2 1 1 ...
$ C : Factor w/ 2 levels "-1","1": 1 1 1 1 2 2 2 2 1 1 ...
$ D : Factor w/ 2 levels "-1","1": 1 1 1 1 1 1 1 1 2 2 ...
$ E : Factor w/ 2 levels "-1","1": 2 1 1 2 1 2 2 1 1 2 ...
$ RESPONSE: num 38.9 35.3 36.7 45.5 35.3 37.8 44.3 34.8 34.4 38.4
...
********************************************************************************************
# Now we will perform analysis of variance on each interaction possible
anova(lm(RESPONSE~A*B,data=df))
Analysis of Variance Table
Response: RESPONSE
Df Sum Sq Mean Sq F value Pr(F)
A 1 0.001 0.001 0.0000 0.99501
B 1 57.381 57.381 3.7438 0.07693 .
A:B 1 0.031 0.031 0.0020 0.96508
Residuals 12 183.922 15.327
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
********************************************************************************************
anova(lm(RESPONSE~A*C,data=df))
Analysis of Variance Table
Response: RESPONSE
Df Sum Sq Mean Sq F value Pr(F)
A 1 0.001 0.0006 0.0000 0.9956
C 1 1.156 1.1556 0.0578 0.8140
A:C 1 0.456 0.4556 0.0228 0.8825
Residuals 12 239.723 19.9769
********************************************************************************************
anova(lm(RESPONSE~A*D,data=df))
Analysis of Variance Table
Response: RESPONSE
Df Sum Sq Mean Sq F value Pr(F)
A 1 0.001 0.0006 0.0000 0.9956
D 1 1.501 1.5006 0.0753 0.7884
A:D 1 0.766 0.7656 0.0384 0.8479
Residuals 12 239.067 19.9223
********************************************************************************************
anova(lm(RESPONSE~A*E,data=df))
Analysis of Variance Table
Response: RESPONSE
Df Sum Sq Mean Sq F value Pr(F)
A 1 0.001 0.001 0.0001 0.9929554
E 1 147.016 147.016 19.1162 0.0009087 ***
A:E 1 2.031 2.031 0.2640 0.6166888
Residuals 12 92.288 7.691
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
********************************************************************************************
anova(lm(RESPONSE~B*C,data=df))
Analysis of Variance Table
Response: RESPONSE
Df Sum Sq Mean Sq F value Pr(F)
B 1 57.381 57.381 3.7705 0.07601 .
C 1 1.156 1.156 0.0759 0.78757
B:C 1 0.181 0.181 0.0119 0.91505
Residuals 12 182.617 15.218
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
********************************************************************************************
anova(lm(RESPONSE~B*D,data=df))
Analysis of Variance Table
Response: RESPONSE
Df Sum Sq Mean Sq F value Pr(F)
B 1 57.381 57.381 3.7808 0.07566 .
D 1 1.501 1.501 0.0989 0.75858
B:D 1 0.331 0.331 0.0218 0.88511
Residuals 12 182.122 15.177
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
********************************************************************************************
anova(lm(RESPONSE~B*E,data=df))
Analysis of Variance Table
Response: RESPONSE
Df Sum Sq Mean Sq F value Pr(F)
B 1 57.381 57.381 91.717 5.702e-07 ***
E 1 147.016 147.016 234.990 3.031e-09 ***
B:E 1 29.431 29.431 47.042 1.750e-05 ***
Residuals 12 7.507 0.626
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1
********************************************************************************************
anova(lm(RESPONSE~C*D,data=df))
Analysis of Variance Table
Response: RESPONSE
Df Sum Sq Mean Sq F value Pr(F)
C 1 1.156 1.1556 0.0584 0.8132
D 1 1.501 1.5006 0.0758 0.7878
C:D 1 1.051 1.0506 0.0531 0.8217
Residuals 12 237.628 19.8023
********************************************************************************************
anova(lm(RESPONSE~C*E,data=df))
Analysis of Variance Table
Response: RESPONSE
Df Sum Sq Mean Sq F value Pr(F)
C 1 1.156 1.156 0.1489 0.7063867
E 1 147.016 147.016 18.9367 0.0009421 ***
C:E 1 0.001 0.001 0.0001 0.9929886
Residuals 12 93.163 7.764
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
********************************************************************************************
anova(lm(RESPONSE~D*E,data=df))
Analysis of Variance Table
Response: RESPONSE
Df Sum Sq Mean Sq F value Pr(F)
D 1 1.501 1.501 0.1940 0.6674214
E 1 147.016 147.016 19.0081 0.0009286 ***
D:E 1 0.006 0.006 0.0007 0.9789286
Residuals 12 92.813 7.734
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# As our possible interaction are
# AB, AC , AD, AE
# BC, BD, BE
# CD, CE
# DE
# using the p-value which is Pr(F) in the table we will evaluate on
the following basis.
# H0 : That interaction affect do not exist.
# The interaction affect do exist.
# If p-value < 0.05. We reject the null hypothesis
# If p-value 0.005. We fail to reject the Null Hypothesis
# Using the table cooresponding to interction B*E we see that
p-value = 0.0000175 < 0.05.
# Hence We can reject the null hypothesis.
# Hence we conclude that interaction exists.
********************************************************************************************
# Now we include a variable BE in the data as interaction
df$BE <- copy.df$B*copy.df$E
df$BE <- as.factor(df$BE)
str(df)
'data.frame': 16 obs. of 7 variables:
$ A : Factor w/ 2 levels "-1","1": 1 2 1 2 1 2 1 2 1 2 ...
$ B : Factor w/ 2 levels "-1","1": 1 1 2 2 1 1 2 2 1 1 ...
$ C : Factor w/ 2 levels "-1","1": 1 1 1 1 2 2 2 2 1 1 ...
$ D : Factor w/ 2 levels "-1","1": 1 1 1 1 1 1 1 1 2 2 ...
$ E : Factor w/ 2 levels "-1","1": 2 1 1 2 1 2 2 1 1 2 ...
$ RESPONSE: num 38.9 35.3 36.7 45.5 35.3 37.8 44.3 34.8 34.4 38.4
...
$ BE : Factor w/ 2 levels "-1","1": 1 2 1 2 2 1 2 1 2 1 ...
********************************************************************************************
# Now we wiil use lm function to calculate the combination.
model <- lm(RESPONSE~B+E+BE, data=df)
summary(model)
Call:
lm(formula = RESPONSE ~ B + E + BE, data = df)
Residuals:
Min 1Q Median 3Q Max
-1.050 -0.450 0.025 0.600 0.950
Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept) 31.9875 0.3955 80.882 < 2e-16 ***
B1 3.7875 0.3955 9.577 5.70e-07 ***
E1 6.0625 0.3955 15.329 3.03e-09 ***
BE1 2.7125 0.3955 6.859 1.75e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1
Residual standard error: 0.791 on 12 degrees of
freedom
Multiple R-squared: 0.9689, Adjusted R-squared: 0.9611
F-statistic: 124.6 on 3 and 12 DF, p-value: 2.622e-09
# R-Square is very high it means that our variable combination
is quite perfect.
# You can also use following anova table for Sum of Square and
other statistics
anova(model)
Analysis of Variance Table
Response: RESPONSE
Df Sum Sq Mean Sq F value Pr(F)
B 1 57.381 57.381 91.717 5.702e-07 ***
E 1 147.016 147.016 234.990 3.031e-09 ***
BE 1 29.431 29.431 47.042 1.750e-05 ***
Residuals 12 7.507 0.626
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# You can see that Mean Sq is very low, hance combination is very
good.
Response = 31.9875 + 3.7875*B + 6.0625*E +
2.7125*BE