In: Math
The variance in a production process is an important measure of the quality of the process. A large variance often signals an opportunity for improvement in the process by finding ways to reduce the process variance. The following sample data show the weight of bags (in pounds) produced on two machines: machine 1 and 2.
m1 = (2.95, 3.45, 3.50, 3.75, 3.48, 3.26, 3.33, 3.20, 3.16, 3.20, 3.22, 3.38, 3.90, 3.36, 3.25, 3.28, 3.20, 3.22, 2.98, 3.45, 3.70, 3.34, 3.18, 3.35, 3.12)
m2 = (3.22, 3.30, 3.34, 3.28, 3.29, 3.25, 3.30, 3.27, 3.38, 3.34, 3.35, 3.19, 3.35, 3.05, 3.36, 3.28, 3.30, 3.28, 3.30, 3.20, 3.16, 3.33)
A) Provide descriptive statistical summaries of the data for each model; in particular, the sample variance and the sample size for each machine.
Please copy your R code and the result and paste them here.
B) Conduct a statistical test to determine whether there is a significant difference between the variances in the bag weights for two machines. First, clearly formulating your hypotheses below.
C) Compute the test statistic.
Please copy your R code and the result and paste them here.
D) Compute the p value.
Please copy your R code and the result and paste them here.
E) Use a .05 level of significance to compute both critical values for your test statistic.
Please copy your R code and the result and paste them here
F) Use a .05 level of significance. What is your conclusion?
G) Use the function var.test() in R to run the test directly to confirm your results above are correct.
Please copy your R code and the result and paste them here.
H) Construct a 95% confidence interval for the variance of the weight of bags produced on machine 1.
Please copy your R code and the result and paste them here.
I) Construct a 95% confidence interval for the standard deviation of the weight of bags produced on machine 2.
Please copy your R code and the result and paste them here.
J) Which machine, if either, provides the greater opportunity for quality improvements?
A) R code to print descriptive summaries
#create the sample data
m1<-c(2.95,3.45,3.50,3.75,3.48,3.26,3.33,3.20,3.16,3.20,3.22,3.38,3.90,3.36,3.25,3.28,3.20,3.22,2.98,3.45,3.70,3.34,3.18,3.35,3.12)
m2<-c(3.22,3.30,3.34,3.28,3.29,3.25,3.30,3.27,3.38,3.34,3.35,3.19,3.35,3.05,3.36,3.28,3.30,3.28,3.30,3.20,3.16,3.33)
#a) descriptive summaries
summary(m1)
summary(m2)
#get the sample sizes
n1<-length(m1)
n2<-length(m2)
#get the sample variances
v1<-var(m1)
v2<-var(m2)
print(paste("The sample size of m1 is",n1,"and the sample variance
is",round(v1,4)))
print(paste("The sample size of m2 is",n2,"and the sample variance
is",round(v2,4)))
#--get the following output
b) Let be the true variance in bag weights for machine 1 and be the true variance in bag weights for machine 2.
we want to conduct a statistical test to determine whether there is a significant difference between the variances in the bag weights for two machines. this means we want to test if
the following are the hypotheses that we want to test
C) the test statistic is
R-code
#c) test statistics
f<-v1/v2
print(paste("The test statistics is F=",round(f,4),sep=""))
## get the following output
D) This is a 2 tailed test. P-value is the sum of area under the 2 tails, that is P(F<1/8.2844) +P(F>8.2844). Since the areas are the same,
the p-value is 2*P(F>8.2844). The numerator df is n1-1 = 25-1=24
and the denominator df is n2-1=22-1=21
Rcode is
#d) p-value
pvalue<-2*pf(f,df1=n1-1,df2=n2-1,lower.tail=FALSE)
print(paste("The p-value is ",round(pvalue,9),sep=""))
#get the following
E) this is a 2 tailed test. Hence
the upper tail value of critical value of F is
with degrees of freedom n1-1=24 and n2-1=21
In terms of the left tail this can be experessed as
the lower tail value of critical value of F is
with degrees of freedom n1-1=24 and n2-1=21
R-code to get the values
#E)Critical values
alpha<-0.05
#get the lower tail critical value
lcf<-qf(alpha/2,df=n1-1,df2=n2-1)
#get the upper tail critical value
ucf<-qf(1-alpha/2,df=n1-1,df2=n2-1)
print(paste("The lower critical value is",round(lcf,4),",the upper
critical value is",round(ucf,4)))
#get the following
F) We can see the p-value is less than alpha = 0.05. We can also see that the the tets statistics of 8.2844 is greater than the upper critical value of 2.3675.
We reject the null hypothesis
We conclude that there is sufficient evidence to support the claim that there is a significant difference between the variances in the bag weights for two machines.
G) R code
#g use var.test
var.test(m1,m2,ratio = 1,alternative = c("two.sided"),conf.level =
0.95)
# get the following
the results match that of earlier
H) 95% confidence interval indicates that the total area under 2 tails is or the area under each tail is 0.025. The critical value for the upper tail is
the lower tail critical value is
confidence interval for variance is
R code for machine 1
#h) 95% confidence interval for machine 1
alpha<-0.05
#get the lower tail critical value
chil<-qchisq(alpha/2,df=n1-1)
#get the upper tail critical value
chiu<-qchisq(1-alpha/2,df=n1-1)
#get the lower value of CI
lci1<-(n1-1)*v1/chiu
#get the upper value of CI
uci1<-(n1-1)*v1/chil
print(paste("The 95% confidence interval for the variance of the
weight of bags produced on machine 1 is
[",round(lci1,4),",",round(uci1,4),"]",sep=""))
# get the output
I) 95% confidence interval for standard deviation is
R-code
#h) 95% confidence interval of standard deviation for machine
2
alpha<-0.05
#get the lower tail critical value
chil<-qchisq(alpha/2,df=n2-1)
#get the upper tail critical value
chiu<-qchisq(1-alpha/2,df=n2-1)
#get the lower value of CI
lci2<-sqrt((n2-1)*v2/chiu)
#get the upper value of CI
uci2<-sqrt((n2-1)*v2/chil)
print(paste("The 95% confidence interval for the standard deviation
of machine 2 is [",round(lci2,4),",",round(uci2,4),"]",sep=""))
# output is
IJ) Machine 1 has a larger variance ( 0.0489) in the weights of nags compared to machine 2 (0.0059).
Machine 1 also has a larger confidence interval for standard deviation:
R- code
print(paste("length of CI of SD of machine 1 is ",
round(sqrt(uci1) - sqrt(lci1),4)))
print(paste("length of CI of SD of machine 2 is ", round(uci2 -
lci2,4)))
#output
Hence machine 1 provides a greater opportunities for improvement.