In: Statistics and Probability
Stet by step in R and attach R file and R codes too - Thanks
Use one of the real-world example data sets from R (not previously used in the R practice assignment) or a dataset you have found, and at least two of the tests and R functions covered in the practice assignment to conduct a hypothesis test then report your findings and give proper conclusion(s).
Use the following supporting materials for R syntax, data sets and tools, along with other resources found in this module or that you find on your own.
• Using T-Tests in R from the Department of Statistics at UC Berkley
• Test of equal or given proportions from R Documentation
• F-Test: Compare Two Variances in R from STHDA (Statistical tools for high-throughput data analysis)
Please answer step by step with R files attached and R codes
To Use one of the real-world example data sets from R
Using T-Tests in R from the Department of Statistics at UC Berkley
Let us import data from library MASS
We will use "Cats ' data ,
Total Row = 144 , Colums = 3
ibrary(MASS)
> D=(cats) # to
import data of "Cats"
>
head(D,10)
# to show only first 10 values of data set
Sex Bwt Hwt
1 F 2.0 7.0
2 F 2.0 7.4
3 F 2.0 9.5
4 F 2.1 7.2
5 F 2.1 7.3
6 F 2.1 7.6
7 F 2.1 8.1
8 F 2.1 8.2
9 F 2.1 8.3
10 F 2.1 8.5
>
D[44:54,]
# between 44 and 54 observation
Sex Bwt Hwt
44 F 2.9 10.1
45 F 2.9 10.1
46 F 3.0 10.6
47 F 3.0 13.0
48 M 2.0 6.5
49 M 2.0 6.5
50 M 2.1 10.1
51 M 2.2 7.2
52 M 2.2 7.6
53 M 2.2 7.9
54 M 2.2 8.5
In tese data set we have Gender of cat
Column 1 - "Feamle" = F , Male =M
Column 2 - "Weight of cats " = Bwt ,
Column 3 - "Height of cats " = Hwt ,
Now we wish to cheak weater weight of Female cats is less than Meal cats or not
I.e let u1 be Weight of Female cats , and u2 be weight of Males cats
Hypothesis to test are
H0 : u1 = u2 ( weight of female and Male cat is same )
H1 : u1 < u2 ( weight of female is less than weight of Male cat )
R functions - t.test()
First we will sort data for Male and Female cats
>
weight_Female=D[c(Female),2]
# only observation which represents Femal cats
>
weight_Male=D[c(Male),2]
# only observation which represents Femal cats
>
weight_Female
# Weight of female cat
[1] 2.0 2.0 2.0 2.1 2.1 2.1 2.1 2.1 2.1 2.1 2.1 2.1 2.2 2.2 2.2 2.2
2.2 2.2 2.3
[20] 2.3 2.3 2.3 2.3 2.3 2.3 2.3 2.3 2.3 2.3 2.3 2.4 2.4 2.4 2.4
2.5 2.5 2.6 2.6
[39] 2.6 2.7 2.7 2.7 2.9 2.9 2.9 3.0 3.0
>
weight_Male
# Weight of male cat
[1] 2.0 2.0 2.1 2.2 2.2 2.2 2.2 2.2 2.2 2.2 2.2 2.3 2.4 2.4 2.4 2.4
2.4 2.5 2.5
[20] 2.5 2.5 2.5 2.5 2.5 2.5 2.6 2.6 2.6 2.6 2.6 2.6 2.7 2.7 2.7
2.7 2.7 2.7 2.7
[39] 2.7 2.7 2.8 2.8 2.8 2.8 2.8 2.8 2.8 2.9 2.9 2.9 2.9 2.9 3.0
3.0 3.0 3.0 3.0
[58] 3.0 3.0 3.0 3.0 3.1 3.1 3.1 3.1 3.1 3.1 3.2 3.2 3.2 3.2 3.2
3.2 3.3 3.3 3.3
[77] 3.3 3.3 3.4 3.4 3.4 3.4 3.4 3.5 3.5 3.5 3.5 3.5 3.6 3.6 3.6
3.6 3.7 3.8 3.8
[96] 3.9 3.9
# noww we will test our hypothesis
# to test alternative hypothsis " < " , use comand t.test (x,y, )
> t.test(weight_Female,weight_Male,)
Welch Two Sample t-test
data: weight_Female and weight_Male
t = -8.7095, df = 136.84, p-value =
4.416e-15
alternative hypothesis: true difference in means is less than
0
95 percent confidence interval:
-Inf -0.4376663
sample estimates:
mean of x mean of y
2.359574 2.900000
Since P-value = 4.416e-15 is very small , i.e P-value <<<0.05 ,so we reject null hypothesis at 5% of level of significance .
Hence Mean weight of Female cats is less than Male cats
Suppose we wish to test that in town the proportion of males cats in population 65%
Than is we wish to cheak that in every 100 cats there are 65 Male cats .
But someone says that Proportion of males cats is more than 65%
So ,
Hypothesis to test are
H0 : po = 0.65 ( proportion of male cats in town is 65% )
H1 : po > 0.65 ( proportion of male cats in town is more than 65% )
Now we have sample of size 144 , so we will test weater proportion of cats in sample is 65% or more than 65%
we will use here prop.test (x,n,po=)
> n=length(D[,1])
>
n
# total samples
[1] 144
> x=length(D[c(Male),2])
>
x
# number of Male Cats in sample
[1] 97
> prop.test(x,n,0.65,)
1-sample proportions test with continuity correction
data: x out of n, null probability 0.65
X-squared = 0.25672, df = 1, p-value =
0.3062
alternative hypothesis: true p is greater than 0.65
95 percent confidence interval:
0.6030755 1.0000000
sample estimates:
p
0.6736111
Here we have , p-value = 0.3062 > 0.05
So at 5% of level of significane we do not reject null hypothesis
Hence Proportion of Male cats in town may be equal to 65%
{ Note - In STHDA am not getting any usefull dataset since in testing F-Test: Compare Two Variances we requires two variable which have atrribute like gender , Height ect . So if any dataset like avilable you can use}
Since Here we are suppose to perform F-Test: Compare Two Variances we can use previous data set of cats only ,and can cheack weathere Feamle and Male Cats ahve same variability in their weight or not :
To test
H0 : 1 = 2 ( No varibility between weigths of diferent gender of cats )
H1 : 1 2 ( varibility between weigths of male and female cats differs)
> weight_Female #
already obtain in first part i.e weight of female cats
[1] 2.0 2.0 2.0 2.1 2.1 2.1 2.1 2.1 2.1 2.1 2.1 2.1 2.2 2.2 2.2 2.2
2.2 2.2 2.3
[20] 2.3 2.3 2.3 2.3 2.3 2.3 2.3 2.3 2.3 2.3 2.3 2.4 2.4 2.4 2.4
2.5 2.5 2.6 2.6
[39] 2.6 2.7 2.7 2.7 2.9 2.9 2.9 3.0 3.0
>
weight_Male #
already obtain in first part i.e weight of male cats
[1] 2.0 2.0 2.1 2.2 2.2 2.2 2.2 2.2 2.2 2.2 2.2 2.3 2.4 2.4 2.4 2.4
2.4 2.5 2.5
[20] 2.5 2.5 2.5 2.5 2.5 2.5 2.6 2.6 2.6 2.6 2.6 2.6 2.7 2.7 2.7
2.7 2.7 2.7 2.7
[39] 2.7 2.7 2.8 2.8 2.8 2.8 2.8 2.8 2.8 2.9 2.9 2.9 2.9 2.9 3.0
3.0 3.0 3.0 3.0
[58] 3.0 3.0 3.0 3.0 3.1 3.1 3.1 3.1 3.1 3.1 3.2 3.2 3.2 3.2 3.2
3.2 3.3 3.3 3.3
[77] 3.3 3.3 3.4 3.4 3.4 3.4 3.4 3.5 3.5 3.5 3.5 3.5 3.6 3.6 3.6
3.6 3.7 3.8 3.8
[96] 3.9 3.9
# to Compare Two Variances we use var.test(x,y,conf.level="")
> var.test(weight_Female,weight_Male,conf.level=0.95)
F test to compare two variances
data: weight_Female and weight_Male
F = 0.3435, num df = 46, denom df = 96,
p-value = 0.0001157
alternative hypothesis: true ratio of variances is not equal to
1
95 percent confidence interval:
0.2126277 0.5803475
sample estimates:
ratio of variances
0.3435015
Conclusion - Since P-value = 0.0001157 < 0.05
We reject null hypothesis at 5% of level of significance.
Hence variability in weights of male and female cats may differ .