In: Statistics and Probability
Question (4) [6 marks]
I am a firm believer in sketching things on tests/assignments. I think that sometimes the sketch can help remind you what you have to do next, or correctly guide your thought process. One of my TA's for a previous Statistics course gathered data from marked tests to determine if sketching had an influence grades. For each student in the class, we recorded whether or not they sketched during a specific question, and also recorded whether or not they obtained the correct p- value. The results were as follows:
At 10% the level of significant, is there any statistical reason to believe that sketching is associated with grades?
Note: You can use the functions qchisq() in R to help you in solving the following. Why we are using qchisq() function in R.
The qchisq() function in R allows us to specify a desired area in a tail and the number of degrees of freedom. From that information, qchisq() computes the required x-value to get the specified area in the specified tail with the specified number of degrees of freedom.
O |
sketch |
No sketch |
Total |
correct |
50 |
30 |
|
incorrect |
60 |
90 |
|
Total |
E |
sketch |
No sketch |
Total |
correct |
|||
incorrect |
|||
Total |
a) State the two hypothesis of interest.
b) Calculate an appropriate test statistic for (a) by hand.
χ 2 (O E )2 E
c) Write your conclusion using the rejection region method “critical value method” include both statistical and related to the topic of the question (practical) interpretation use the function qchisq() in R.
Question 5
In this question we’ll use housetasks dataset from STHDA: http://www.sthda.com/sthda/RDoc/data/housetasks.txt. The dataset is a contingency
table containing 13 house tasks and their distribution in the couple: → rows are the different tasks
→ values are the frequencies of the tasks done :
1) by the wife only
2) alternatively
3) by the husband only 4) or jointly
Using R test whether the two variables housetasks and their distribution in the couple are statistically significantly associated (dependent) by answering the following questions.
a)
b)
(1 mark) State the two hypothesis of interest.
(0.5 mark) Import the data into R using the function read.table
Note: show your R codes but not the output (the dataset).
c)
(0.5 mark) Calculate Chi-square statistic using the function chisq.test() in R. Note: show your R codes and output.
d)
(1 mark) Use ? = 0.05 write your conclusion using the p-value (include both statistical and related to the topic of the question interpretation).
DATA for Q5
Wife Alternating Husband Jointly Laundry 156 14 2 4 Main_meal 124 20 5 4 Dinner 77 11 7 13 Breakfeast 82 36 15 7 Tidying 53 11 1 57 Dishes 32 24 4 53 Shopping 33 23 9 55 Official 12 46 23 15 Driving 10 51 75 3 Finances 13 13 21 66 Insurance 8 1 53 77 Repairs 0 3 160 2 Holidays 0 1 6 153
a) Null hypothesis, H0: sketching and grades are independent
vs. Alternative hypothesis, Ha: sketching has an influence on grades
(b)
O |
sketch |
No sketch |
Total |
correct |
50 |
30 |
80 |
incorrect |
60 |
90 |
150 |
Total |
110 | 120 | 230 |
E |
sketch |
No sketch |
Total |
correct |
80*110/230=38.2609 | 80*120/230=41.7391 | |
incorrect |
150*110/230=71.7391 | 120*150/230=78.2609 | |
Total |
Q5: (a) Null hypothesis, H0: Two variables house tasks and their distribution in the couple are not associated
vs. Alternative hypothesis, Ha: Two variables house tasks and their distribution in the couple are associated
(b) and (c) R code:
library(MASS) # load the
MASS package
t=matrix(c(156,124,77,82,53,32,33,12,10,13,8,3,0,14,20,11,36,11,24,23,46,51,13,1,3,1,
2,5,7,15,1,4,9,23,75,21,53,160,6,4,4,13,7,57,53,55,15,3,66,77,2,153),ncol=4,byrow=TRUE)
colnames(t)=c("Wife","Alternating","Husband","Jointly")
rownames(t)=c("Laundry",
"Mainmeal",
"Dinner",
"Breakfeast",
"Tidying",
"Dishes",
"Shopping",
"Official",
"Driving",
"Finances",
"Insurance",
"Repairs",
"Holidays"
)
table=as.table(t)
chisq.test(table)
Output:
Pearson's Chi-squared test
data: table
X-squared = 809.81, df = 36, p-value < 2.2e-16
Warning message:
In chisq.test(table) : Chi-squared approximation may be
incorrect
(d) Since p-value<0.05 so we reject H0 at 5% level of
significance and conclude that the two variables housetasks and
their distribution in the couple are statistically significantly
associated.