In: Math
For the following scenario, answer the following questions. The underlined text is the name of the StatCrunch data set to be used for that part. Please note, do not conduct inference in this problem; just answer each question.
Heights of Fathers and Sons. To test the claim that sons are
taller than their fathers on average, a researcher randomly
selected 13 fathers who have adult male children. She records the
height of both the father and son in inches.
Note: to answer the questions below, subtract (Son’s Height –
Father’s Height).
Data:
Sons Fathers
64.4 79
69.2 67.1
76.4 70.9
69.2 66.8
78.2 72.8
76.9 70.4
71.8 70.3
79 70.1
75.8 79.5
72.3 65.5
69.2 65.4
66.9 69.1
64.5 74.5
a) What is (are) the parameter(s) of interest? Choose one of the following symbols the population mean)D (the mean difference from paired (dependent) data)2 (the difference of two independent means) and describe the parameter in context of this question in one sentence.
b) Depending on your answer to part (a), construct one or two relative frequency histograms. Remember to properly title and label the graph(s). Copy and paste these graphs into your document.
c) Describe the shape of the histogram(s) in one sentence.
d) Depending on your answer to part (a), construct one or two boxplots and copy and paste these graphs into your document.
e) Does the boxplot (or do the boxplots) show any outliers? Answer this question in one sentence and identify any outliers if they are present.
f) Considering your answers to parts (c) and (e), is inference appropriate in this case? Why or why not? Defend your answer using the graphs in two to three sentences.
Que.a
Since this is dependent sample, parameter of interest is = mean difference between son's height and father's height.
Que.b
I used R software to solve this question.
R codes:
son=scan('clipboard') ;son
Read 13 items
[1] 64.4 69.2 76.4 69.2 78.2 76.9 71.8 79.0 75.8 72.3 69.2 66.9
64.5
> father=scan('clipboard') ;father
Read 13 items
[1] 79.0 67.1 70.9 66.8 72.8 70.4 70.3 70.1 79.5 65.5 65.4 69.1
74.5
> d=son-father
hist(d, main='Histogram', xlab='Dfference in height',
ylab='Frequency')
Que.c
Shape of histogram is left skewed or negatively skewed.
Que.d
boxplot(d, main='Boxplot for difference in height')
Que.e
summary(d)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-14.6000 -2.2000 2.4000 0.9538 5.5000 8.9000
Yes, from box plot we see that one outlier is presented in data.
Outlier = -14.6
Que.e
Inference is not appropriate in this case because data does not come from normal population. And it contain an outlier.