In: Biology
The following data were collected as part of a study of coffee consumption among graduate students. The following reflect cups per day consumed:
3 4 6 8 2 1 0 2
X |
X2 |
0 |
0 |
1 |
1 |
2 |
4 |
2 |
4 |
3 |
9 |
4 |
16 |
6 |
36 |
8 |
64 |
26 |
134 |
Coffee consumption among graduate students.
(a.) Sample mean (X̅) is defined as sum of total events (∑x) divided by number of total events(n).
i.e. X̅= (∑x)/ (n)
Here, total number of events is (n)=8
So,X̅= (3+4+6+8+2+1+0+2)/8
X̅= 26/8
X̅= 3.25
Hence, sample mean (X̅) is 3.25.
(b.) Sample standard deviation (σ) is the measure of the spread of the data. The formula to calculate it is square root of ((xi-X̅)^2) divided by (n-1)
σ= √[∑(xi-X̅)^2/(n-1)]
where ,
s= sample standard deviation
xi= observed values of sample
n= number of observed values
X̅= sample mean
observed values (xi) |
(xi-X̅) |
(xi-X̅)^2 |
3 | (3-3.25)= (-0.25) | 6.25 |
4 | (4-3.25)= (0.75) | 0.5625 |
6 | (6-3.25)= (2.75) | 7.5625 |
8 | (8-3.25)= (4.75) | 22.5625 |
2 | (2-3.25)= (-1.25) | 1.5625 |
1 | (1-3.25)= (-2.25) | 5.0625 |
0 | (0-3.25)= (-3.25) | 10.5625 |
2 | (2-3.25)= (-1.25) | 1.5625 |
total [∑(xi-X̅)^2] = 55.6875 |
Now, σ= √[∑(xi-X̅)^2/(n-1)]
σ=√55.6875/(8-1)
σ= √55.6875/7
σ= √7.95
σ= 2.81
Hence, sample standard deviation (σ) is 2.81.
(c.) Median is the middle number of the set of numbers of a data.
First, we will arrange the data as- 0,1,2,2,3,4,6,8.
Since, the number of values in the data set is even. So, median is calculated as-
Then, we will find n/2th observation from the data. It is 8/2 th = 4th observation.
And, (n+1)/2th observation from the data. It is (8/2)+1 th = 4+1 = 5th observation
The 4th observation= 2
The 5th observation= 3
Now, the median is the average of 4th and 5th observation.
median= (4th obs+ 5th obs)/ 2
median= (2+3)/2 =5/2 = 2.5
Hence, median is 2.5.
(d.) For data with even number of observed values,
such as - 0,1,2,2,3,4,6,8
3rd and 4th observations are included in the calculation of median. This divides the data in lower half and upper half. The values in lower half are 0,1,2. The value in upper half are 4,6,8. The middle number in these halves are the 1st and 3rd quartile respectively.
So,for 1st quartile /Q1: the nos- 0,1,2. And the middle no. is 1. So, Q1 = 1.
Further, for 3rd quartile /Q3: the nos- 4,6,8. And the middle no. is 6. So, Q3 = 6.
Hence, Q1 is 1 and Q3 is 6.
(e.) For, normally distributed data, mean is better as using it will include all the values of observation.But, for a skewed data, median is better, as it is less effected by extreme values of a skewed data.
(f.) Both standard deviation and inter- quartile range are measures of standard deviation. Inter- quartile range is dependent on few values Q, Q2 and Q3. While, standard deviation takes into acount all the observed values in the data. So, I feel standard deviation is a better measure of dispersion over inter- quartile range.