In: Statistics and Probability
The data below are the densities of clover flowers in a lawn (flowers/m2) for 20 quadrat samples. For this data set create a frequency table and histogram, and determine the sample size, mean, median, mode, range, standard deviation, variance, and standard error, and the 95% confidence interval for the population mean.
Describe the data using all of the following terms that apply: population, census, sample, quantitative data, qualitative data, discrete data, continuous data, symmetric, skewed, bimodal, and/or multimodal.5, 0, 23, 10, 1, 6, 19, 0, 4, 8, 0, 18, 22, 23, 0, 21, 7, 24, 6, 23
We will use R software to create a frequency table and histogram , and also for other calculation process any other software can be used , if manual calculation are require you can ask for that in comment box.
Given data is
5, 0, 23, 10, 1, 6, 19, 0, 4, 8, 0, 18, 22, 23, 0, 21, 7, 24, 6, 23
We will first import data into R
x=c(5, 0, 23, 10, 1, 6, 19, 0, 4, 8, 0, 18, 22, 23, 0, 21, 7, 24, 6, 23 )
to create a frequency table and histogram
1) Frequency table
>
table(x)
# frequency table
x
0 1 4 5 6 7 8 10 18 19 21 22 23 24
4 1 1 1 2 1 1 1 1 1 1 1
3 1
2) Histogram
> b=seq(0,27,3) # to create bins of size 3
> hist(x,breaks=b,col=13) # to plot histogram
3) Sample Size
> n=length(x)
> n
[1] 20
Thus Sample Size n = 20
4) To find mean, median, mode, range, standard deviation, variance, and standard error,
> summary(x)
Min. 1st Qu. Median Mean 3rd Qu.
Max.
0.00 3.25
7.50 11.00 21.25
24.00
>
mean(x)
# to find mean
[1] 11
>
median(x)
# to find median
[1] 7.5
>
sd(x)
# to find standard deviation
s
[1] 9.403247
>
var(x)
# to find variance Var
[1] 88.42105
> sd(x)/n^(1/2)
# to find standard error S.E = sd/(n)1/2
[1] 2.10263
Now The mode of a set of data values is the value that appears most often.
From Frequency table we have
x
0 1 4 5 6 7 8 10 18 19 21 22 23 24
4 1 1 1 2 1 1 1 1 1 1
1 3 1
Number zero " 0 " has appears most often.
Hence mode is 0
Now rande is given by range = Maximum - Minimum
= 24 - 0 = 24
Hence Range = 24
Thus
Mean = 11
Median = 7.5
Mode = 0
Range = 24
Standard deviation s = 9.403247
Variance = 88.42105
standard error S.E = s / (n)1/2 = 2.10263
5) 95% confidence interval for the population mean .
Now 95% confidence interval is given by
CI = ( - S.E , + S.E )
Here = 11 , S.E = 2.10263
is t-distributed with n-1 = 20-1 = 20 degree of freedom and =0.05
We can find this value from statistical book or from software like R as follow
> qt(1-0.05/2,df=19)
[1] 2.093024
Thus
95% confidence interval for the population mean is
CI = ( - S.E , + S.E )
= ( 11 - 2.093024 * 2.10263 , 11 + 2.093024 * 2.10263 )
= ( 6.599145 , 15.40086 )
Thus 95% confidence interval for the population mean is ( 6.5991 , 15.4009 )
6) Describe the data using all of the following terms that apply: population, census, sample, quantitative data, qualitative data, discrete data, continuous data, symmetric, skewed, bimodal, and/or multimodal
This is a sample data ( since we are given 20 quadrat samples not popuation )
This is quantitative data (Quantitative data are measures of values or counts and are expressed as numbers)
This is discrete data ( since Discrete data is based on counts , and given data are notjing but count )
From obtained Histogram in part 2 we can say that this is not symmetric
Also is neither right skewed nor left skewed , since we can see two peaks at both ends.
So thse is bimodal ( we can see two peaks in histogram )
A histogram with two peaks is called "bimodal" since it has two values or data ranges that appear most often in the data . So because there are only two peaks in hisograme this is not multimodal