In: Statistics and Probability
1. Consider the builtin dataset iris.
a. What is the structure of the iris data frame?
b. Create a histogram of the Sepal.Width variable.
c. Create a histogram of the Petal.Width variable.
d. For both histograms, does the data appear normally distributed? Are they skewed?
e. For both histograms, does it appear that the data come from more than one populations?
f. What is the mean and median of Sepal.Width? What is the variance and standard deviation?
g. What is the mean and median of Petal.Width? What is the variance and standard deviation?
R codes
a)
> data = iris
> class(data)
[1] "data.frame"
b)
> hist(iris$Sepal.Width, main = 'Histogram of Sepal.Width')
c)
> hist(iris$Petal.Width, main = 'Histogram of
Petal.Width')
>
d)
For Sepal.Width data is normally distributed.
For Peatl.Width data is not Normally distributed.
e)
For Petal.Width data appears to come from more than one population.
f)
> Sepal_width = iris$Sepal.Width
> mean(Sepal_width)
[1] 3.057333
> median(Sepal_width)
[1] 3
> var(Sepal_width)
[1] 0.1899794
> sd(Sepal_width)
[1] 0.4358663
g)
> Petal_width = iris$Petal.Width
> mean(Petal_width)
[1] 1.199333
> median(Petal_width)
[1] 1.3
> var(Petal_width)
[1] 0.5810063
> sd(Petal_width)
[1] 0.7622377