In: Statistics and Probability
Consider the following hypothetical dataset regarding the compression strength of a concrete slab (ksi): 2.5, 3.5, 2.2, 3.2, 2.9, 4.3, 3.7, 3.4, 3.1, 2.8, 1.9, and 2.1.
(a) Compute the mean and standard deviation of the above data set
(b) Compute the 25th, 50th, 75th and 90th percentile values of the compressive strength from the above dataset
(c) Construct a boxplot for the above data set
(d) Check if the largest value is an outlier following the z-score approach
Solution :
We have the hypothetical dataset regarding the compression strength of a concrete slab (ksi).
---------------------------------------------------------------------------------------------
(a) Compute the mean and standard deviation of the above data set :
We know that the mean and standard deviation of a set of "n" sample values is given as ,
(b) Compute the 25th, 50th, 75th and 90th percentile values of the compressive strength from the above dataset :
Using the R Statistical Software , we have computed the required percentile values !
---------------------------------------------------------------------------------------------
(c) Construct a boxplot for the above data set :
Using the R Statistical Software , we have constructed the boxplot given below.
---------------------------------------------------------------------------------------------
(d) To check if the largest value is an outlier following the z-score approach :
We know that an obs. of a dataset is considered to an outlier if
where , is the mean ; is the standard deviation ; is the Outlier.
From the data , we have the following information ,
We have computed the Z-values for the 12 values.
Clearly , none of the Z-values are less than -3 or greater than 3 , thus , we can conclude that there is no outliers in the dataset......... (Ans)
---------------------------------------------------------------------------------------------
The R - codes are given below :
## Data Entry ##
x=c(2.5, 3.5, 2.2, 3.2, 2.9, 4.3, 3.7, 3.4, 3.1, 2.8, 1.9, 2.1)
## Mean and Standard Deviation ##
m=mean(x);m
s=sd(x);s
## Percentile Values ##
a=sort(x)
quantile(x,probs=c(0.25,0.5,0.75,0.90))
## Box Plot ##
boxplot(x,main="Box - Plot of the Compression Strengths")
## Z - Scores ##
z=(x-m)/s;z
length(which(z < -3 || z > 3))
##### NO OUTLIERS #####
The outputs are given below :