In: Statistics and Probability
In order to map whether the salmon in a cage in a farm is
approaching the harvesting mode, a random sample has been made
where we have weighed 10 randomly selected salmon in the cage. The
result (in kg) was:
3.3 3.6 3.7 3.9 1.6 3.4 3.8 3.8 4.1 3.9
a) Enter the numbers in a data vector in R, and then calculate the
mean, median, sample variance
and sample standard deviations of the data.
b) Create a box plot and histogram of the data.
It turned out that one salmon that was much smaller than the others
was sick. Therefore, it is chosen to remove this from the data so
that the data becomes representative of healthy salmon in the
cage.
c) Calculate the mean, median, sample variance and sample standard
deviation of the data without the least value.
Comment on differences and similarities between the results you get
now and the results you got in point a).
Ans:Here I have given the R output also I have attached the code at the last.
> x=c(3.3,3.6,3.7,3.9,1.6,3.4,3.8,3.8,4.1,3.9) #Given sample of salmon in the cage > length(x) [1] 10
a)
> m=mean(x) > m #mean of the sample [1] 3.51 > md=median(x) > md #median of the sample [1] 3.75 > v=var(x) > v #sample varience of the sample [1] 0.5076667 > sd=sd(x) > sd #standard deviation of the sample [1] 0.7125073
b)
> boxplot(x) #it gives the boxplot of the given data
> hist(x) #it gives the histogram of the given data
Here note that minimum value is seems to be outlier in the data.
c)
now we have to remove the least value by considering it as a sick(outlier)
> a=min(x) > a #remove this value [1] 1.6 > x1=c(3.3,3.6,3.7,3.9,3.4,3.8,3.8,4.1,3.9) > length(x1) [1] 9 > m1=mean(x1) > m1 #it gives the mean by removing least value [1] 3.722222
> md1=median(x1) > md1 #it gives the median by removing the least value [1] 3.8 > v1=var(x1) > v1 #it gives the varience by removing least value [1] 0.06444444 > sd1=sd(x1) > sd1 #it gives the standard deviation by removing least value [1] 0.2538591 > #we can tabulte the our result as: > statistics=c("mean","median","varience","standard deviation") > output.all=c(m,md,v,sd) #statistic with all value > output.noleast=c(m1,md1,v1,sd1) #statistic without least value > d=data.frame(statistics,output.all,output.noleast) > d statistics output.all output.noleast 1 mean 3.5100000 3.72222222 2 median 3.7500000 3.80000000 3 varience 0.5076667 0.06444444 4 standard deviation 0.7125073 0.25385910
Here note that mean , median are seems to be increased by removing the least observation from the data. The least observation is seems to be outlier. Hence after removing it varience and standard deviation is decreased significantly.
###########################################################
R code:
x=c(3.3,3.6,3.7,3.9,1.6,3.4,3.8,3.8,4.1,3.9) #Given sample of
salmon in the cage
length(x)
#a)
m=mean(x)
m
#mean of the sample
md=median(x)
md
#median of the sample
v=var(x)
v
#sample varience of the sample
sd=sd(x)
sd
#standard deviation of the sample
#b)
boxplot(x) #it gives the boxplot of the given data
hist(x) #it gives the histogram of the
given data
#here note that minimum value is seems to be outlier in the
data.
#now we have to remove the least value by considering it as a
sick(outlier)
a=min(x)
a #remove this value
x1=c(3.3,3.6,3.7,3.9,3.4,3.8,3.8,4.1,3.9)
length(x1)
m1=mean(x1)
m1 #it gives the mean by removing least value
md1=median(x1)
md1 #it gives the median by removing the least value
v1=var(x1)
v1 #it gives the varience by removing least value
sd1=sd(x1)
sd1 #it gives the standard deviation by removing least value
#we can tabulte the our result as:
statistics=c("mean","median","varience","standard deviation")
output.all=c(m,md,v,sd)
output.noleast=c(m1,md1,v1,sd1)
d=data.frame(statistics,output.all,output.noleast)
d