In: Statistics and Probability
I was given a sample data set of systolic blood pressures and
was asked to 1.) calculate: the mean, median, standard deviation,
interquartile range and 2.) applied what I have learned in the
course to describe or comment on the systolic blood pressures
data.
minimum : 86.0
maximum : 267.0
mean: 140.2
median: 137.0
standard deviation: 22.92012
Q1: 123.0
Q3: 154.0
After calculating the IQR with lower limit of 76.5 and upper limit of 200.5, I stated that given the maximum value of systolic blood pressure of 267.0, there are outliers in this sample and because there are outliers, median and IQR should be used to summarize typical value and variability rather than the mean and standard deviation. Is my assessment correct? Additionally, how can I interpret the median value in this example and connect that to systolic blood pressures? Like, is it accurate to say that the average systolic blood pressure in this sample is 137 (which is the median value)?
The mean is the most common measure of center. It is what most people think of when they hear the word "average". However, the mean is affected by extreme values so it may not be the best measure of center to use in a skewed distribution. As here there are some outlier and distribution is skewed not symmetric. Therefore variability measured on the basis of standard deviation and typical average on the basis of mean are not appropriate. So The median should be used as the value in the center of the data. Half of the values are less than the median and half of the values are more than the median. It is the best measure of center to use in a skewed distribution.
average is defined as the sum/n, and median is the middle observation and half of the values are less than the median and half of the values are more than the median. Therefore it is not accurate to say that the average systolic blood pressure in this sample is 137 (which is the median value).