In: Statistics and Probability
When data is collected, the most common calculations computed are the Measures of Central Tendency, mean, median and mode. It is important to know how to compute these values and it is also extremely important to know what these values mean in the context of the data set.
Use the internet to find a data set. Key terms to search: Free Public Data Sets and Medical Data Sets.
Download the data from the following site.
https://perso.telecom-paristech.fr/eagan/class/igr204/datasets
Cars: A dataset of about 400 cars with 8 characteristics such as horsepower, acceleration, etc.
Download csv file.
Observations about the data
MPG
The mean and median are almost equal and hence we can expected the
data to be normally distributed.
Most of the cars have an MPG of 13.
Cylinders
We find the median and mode are equal. However there are some cars
which have a higher number of cylinders due to which the overall
value of the mean is higher than the median or the mode.
Displacement
We find the standard deviation is very high, hence we find a very
high variability in the data for this variable.
The mean and median lower than the mode, indicate the distribution
is right skewed.
Horsepower
We find the mode is greater than median and mean, indicating that
the data is left skewed.
Weight
Here we find the mean and median are almost equal and we can expect
the data to be normal distributed.