In: Math
Running times (Y) and maximal aerobic capacity (X) for 14
female
Runners. Data collected for running times and maximal aerobic
capacity are listed below
X: 61.32 55.29 52.83 57.94 53.31 51.32 52.18 52.37 57.91 53.93 47.88 47.41 47.17 51.05
Y: 39.37 39.80 40.03 41.32 42.03 42.37 43.93 44.90 44.90 45.12 45.60 46.03 47.83 48.55
(a) Calculate the mean, median, MAD, MSD, and standard deviation
for each variable. ? [Include all your steps and explain all the
steps involved in details]
(b) Which of these statistics give a measure of the center of data
and which give a measure of the spread of data?
(c) Calculate the correlation of the two variables and pro-duce a
scatterplot of Y against X. [Use excel for scatterplot, show all
your computations concerning the correlation and explain all your
steps]
(d) Why is it inappropriate to calculate the autocorrelation of
these data?
(a) For X,
Mean = xi / n, i = 1, 2,..., n, (xi's are the values of X, n = total number of values).
Mean = (61.32 + 55.29 + .... + 51.05) / 14 = 52.99357
Median is the middle-most value when the values are arranged in increasing order. The values in increasing order are 47.17, 47.41, 47.88, 51.05, 51.32, 52.18, 52.37, 52.83, 53.31, 53.93, 55.29, 57.91, 57.94, 61.32. Here there are 14 values, i.e. an even number of values, so there is no single middle-most value. In such case, the average of the two middle-most values will be the median. Hence, median = (52.37 + 52.83) / 2 = 52.6
MAD is mean absolute deviation defined as median of the absolute deviations from the median of the data, i.e., MAD = median of |xi - median(xi)|
The absolute deviations of the data from its median, when arranged in increasing order are 0.23, 0.23, 0.42, 0.71, 1.28, 1.33, 1.55, 2.69, 4.72, 5.19, 5.31, 5.34, 5.43, 8.72
Therefore, MAD = (1.55 + 2.69) / 2 = 2.12
MSD, or mean squared deviation is defined as ((xi - mean)^2) / n
MSD = ((xi - 52.99357)^2) / 14 = 15.93665
Standard deviation is the root mean squared deviation i.e., the square root of MSD
Standard deviation = 3.99207
Similarly, for Y, mean = 43.69857
Median of Y = 44.415
MAD of Y = 2.215
MSD of Y = 8.015541
Standard deviation of Y = 2.83117
(b) The mean and median give a measure of center of data while the MAD, MSD and standard deviation give a measure of the spread of data.
(c) The correlation of two variables is given by
corr(X, Y) = {(xi - mean(x))(yi - mean(y))} / {n*(standard deviation(x))*(standard deviation(y))}
Therefore,
corr(X, Y) = {(8.32643 * -4.32857) + (2.29643 * -3.89857) + ... + (-1.94357 * 4.85143)} / (14*2.83117*3.99207)
= 0.04713
Steps in Excel are:
Insert data
Go to Chart, then Scatter Plot
Select data
Enter
(d) Autocorrelation is the correlation of the same variable across different observations at different points of time. Since, here we have to measure correlation between running time and maximal aerobic capacity, we do not measure correlation of a single variable with itself across time. So autocorrelation is not appropriate in this case.