In: Statistics and Probability
For all calculations, please keep two decimals for the final answer.
Use the data set below for Questions 1, &
2:
ID | Sit-ups (Test A) | Softball Throw / ft (Test B) | 40-Yard Dash (Test C) |
1 | 23 | 25 | 9 |
2 | 34 | 40 | 7.7 |
3 | 40 | 38 | 7.1 |
4 | 31 | 38 | 7.5 |
5 | 35 | 39 | 7.4 |
6 | 20 | 24 | 8.3 |
7 | 44 | 53 | 7.2 |
8 | 22 | 34 | 7.6 |
9 | 39 | 42 | 6.9 |
10 | 37 | 45 | 7.2 |
11 | 32 | 45 | 7.3 |
12 | 33 | 39 | 7.6 |
Make three scatter plots:
the first one is for the data sets of Test A (Sit-Ups) & Test C (40-Yard Dash),
the second one is for the data sets of Test A (Sit-Ups) & Test B (Softball Throw), and
the last one is for the data sets of Test B (Softball Throw) & Test C (40-Yard Dash).
For each scatter plot, briefly describe your observations (e.g., does the plot show you a positive or negative relationship between the two variables? Is the relationship between the two variables strong, weak or moderate? Does a straight line fit the data better than a curve line?)
show all work. Calculate correlation coefficients for:
Test A (Sit-Ups) & Test C (40 Yard Dash) = __________
Test A (Sit-Ups) & Test B (Softball Throw) = __________
Test B (Softball Throw) & Test C (40-Yard Dash) = __________
Based on your calculation, please answer the following question for each correlation coefficient: what does the correlation coefficient indicate (hint: direction, degree and form)?
Solution :
Question 1) Construction of the 3 Scatter Plots :
Scatter Plot for the data sets of Test A (Sit-Ups) & Test C (40-Yard Dash)
We construct the Scatter Plot of Test A (Sit-Ups) & Test C (40-Yard Dash) using R Software.
Observations : From the above Scatter Plot , we can clearly see that there is a Negative Relationship between Test A (Sit-Ups) and Test C (40-Yard Dash). That is , from the Scatter Plot , we can say that as the Test A scores (Number of Sit-Ups) increase , the Test C scores (Time taken for the 40-Yard Dash) decrease. From the above Scatter Plot , we can also observe that the relation between the two variables , Test A (Sit-Ups) and Test C (40-Yard Dash) , is Moderate in nature. A straight line representation is enough to fit the data , but a curve line always has better Coefficient of Determination (r2) value compared to a Straight Line. The value of the Coefficient of Determination (r2) increases as the degree of polynomial increases.
Scatter Plot for the data sets of Test A (Sit-Ups) & Test B (Softball Throw)
We construct the Scatter Plot of Test A (Sit-Ups) & Test B (Softball Throw) using R Software.
Observations : From the above Scatter Plot , we can clearly see that there is a Positive Relationship between Test A (Sit-Ups) and Test B (Softball Throw). That is , from the Scatter Plot , we can say that as the Test A scores (Number of Sit-Ups) increase , the Test B scores (Total feet for the Softball Throw) also increase. From the above Scatter Plot , we can also observe that the relation between the two variables , Test A (Sit-Ups) and Test B (Softball Throw) , is Strong in nature. A straight line representation is enough to fit the data , but a curve line always has better Coefficient of Determination (r2) value compared to a Straight Line. The value of the Coefficient of Determination (r2) increases as the degree of polynomial increases.
Scatter Plot for the data sets of Test B (Softball Throw / ft) & Test C (40-Yard Dash)
We construct the Scatter Plot of Test B (Softball Throw / ft) & Test C (40-Yard Dash) using R.
Observations : From the above Scatter Plot , we can clearly see that there is a Negative Relationship between Test B (Softball Throw / ft) and Test C (40-Yard Dash). That is , from the Scatter Plot , we can say that as the Test B scores (Total feet for the Softball Throw) increase , the Test C scores (Time taken for the 40-Yard Dash) decrease. From the above Scatter Plot , we can also observe that the relation between the two variables , Test B (Softball Throw / ft) and Test C (40-Yard Dash) , is Moderate in nature. A straight line representation is enough to fit the data , but a curve line always has better Coefficient of Determination (r2) value compared to a Straight Line. The value of the Coefficient of Determination (r2) increases as the degree of polynomial of the regression equation increases.
-------------------------------------------------------
-------------------------------------------------------
Question 2) Calculation of the three Correlation Coefficients :
Let "X" and "Y" be two random variables havng "n" observations each. That is , we can say ,
Then , the formula for calculating Correlation Coefficient between X and Y is given as ,
Correlation Coefficient for the data sets of Test A (Sit-Ups) & Test C (40-Yard Dash)
We calculate the Correlation Coefficient between Test A (Sit-Ups) & Test C (40-Yard Dash).
Now , Let X represent Test A (Sit-Ups) scores and Y represent Test C (40 Yard Dash) scores.
Thus , the Correlation Coefficient between Test A (Sit-Ups) & Test C (40-Yard Dash) is,
Interpretation : We clearly see that the Correlation Coefficient between Test A (Sit-Ups) & Test C (40-Yard Dash) is rAC = -0.77. Thus , we can clearly state that the direction of the relationship between Test A (Sit-Ups) & Test C (40-Yard Dash) is in the Negative Direction. From the value of the Correlation Coefficient , we can say that a Moderate Negative Relationsip of a Linear Form exists between Test A (Sit-Ups) & Test C (40-Yard Dash).
Correlation Coefficient for the data sets of Test A (Sit-Ups) & Test B (Softball Throw)
We calculate the Correlation Coefficient between Test A (Sit-Ups) & Test B (Softball Throw).
Now , Let X represent Test A (Sit-Ups) scores and Y represent Test B (Softball Throw) scores.
Thus , the Correlation Coefficient between Test A (Sit-Ups) & Test B (Softball Throw) is,
Interpretation : We clearly see that the Correlation Coefficient between Test A (Sit-Ups) & Test B (Softball Throw) is rAB = 0.86. Thus , we can clearly state that the direction of the relationship between Test A (Sit-Ups) & Test B (Softball Throw) is in the Positive Direction. From the value of the Correlation Coefficient , we can say that a Strong Positive Relationsip of a Linear Form exists between Test A (Sit-Ups) & Test B (Softball Throw).
Correlation Coefficient for the data sets of Test B (Softball Throw) & Test C (40-Yard Dash)
We compute Correlation Coefficient between Test B (Softball Throw) & Test C (40-Yard Dash).
Let X represent Test B (Softball Throw) scores and Y represent Test C (40 Yard Dash) scores.
Thus , the Correlation Coefficient between Test B (Softball Throw) & Test C (40-Yard Dash) is,
Interpretation : We clearly see that the Correlation Coefficient between Test B (Softball Throw) & Test C (40-Yard Dash) is rBC = -0.81. Thus , we can clearly state that the direction of the relationship between Test B (Softball Throw) & Test C (40-Yard Dash) is in the Negative Direction. From the value of the Correlation Coefficient , we can say that a Strong Negative Relationsip of a Linear Form exists between Test B (Softball Throw) & Test C (40-Yard Dash).
---------------------------------------------------------------
---------------------------------------------------------------
The R - codes for constructing the Scatter Plots and calculating the Correlation Coeffs :
## Data Entry ##
a=c(23,34,40,31,35,20,44,22,39,37,32,33);a
b=c(25,40,38,38,39,24,53,34,42,45,45,39);b
c=c(9,7.7,7.1,7.5,7.4,8.3,7.2,7.6,6.9,7.2,7.3,7.6);c
## Scatter Plots ##
plot(a,c,xlab="Test A (Sit-Ups)",ylab="Test C (40 Yard
Dash)",main="Scatter Plot for Sit-Ups (A) & 40 Yard Dash
(C)",text(35,8.5,"Correlation Coefficient (r) = -0.77"))
plot(a,b,xlab="Test A (Sit-Ups)",ylab="Test B (Softball
Throw)",main="Scatter Plot for Sit-Ups (A) & Softball Throw
(B)",text(25,50,"Correlation Coefficient (r) = +0.86"))
plot(b,c,xlab="Test B (Softball Throw)",ylab="Test C (40 Yard
Dash)",main="Scatter Plot for Softball Throw (B) & 40 Yard Dash
(C)",text(45,8.5,"Correlation Coefficient (r) = -0.81"))
## Correlation Coefficients ##
cor_ac=round(cor(a,c),2);cor_ac
cor_ab=round(cor(a,b),2);cor_ab
cor_bc=round(cor(b,c),2);cor_bc
---------------------------------------------------------------