In: Statistics and Probability
For this question you must calculate the deviations from the average, the manual correlation calculation and cross check it with Excel's correl function. Furthermore, you will generate the scatterplot graph, along with the trendline. Indicate whether there is a strong/weak/no correlation between the two variables.
Question :
Fair Isaac, the company that developed the credit score (FICO) model used by most lenders today, would like to test the linear relationship between age and credit score of an individual. The follow table shows the credit scores and ages of 10 randomly selected individuals:
AGE: 36 24 54 28 31 47 35 59 40 42
FICO: 675 655 760 615 660 790 720 760 685 610
Using Excel, Insert Scatter Plot with Age on X-axis and FICO on Y-axis. Right-click on any point on the plot, select Add Trendline and choose Linear. Tick Display Equation on Chart and Display R-squred value.
Age (x) | FICO (y) | (x-x̅) | (y-y̅) | (x-x̅)(y-y̅) | (x-x̅)2 | (y-y̅)2 |
36 | 675 | -3.25 | -29.375 | 95.469 | 10.563 | 862.891 |
24 | 655 | -15.25 | -49.375 | 752.969 | 232.563 | 2437.891 |
54 | 760 | 14.75 | 55.625 | 820.469 | 217.563 | 3094.141 |
28 | 615 | -11.25 | -89.375 | 1005.469 | 126.563 | 7987.891 |
31 | 660 | -8.25 | -44.375 | 366.094 | 68.063 | 1969.141 |
47 | 790 | 7.75 | 85.625 | 663.594 | 60.063 | 7331.641 |
35 | 720 | -4.25 | 15.625 | -66.406 | 18.063 | 244.141 |
59 | 760 | 19.75 | 55.625 | 1098.594 | 390.063 | 3094.141 |
Σx=314 | Σy=5635 | Σ(x-x̅)(y-y̅)=4736.25 | Σ(x-x̅)2=1123.5 | Σ(y-y̅)2=27021.88 |
x̅=Σx/n = 314/8 = 39.25, y̅=Σy/n = 5635/8 = 704.375
Correlation: r = Σ(x-x̅)(y-y̅)/(Σ(x-x̅)2*Σ(y-y̅)2)^0.5 = 4736.25/(1123.5*27021.88)^0.5 = 0.8596
There is a strong correlation between the two variables.
Test for linear relationship
r (Using Excel function CORREL(x,y)) = CORREL(age,FICO) = 0.8596
n = 8, Degrees of freedom: df = n-2 = 6, Level of significance: α = 0.05
H0: ρ = 0, There is no significant linear relationship between age and credit score of an individual
Ha: ρ ≠ 0, There is a significant linear relationship between age and credit score of an individual
Test statistic: t = r*((1-r*r)/(n-2))^0.5 = 0.8596*((1-0.8596*0.8596)/(8-2))^0.5 = 0.179
Critical value (Using Excel function T.INV.2T(probability,df)) = T.INV.2T(0.05,6) = 2.447
Since test statistic is less than critical value, we fail to reject H0.
So, there is no significant linear relationship between age and credit score of an individual.