In: Statistics and Probability
Jim is a transportation engineer and is interested in whether roadways that score high in one traffic engineering aspect also tend to score high in other aspects. To address this question, he completed a survey of 80 roadways that contain five measures of effectiveness with ratings for safety, capacity, speed, alignment, and flow.
a) Conduct a correlational analysis to investigate these relationships. What are your conclusions?
b) Jim determines that the speed of a facility affect/s the other roadway scores. Reevaluate Jim's hypothesis by controlling for speed. What effect does partialling out the effects of speed have on the relationships?
safety capacity speed alignment flow
48   56   48   35  
47
44   50   47   41  
51
38   41   49   59  
46
48   59   59   49  
52
44   44   52   37  
47
43   57   42   52  
54
49   53   56   32  
49
52   65   57   54  
54
55   63   56   58  
53
50   60   49   52  
54
45   47   44   51  
49
61   56   57   63  
50
49   57   50   51  
50
44   45   46   31  
47
39   50   51   28  
52
50   51   51   43  
54
49   51   50   50  
49
37   52   44   45  
53
49   55   51   47  
53
47   48   46   44  
54
55   56   55   42  
54
49   60   62   58  
53
45   58   65   59  
54
39   40   36   44  
46
54   56   51   55  
52
58   58   51   44  
53
52   52   62   47  
53
50   56   38   47  
51
62   57   52   25  
53
55   55   59   57  
49
50   52   52   47  
53
44   35   44   46  
45
55   55   39   24  
48
46   45   44   54  
47
48   51   47   50  
50
52   52   38   53  
47
52   49   50   59  
49
50   49   39   40  
48
48   50   52   52  
43
50   45   46   36  
47
50   46   44   44  
52
43   40   43   27  
45
40   44   54   50  
47
57   67   53   41  
58
57   52   56   67  
58
61   62   51   60  
56
46   52   52   60  
56
42   44   48   41  
56
60   58   66   50  
57
47   47   58   51  
56
50   59   51   43  
57
63   65   56   63  
62
59   52   42   42  
56
50   52   59   57  
56
49   59   62   51  
61
51   52   56   44  
58
47   62   67   54  
66
52   62   58   47  
62
60   48   55   45  
63
45   45   59   47  
60
59   53   51   39  
57
57   51   68   59  
59
46   60   64   54  
57
51   61   46   44  
59
47   53   49   41  
55
50   63   52   48  
64
57   69   70   51  
57
50   57   51   37  
56
65   69   62   60  
55
50   58   54   49  
56
56   63   54   49  
58
54   65   57   38  
56
42   53   47   45  
56
53   53   62   49  
59
61   54   57   63  
61
47   56   50   58  
55
46   55   48   55  
57
57   54   58   58  
57
56   60   51   47  
58
50   52   50   46   56
From the given data set, first we will compute the total
correlation coefficient 

| safety | capacity | speed | alignment | flow | 
| x1 | x2 | x3 | x4 | x5 | 
| 48 | 56 | 48 | 35 | 47 | 
| 44 | 50 | 47 | 41 | 51 | 
| 38 | 41 | 49 | 59 | 46 | 
| 48 | 59 | 59 | 49 | 52 | 
| 44 | 44 | 52 | 37 | 47 | 
| 43 | 57 | 42 | 52 | 54 | 
| 49 | 53 | 56 | 32 | 49 | 
| 52 | 65 | 57 | 54 | 54 | 
| 55 | 63 | 56 | 58 | 53 | 
| 50 | 60 | 49 | 52 | 54 | 
| 45 | 47 | 44 | 51 | 49 | 
| 61 | 56 | 57 | 63 | 50 | 
| 49 | 57 | 50 | 51 | 50 | 
| 44 | 45 | 46 | 31 | 47 | 
| 39 | 50 | 51 | 28 | 52 | 
| 50 | 51 | 51 | 43 | 54 | 
| 49 | 51 | 50 | 50 | 49 | 
| 37 | 52 | 44 | 45 | 53 | 
| 49 | 55 | 51 | 47 | 53 | 
| 47 | 48 | 46 | 44 | 54 | 
| 55 | 56 | 55 | 42 | 54 | 
| 49 | 60 | 62 | 58 | 53 | 
| 45 | 58 | 65 | 59 | 54 | 
| 39 | 40 | 36 | 44 | 46 | 
| 54 | 56 | 51 | 55 | 52 | 
| 58 | 58 | 51 | 44 | 53 | 
| 52 | 52 | 62 | 47 | 53 | 
| 50 | 56 | 38 | 47 | 51 | 
| 62 | 57 | 52 | 25 | 53 | 
| 55 | 55 | 59 | 57 | 49 | 
| 50 | 52 | 52 | 47 | 53 | 
| 44 | 35 | 44 | 46 | 45 | 
| 55 | 55 | 39 | 24 | 48 | 
| 46 | 45 | 44 | 54 | 47 | 
| 48 | 51 | 47 | 50 | 50 | 
| 52 | 52 | 38 | 53 | 47 | 
| 52 | 49 | 50 | 59 | 49 | 
| 50 | 49 | 39 | 40 | 48 | 
| 48 | 50 | 52 | 52 | 43 | 
| 50 | 45 | 46 | 36 | 47 | 
| 50 | 46 | 44 | 44 | 52 | 
| 43 | 40 | 43 | 27 | 45 | 
| 40 | 44 | 54 | 50 | 47 | 
| 57 | 67 | 53 | 41 | 58 | 
| 57 | 52 | 56 | 67 | 58 | 
| 61 | 62 | 51 | 60 | 56 | 
| 46 | 52 | 52 | 60 | 56 | 
| 42 | 44 | 48 | 41 | 56 | 
| 60 | 58 | 66 | 50 | 57 | 
| 47 | 47 | 58 | 51 | 56 | 
| 50 | 59 | 51 | 43 | 57 | 
| 63 | 65 | 56 | 63 | 62 | 
| 59 | 52 | 42 | 42 | 56 | 
| 50 | 52 | 59 | 57 | 56 | 
| 49 | 59 | 62 | 51 | 61 | 
| 51 | 52 | 56 | 44 | 58 | 
| 47 | 62 | 67 | 54 | 66 | 
| 52 | 62 | 58 | 47 | 62 | 
| 60 | 48 | 55 | 45 | 63 | 
| 45 | 45 | 59 | 47 | 60 | 
| 59 | 53 | 51 | 39 | 57 | 
| 57 | 51 | 68 | 59 | 59 | 
| 46 | 60 | 64 | 54 | 57 | 
| 51 | 61 | 46 | 44 | 59 | 
| 47 | 53 | 49 | 41 | 55 | 
| 50 | 63 | 52 | 48 | 64 | 
| 57 | 69 | 70 | 51 | 57 | 
| 50 | 57 | 51 | 37 | 56 | 
| 65 | 69 | 62 | 60 | 55 | 
| 50 | 58 | 54 | 49 | 56 | 
| 56 | 63 | 54 | 49 | 58 | 
| 54 | 65 | 57 | 38 | 56 | 
| 42 | 53 | 47 | 45 | 56 | 
| 53 | 53 | 62 | 49 | 59 | 
| 61 | 54 | 57 | 63 | 61 | 
| 47 | 56 | 50 | 58 | 55 | 
| 46 | 55 | 48 | 55 | 57 | 
| 57 | 54 | 58 | 58 | 57 | 
| 56 | 60 | 51 | 47 | 58 | 
| 50 | 52 | 50 | 46 | 56 | 
Since the calculation steps are very long, we will use the Excel to calculate the above total correlations
[How to do it in Excel: Go to data. Choose Data Analysis. Choose correlation. Select Data set]
The correlation coeffecient matrix is given below
| x1 | x2 | x3 | x4 | x5 | |
| x1 | 1.000 | ||||
| x2 | 0.552 | 1.000 | |||
| x3 | 0.351 | 0.462 | 1.000 | ||
| x4 | 0.218 | 0.244 | 0.400 | 1.000 | |
| x5 | 0.393 | 0.546 | 0.525 | 0.261 | 1.000 | 
From the above, the following set of varibles have moderate correlation as they have values around 0.5
(x1, x2); (x2,x5) and (x3,x5)
Other combinations have low level of correlations that mean there is no strong relationship in the ratings
(b) According to Jim, speed is the deciding factor on the ratings. Therefore we need to find the relation of other variables with respect to x3 or in other words, how speed influenc other ratings.
From the above table it is clearly coming out there is only moderate relationship between x3 and x5 which is 0.525. So, only flow is affected by speed. Other facotrs are not so much affected by speed,