In: Statistics and Probability
A newly employed well-qualified data analyst suggested to the
management that a more thorough analysis of the situation could be
done by using data which includes ridership numbers in terms of bus
routes combined with the time-slots information.
Number of Riders | Bus Route | Time Slot |
54 | 1 | 1 |
51 | 2 | 1 |
63 | 3 | 1 |
66 | 4 | 1 |
68 | 1 | 2 |
66 | 2 | 2 |
87 | 3 | 2 |
75 | 4 | 2 |
63 | 1 | 3 |
63 | 2 | 3 |
78 | 3 | 3 |
66 | 4 | 3 |
75 | 1 | 4 |
72 | 2 | 4 |
84 | 3 | 4 |
84 | 4 | 4 |
48 | 1 | 5 |
69 | 2 | 5 |
69 | 3 | 5 |
66 | 4 | 5 |
a. Plot the mean ridership against the factors using
Minitab.
b. From this interaction diagram, which combinations of the TSlot
and BRoute (‘treatment cells’ as they are called) have the highest
and lowest mean ridership respectively?
c. Based on various residual plots, what can you say about the
aptness of this model?
The following is the mean ridership against the factors using minitab -
From above graph The highest mean ridership will be obtained at Bus route 3 and time slot 2
And the lowest mean ridership can be obtained at Bus route 1 and time slot 5
The residual plots can be observed one by one
First Normal Probability Plots of residuals
From the Normal Probability plot we can say that the model used for the data is well fitted.
We can observe Residual verses order of the data
The above graph also support that two factors considered as Time slot and Route contributes significant to the Number of ridership
Also we have plot of residual verses fitted values
From all of above residual plots we can say that the model is optimum for the mean ridership.
The two factors bus route and time slot are the two significant for the mean ridership