In: Statistics and Probability
Scatterplot of daily cycling distances and type of climb: Every summer, the touring company America by Bicycle conducts the “Cross Country Challenge,” a 7-week bicycle journey across the United States from San Francisco, California, to Portsmouth, New Hampshire. At some point during the trip, the exhausted cyclists usually start to complain that the organizers are purposely planning for days with lots of hill and mountain climbing to coincide with longer distances. The tour staff counter that no relation exists between climbs and mileage and that the route is organized based on practical issues, such as the location of towns in which riders can stay. The organizers who planned the route (these are the company owners who are not on the tour) say that they actually tried to reduce the mileage on the days with the worst climbs. Here are the approximate daily mileages and climbs (in vertical feet), as estimated from one rider’s bicycle computer.
|
From the above scatterplot we can observe that the points in the graph are randomly scattered. There is no pattern observed here. For any mileage most of the climb fall below 2000 feet, for example a mileage of 73 has a climb height of 1000 and again a mileage of 103 has a climb height of 1000. Again a mileage of 73 has highest climb height of 8500 feet.So the pattern is random that is we cannot predict the climb height as a function of mileage. We cannot fit any relation between mileage and climb height. The pattern is random because a mileage corresponding to low distance can have low climb height again it may have high climb height.
The prediction of cyclist may be wrong due to their instance schedule. There is lots of travel day after day.So after few days they may be exhausted and conclude that higher mileage has higher climb height. On the other hand organizers claim is wrong due to the fact that they predicted the climb height wrong, the roads may be spiral and has lots of turns. So for a low climb height the cyclist has to travel more distance. So there is exist people's bias and so we need for actual data and inferential statistics to conclude the actual relationship.