Le Tour: the Good & the Bad
It would be interesting to see, whether or not we could use graphics (or other statistical methods) to identify potential doping candidates ...
The only thing we would need is a list of drivers who either
- admitted doping
- are convicted
- or are accused to be involved in doping.
At least we could use the list to look at the 2005 and 2006 results to find patterns ...
Sergej did collect some data from:
- http://en.wikipedia.org/wiki/Operaci%C3%B3n_Puerto_doping_case
- http://en.wikipedia.org/wiki/Doping_at_the_Tour_de_France
- http://sports.yahoo.com/sc/news?slug=ap-tourdefrance-dropouts&prov=ap&type=lgns
- http://en.wikipedia.org/wiki/List_of_sportspeople_who_tested_positive_for_banned_substances
There is more to look at, but for a first view pretty consistent ...

Thank you
Frederic (Comment this)
the boxplots show the cumulative times of all riders. These boxplots are all centered at the median time of the stage. The highlighted boxplots show the subgroup of the potentially doped drivers. As the highlighted medians are clearly better (shorter times) than the overall times, we might suggest that better performance is often caused by doping ...
It is also interesting to look at the shapes of the distributions for the 2007 data and data of earlier Tours ...
Best
Martin (Comment this)