The ICC and the International Year of Statistics
2013 is the International Year of Statistics. On the balance of probability it was bound to get its own year sooner or later. Whether it is found to be significant for the whole 12 months, time will tell.
On hearing of this year’s appellation, I thought it an opportunity to explore some issues with data recorded at international cricket fixtures that intrigued me. The answers I sought weren’t on the ICC website, so I emailed a query to email@example.com.
I noticed that 2013 has been designated International Year of Statistics, which prompted some thoughts about the statistics that I care most about: cricket statistics. I would be grateful if you could help me with these queries:1. Is the ICC planning any developments to mark the International Year of Statistics?2. Who owns the data from which international cricket statistics are drawn? Is it the ICC, the associations of participating teams, the host?3. Are the detailed data of international matches collated and held centrally? Match scorecards are widely published, but is there a repository of the underlying ball-by-ball data? If so, how can an individual access it?4. Is there a database of non-match cricket data held by the ICC (e.g. players’ dates and places of birth)?5. Most international matches have high standards of television coverage, including ball-tracking and ball speed cameras. Who owns the data from these devices? Are these data collated centrally? Are these data held in a way they can be cross-matched with the ball-by-ball score data?I look forward very much to hearing from you.Regards
I sent that email on 8 January; requested an acknowledgement a week later and then having stumbled across the name of the ICC’s Official Statistician, David Kendix, I re-sent it to his attention on 27 January. You may have guessed that I’ve not had a reply.
I am more interested in the answers than in criticising the ICC. It would be wrong to be too harsh on their customer service. I would imagine they get hundreds of queries daily, many scurrilous, many others asking for services they don’t provide or pitching sales. And they have had a busy month: the build-up to an international tournament, managed in a ‘last-minute Larry’ fashion; a Board meeting where one of the game’s most contentious issues – DRS – was kept off the formal agenda. So, I can see why my questions dithered in somebody’s in-box before disappearing to the trash.
Anyway, the ICC have done me the honour of following me on twitter. I double-checked this after I told a cricket-playing colleague, who looked alarmed that the International Criminal Court was taking an interest in me.
Back to my questions. I make use of cricinfo’s statsguru function for exploring cricket statistics. It’s an admirable application, pretty flexible, free and authoritative. But, increasingly I am finding that the features of the game I want to explore are not easily queried.
For example, a friend (@ghdunn1) asked whether the new ball was a more potent feature of Test cricket now than in the past. It took me over an hour, working through scorecards, to generate the analysis depicted below on last summer’s England v South Africa series. A database of ball-by-ball data could return an answer to that very important question about one of the variables in the game in less time than it took me to analyse a single series manually (NB clearly the availability of ball-by-ball data is a limitation).
While it would be wonderful for ball-by-ball data to be available for a hobbyist such as myself, I don’t pretend that is sufficient reason for time to be invested developing such a database. But I do think there is a justification.
England are probably the world’s best-resourced test team. One of the methods they have employed to gain competitive advantage, is the detailed analysis of their own players’ and opponents’ performance. The exact manner they do this, and the resources used, are not made public.
One of the concerns supporters of test cricket wrestle with is the polarisation in performance amongst the small number of test playing nations. The ability to carry out detailed analysis has not created that polarisation, but it reinforces the competitive advantage of the richer nations. A free database would counter that. Even if a team lacked the money to employ analysts, I reckon they could crowd-source the analysis they needed from the many part-time and hobbyist statisticians across the world.
I emphasise that the database should be free to use. It may be that I misunderstand the ownership of cricket data (see question 2 to the ICC), but it ought to belong to all of us who follow the game.
You may have the answers to the questions I posed the ICC. If so, please let me know. If you don’t, but think they are important or interesting, perhaps you would email or tweet the ICC?