Cricketr learns new tricks : Performs fine-grained analysis of players

“He felt that his whole life was some kind of dream and he sometimes wondered whose it was and whether they were enjoying it.”

“The ships hung in the sky in much the same way that bricks don’t.”

“We demand rigidly defined areas of doubt and uncertainty!”

“For a moment, nothing happened. Then, after a second or so, nothing continued to happen.”

“The Answer to the Great Question… Of Life, the Universe and Everything… Is… Forty-two,’ said Deep Thought, with infinite majesty and calm.”

 The Hitchhiker's Guide to the Galaxy - Douglas Adams

Introduction

In this post, I introduce 2 new functions in my R package ‘cricketr’ (cricketr v0.22) see Re-introducing cricketr! : An R package to analyze performances of cricketers which enable granular analysis of batsmen and bowlers. They are

  1. Step 1: getPlayerDataHA – This function is a wrapper around getPlayerData(), getPlayerDataOD() and getPlayerDataTT(), and adds an extra column ‘homeOrAway’ which says whether the match was played at home/away/neutral venues. A CSV file is created with this new column.
  2. Setp 2: getPlayerDataOppnHA – This function allows you to slice & dice the data for batsmen and bowlers against specific oppositions, at home/away/neutral venues and between certain periods. This reduced subset of data can be used to perform analyses. A CSV file is created as an output based on the parameters of opposition, home or away and the interval of time

Note All the existing cricketr functions can be used on this smaller fine-grained data set for a closer analysis of players

Note 1: You have to call the above functions only once. You can reuse the CSV files in other functions

Important note: Don’t go too fine-grained by choosing just one opposition, in one of home/away/neutral and for too short a period. Too small a dataset may defeat the purpose of the analysis!

This post has been published in Rpubs and can be accessed at Cricketr learns new tricks

You can download a PDF version of this post at Cricketr learns new tricks

If you are passionate about cricket, and love analyzing cricket performances, then check out my racy book on cricket ‘Cricket analytics with cricketr and cricpy – Analytics harmony with R & Python’! This book discusses and shows how to use my R package ‘cricketr’ and my Python package ‘cricpy’ to analyze batsmen and bowlers in all formats of the game (Test, ODI and T20). The paperback is available on Amazon at $21.99 and  the kindle version at $9.99/Rs 449/-. A must read for any cricket lover! Check it out!!

Untitled

Untitled

1. Analyzing Tendulkar at 3 different stages of his career

The following functions analyze Sachin Tendulkar during 3 different periods of his illustrious career. a) 1st Jan 2001-1st Jan 2002 b) 1st Jan 2005-1st Jan 2006 c) 1st Jan 2012-1st Jan 2013

#Note: I have commented the lines to getPlayerDataHA() as I already have # CSV file df1=getPlayerDataOppnHA(infile="tendulkarHA.csv",outfile="tendulkarTest2001.csv", startDate="2001-01-01",endDate="2002-01-01") df2=getPlayerDataOppnHA(infile="tendulkarHA.csv",outfile="tendulkarTest2005.csv", startDate="2005-01-01",endDate="2006-01-01") 

`

1a Mean strike rate of Tendulkar in 2001,2005,2012

Note: Any of the cricketr R functions can be used on the fine-grained subset of data as below. The mean strike rate of Tendulkar is of the order of 60+ in 2001 which decreases to 50 and later to around 45


batsmanMeanStrikeRate ("./tendulkarTest2001.csv","Tendulkar-2001")

batsmanMeanStrikeRate ("./tendulkarTest2005.csv","Tendulkar-2005")

batsmanMeanStrikeRate ("./tendulkarTest2012.csv","Tendulkar-2012")

1b. Plot the performance of Tendulkar at venues during 2001,2005,2012

On an average Tendulkar score 60+ in 2001 and is really blazing. This performance decreases in 2005 and later in 2012

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsmanAvgRunsGround("tendulkarTest2001.csv","Tendulkar-2001")
batsmanAvgRunsGround("tendulkarTest2005.csv","Tendulkar-2005")
batsmanAvgRunsGround("tendulkarTest2012.csv","Tendulkar-2012")

dev.off()

1c. Plot the performance of Tendulkar against different oppositions during 2001,2005,2012

Sachin uniformly scores 50+ against formidable oppositions in 2001. In 2005 this decreases to 40 in 2005 and in 2012 it is around 25

batsmanAvgRunsOpposition("tendulkarTest2001.csv","Tendulkar-2001")

2. Analyzing Virat Kohli’s performance against England in England in 2014 and 2018

The analysis below looks at Kohli’s performance against England in ‘away’ venues (England) in 2014 and 2018

 df=getPlayerDataOppnHA(infile="kohliTestHA.csv",outfile="kohliTestEng2014.csv", opposition=c("England"),homeOrAway=c("away"),startDate="2014-01-01",endDate="2015-01-01") df1=getPlayerDataOppnHA(infile="kohliHA.csv",outfile="kohliTestEng2018.csv", opposition=c("England"),homeOrAway=c("away"),startDate="2018-01-01",endDate="2019-01-01")

3a. Compare the performances of Ganguly, Dravid and VVS Laxman against opposition in ‘away’ matches in Tests

The analyses below compares the performances of Sourav Ganguly, Rahul Dravid and VVS Laxman against Australia, South Africa, and England in ‘away’ venues between 01 Jan 2002 to 01 Jan 2008

 df=getPlayerDataOppnHA(infile="gangulyTestHA.csv",outfile="gangulyTestAES2002-08.csv" ,opposition=c("Australia", "England", "South Africa"), homeOrAway=c("away"),startDate="2002-01-01",endDate="2008-01-01") df=getPlayerDataOppnHA(infile="dravidTestHA.csv",outfile="dravidTestAES2002-08.csv" ,opposition=c("Australia", "England", "South Africa"), homeOrAway=c("away"),startDate="2002-01-01",endDate="2008-01-01") df=getPlayerDataOppnHA(infile="laxmanTestHA.csv",outfile="laxmanTestAES2002-08.csv" ,opposition=c("Australia", "England", "South Africa"), homeOrAway=c("away"),startDate="2002-01-01",endDate="2008-01-01")

3b Plot the relative cumulative average runs and relative cumative strike rate

Plot the relative cumulative average runs and relative cumative strike rate of Ganguly, Dravid and Laxman

-Dravid towers over Laxman and Ganguly with respect to cumulative average runs. – Ganguly has a superior strike rate followed by Laxman and then Dravid

frames=list("gangulyTestAES2002-08.csv","dravidTestAES2002-08.csv","laxmanTestAES2002-08.csv")
names=list("GangulyAusEngSA2002-08","DravidAusEngSA2002-08","LaxmanAusEngSA2002-08")
relativeBatsmanCumulativeAvgRuns(frames,names)

relativeBatsmanCumulativeStrikeRate(frames,names)

4. Compare the ODI performances of Rohit Sharma, Joe Root and Kane Williamson against opposition

Compare the performances of Rohit Sharma, Joe Root and Kane williamson in away & neutral venues against Australia, West Indies and Soouth Africa

  • Joe Root piles us the runs in about 15 matches. Rohit has played far more ODIs than the other two and averages a steady 35+
 df=getPlayerDataOppnHA(infile="rohitODIHA.csv",outfile="rohitODIAusWISA.csv" ,opposition=c("Australia", "West Indies", "South Africa"), homeOrAway=c("away","neutral")) df=getPlayerDataOppnHA(infile="joerootODIHA.csv",outfile="joerootODIAusWISA.csv" ,opposition=c("Australia", "West Indies", "South Africa"), homeOrAway=c("away","neutral")) df=getPlayerDataOppnHA(infile="williamsonODIHA.csv",outfile="williamsonODIAusWiSA.csv" ,opposition=c("Australia", "West Indies", "South Africa"), homeOrAway=c("away","neutral"))

5. Plot the performance of Dhoni in T20s against specific opposition at all venues

Plot the performances of Dhoni against Australia, West Indies, South Africa and England

 df=getPlayerDataOppnHA(infile="dhoniT20HA.csv",outfile="dhoniT20AusWISAEng.csv" ,opposition=c("Australia", "West Indies", "South Africa","England"), homeOrAway=c("all"))

6. Compute and performances of Anil Kumble, Muralitharan and Warne in ‘away’ test matches

Compute the performances of Kumble, Warne and Maralitharan against New Zealand, West Indies, South Africa and England in pitches that are not ‘home’ pithes

 df=getPlayerDataOppnHA(infile="kumbleTestHA.csv",outfile="kumbleTest-NZWISAEng.csv" ,opposition=c("New Zealand", "West Indies", "South Africa","England"), homeOrAway=c("away")) df=getPlayerDataOppnHA(infile="warneTestHA.csv",outfile="warneTest-NZWISAEng.csv" ,opposition=c("New Zealand", "West Indies", "South Africa","England"), homeOrAway=c("away")) df=getPlayerDataOppnHA(infile="muraliTestHA.csv",outfile="muraliTest-NZWISAEng.csv" ,opposition=c("New Zealand", "West Indies", "South Africa","England"), homeOrAway=c("away"))

7. Compute and plot the performances of Bumrah in 2016, 2017 and 2018 in ODIs


df=getPlayerDataHA(625383,tfile="bumrahODIHA.csv",type="bowling",matchType="ODI")
## [1] "Working..."

df=getPlayerDataOppnHA(infile="bumrahODIHA.csv",outfile="bumrahODI2016.csv", startDate="2016-01-01",endDate="2017-01-01") df=getPlayerDataOppnHA(infile="bumrahODIHA.csv",outfile="bumrahODI2017.csv", startDate="2017-01-01",endDate="2018-01-01") df=getPlayerDataOppnHA(infile="bumrahODIHA.csv",outfile="bumrahODI2018.csv", startDate="2018-01-01",endDate="2019-01-01")

8. Compute and plot the performances of Shakib, Bumrah and Jadeja in T20 matches for bowling


df=getPlayerDataHA(56143,tfile="shakibT20HA.csv",type="bowling",matchType="T20")
## [1] "Working..."
df=getPlayerDataHA(625383,tfile="bumrahT20HA.csv",type="bowling",matchType="T20")
## [1] "Working..."
df=getPlayerDataHA(234675,tfile="jadejaT20HA.csv",type="bowling",matchType="T20")
## [1] "Working..."

df=getPlayerDataOppnHA(infile="shakibT20HA.csv",outfile="shakibT20-SLAusSAEng.csv" ,opposition=c("Sri Lanka","Australia", "South Africa","England"), homeOrAway=c("all"))
df=getPlayerDataOppnHA(infile="bumrahT20HA.csv",outfile="bumrahT20-SLAusSAEng.csv" ,opposition=c("Sri Lanka","Australia", "South Africa","England"), homeOrAway=c("all")) df=getPlayerDataOppnHA(infile="jadejaT20HA.csv",outfile="jadejaT20-SLAusSAEng.csv" ,opposition=c("Sri Lanka","Australia", "South Africa","England"), homeOrAway=c("all"))

Conclusion

By getting the homeOrAway data for players using the profileNo, you can slice and dice the data based on your choice of opposition, whether you want matches that were played at home/away/neutral venues. Finally by specifying the period for which the data has to be subsetted you can create fine grained analysis.

Hope you have a great time with cricketr!!!

Also see

1. My book ‘Deep Learning from first principles:Second Edition’ now on Amazon
2. Cricpy takes a swing at the ODIs
3. My book ‘Practical Machine Learning in R and Python: Third edition’ on Amazon
4. Googly: An interactive app for analyzing IPL players, matches and teams using R package yorkr
5. Big Data-2: Move into the big league:Graduate from R to SparkR
6. Rock N’ Roll with Bluemix, Cloudant & NodeExpress
7. A method to crowd source pothole marking on (Indian) roads
8. De-blurring revisited with Wiener filter using OpenCV

To see all posts click Index of posts


R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…


If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook
Favorite

Leave a Comment