Who is the Real RBI Leader for 2012? by Ryan Kruse July 12, 2013 We all know that Miguel Cabrera had a phenomenal year in 2012, winning the Triple Crown and later being named the American League MVP. His 44 home runs and .330 batting average are all his own but the 139 RBI he amassed are a shared number, as he couldn’t accumulate RBI without the R (runners). What if everybody had Cabrera’s opportunities? Would others have eclipsed his RBI total? To analyze this I calculated a percentage measure called the Runner Movement Indicator, or RMI for short. It’s a simple calculation once you have the data. Each time a batter comes to the plate with a runner on base, the potential bases that the runners can move are added together. A runner on 1st can move three total bases, 2nd base can move two and 3rd base can move one. Then, at the end of the at-bat, the final positions of the runners are compared with their starting position to determine the total bases moved out of the potential bases. For example if Cabrera gets a single with a runner on 1st, moving the runner to 3rd base, he is awarded two of the possible three bases, for a 0.667 clip. By calculating RMI as a percentage of the opportunities, we’re factoring out the increased benefit Cabrera gets from his stellar teammates. One of the beautiful things about RMI is not just that it is a simple calculation, but that it reads nearly like a batting average. This makes it is immediately easy to tell the good from the bad. Below is a histogram of the RMI for all qualifying players in 2012. Now let’s overlay that with the batting averages from the same year in red. You’ll see the distribution is quite similar. One might think that players with high batting averages also have high RMI, but that’s not quite the case. If we try to correlate RMI with Batting Average, OBP or SLG, we stay below a 0.5 R2 in each case although all with the expected positive slopes. RMI vs BA RMI vs OBP RMI vs SLG 0.411 R2 0.429 R2 0.323 R2 * * * Now that we know a little about RMI, let’s look at the leaders from 2012. Player RMI Actual Bases Moved Potential Bases Moved RBI Joey Votto 0.342 218 637 56 Joe Mauer 0.332 336 1011 85 Torii Hunter 0.328 300 915 92 Josh Hamilton 0.323 288 891 128 Adrian Gonzalez 0.317 329 1037 108 Yasmani Grandal 0.317 117 369 36 Miguel Cabrera 0.316 319 1008 139 Josh Rutledge 0.316 128 405 37 Garrett Jones 0.315 249 791 86 Elvis Andrus 0.311 271 871 62 We see that Cabrera is 7th on the list for 2012. Still great, but not the best. We also see that Joey Votto moved runners around the bases at the highest rate, 26 points higher than Cabrera. So let’s use the RMI data above to see if anybody would have taken over the RBI lead given the same opportunities as Cabrera. To do this we first subtract home runs from RBI, as the batter’s own bases aren’t used in RMI. Of Cabrera’s 139 RBI in 2012, 44 came from himself scoring on his own home run. This means he had 95 RMI influenced RBI based on a 0.316 RMI. If we apply this same ratio to Votto’s RMI of 0.342 we get 103 RBI. Votto’s 14 home runs bring him up to 117 RBI, still well shy of Cabrera. Of course we know that Josh Hamilton was the one chasing Cabrera’s home run total in 2012, so let’s do the same calculation with him. Hamilton’s 0.323 RMI would give him 98 equivalent RBI. Adding in his 43 home runs brings him to 141 RBI, 2 higher than Cabrera. Too close to call? Nah… Hamilton wins. Takeaways The ability to get on base is one of the best predictive factors of runs and therefore wins. It gets better if you add RMI but they should be considered a distinct contribution. RMI leaders may not have great batting averages and vice versa. Undervalued players can be found with high RMI that have average OBP and BA stats. More Data Complete player and team RMI stats can be found on with the links below Player RMIs from 2010 to 2013 Team RMIs from 2010 to 2013 Data Collection & Mining Techniques All of the data used in this post was loaded from MLB’s gameday servers into a MongoDB database using my atbat-mongodb project. This project is open source code that anybody can use, modify, contribute to, etc. Fork me please! https://github.com/kruser/atbat-mongodb All data aggregation code and charts are written in Python using MongoClient, matplotlib, scipy and numpy modules. You can find that code on github as well. https://github.com/kruser/mlb-research Other Notes on RMI After collecting my data I ran across Gary Hardegree’s Base-Advance Average paper from 2005, which does a nearly similar calculation, with the exception that it gives the batter credit for moving themselves. I prefer to keep this a clutch stat and remove the batter’s bases. The RMI data does not correlate to team run production as high as Batting Average, Slugging Percentage or On-Base Percentage. Adding OBP to RMI correlates much higher, but then again, that’s what a run is–getting on base and moving around to home. So there isn’t anything noteworthy enough there to post numbers. In order to qualify for my list a batter must have a minimum of two potential base movement opportunities per game. Opportunities fluctuate largely among regular players so it is important not to keep this requirement too low.