# Who is the Real RBI Leader for 2012?

We all know that Miguel Cabrera had a phenomenal year in 2012, winning the Triple Crown and later being named the American League MVP. His 44 home runs and .330 batting average are all his own but the 139 RBI he amassed are a shared number, as he couldnâ€™t accumulate RBI without the R (runners). What if everybody had Cabrera’s opportunities? Would others have eclipsed his RBI total?

To analyze this I calculated a percentage measure called the Runner Movement Indicator, or RMI for short. Itâ€™s a simple calculation once you have the data. Each time a batter comes to the plate with a runner on base, the potential bases that the runners can move are added together. A runner on 1st can move three total bases, 2nd base can move two and 3rd base can move one. Then, at the end of the at-bat, the final positions of the runners are compared with their starting position to determine the total bases moved out of the potential bases. For example if Cabrera gets a single with a runner on 1st, moving the runner to 3rd base, he is awarded two of the possible three bases, for a 0.667 clip. By calculating RMI as a percentage of the opportunities, weâ€™re factoring out the increased benefit Cabrera gets from his stellar teammates.

One of the beautiful things about RMI is not just that it is a simple calculation, but that it reads nearly like a batting average. This makes it is immediately easy to tell the good from the bad. Below is a histogram of the RMI for all qualifying players in 2012.

Now letâ€™s overlay that with the batting averages from the same year in red. Youâ€™ll see the distribution is quite similar.

One might think that players with high batting averages also have high RMI, but thatâ€™s not quite the case. If we try to correlate RMI with Batting Average, OBP or SLG, we stay below a 0.5 R2Â in each case although all with the expected positive slopes.

 RMI vs BA RMI vs OBP RMI vs SLG 0.411Â R2 0.429Â R2 0.323Â R2 * * *

Now that we know a little about RMI, letâ€™s look at the leaders from 2012.

 Player RMI Actual Bases Moved Potential Bases Moved RBI Joey Votto 0.342 218 637 56 Joe Mauer 0.332 336 1011 85 Torii Hunter 0.328 300 915 92 Josh Hamilton 0.323 288 891 128 Adrian Gonzalez 0.317 329 1037 108 Yasmani Grandal 0.317 117 369 36 Miguel Cabrera 0.316 319 1008 139 Josh Rutledge 0.316 128 405 37 Garrett Jones 0.315 249 791 86 Elvis Andrus 0.311 271 871 62

We see that Cabrera is 7th on the list for 2012. Still great, but not the best. We also see that Joey Votto moved runners around the bases at the highest rate, 26 points higher than Cabrera. So letâ€™s use the RMI data above to see if anybody would have taken over the RBI lead given the same opportunities as Cabrera.

To do this we first subtract home runs from RBI, as the batterâ€™s own bases arenâ€™t used in RMI. Of Cabreraâ€™s 139 RBI in 2012, 44 came from himself scoring on his own home run. This means he had 95 RMI influenced RBI based on a 0.316 RMI. If we apply this same ratio to Vottoâ€™s RMI of 0.342 we get 103 RBI. Vottoâ€™s 14 home runs bring him up to 117 RBI, still well shy of Cabrera.

Of course we know that Josh Hamilton was the one chasing Cabreraâ€™s home run total in 2012, so letâ€™s do the same calculation with him. Hamiltonâ€™s 0.323 RMI would give him 98 equivalent RBI. Adding in his 43 home runs brings him to 141 RBI, 2 higher than Cabrera. Too close to call? Nah… Hamilton wins.

Takeaways

The ability to get on base is one of the best predictive factors of runs and therefore wins. It gets better if you add RMI but they should be considered a distinct contribution. RMI leaders may not have great batting averages and vice versa. Undervalued players can be found with high RMI that have average OBP and BA stats.

More Data

Complete player and team RMI stats can be found on with the links below

Data Collection & Mining Techniques

All of the data used in this post was loaded from MLBâ€™s gameday servers into a MongoDB database using my atbat-mongodb project. This project is open source code that anybody can use, modify, contribute to, etc. Fork me please!
https://github.com/kruser/atbat-mongodb

All data aggregation code and charts are written in Python using MongoClient, matplotlib, scipy and numpy modules. You can find that code on github as well. https://github.com/kruser/mlb-research

Other Notes on RMI

• After collecting my data I ran across Gary Hardegreeâ€™s Base-Advance Average paper from 2005, which does a nearly similar calculation, with the exception that it gives the batter credit for moving themselves. I prefer to keep this a clutch stat and remove the batterâ€™s bases.

• The RMI data does not correlate to team run production as high as Batting Average, Slugging Percentage or On-Base Percentage. Adding OBP to RMI correlates much higher, but then again, thatâ€™s what a run is–getting on base and moving around to home. So there isnâ€™t anything noteworthy enough there to post numbers.

• In order to qualify for my list a batter must have a minimum of two potential base movement opportunities per game. Opportunities fluctuate largely among regular players so it is important not to keep this requirement too low.

Software Developer from Austin, TX. Transplant from Minnesota. Big Twins fan.

Guest
Red Wood

Guest
jcxy

Small quip–

“For example if Cabrera gets a single with a runner on 1st, moving the runner to 3rd base, he is awarded two of the possible three bases… weâ€™re factoring out the increased benefit Cabrera gets from his stellar teammates.”

Reconsider that 1st to 3rd example–a stellar or slow baserunner–say, Mike Trout vs Delmon Young–would add/subtract value on the basepaths in a way that isn’t factored out here, right?

Guest
UCLAboi

I bet baserunning would account for more than 2 rbi’s worth of bases.

Member

“Weâ€™re factoring out the increased benefit Cabrera gets from his stellar teammates”

But leaving in teammate speed without correction.

Member
Spitball McPhee

Guest
Sam

I had an idea for this exact stat a couple of years ago but couldn’t find anything about it. When was it created?

Guest
Ian Sales

How did you accumulate your data. I would love to make my own reports as high a quality as your. But I don’t where to find good data.

Guest
Ian Sales

I apologize for my last comment. I did not read the whole article before I posted a comment. Not a smart move. Thanks for the info.

Guest

I like this statistic for its simplicity. As others have pointed out, speed of base runners is a factor and you could make a more complex stat based on that. That would be interesting, but I think it’s a valuable stat as is.

Member
Member
mlstarr

I may be way off (and if so by all means tell me), but wouldn’t a more telling measure be the sum of the RMIs for each base relative to league average at that base? I realize this is meant to be a simple measure and that over the course of a season’s opportunities it’s likely to be mostly a wash. I just can’t shake the idea that if you have a hitter who hits only singles (BA 1.000) but always has only a runner on third and a hitter who hits only doubles (BA 1.000) but always has only… Read more »

Guest
Jefferson

What would the RMI be if a runner is at 3rd with one out and the batter hits a sac fly? I’m guessing 1.000? Even though the player didn’t get a hit.

Guest
Jay

Great article and stat. Who created it?

Member
Member
mlstarr

Do you have a spreadsheet with number of times each player has had a runner on first, second, and third instead of the total potential bases, or would that take some mining?

Guest
rob

It’s amazing on which pace Joey Votto was in his 2012 season. Would’ve been a historic one if he stayed healthy.

Guest
Jay

How did you get the numbers for each player? Did you have to watch each AB?

Member
Member

This is good. I have the same comments about baserunner speed and certain advances being easier than others, but appreciate the simplicity for now. Also worth pointing out you should probably base the conversion to RBI based on games played or PA. RBI are dependent on your team getting on base for you and you want to strip that away, but Cabrera played 13 more games than Hamilton so he has actually earned some extra RBI chances simply by being healthy.

Guest
jcxy

@kruser- thanks for the response. While I applaud your goal of simplicity, I feel that baserunner speed is a worthwhile factor to further consider–if only for completeness sake. We certainly wouldn’t be shocked if it turns out that the sum difference is on the magnitude of fractional, right?

Guest
pft

The scoreboard does not change based on getting on base or movement on the bases, even though both are instrumental to the run generation process. It only changes when a runner scores and runs only score with RBI, errors, or PB/WP – the vast majority by RBI-95+%). RBI’s, including your own RBI from a HR always change the scoreboard, why replace it with something else given its importance. I agree with the effort to adjust RBI totals for opportunity. However, limit it to actual runs driven in and not put lipstick on a pig and call the pig honey. Perhaps… Read more »

Guest

Solid work, but I do not think it answers the question of “who is the real rbi leader’. To answer that, I prefer my method. Find the ‘expected’ value of runners driven in based on the baseout situation each player is presented with, such as .59 RBI per every at bat with a runner on 3rd and 2 out. You can find read more at:

http://www.hardballtimes.com/main/article/the-opportunity-of-rbi/

Guest

I love the simplicity. It makes it easier to understand and if I were a GM, I’d definitely use it. GMs know if they have good base runners on their team, a guy who has a high RMI would flourish even more if added to their roster. Very well thought out article and very impressive knowing that you wrote a program to grab the info yourself.

Guest
Garbanzo

What do you do with PAs that end in IBBs and UIBBs?

Guest

I think this would make more sense if you counted the batter as a runner, too. Add four “potential bases moved” for the batter each at-bat, and count his total bases toward “actual bases moved”. Then you don’t have to count home runs separately at the end.

Guest
Garbanzo

If a player gets a 0 for 2+ when he’s walked with RISP and 1b open…… Maybe you should do RE24 changes but vaporize the baserunner if he reaches (or HRs) before calculating the RE24 value of the end state.

Guest

(Aha, if I actually read the article my point is mentioned. I completely disagree with your reasoning here. Even with the batter’s own advancement accounted for, at-bats with runners on are still weighted much higher! I bet the stat with the batter’s own bases moved added in correlates much better with overall team offensive production than this one. Why add OBP, which is calculated quite differently, when you can instead fluidly account for the hitter’s own advancement? There’s probably a better way to do it, though.)

Member

Excellent article, I’ve been wondering for some time how batters would rank in a stat which measures not just RBIs, but moving runners ahead on the bases.

Having just perused the comment section, I see the questions on how much baserunner speed changes these numbers, and would wonder if team baserunning/stolen base values could be averaged and applied to this to balance batters moving slower or faster runners. Also, this applied on a team level.

Member

The study isn’t fair. Power-hitting lefties, switch-hitters, and slap-hitting righties are going to have more runners moved than power hitting right handers because it’s easier for these guys to move the runner from first to third, and is not indicative of the players’ ability to drive in a run. Yes, that hitter is better at moving runners along because the distance from right field to third is a lot farther than left field to third, but the distance from left field to home and the distance from right field to home is the same, unless you’re playing at a ballpark… Read more »

Guest
Ron

It’s an interesting article and concept for sure, but several major flaws including these:

1) Do I understand correctly that if a batter comes up with runners on 2nd & 3rd, and gets intentionally walked, that’s an 0-for-3 on the RMI? Sheesh, no wonder Cabrera didn’t pan out as well.

2) Not counting the batter moving himself reminds me of the same flaw as that “runs created” stat. Both of these seem to say that if a batter gets an RBI single with a runner on 2nd, that’s of equal value as if the guy had slugged a 2-run homer.

Guest
PackBob

Nice work. It seems like hit-and-run could have an effect too, especially if some teams are more prone to this tactic than others. A hitter with good contact skills coupled with a team that likes to hit-and-run may get a little advantage.

Guest
Oh, Beepy

You could consider building in a weighted average of the BsR of the 5 previous batters. if you were comfortable with a value-based stat that is.

This is really cool work, it seems to me to be a stat that a non-stathead would be able to get behind as is, but I think it would obviously be more accurate if it factored in baserunning.

Guest
ImKeithHernandez

Just wanted to say great job man. Very impressive work, but an even more impressive presentation. I’m glad the FanGraphs staff acknowledged this.

Member
Phils_Goodman

Now we need to know how much of RMI is a repeatable skill and how much of it is randomness.

Guest
chief00

This is great. I’m no statistician by any stretch, but I like to see people tinker with numbers because my thinking needs to be challenged regularly.

I enjoy seeing Elvis Andrus’ name among the usual suspects. It seems to suggest 2 things at a glance: (1) your theory and its attendant equation doesn’t exclude lighter hitters; and (2) Elvis brings runners around the bases. Boy, does that say a lot about Andrus and how valuable he is.

Guest
Aidan

I disagree with many of the people suggesting that teammate baserunning should be accounted for in RMI. While it certainly would make it more accurate, RMI seems like a very good base for similar statistics. As the OP has pointed out, the simplicity would be lost, given the imperfections in those metrics. Also, adding teammates running would only plug in one of the several missing variables from the perfect stat.

Guest
gouis

Thanks for using a good plotting tool (MATLAB?) instead of using Excel or whatever that abomination Dave Cameron uses.

Member
dbssaber

I’m curious about year to year correlations- does RMI indicate a repeatable skill or does it tend to jump around?

Guest
Mister

Interesting stuff. What was the sample size for the AVG/OBP/SLG correlations? 2010-2013 or just 2012? I’m wondering if a stronger correlation would show up given a larger sample size. I see Elvis Andrus at #11 in 2012 and I just can’t bring myself to believe that he’s really THAT efficient at moving runners over. Who do you want batting in a close game with runners on base, Elvis Andrus or Edwin Encarnacion? I’ve got to think that this is measuring luck to a large extent. This could be similar to BABIP, and could be used to help identify who is… Read more »

Guest
Jeromey Thornton

I’m trying to get your AtBat project up and running and I’ve hit a roadblock when attempting to load the DB for the first time. I’m hoping to perform some data analysis in support of a project that I’m working on. If you’re willing to help me out with some debug advice, please send me an email. Thanks!

Guest
JDX19

Hey, Ryan.

Are you still updating these stats somewhere for 2013 and 2014? Is someone else carrying the torch?

Thanks!