Who is the Real RBI Leader for 2012?

We all know that Miguel Cabrera had a phenomenal year in 2012, winning the Triple Crown and later being named the American League MVP. His 44 home runs and .330 batting average are all his own but the 139 RBI he amassed are a shared number, as he couldn’t accumulate RBI without the R (runners). What if everybody had Cabrera’s opportunities? Would others have eclipsed his RBI total?

To analyze this I calculated a percentage measure called the Runner Movement Indicator, or RMI for short. It’s a simple calculation once you have the data. Each time a batter comes to the plate with a runner on base, the potential bases that the runners can move are added together. A runner on 1st can move three total bases, 2nd base can move two and 3rd base can move one. Then, at the end of the at-bat, the final positions of the runners are compared with their starting position to determine the total bases moved out of the potential bases. For example if Cabrera gets a single with a runner on 1st, moving the runner to 3rd base, he is awarded two of the possible three bases, for a 0.667 clip. By calculating RMI as a percentage of the opportunities, we’re factoring out the increased benefit Cabrera gets from his stellar teammates.

One of the beautiful things about RMI is not just that it is a simple calculation, but that it reads nearly like a batting average. This makes it is immediately easy to tell the good from the bad. Below is a histogram of the RMI for all qualifying players in 2012.

Now let’s overlay that with the batting averages from the same year in red. You’ll see the distribution is quite similar.

One might think that players with high batting averages also have high RMI, but that’s not quite the case. If we try to correlate RMI with Batting Average, OBP or SLG, we stay below a 0.5 R2 in each case although all with the expected positive slopes.

RMI vs BA

RMI vs OBP

RMI vs SLG

0.411 R2

0.429 R2

0.323 R2

* * *

Now that we know a little about RMI, let’s look at the leaders from 2012.

Player

RMI

Actual Bases Moved

Potential Bases Moved

RBI

Joey Votto

0.342

218

637

56

Joe Mauer

0.332

336

1011

85

Torii Hunter

0.328

300

915

92

Josh Hamilton

0.323

288

891

128

Adrian Gonzalez

0.317

329

1037

108

Yasmani Grandal

0.317

117

369

36

Miguel Cabrera

0.316

319

1008

139

Josh Rutledge

0.316

128

405

37

Garrett Jones

0.315

249

791

86

Elvis Andrus

0.311

271

871

62

We see that Cabrera is 7th on the list for 2012. Still great, but not the best. We also see that Joey Votto moved runners around the bases at the highest rate, 26 points higher than Cabrera. So let’s use the RMI data above to see if anybody would have taken over the RBI lead given the same opportunities as Cabrera.

To do this we first subtract home runs from RBI, as the batter’s own bases aren’t used in RMI. Of Cabrera’s 139 RBI in 2012, 44 came from himself scoring on his own home run. This means he had 95 RMI influenced RBI based on a 0.316 RMI. If we apply this same ratio to Votto’s RMI of 0.342 we get 103 RBI. Votto’s 14 home runs bring him up to 117 RBI, still well shy of Cabrera.

Of course we know that Josh Hamilton was the one chasing Cabrera’s home run total in 2012, so let’s do the same calculation with him. Hamilton’s 0.323 RMI would give him 98 equivalent RBI. Adding in his 43 home runs brings him to 141 RBI, 2 higher than Cabrera. Too close to call? Nah… Hamilton wins.

Takeaways

The ability to get on base is one of the best predictive factors of runs and therefore wins. It gets better if you add RMI but they should be considered a distinct contribution. RMI leaders may not have great batting averages and vice versa. Undervalued players can be found with high RMI that have average OBP and BA stats.

More Data

Complete player and team RMI stats can be found on with the links below

 

Data Collection & Mining Techniques

All of the data used in this post was loaded from MLB’s gameday servers into a MongoDB database using my atbat-mongodb project. This project is open source code that anybody can use, modify, contribute to, etc. Fork me please!
https://github.com/kruser/atbat-mongodb

All data aggregation code and charts are written in Python using MongoClient, matplotlib, scipy and numpy modules. You can find that code on github as well. https://github.com/kruser/mlb-research

Other Notes on RMI

  • After collecting my data I ran across Gary Hardegree’s Base-Advance Average paper from 2005, which does a nearly similar calculation, with the exception that it gives the batter credit for moving themselves. I prefer to keep this a clutch stat and remove the batter’s bases.

  • The RMI data does not correlate to team run production as high as Batting Average, Slugging Percentage or On-Base Percentage. Adding OBP to RMI correlates much higher, but then again, that’s what a run is–getting on base and moving around to home. So there isn’t anything noteworthy enough there to post numbers.

  • In order to qualify for my list a batter must have a minimum of two potential base movement opportunities per game. Opportunities fluctuate largely among regular players so it is important not to keep this requirement too low.

 





Software Developer from Austin, TX. Transplant from Minnesota. Big Twins fan.

57 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Red Wood
10 years ago

A very interesting read.

jcxy
10 years ago

Small quip–

“For example if Cabrera gets a single with a runner on 1st, moving the runner to 3rd base, he is awarded two of the possible three bases… we’re factoring out the increased benefit Cabrera gets from his stellar teammates.”

Reconsider that 1st to 3rd example–a stellar or slow baserunner–say, Mike Trout vs Delmon Young–would add/subtract value on the basepaths in a way that isn’t factored out here, right?

guy who knows where the beds are
10 years ago
Reply to  Ryan Kruse

dude, there’s a reply button

Dude who doesn't know where his car is
10 years ago

Dude, where’s my car?

UCLAboi
10 years ago

I bet baserunning would account for more than 2 rbi’s worth of bases.

John Choiniere
10 years ago

“We’re factoring out the increased benefit Cabrera gets from his stellar teammates”

But leaving in teammate speed without correction.

Spitball McPhee
10 years ago

…and would he please stand up, please stand up?

Sam
10 years ago

I had an idea for this exact stat a couple of years ago but couldn’t find anything about it. When was it created?

Ian Sales
10 years ago

How did you accumulate your data. I would love to make my own reports as high a quality as your. But I don’t where to find good data.

Ian Sales
10 years ago

I apologize for my last comment. I did not read the whole article before I posted a comment. Not a smart move. Thanks for the info.

Lee Panas
10 years ago

I like this statistic for its simplicity. As others have pointed out, speed of base runners is a factor and you could make a more complex stat based on that. That would be interesting, but I think it’s a valuable stat as is.

mlstarrmember
10 years ago

I may be way off (and if so by all means tell me), but wouldn’t a more telling measure be the sum of the RMIs for each base relative to league average at that base? I realize this is meant to be a simple measure and that over the course of a season’s opportunities it’s likely to be mostly a wash. I just can’t shake the idea that if you have a hitter who hits only singles (BA 1.000) but always has only a runner on third and a hitter who hits only doubles (BA 1.000) but always has only a runner on first, the singles hitter is guaranteed to have an RMI equal or greater than that of the doubles hitter. It seems the OBP of teammates is being factored out, but the positioning of teammates which can be affected by their SB and XBH abilities is not.

Jefferson
10 years ago

What would the RMI be if a runner is at 3rd with one out and the batter hits a sac fly? I’m guessing 1.000? Even though the player didn’t get a hit.

Jay
10 years ago

Great article and stat. Who created it?

mlstarrmember
10 years ago

Do you have a spreadsheet with number of times each player has had a runner on first, second, and third instead of the total potential bases, or would that take some mining?

rob
10 years ago

It’s amazing on which pace Joey Votto was in his 2012 season. Would’ve been a historic one if he stayed healthy.

Jay
10 years ago

How did you get the numbers for each player? Did you have to watch each AB?

Neil Weinbergmember
10 years ago

This is good. I have the same comments about baserunner speed and certain advances being easier than others, but appreciate the simplicity for now. Also worth pointing out you should probably base the conversion to RBI based on games played or PA. RBI are dependent on your team getting on base for you and you want to strip that away, but Cabrera played 13 more games than Hamilton so he has actually earned some extra RBI chances simply by being healthy.

jcxy
10 years ago

@kruser- thanks for the response. While I applaud your goal of simplicity, I feel that baserunner speed is a worthwhile factor to further consider–if only for completeness sake. We certainly wouldn’t be shocked if it turns out that the sum difference is on the magnitude of fractional, right?

pft
10 years ago

The scoreboard does not change based on getting on base or movement on the bases, even though both are instrumental to the run generation process. It only changes when a runner scores and runs only score with RBI, errors, or PB/WP – the vast majority by RBI-95+%). RBI’s, including your own RBI from a HR always change the scoreboard, why replace it with something else given its importance.

I agree with the effort to adjust RBI totals for opportunity. However, limit it to actual runs driven in and not put lipstick on a pig and call the pig honey.

Perhaps weight the runners on base by RE (average RE with 1 out for simplicity), and then calculate (total RBI-sum of RE) divided by the (sum of the RE).

jason_mitchell
10 years ago

Solid work, but I do not think it answers the question of “who is the real rbi leader’. To answer that, I prefer my method. Find the ‘expected’ value of runners driven in based on the baseout situation each player is presented with, such as .59 RBI per every at bat with a runner on 3rd and 2 out. You can find read more at:

http://www.hardballtimes.com/main/article/the-opportunity-of-rbi/

Grady
10 years ago

I love the simplicity. It makes it easier to understand and if I were a GM, I’d definitely use it. GMs know if they have good base runners on their team, a guy who has a high RMI would flourish even more if added to their roster. Very well thought out article and very impressive knowing that you wrote a program to grab the info yourself.

Garbanzo
10 years ago

What do you do with PAs that end in IBBs and UIBBs?

Al Dimond
10 years ago

I think this would make more sense if you counted the batter as a runner, too. Add four “potential bases moved” for the batter each at-bat, and count his total bases toward “actual bases moved”. Then you don’t have to count home runs separately at the end.

Garbanzo
10 years ago

If a player gets a 0 for 2+ when he’s walked with RISP and 1b open…… Maybe you should do RE24 changes but vaporize the baserunner if he reaches (or HRs) before calculating the RE24 value of the end state.

Al Dimond
10 years ago

(Aha, if I actually read the article my point is mentioned. I completely disagree with your reasoning here. Even with the batter’s own advancement accounted for, at-bats with runners on are still weighted much higher! I bet the stat with the batter’s own bases moved added in correlates much better with overall team offensive production than this one. Why add OBP, which is calculated quite differently, when you can instead fluidly account for the hitter’s own advancement? There’s probably a better way to do it, though.)

jake
10 years ago

Excellent article, I’ve been wondering for some time how batters would rank in a stat which measures not just RBIs, but moving runners ahead on the bases.

Having just perused the comment section, I see the questions on how much baserunner speed changes these numbers, and would wonder if team baserunning/stolen base values could be averaged and applied to this to balance batters moving slower or faster runners. Also, this applied on a team level.

Weston Taylor
10 years ago

The study isn’t fair. Power-hitting lefties, switch-hitters, and slap-hitting righties are going to have more runners moved than power hitting right handers because it’s easier for these guys to move the runner from first to third, and is not indicative of the players’ ability to drive in a run. Yes, that hitter is better at moving runners along because the distance from right field to third is a lot farther than left field to third, but the distance from left field to home and the distance from right field to home is the same, unless you’re playing at a ballpark with odd dimensions, such as Fenway. The only way to even this out is if slugging percentage was factored in there somehow.

Ron
10 years ago

It’s an interesting article and concept for sure, but several major flaws including these:

1) Do I understand correctly that if a batter comes up with runners on 2nd & 3rd, and gets intentionally walked, that’s an 0-for-3 on the RMI? Sheesh, no wonder Cabrera didn’t pan out as well.

2) Not counting the batter moving himself reminds me of the same flaw as that “runs created” stat. Both of these seem to say that if a batter gets an RBI single with a runner on 2nd, that’s of equal value as if the guy had slugged a 2-run homer.

PackBob
10 years ago

Nice work. It seems like hit-and-run could have an effect too, especially if some teams are more prone to this tactic than others. A hitter with good contact skills coupled with a team that likes to hit-and-run may get a little advantage.

Oh, Beepy
10 years ago

You could consider building in a weighted average of the BsR of the 5 previous batters. if you were comfortable with a value-based stat that is.

This is really cool work, it seems to me to be a stat that a non-stathead would be able to get behind as is, but I think it would obviously be more accurate if it factored in baserunning.

ImKeithHernandez
10 years ago

Just wanted to say great job man. Very impressive work, but an even more impressive presentation. I’m glad the FanGraphs staff acknowledged this.

Phils_Goodman
10 years ago

Now we need to know how much of RMI is a repeatable skill and how much of it is randomness.

chief00
10 years ago

This is great. I’m no statistician by any stretch, but I like to see people tinker with numbers because my thinking needs to be challenged regularly.

I enjoy seeing Elvis Andrus’ name among the usual suspects. It seems to suggest 2 things at a glance: (1) your theory and its attendant equation doesn’t exclude lighter hitters; and (2) Elvis brings runners around the bases. Boy, does that say a lot about Andrus and how valuable he is.

ajkreider
10 years ago
Reply to  Ryan Kruse

Good stuff.

In addition to the IBB issue, regular walks seem over-valued – if RMI is supposed to be a real RBI indicator. I think your calculation treats a walk with runners on first and second as RMI equivalent to a single with a runner on second. But the latter plates a run while the former doesn’t.

The over-values high walk guys like Votto, and penalizes free swinging high BA players. This is why I suspect that Votto will have significantly lower RBI totals than Cabrera, which if a matter of luck would balance out over a career.

ajkreider
10 years ago
Reply to  Ryan Kruse

You are correct of course. I should’ve picked a better example. As with runners on first and third, a sac fly or fielder’s choice plates a run. A walk doesn’t but is credited with the same RMI (.250). A guy who homers 1/3rd the time with a guy on first gets the same RMI as a guy who walks three times in the same situation.

The point is that walks very seldom plate runs. Doesn’t mean your stat isn’t useful, obviously.

Aidan
10 years ago

I disagree with many of the people suggesting that teammate baserunning should be accounted for in RMI. While it certainly would make it more accurate, RMI seems like a very good base for similar statistics. As the OP has pointed out, the simplicity would be lost, given the imperfections in those metrics. Also, adding teammates running would only plug in one of the several missing variables from the perfect stat.

gouis
10 years ago

Thanks for using a good plotting tool (MATLAB?) instead of using Excel or whatever that abomination Dave Cameron uses.

dbssaber
10 years ago

I’m curious about year to year correlations- does RMI indicate a repeatable skill or does it tend to jump around?

Mister
10 years ago

Interesting stuff. What was the sample size for the AVG/OBP/SLG correlations? 2010-2013 or just 2012? I’m wondering if a stronger correlation would show up given a larger sample size.

I see Elvis Andrus at #11 in 2012 and I just can’t bring myself to believe that he’s really THAT efficient at moving runners over. Who do you want batting in a close game with runners on base, Elvis Andrus or Edwin Encarnacion? I’ve got to think that this is measuring luck to a large extent. This could be similar to BABIP, and could be used to help identify who is getting lucky in the RBI department and who is not, similar to how BABIP is used for AVG.

Jeromey Thornton
10 years ago

I’m trying to get your AtBat project up and running and I’ve hit a roadblock when attempting to load the DB for the first time. I’m hoping to perform some data analysis in support of a project that I’m working on. If you’re willing to help me out with some debug advice, please send me an email. Thanks!

JDX19
9 years ago

Hey, Ryan.

Are you still updating these stats somewhere for 2013 and 2014? Is someone else carrying the torch?

Thanks!