# Weighted Runs Batted In Efficiency

Imagine that throughout high school, teachers gave their favorite students easier tests than the rest of the class. Results would be clear: the majority of the favored students would come out with stronger scores. However, one would question if those strong scores would be a result of high intellect or because of an easy test. Contrarily, there would be other students who would still score well while given a difficult test. Now there’s an issue. If the teachers want to know which of the students know the material the best, how should they figure it out? They know that they can’t take the highest score, because they are aware that the scores are not an accurate representation due to the skewed tests. This is the situation in which the RBI has put the baseball world.

When the RBI was first documented as an official statistic in 1920, the wording of the definition in Rule 86, Section 8 of the Official Baseball Rules was “The number of runs batted in by each batsman.” Although this definition was slightly vague, its intention was to quantify which batter is the best at batting in runs. For years, this statistic has been praised. The RBI is always one of the first statistics to be mentioned while summarizing a player’s year and career. The RBI is even in the most prestigious hitting award, The Triple Crown. Despite its strong reputation, over the last few years it has become clear that the RBI doesn’t answer “Which batsman is the best at batting in runs?” The RBI only answers “Who has batted in the most runs?” Although that may seem like a small wording change, the two questions are tremendously different.

The RBI may seem like a valuable statistic, but there are two main assumptions that drastically devalue the it: (1) the idea that all hitters have the same quality of plate appearances and (2) that all RBI are created the same. The first assumption can be broken down into two different factors: the batting order effect — the concept that players will get different quality of plate appearances depending on their spot in the order — and the team effect — players on teams with better offenses have an easier time accumulating RBI. Table 1 below shows that the quality of plate appearances — a plate appearance with a man on base being high quality while a plate appearance with bases empty is low quality — depends on batting order for the entire MLB from the years 2015-2019:

Batting Order: | Plate Appearances | PAs with Men On | Percentage |
---|---|---|---|

1 | 113,119 | 36,977 | 33% |

2 | 110,540 | 45,550 | 41% |

3 | 107,970 | 49,375 | 46% |

4 | 105,538 | 51,150 | 48% |

5 | 103,098 | 45,710 | 44% |

6 | 100,491 | 43,532 | 43% |

7 | 97,711 | 43,284 | 44% |

8 | 94,797 | 41,805 | 44% |

9 | 91,810 | 40,548 | 44% |

Average: | 102,786 | 44,214 | 43% |

This shows a strong correlation to the idea that quality of plate appearances is reliant on the batting order. Most starkly, the average No. 4 hitter has 48% of their plate appearances with a man on base while the leadoff hitter only has 33%. In fact, the average hitter will get 43% of their plate appearances with a man on base, meaning that the leadoff hitter has a 10% disadvantage and the cleanup hitter has a 5% advantage. This can be attributed to the fact that on a typical team, players hitting first or second will often have the highest on-base percentage, leading to the third and fourth hitters having a plethora of plate appearances with men on base. On the other hand, players batting seventh, eighth, or ninth will typically have the lowest OBP, resulting in the leadoff hitter having a minimal amount of plate appearances with a man on base and subsequently fewer premium spots to collect an RBI.

Throughout a game, it is equally difficult for a hitter to hit a single or home run in every spot in the lineup, as those pretty well solely rely on one’s ability to hit. However, the difficulty of collecting RBI changes greatly depending on the spot in the order in which the hitter hits as it also relies on others getting on base. This means that some hitters simply have an easier time driving in a run, creating an uneven playing field. This plays a role into the making of an RBI and adds an unfair element to hitters not hitting in premium spots.

Furthermore, the team effect also contributes to the inequality of plate appearances. A team that scores more runs will undoubtedly accumulate more RBI. This means that players who play for the best offense will have a higher likelihood of being an elite RBI collector. The following table, Table 2, represents the frequency in which a player who ranks in the top 10 in RBI for the year plays for an above-average offense, according to runs scored, from the years 2015-2019.

Year | Players* on Above-Average Offense |
---|---|

2019 | 70% |

2018 | 90% |

2017 | 70% |

2016 | 80% |

2015 | 100% |

Average | 82% |

This suggests a strong correlation between a team’s offense and resulting RBI leaders. This correlation is in part due to better offenses having runners on base at a higher frequency, so it is much easier for their hitters to drive in runs. In a statistical baseball world where everything can be quantified in a vacuum, this prominent metric has clear outside factors. Therefore, having a cumulative statistic where there are multiple factors out of the hitters’ control that completely skew results is not a good representation of the desired results.

As a case study, Anthony Rendon’s 2019 is the perfect example. Rendon led the league with 126 RBI. However, the batting order effect and team effect were both prevalent. Rendon hit in the No. 3 spot for 137 of the 146 games he played. He also played for the World Series Champion Nationals, who ranked sixth in the league in runs scored. Of his league-leading 126 RBI, Rendon also led the league in RBI from third base, driving in 43 runners. He was also third in the league in amount of plate appearances with a man on third base. Meanwhile, he ranked 76th in plate appearances with no one on base and 42nd in PAs with a man on first base. So was Rendon just very lucky to hit third for the sixth-best offense? Did he just make the best of a great situation? Would he still have led the league if he had worse quality of plate appearances? Using the RBI, it is impossible to know whether or not Rendon was truly an elite hitter at driving in runs or if he just took advantage of his circumstances.

The last assumption that the RBI makes is the idea that all RBI are equal. The RBI presumes that every run batted in is just worth 1. On the surface that may make sense because a run is worth 1 in baseball, so a run batted in should also be worth 1. However, one must take into account the level of difficulty when driving in a run if accurate credit is to be given. Should a soft line drive that scores a man who was on third base be worth the same as a home run? Or even as a strongly hit double in the gap that scores a runner from first. What if there is a man on third and the hitter hits a soft ground ball and the man on third is able to score. I don’t think they should be worth the same. It is far easier to drive in a run from third than it is from second or first, or with the bases empty. Level of difficulty has been added to many other baseball statistics, such as slugging percentage where when quantifying power attributes, a triple is worth more points than a single. So if the RBI is truly trying to determine the best at driving in runs, then level of difficulty of driving in a run must be taken into account.

Since the explosion of Sabermetrics, these gaping holes in the RBI have been assessed numerous times by quite a few statisticians. In fact, there have between multiple new statistics that vaguely look to solve this problem. However, the question from the RBI changes from “Who is the best at driving in runs?” to a more principal question, “Who creates/produces the most runs?” This is arguably the most important possible statistic as the goal of baseball is to score more runs, so the best hitters will create and produce the most runs. Bill James has attempted to quantify one’s ability to create runs in a statistic called Runs Created. RC is able to make an estimation of a player’s offensive value in terms of runs by taking the hitters ability to hit and get on base — a key distinction from a statistic that only quantifies a player’s ability to drive in runs. Furthermore, another statistic was created with a similar goal: Weighted Runs Above Average. wRAA is similar to RC in that it quantifies a player’s offensive value in terms of runs, however instead of having an outright number, it compares the hitter’s ability versus the average hitter. These statistics are both fantastic statistics to answer their question, but the question “Who’s the best at driving in runs?” is not being answered.

Both RC and wRAA provide solid alternatives for the RBI. They are much closer to answering the question than the RBI as neither of them have outside factors affecting the results. However, it was clear that these are solely alternatives, not replacements or direct solutions to the problem. As such, Others Batted In (OBI%) was created. This is an efficiency of how well a hitter can drive in runs. In fact, this completely fixes the first assumption that the original RBI makes — that all hitters have the same quality of plate appearances. OBI% is an efficiency rating, which in turn creates a neutral playing field for all hitters. However, there are still problems in OBI% when answering “Who’s the best at driving in runs?” The second assumption is not addressed — the inequality of driving in runs depending on situations — and the metric doesn’t credit solo homers, as this only counts “others” and not driving in oneself. In OBI%, each situation is worth the same, so being efficient at driving in runs from third is valued the same as driving in runs from first. Although many other questions have been answered, the question originally posed in 1920 has still yet to be truly answered.

Most aspects of the game of baseball has been quantified in advanced fashion. There are advanced baserunning metrics, advanced fielding, there are statistics about how many rotations the ball makes on every single throw in the ballpark. So why is there no advanced statistic about RBI? Why has the question been left unanswered for so long? With that in mind, I have created a statistic called Weighted Runs Batted In Efficiency.

wRBIe is the sum of efficiencies of driving in runners from each base, weighted. This statistic directly addresses the two assumptions from the RBI. First, like OBI%, this is an efficiency rating, ensuring an equal playing field for all hitters. It doesn’t matter how many times a player gets plate appearances with men on third, it solely matters how well they perform in those situations.

Next, wRBIe weights every RBI situation. The weight used in wRBIe was initially created by statistical analyst Tom Tango, who created a metric that quantified the chance of scoring from each base. wRBIe takes the chance of scoring and uses the reciprocal to quantify how difficult is it to score from that base. This means that if the chance of scoring from first base is .265, then the difficulty of scoring is 1/.265, or roughly 3.77. wRBIe then multiplies that weight by the hitter’s efficiency of driving in runs from first base. The sum is then taken of all the efficiencies. In all, the formula for wRBIe is:

wRBIe = *(((DoS0)(E0)) + ((DoS1)(E1)) + ((DoS2)(E2)) + ((DoS3)(E3))) * ScaleAdj*

Where: DoS is difficulty of scoring (reciprocal of chance of scoring), E is efficiency of scoring runners from each base (runners driven in from that base/PAs with runners at that base) and the digits are the location of the runners. “ScaleAdj” is the added weight to make a simple scale. Furthermore, the difficulties of scoring are as such:

No one on | Man on first | Man on second | Man on third |
---|---|---|---|

8.403361345 | 3.773584906 | 2.37529691 | 1.66944908 |

And the scale is as such:

Rating | wRBIe |
---|---|

135 | Superstar |

120 | All Star |

100 | Starter |

85 | Role Player |

70 | Bench |

Less Than 70 | Scrub |

Now that there is finally a metric that quantifies a hitter’s ability to drive in runs, the leaderboard can be found. For the first time, it will be possible to find which hitters are truly the best at driving in runs. The following tables compares the top 20 hitters from 2019 in terms of wRBIe and RBI:

Rank | wRBIe | RBI |
---|---|---|

1 | Nelson Cruz | Anthony Rendon |

2 | Freddie Freeman | Jose Abreu |

3 | DJ LeMahieu | Freddie Freeman |

4 | Eric Hosmer | Pete Alonso |

5 | Anthony Rendon | Eduardo Escobar |

6 | Max Kepler | Nolan Arenado |

7 | Anthony Rizzo | Jorge Soler |

8 | Nolan Arenado | Xander Bogaerts |

9 | Charlie Blackmon | Josh Bell |

10 | Pete Alonso | Cody Bellinger |

11 | Bryce Harper | Rafael Devers |

12 | Rafael Devers | Bryce Harper |

13 | Mike Trout | Alex Bregman |

14 | Cody Bellinger | Juan Soto |

15 | Marcus Semien | Eddie Rosario |

16 | Eddie Rosario | Nelson Cruz |

17 | Josh Bell | JD Martinez |

18 | Alex Bregman | Mike Trout |

19 | Danny Santana | Yuli Guriel |

20 | Austin Meadows | Eugenio Suarez |

This shows exactly what wRBIe hoped to accomplish: giving true credit to hitters who can drive in runs at an elite level with a neutral playing field. The RBI had been the quickest, easiest, simplest way to quantify a hitters ability to drive in a run, and there is a minor correlation between the two leaderboards — 55% in common — but the fact that leadoff hitters like DJ LeMahieu (3) and Max Kepler (6) are on this list while also having 20% of these players be on teams with below-average offenses shows the power of a neutral playing field.

While looking at this table, one may wonder if the team effect has actually been neutralized. The earlier table showed that in 2019, 70% of players ranked in the top 10 in RBI played for above-average offenses — in terms of runs, 20% lower than that of the wRBIe. However, this trend is unique to 2019 alone, as it was an outlier. From the years 2015-2019, 74% of the players ranked top 10 yearly in wRBIe played for above-average offenses while 82% of top 10 yearly RBI collectors played for above-average offenses — an 8% difference, with the biggest being in 2016 when 80% of top 10 RBI hitters played for above-average teams compared to that of 50% for the wRBIe. This 8% gap highlights two very important results of the wRBIe. First, it is not a function of the team for which that a player plays. The frequency of wRBIe leaders playing for above-average offenses is noticeably less, creating a more equal world for this statistic. However, 8% is not that substantial. This can be attributed to the fact that players who can drive in runs efficiently often play for the best teams — 74% of the time in the last five years. Prior to the wRBIe, front offices have combined many different statistics to try and find players to drive in their runs efficiently, such as wRAA, RC, and OBI%, but now the wRBIe provides one simple statistic to quantify this aspect of hitting.

Furthermore, only 55% of the wRBIe leaders hit in either the third or fourth spot of their teams’ lineups. This is a sizeable difference compared to the 80% mark of the RBI leaders. That trend has been consistent over the past five years, as the top 63% of the 20 wRBIe leaders from 2015-2019 have batted in those spots compared to a 77% of the top 20 RBI leaders in that span.

This is the exact reason why the wRBIe has been created. wRBIe suggests more of an opportunity for players hitting outside of the two most premium spots in the lineup in terms of quality of at-bats. For example, leadoff hitters are currently more than three times more likely to appear on a wRBIe leaderboard as they are for that of an RBI. This shows that just as the wRBIe is not a function of the team effect like the RBI, wRBIe is also not a function of the batting order.

The two major applications for the wRBIe have to do with lineup creation and pinch-hitting. First, wRBIe can actually spark two schools of thought while creating a lineup, and both are valid. One would be to slot the hitters with the best wRBIe — the best at driving in runs — in the 3/4/5 spots because those are the plate appearances that most often have runners on-base in front of them. This would give those hitters the best chance at maximizing their abilities. Contrarily, another manager may say that because they know their top wRBIe hitters can produce runs in any situation, they won’t slot them in 3/4/5 and give those spots to players who need more help driving in runs, players who are less efficient. For example, a player with a wRBIe above 120 will be able to drive in runs regardless of the quality of at-bats in terms of plate appearances with men on base, whereas a player with a wRBIe in the range of 70-100 will drive in runs at a higher rate if they are slotted in the 3/4/5 spots because they will be helped by a premium batting spot with more plate appearances with men on base.

The next application has to do with substitutions during a game. In general, there are four reasons for a manager to substitute a non-pitcher during a game: injury, defensive substitution, pinch-running, or pinch-hitting. Typically when a manager performs a pinch-hitting substitution, it is to have a better chance at driving in a run in a critical spot. Not only does this situation suggest the importance of pinch-hitting in general, but this shows how wRBIe can change the roles of benches forever. Throughout the year, teams acquire players in order to further their chances of winning a championship. Teams are allowed to have up to 13 batters at time, including five bench hitters, so it would make sense to roster bench players who are most effective at driving in runs. It is the front office’s job to provide the manager with the most efficient batters at driving in runs to choose from, a job that is most effectively done using the wRBIe.

In 2019, Anthony Rendon was the kid in high school who was the favorite student and was given an easy test. He passed the test with flying colors and made the best of his situation. Fans were quick to deem him the best at driving in runs. In reality though, Nelson Cruz — 16th in RBI — led the league in wRBIe and should’ve been crowned best at driving in runs as he had one of the best seasons ever recorded at 153 wRBIe, while Rendon was fifth at 138 wRBIe. There is a consensus that the RBI is going to be obsolete in the near future, however for the first time ever, wRBIe is a direct advanced replacement that can finally answer the question, “Who is the best at driving in runs?”

I think you need to make a distinction between driving in a man on third with 0 or 1 outs and doing so with 2 outs. The man on third with 2 outs is a lot harder to drive in, as a force at first on a grounder or a caught in the air on a long fly both result in no run with two outs and can easily give a run with 0 or 1 outs.

I was thinking the same though it looks like the tables from Tom Tango take that into account. Though some clarification would be good.

This article slaps and I learned something from pretty much every paragraph. Thank you! More if this type of analysis, FG, please!

This is definitely interesting work. One issue that I see that needs to be addressed is baserunning. After all, it’s a lot easier to drive in Ricky Henderson than Cecil Fielder. And we know there are definite differences between teams in terms of speed/baserunning skill. And even within a team, there’s probably not a uniform distribution of RBI opportunities relative to speed/baserunning.

I realize that what I’m suggesting would probably take a lot of work but seems like something that’s missing.

Eamon Sinclair! Now that’s the name of someone who’s going places.

Extremely interesting article! One thing I’m curious about for the applications is how much additional value/insight this provides over other hitting stats. I.e. for pinch hitting, how often would using this change the optimal decision over looking at OPS (for example).

This is interesting and very well done, first and foremost, but I’ll disagree with the notion that a solo home run is more valuable or should be given more credit than an RBI groundout, if production is what we’re trying to judge here.

Awesome stuff! How does this correlate to RE24…and how well does Ryan Howard come off in this stat, haha?

No need to mention Baseball Prospectus that introduced OBI like 15 years ago?

I wasn’t sure how to credit/link this in editing. Do you have a URL perhaps?

One of the most valuable articles in years! I do have a nitpick would not a baserunners speed be helping hitters drive them in and is this taken into account? For example, if Billy Hamilton is on first, he is going to score more than Nick Ahmed. Does your metric take this into account?

just noticed left of centerfield noticed baserunning as well. This post adds to his post.

Yeah, I’m not sure it would have a huge effect. But there’s certainly an effect. And if the effect can’t be measured accurately, I think there needs to be a disclaimer at least.

Eric Hosmer fans must be liking this metric!

Josh Bell on this wRBIe list makes me smile! Juan Soto not being on the wRBIe list makes me sad!

Josh Bell was a beast with men in scoring position in 2019. His .336 batting average was 16th best in the league and his 12 HR were tied for 2nd most in the league.

(He also had 13 intentional walks with men in scoring position, which was the 3rd most in the league. So other teams knew what was up.)

Last point, there are several surprising names on the wRBIe list. How does this metric change for a hitter year to year? In other words, how consistent is it?

I always thought the down & dirty answer to the question you’re trying to find an answer for was to look at batting-average-with-runners-in-scoring-positions. It’s not perfect by any means – for example, a batter could get a hit and a run not scores, or the run could score but the batter isn’t credited with a hit, and it doesn’t take in account the self-batted in run from a home run – but the names are similar to your wRBIe list.

And that should make sense: if your goal is to drive in runs, more than anything else you would want a player that would get a hit when they have the opportunity to drive in a run. And if a player is more likely than others to get a hit when that hit would lead to a run being scored, then they’d have a high wRBIe.

Also, Nelson Cruz had an insane .479 BABIP with runners in scoring position in 2019. I could imagine some legit reasons for that for that (though it certainly wasn’t year-over-year), but there’s probably an element of luck to Cruz’s wRBIe the same way there’s an element of luck in Rendon’s RBI total being a result of having more RBI opportunities.