Bases Produced and a Consideration of the 2016 AL/NL MVPs
Bases Produced is the keystone stat in a paradigm for baseball statistics that I have been developing, off and on, for the past 18 years.* Bases Produced measures a player’s overall offensive productivity by counting, quite simply, the number of times that player enables either himself or a teammate to advance to the next base. Each time this happens, a player is considered to have “produced a base.” Counting these events is important because producing bases is quite literally the only way that a baseball player can contribute to the scoring of runs by his team. When a player scores a run, after all, he has done nothing more than advance to all four bases in succession.
The Bases Produced system assigns credit for the production of these bases in a way that is based on traditional baseball statistics, but is also an expansion thereof. This expansion enables most traditional numbers to be tied together into a unified whole, evaluated in terms of Bases Produced, rather than remaining the haphazard collection of unrelated counts that they have always seemed to be.
How does it work? To calculate Bases Produced (BP), I first unify all of a player’s productive batting stats into one sub-total called “Batting Bases Produced” (BBP). This counts each base the player reaches on his own base hits, walks, or times hit by pitch:
BBP = 1 * 1B + 2 * 2B + 3 * 3B + 4 * HR + BB + HBP
A player’s success at producing BBP may be contextualized by dividing his BBP by his total number of “Batting Base Production Chances” (BBPC). This total includes all of a player’s plate appearances (PA), except for those times when a player has attempted to lay down a sacrifice bunt (SHA) — where his primary goal is ostensibly to produce bases for his teammates, rather than himself — and also his catcher’s interferences (CI), where the defense literally takes away his ability to put the ball in play.
BBPC = PA – SHA – CI
The ratio of BBP to BBPC then becomes a player’s “Batting Base Production Average” (BBPAVG):
BBPAVG = BBP / BBPC
Secondly, a player may produce bases for himself as a runner, by either stealing bases (SB), advancing on fielder’s indifference (FI), or “gaining” bases (BG). “Gaining Bases” is the term I use for a player who advances a base when the defense attempts to make a play on a runner somewhere else on the basepaths. For example, if a runner tries to score from second on a single, the batter may advance to second when the defense tries to throw out the runner at the plate. In this case, the batter/runner “gains” second base.
Taken altogether, the bases a player produces for himself as a runner are then called “Running Bases Produced” (RBP):
RBP = SB + FI + BG
Lastly, an offensive player can produce bases for teammates who are already on base by either drawing walks, getting hit by a pitch, or by putting the ball in play. Collectively, these bases are known as “Team Bases Produced” (TBP). The number of times a batter enables a teammate to reach home (TBP4) can be intuitively understood as the number of RBIs he has produced for his teammates, without including any that he has produced for himself. Overall, Team Bases Produced expands this concept by including the number of times a player enables his teammates to advance to second (TBP2) or third (TBP3), as well:
TBP = TBP2 + TBP3 + TBP4
While of course the batter depends on the presence — and subsequent baserunning actions — of a teammate on base to produce these bases, I assign the credit for producing them solely to the batter, without whose actions the runner(s) would not be able to advance on the play. The presence of the runners on base, however, is important to recognize when trying to evaluate how successful a batter is at producing team bases; each runner on base therefore counts as one “Team Base Production Chance” (TBPC) for a batter. (Note: When a batter draws an intentional walk, I do not count TBPC for runners whom the batter cannot force ahead to the next base.)
A batter’s Team Base Production Average (TBPAVG) then becomes, generally (and simply):
TBPAVG = TBP/TBPC
Overall, a player’s total Bases Produced (BP) is simply the sum of his Batting Bases Produced, Running Bases Produced and Team Bases Produced:
BP = BBP + RBP + TBP
This number may also be evaluated in terms of the player’s total number of chances to produce bases (BPC), including his Plate Appearances, Team Base Production Chances, and the number of times he enters the game as a pinch runner (PRS):
BPC = PA + PRS + TBPC
Rounding out this approach, I calculate a general measure of “Base Production Average” as the ratio of Bases Produced to Base Production Chances:
BPAVG = BP / BPC
On my website, www.basesproduced.com, I fill in the blanks of this general paradigm with similar breakdowns for “Outs Produced” and “Bases Run” (= bases a player reaches, but does not necessarily produce); interested readers may follow the link to learn all of the gruesome details for themselves. On the same website, I also calculate and update the BP stats for the current MLB season on a daily basis. You are welcome to check it out to follow along and see how they play out in real life.
While the Bases Produced paradigm may not enjoy all of the mathematical sophistication that goes into many modern sabermetric measures of offensive performance, it does have the advantage of reflecting straightforward facts and events that take place in every baseball game that any fan can quickly recognize and easily count for themselves (with or without a smartphone!). A grand slam home run, for instance, counts as 10 BP: 4 for the batter, 3 for the runner at first, 2 for the runner at second, and 1 for the runner at third. 10 Bases Produced is also a pretty good standard for an excellent game of baseball: I’ll mention in passing that there were just 7 performances of 10 BP or greater in last night’s (9/16) slate of 15 MLB games, with 14 BP topping the list (by three different players).
On basesproduced.com, I have also tabulated the same stats, using data from retrosheet.org, going back to the 1922 season. For those who are curious, the highest single-season BP total in history is 1005, by Lou Gehrig in 1927, while the highest BPAVG of all time is Barry Bonds’ .885, in 2004. There are still many bases produced statistics left to be calculated from the very olden days of baseball, however, before any of these numbers might be considered “records.”
Although Bases Produced is not, strictly speaking, a system that was designed to determine who ought to be the “Most Valuable Player” in any given season (whatever you might interpret that to mean), it is fun to use as another data point in the never-ending discussions about who most deserves the MVP award each year. So let’s consider what the system can show us about the best players in the American and National Leagues in 2016.
The AL MVP race has generally been described this season as a five-man horse race between David Ortiz, Mike Trout, Jose Altuve, Josh Donaldson and Mookie Betts. The Base Production Average numbers back that perception up, as all five of those players sit on top of the current AL BPAVG leaderboard, as of September 16th:
Player BPAVG BBPAVG TBPAVG
1. David Ortiz .709 .673 .760
2. Mike Trout .649 .628 .613
3. Jose Altuve .645 .590 .652
4. Josh Donaldson .644 .630 .651
5. Mookie Betts .605 .564 .607
Although these numbers should ideally be normalized to account for the influence of hitter-friendly venues like Fenway Park, Ortiz is still enjoying his best season there ever (his previous season high BPAVG was .697, in 2007), and he’s well ahead of his career BPAVG of .620, too. As far as base-production statistics are concerned, David Ortiz is unambiguously the 2016 AL MVP.
Over in the National League, I have heard many people talk about the great year that Kris Bryant is having, but his performance fails to even register in the NL’s top five base producers, by average:
Player BPAVG BBPAVG TBPAVG
1. Daniel Murphy .665 .619 .718
2. Anthony Rizzo .634 .607 .659
3. Joey Votto .619 .602 .617
4. Nolan Arenado .617 .607 .624
5. Freddie Freeman .612 .612 .597
(9. Kris Bryant .601 .618 .541)
Daniel Murphy of the Nationals has clearly had the standout year, instead. And it is worth noting that Bryant’s teammate, Anthony Rizzo, is actually doing considerably better than Bryant in overall BPAVG. The big difference amongst these three players can largely be attributed to Bryant’s mediocre TBPAVG, which is near the National League median of .529 (Aledmys Diaz). That difference can, in turn, be attributed to a combination of Bryant’s high strikeout percentage (.219) and very low ground-out percentage (.113). The one outcome of a plate appearance that never produces bases for teammates is a strikeout, and ground outs tend to be about three times as team-productive as fly outs, in those situations where a batter hasn’t succeeded in producing a base for himself. Bryant’s current numbers place him squarely on the wrong side of both of these team-base-production tendencies.
While Kris Bryant has had a great baserunning season this year…these numbers give reason to question any suggestion that he might have been the best player in the league this season — or even, for that matter, the best player on his own team. But at least it is manifestly clear that Joe Maddon has Bryant and Rizzo in the correct order in the Cubs’ lineup. 🙂
*While I am not as up on the current literature in baseball statistical analysis as I should be, I do know that others have developed similar statistical measures independently of me, including at least Gary Hardegree, Alfredo Nasiff Fors, and someone named EvanJ on this forum. If there are other similar thinkers out there, then I apologize for my ignorance of their work.
This is cool and obviously you’ve put a lot of work into it. A couple questions. Any consideration for negative bases? This would come into play for double plays. A lot of research suggests that ground balls are actually less valuable (with guys on base) because of the potential for double plays.
Second, how does going 1st to 3rd on a single count (and other similar plays)? Seems like maybe both the hitter and runner deserve a little credit for that.
One criticism on your write-up. I like the way you talk about this being another way of looking at things at the beginning, but then to say “as far as base-production statistics are concerned, David Ortiz is unambiguously the 2016 AL MVP” is a bit silly. Even with the caveat, yours is not the only “base-production” statistic, and it is also just odd. It would be like saying “as far as catchers go, Gary Sanchez is unambiguously the AL ROY.” It might be technically true, but it doesn’t really mean anything.
Thanks for the feedback! Let me try to address your points in the order you made them:
1. I collect all of the “negative bases” together into a total called “Bases Lost”. These include both Bases Lost as a runner, and Bases Lost “for” teammates, by producing double and triple plays. For instance, a runner who gets caught stealing at second loses one base (and a runner caught stealing at third loses two bases, etc.). On the other hand, a batter who hits into a ground ball double play in which a runner is put out at second loses one base. It would be relatively straightforward to subtract all of these bases from a player’s overall Bases Produced to get a more general measure of productivity or effectiveness (call it what you will), but I’ve just never done that. I will mention in passing, though, that the book by Alfredo Nasiff Fors that I cite in the footnote advocates this approach, as well.
1b. Your point about more ground balls leading to more double plays is interesting and is one I hadn’t considered. Unfortunately, I don’t have time at the moment to quantify, exactly, how those Bases Lost would mitigate the Team Base Production effectiveness of ground balls. However, I can quickly cherry pick a couple of examples to see what effect accounting for Team Bases Lost would have on Team Base Production Average. So, here goes: Greg Maddux was a ground ball pitcher and had a career TBPAVG of .514, but after eliminating his Team Bases Lost, that goes down to .469. Jim Palmer was more of a fly ball pitcher, with a career TBPAVG of .450. After eliminating his Team Bases Lost, that goes down to just .409.
My guess is that there is some correlation there, but probably not enough to overcome the general trend of ground outs being more productive than fly outs.
2. I count plays where a runner advances one base more than the batter on a base hit as “Extra Bases Run” (XBR) for the runner. So, when a runner goes from first to third on a single, he gets credit for one XBR13. I went back and forth for awhile on whether the batter or runner ought to get credit for “producing” that base, but ultimately decided that it ought to go to the batter. The rationale being that the runner wouldn’t have been able to advance without the batter putting the ball in play in the first place. One of my goals is to keep the numbers in the system as simple as possible, without splitting them up or having them represent mathematical abstractions, and taking that approach seemed like the best way to do that.
It would be straightforward to split credit for XBR13 right down the middle between the batter and runner, but note that it is a little easier for a runner to score from second on a single (XBR24), so perhaps those runners should get somewhat less credit for the same base? As it stands, I just avoid that question altogether, give full credit to the batter, and post all of the baserunning data so that those who are interested could calculate a more sophisticated metric if they want to.
By the way, I have a separate stat on my site, “Running Bases Created”, which doesn’t strictly fit into the Bases Produced paradigm. That one adds to Running Bases Produced the number of times runners advance at least one more base than the batter on the same play, including Extra Bases Run and bases advanced on fly outs. That’s about as general a measure of the overall contribution a runner makes to advancing bases as I have developed with this sort of approach.
3. I’ll plead no contest on your last point and just clarify that I was only referring to the Bases Produced statistics I describe in my post. David Ortiz is clearly the AL leader in those for this season. I’ll also reiterate that I have been working in a bubble with all of this for a long time, so I would be happy to hear more about any other base production statistics that you think I ought to know!
Thanks again for your comments.
This is a good idea. I am curious about the ratio of a teams total bases produced to runs scored.
Total bases produced by a team/total runs scored by a team
See how far away from 1 run for every 4 bases, and seeing how efficient each team is at converting bases into runs.
Hi–it’s a little hard to format tables on this interface, but here you go, for the 2016 season:
BP RS BP/RS
col 5006 821 6.10
bos 5328 850 6.27
bal 4482 709 6.32
tex 4665 731 6.38
cle 4733 740 6.40
stl 4660 728 6.40
tor 4651 722 6.44
sdg 4149 642 6.46
was 4663 713 6.54
sea 4675 714 6.55
ari 4650 709 6.56
det 4578 697 6.57
cin 4358 663 6.57
tbr 4220 642 6.57
lad 4500 683 6.59
chc 4983 755 6.60
hou 4566 690 6.62
min 4577 688 6.65
laa 4425 663 6.67
oak 4169 623 6.69
nyy 4365 644 6.78
kan 4301 631 6.82
pit 4735 693 6.83
chw 4395 637 6.90
sfo 4624 663 6.97
nym 4222 604 6.99
phi 4029 576 6.99
mil 4453 635 7.01
mia 4366 622 7.02
atl 4394 610 7.20
Thank you for the data.
You have presented this as an alternative to sabermetric analyses that doesn’t require a large body of historical data (linear weights, run expectancy, etc.), but can be calculated just from the actual events that each player produces. Fair enough. But an obvious flaw if one did want to use this method to rate the value of players is apparent in your first equation, for batting bases produced:
BBP = 1 * 1B + 2 * 2B + 3 * 3B + 4 * HR + BB + HBP
This formula implies that the value of various offensive events is directly proportional to the number of bases—e.g., a HR is worth four times as much as a single. But we know that this is incorrect, that a HR is worth less than four times a single. In fact your BBP average is a lot like slugging average, and the other two averages you use to account for baserunning and other types of advancing on the bases follow the same premise. Like BA and OBP, these are relatively easy to calculate, but provide only a rough indication of a player’s value.
Also, the running bases produced (RBP) metric does not seem to include bases advanced when they involve a fractional value. E.g., a runner is expected to advance from first to second when the following batter hits a single, but sometimes may advance to third. Since the latter is not a given, the runner gets some credit in sabermetric analyses for taking this extra base, but since it sometimes happens, he doesn’t get credit for a full base (more precisely, does not get full credit for the extra run value associated with the extra base). If I understand your approach, you give all the credit for the advance to the hitter, when in fact, advancing the extra base is clearly due in large part to the base runner.
Thanks for your feedback.
You’re right–I am just counting bases + events here, and they may be taken as a measure of a player’s overall “value”, but ultimately they just reflect those events.
I think you might be missing the connections between the various measures, however, in your criticism of the calculation of Batting Bases Produced. That stat counts how many bases a batter advances to on base hits (+ walks and times hit by pitch), and in that context a home run certainly is four times more valuable than a single, since a batter advances four bases on a homer and only one on a single.
Base hits also allow runners to advance, however, which means that batters produce TBP with them, too. Over the 90 seasons I’ve got in my database, an average of .835 TBP are produced on singles and 1.412 TBP are produced on each home run. If you factor in the BBP produced on the same plays, then on average 2.95 (=5.412 / 1.835) times more BP are produced on home runs than on singles. That average, of course, depends on how many men are on base in each situation. It turns out, interestingly enough, that there tend to be fewer men on when players hit home runs than when they hit singles, so if you factor in that difference, the BPAVG for home runs is almost exactly 3 times as much as that for singles. I don’t know what sabermetricians have calculated the “value” of home runs to be in comparison to singles, but if it is in the neighborhood of 2.95 to 3, I wouldn’t be too surprised.
As far as Batting Base Production Average is concerned, yes, it is just an extension of slugging percentage (with a few minor modifications), and I make no bones about that. You’re also right in that there are no “fractional values” in this system; that’s part of its aesthetic. That may make it less powerful or exact than sabermetric models, but I think that also gives this approach the benefits of clarity and simplicity.
As I mentioned in a response to previous comments, it would be relatively easy to split up the credit for team bases produced between the batter and the runner when the runner advances more bases than the batter on the same play, but I haven’t done so in part because it does lead down a rabbit hole of further analysis. The ability to do this depends not only on the runner, but also on the base being advanced to, in which direction the ball is hit and how far, how many outs there are in the inning, and so on and so forth. As it stands, I do account for the times that a runner is able to do this, but just do not include it in his BP total. Since it sounds like others have already tackled much of the rest of the analysis problem, I will be happy to not have to replicate their work. 🙂
Thanks again for your comments.