Seeing the Complete Picture: Building New Statistics to Find Value in the Details by Colin Dew-Becker October 29, 2013 Attempting to accurately estimate the number of runs produced by players is one of the most important tasks in sabermetrics. While there is value in knowing that a player averages four hits every ten at-bats, that value comes from knowing that more hits tend to lead to more runs. On-base percentage became popularized through Moneyball in the early 2000s because the Oakland Athletics, among other teams, realized that getting more runners on base would lead to more opportunities to score runs. Knowing a player’s batting average or on-base percentage can be informative, but that information does nothing to quantify how the player contributed to a team’s ability to score runs. The classic method for determining how many runs a player contributes to his team is to look at his RBI and runs scored totals. However, both of those statistics are extremely dependent on timely hitting and the quality of the rest of the team. A player will not score many runs nor have many RBI opportunities if the rest of the players on his team, particularly the players around him in the lineup, are not productive. One of the more popular sabermetric methods to estimate a player’s run production is to find the average number of runs that certain offensive events are worth across all situations and then apply those weights to a player’s stat line. In this way, it doesn’t matter if a player comes to the plate with the bases loaded every time or the bases empty every time, just that he produced the specific type of event. Here is a chart that shows the average number of runs that scored in an inning following each combination of base and out states in 2013^^. Base State 0 OUT 1 OUT 2 OUT 0** 0.47 0.24 0.09 1 0.82 0.50 0.21 2 1.09 0.62 0.30 3 1.30 0.92 0.34 1-2 1.39 0.84 0.41 1-3 1.80 1.11 0.46 2-3 2.00 1.39 0.56 1-2-3 2.21 1.57 0.71 We can see in the chart that in 2013, with no men on base and zero outs, teams scored an average of 0.47 runs through the end of the inning. If a batter came to the plate in that situation and hit a single, the new base/out state is a man on first with zero outs, a state in which teams scored an average of 0.82 runs through the end of the inning. If the batter had instead caused an out, the new base/out state would have become bases empty with one out, a state in which teams only averaged 0.24 runs through the remainder of the inning. Consequently, we can say that a single in that situation was worth 0.58 runs in relation to the value of an out in the same situation. If we repeat this process for every single hit in 2013, and apply the averages from the chart to each single depending on when they occur, we find that an average single in 2013 was worth approximately 0.70 runs in relation to the average value of an out. This is known as the linear weights method for calculating the context-neutral value of certain events. Check this article from the FanGraphs Library, and the links within, for more information on linear weights estimation methods. There have been a variety of statistics created to estimate a player’s performance in a context-neutral environment using the linear weights method over the last few decades. Recently, one of the more popular linear weight run estimators, particularly here at FanGraphs, has been weighted On-Base Average (wOBA) introduced in The Book: Playing the Percentages in Baseball. wOBA is arguably the best, publically-available run estimator, but I think it has potential for improvement by incorporating more specific and different kinds of events into its estimate. wOBA is traditionally built with seven statistics: singles, doubles, triples, home runs, reaches on error, unintentional walks, and hit by pitches. While some versions may exclude reaches on error and others may include components like stolen bases and caught stealing, I will focus exclusively on the version presented in The Book that uses those seven statistics. By limiting the focus to just those seven components, wOBA can be calculated perfectly in every season since at least 1974 (as far back as most play-by-play data goes), and can be calculated reasonably well for the entire history of the game. While it can be informative to see what Babe Ruth’s wOBA was in 1927, when analyzing players in recent history, particularly those currently playing, accuracy in the estimation should be the most important consideration. Narrowing the focus to just seven statistics, some broadly defined, will limit how accurately we can estimate the number of runs a player produced in a context-neutral environment. The statistics I refer to as “broadly defined” are singles and doubles. I say that because it is a relatively easy task to convince even a casual baseball fan that not all singles are created equally. If we compare singles hit to the infield with singles hit to the outfield, we’ll notice that outfield singles will cause runners on base to move further ahead on the basepaths on average than infield singles. For example, in 2013, with a man on first, only 3.2% of infield singles ended with men on first and third base compared to 29.9% of outfield singles. If outfield singles create more “1-3” base states than infield singles, and we know from the chart above that “1-3” base states have a higher run expectancy than “1-2” base states in the same out state, then we know that outfield singles are producing more runs on average than infield singles. If outfield and infield single are producing different amounts of runs on average, then we should differentiate between the two events. Beyond just breaking down hits by fielding location, we can refine hit types even further. If we differentiate singles and doubles by direction (left, center, right) and by batted ball type (bunt, groundball, line drive, fly ball, pop up) we can more accurately reflect the value of each of those offensive events. While the difference in value between a groundball single to right field compared to a line drive single to center field is minimal, about 0.04 runs, those minimal differences add up over a season or career of plate appearances. Reach on error events should also be broken down like singles and doubles, as balls hit to the third baseman that cause errors are going to have a different effect on the base state than balls hit to the right fielder that cause errors. The two other ways that wOBA accounts for run production by a batter are through unintentional walks and hit by pitches, notably excluding intentional walks. If a statistic is attempting to estimate the number of runs produced by a player at the plate, I believe the value created by unskilled events should also be counted. While it takes no skill to stand next to home plate and watch four balls go three feet wide of the strike zone, the batter is still given first base and affects his team’s run expectancy for the remainder of the inning. Distinguishing between runs produced from skilled and unskilled events is something that should be considered when forecasting future performance as unskilled events may be harder to repeat. However, when analyzing past performance, all run production should be accounted for, no matter the skill level it required to produce those runs. There is an argument that the value from an intentional walk should just be assigned to the batting team as a whole, as the batter himself is doing nothing to cause the event to occur; that is, the batter is not swinging the bat, getting hit be a pitch, or astutely taking balls that could potentially be strikes. However, as the players on the field are the only ones who directly affect run production — it isn’t an abstract “ghost runner” on first base after an intentional walk, it’s the batter — the value from the change in run expectancy must be awarded to players on the field. While it can be difficult to determine how to award that value for the pitching team with multiple fielders involved in every event (pitcher and catcher most notably and the rest of the fielders for balls put into play), the only player on the batting team who can receive credit for the event is the batter. If we accept that the intentional walk requires no skill from the batter, but agree that he should still receive credit for the event, then we can extend that logic to all unskilled events in which the batter could be involved. Along with intentional walks, that would include “reaching on catcher’s interference” and “striking out but reaching on an error, passed ball, or wild pitch.” In those cases, it is the catcher rather than the pitcher causing the batter to reach base but it doesn’t matter to the batting team. If the team’s run expectancy changed due to the batter reaching base, it makes no difference if it was the pitcher, catcher, or any other fielder causing the event to occur. When building wOBA, the value of the weight for each component is calculated with respect to the value of an average out, like in the example above. Using the average value of all outs is very similar to using the broad definition of “single,” as discussed earlier. Very often we hear about productive outs, and yet we rarely see statistics quantify the value of different types of outs in a context-neutral manner. If a batter were to exclusively make all of his outs as groundballs to the right side of the infield, he would hurt his team less than if he were to make all of his outs as groundballs to the center of the infield. Groundouts to the right side of the infield allow runners on second and third base to advance more easily than groundouts to the center of the infield. Additionally, groundouts to the center of the infield have more potential to turn into double plays than groundouts to the right side of the infield. As above, the differences in value are minimal — around 0.04 runs in this case — but they add up over a large enough sample. To deal with the difference in the value of outs, all specific types of outs should also be included in any run estimation, weighted in relation to the average value of an out. For instance, in 2013 the average value of all outs in relation to the average value of a plate appearance was -0.258 runs while the average value of a fly out to center field in relation to the average value of a plate appearance was -0.230 runs. Consequently, we can say that a fly out to center field is worth +0.028 runs in relation to the average value of an out. We can do the same for groundouts to the left side of the infield (-0.015) or lineouts to center field (+0.021), as well as every other type of out broken down by direction, batted ball type, and fielding location. Interestingly, and perhaps not surprisingly, all fly outs and lineouts to the outfield are less damaging than an average out while all types of outs in the infield are more damaging than an average out, except for groundouts to the right side of the infield and sacrifice bunts. Taking the weights for each of these 104 components, applying them to the equivalent statistics for a league average hitter, and dividing by plate appearances, generates values that tend to fall between .280 and .300 based on the scoring environment, somewhat similar to the batting average for a league average player. In 2013, a league average player would have a score of .256 from this statistic compared to a batting average of .253. To make the statistic easily relatable in the baseball universe, I’ve chosen to scale the values in each season to batting average. The end result is a statistic called Offensive Value Added rate (OVAr) which has an average value equal to that of the batting average of a league average player in each season. So, if a .400 batting average is an historic threshold for batters, the same threshold can be applied to OVAr. Since 1993, as far back as this statistic can be calculated with current data, Barry Bonds is the only qualified player to post an OVAr above .400 in a single season, and he did it in four straight seasons (2001-2004). Where OVAr mirrors the construction of the rate statistic wOBA, another statistic, Offensive Value Added (OVA), mirrors the construction of the counting statistic weighted Runs Above Average (wRAA). Here is the equation for OVA followed by the equation for wRAA. OVA = ((OVAr – league OVAr) / OVAr Scale) x PA wRAA = ((wOBA – league wOBA) / wOBA Scale) x PA OVA values tend to be very similar to their wRAA counterparts, though they can potentially vary by over 10 runs at the extremes. In 2013, David Ortiz produced 48.1 runs above average according to OVA and “just” 40.3 runs above average according to wRAA, a 19.4% increase from his wRAA value. Of Ortiz’s extra 7.8 runs estimated by OVA, 4.3 of those runs came from the inclusion of intentional walks, and 2.5 of those runs came from Ortiz’s ability to produce slightly less damaging outs through his tendency to pull the ball to the right side of the field. You won’t find many box scores or player pages that list direction, batted ball type, or whether the ball was fielded in the infield or outfield, but the data is publicly available for all seasons since 1993. While wOBA gives non-programmers the ability to calculate an advanced run estimator relatively easily, if we have data that makes the estimation more precise, then programmers should take advantage. Due to the relative difficulty in calculating these values, I’m providing links to spreadsheets with yearly OVAr and OVA values for hitters, Opponent OVAr and OVA values for pitchers, splits for hitters and pitchers based on handedness of the opposing player, and team OVA and OVAr values for offense and defense, with similar splits. Additionally, I’ve included wRAA values for comparison. Those values may not exactly match those you would find on FanGraphs due to rounding differences at various steps in the process, but they should give a general feel for the difference between OVA and wRAA. I’ve obviously omitted the meat of the programming work, as I felt it was too technical to include every detail in an article like this. For more information on run estimators built with linear weights methodology I’d highly recommend reading The Book, The Hidden Game of Baseball by John Thorn and Pete Palmer, or any of a variety of articles by Colin Wyers over at Baseball Prospectus, like this one. I used ten years of play-by-play data to get a substantive sample++ of when each type of event happened on average, and I used a single season of data to create the run environments. Otherwise, the general construction of OVAr mirrors the work done by Tom Tango, Mitchel Lichtman, and Andrew Dolphin in The Book. The next step for this statistic is to make it league and park neutral (nOVAr and nOVA). I chose to omit this step for the initial construction of these statistics as it was also omitted in the initial construction of wOBA and wRAA. Also, the current methods to determine park factors used by FanGraphs and ESPN, among other sites, are somewhat flawed and not something I want to implement. Until that next step, enjoy a pair of new statistics. OVAr and OVA, Ordered Batters OVAr and OVA, Alphabetical Batters OVAr and OVA, Ordered Batter Splits OVAr and OVA, Alphabetical Batter Splits OVAr, Ordered Qualified Batters OVAr, Ordered Qualified Batter Splits Opponent OVAr and OVA, Ordered Pitchers Opponent OVAr and OVA, Alphabetical Pitchers Opponent OVAr and OVA, Ordered Pitcher Splits Opponent OVAr and OVA, Alphabetical Pitcher Splits Opponent OVAr, Ordered Qualified Pitchers Opponent OVAr, Ordered Qualified Pitcher Splits OVAr and OVA, Teams OVAr and OVA, Team Splits OVAr and OVA, Ordered Weights OVAr and OVA, Alphabetical Weights ^^ These averages exclude all events in home halves of the 9th inning or later to avoid biases created by walk-off hits and the inability of the home team to score an unlimited number of runs in 9th inning or later like they can in any other inning. ** A number in the Base State column represents a runner on that base, with 0 representing bases empty. ++ I have one note on sample size that I didn’t think fit anywhere comfortably in the main body of the article. The biggest issue with a statistic built with very specific events is that some of those events are extremely rare. For instance, groundouts to the outfield have happened just 111 times since 1993, compared to groundouts to the infield that have happened 891,175 times since 1993. Consequently, the average value of outfield groundouts, split up direction, can vary substantially from year to year as different events are added or taken away from the sample. I choose to use a ten-year sample to attempt to limit those effects as much as possible but they still will be evident upon close examination. With that sample size, in 2013 a groundout to left field was worth -0.447 runs in relation to the average value of an out. In 2006 the same event was worth -0.089 runs, while in 2000 it was worth +0.154 runs. As long as the statistic is built in a logically consistent manner, I don’t mind that low frequency events like outfield groundouts and infield doubles vary somewhat from year to year in estimated value, as the cumulative effect will be quite minimal. That being said, as I am trying to estimate the value of events as accurately as possible, the variation in value is a bit off-putting. It may be that a sample of 20 or more years would be necessary for those rare events, with a smaller sample size for the more common events. That adjustment will be considered for the nOVAr and nOVA implementations, but for OVAr and OVA I wanted the construction to be completely consistent.