We all know the story: Yoenis Cespedes is a bad defensive center fielder. In 912 career innings in center field, Cespedes has rated miserably in both Ultimate Zone Rating (UZR), with a -17.6 UZR/150, and Defensive Runs Saved (DRS), with a prorated -23.7 DRS/150. Based on those metrics, he should continue to be an awful defensive center fielder in 2016, right?
Not necessarily. Let’s use a few different methods to estimate Cespedes’ defensive value as a center fielder and determine how effective he will be in the future.
Method 1: Regress past defensive data in CF
This is the simplest (and crudest) method of all. If we average Cespedes’ center field contributions per 150 games by UZR (-17.6) and DRS (-23.7), we find that Cespedes is a -20.85 run defender per 150 games. Because of the small 900-inning sample, we’ll regress that by 50% and estimate that Cespedes is a -10.4 runs per 150 games defender in center. This is what many people in the analytical community roughly believe Cespedes’ defensive value in center field to be. Methods 2 and 3, shown below, illustrate why I disagree with this valuation.
Method 2: Combine Cespedes’ Range in CF with his Arm Throughout the Outfield
One thing everyone can agree on with Cespedes: he has a cannon of an arm. Whether he’s playing center field or left field, we should expect his arm to be significantly above average, right?
Well, in his 912 career innings in center field, UZR and DRS seems to disagree. They rate his arm at -0.8 runs and +2 runs, respectively. Decent, no doubt, but not the arm that most of us are accustomed to with Cespedes.
Yet, if we look at his entire career in the outfield, including time in both center field and left field, his arm has been worth +28 runs by DRS and +26.5 runs by UZR in roughly 4300 innings. When averaged and scaled to 150 games, the value of his arm comes out to roughly +9.5 runs per 150 games over a very large sample, much more in line with what we would expect.
Next, we must factor Cespedes’ center-field range into the equation. In 912 innings, DRS pegs his range (they term it rPM) at -17, while UZR estimates his range (they use RngR) at -12.2. When averaged and scaled to 150 games, his range comes out to -20.4 runs per 150 games. Because of the small 900-inning sample, we’ll once again regress his range by 50%, getting us to -10.2 runs per 150 games.
Factor in his arm, worth +9.5 runs per 150 games, and suddenly our estimate of Cespedes comes to -0.7 runs per 150 games in center field. In other words: his excellent arm makes up for his poor range, making him a roughly league-average defensive center fielder.
Method 3: Isolate the Value of Cespedes’ Arm, Then Use Positional Adjustments to Estimate Cespedes’ Range in CF
This is the most complicated of the three methods. First, we must become comfortable with the idea of positional adjustments. Essentially, the purpose of positional adjustments is to provide a run value for each position, using past data of players switching positions to estimate the defensive difficulty of each position. For example, while shortstop is a difficult position to play — and hence has a +7.5 run positional adjustment (per 162 games) — first base is not, with a -12.5 run positional adjustment. Theoretically, if a shortstop was to switch to first base, the theory of positional adjustments would estimate a 20-run improvement in defense per 162 games.
Of course, positional adjustments don’t always work so conveniently, a reality the Red Sox discovered the hard way after moving Hanley Ramirez from shortstop to left field backfired tremendously. Indeed, the difficulty of learning a new position oftentimes overshadows the theoretical improvement that should come from moving down the defensive spectrum.
In the outfield, however, things work much smoother, simply because each outfield position requires roughly the same skill-set: speed, first-step quickness, and efficient route running. Using the positional adjustments from FanGraphs, we’d expect a left fielder (-7.5 run positional adjustment) to be approximately 10 runs worse in center field (+2.5).
For this exercise, we’ll isolate Cespedes’ arm from his range, using the +9.5 runs per 150 game figure we got from Method 2 to estimate the value of his arm (or +10.3 runs per 162 games). Why? For the most part, throwing arm strength is something we don’t expect to change too much shifting from left field to center. The main difference between playing center field and left field is the range required for each position.
Estimating Cespedes’ range in center field using positional adjustments requires some tricky math. First, let’s examine Cespedes’ range throughout his entire outfield career. In 4295.33 innings combined between the two positions, Cespedes’ range is estimated at -13 runs by DRS (rPM) and -4.3 runs by UZR (RngR), or an average of -2.9 runs per 162 games (FanGraphs’ positional adjustments are scaled to 1458 innings, or 162 games).
Next, let’s calculate the percentage of his innings in left and center. 3383/4295.33 shows us that 78.76% of his innings came in left field, and, by extension, that 21.24% of his innings came in center.
Now, the tricky part: algebra. If “x” is his range in CF, “x+10” is his range in LF, and +10 is the positional adjustment per 162 games from LF to CF, we solve for x with the following formula:
0.2124 * x + 0.7876 * (x+10) = -2.9
Wolfram Alpha, what say you?
x = -10.8, or -10.8 range runs per 162 games in CF.
Now, factor in Cespedes’ +10.3 runs per 162 games from his arm, and you arrive at his defense being worth -0.5 runs per 162 games. Just as in Method 2, it appears that the value of Cespedes’ throwing arm essentially counteracts his poor range, making him once again a roughly league-average defender in center
Method 4: Use Positional Adjustments to Estimate Cespedes’ Total Value in CF
While Methods 2 and 3 are certainly improvements over Method 1, there are some minor flaws in the methodology for each of the two methods. In Method 2, we arbitrarily regressed Cespedes’ range in CF by 50%, when in truth we don’t know exactly how much his range needs to be regressed. In both Methods 2 and 3, we assumed that the value of Cespedes’ arm wouldn’t change significantly by moving from LF to CF, when in reality it may be more difficult to accumulate value via throwing as a center fielder.
To address these concerns, let’s do the same Method 3 Calculation except instead of attempting to find Cespedes’ range in CF, we’ll try and estimate Cespedes’ total value in CF, using nothing other than positional adjustments, UZR, and DRS. Rather than breaking down those metrics into their individual components, we’ll simply use the positional adjustments on the metrics themselves, a more traditional calculation.
First, let’s average Cespedes’ total DRS (15 runs) and UZR (20.7 runs) and scale it to 162 games, arriving at +6.1 runs per 162 games between left and center. Then, let’s do the same algebra we did in Method 3, with “x” representing his UZR/DRS in CF and “x+10” representing his UZR/DRS in LF.
0.2124 * x + .7876 * (x+10) = 6.06
We’ll head over to Wolfram Alpha one last time, with x = -1.8 runs per 162 games.
This might be the most accurate estimation of his value in CF of all, as it doesn’t rely on the raw value of his arm (like in methods 2 and 3) or a regressed version of his range in center (like methods 1 and 2).
Don’t believe the skeptics. While Cespedes has rated terribly in roughly 900 innings of data in center field, it’s silly to limit yourself to such a small center field sample size when we have more than 4000 innings of data, separate range and arm ratings, and positional adjustments at our disposal. Using some basic arithmetic, we’ve proven that Cespedes should probably be no worse than a hair below average defensively in center field, as his extremely valuable arm (+10.3 runs per 162 games) makes up for his below-average range.
In Part 1 of the “Trying to Improve fWAR” series, we focused on how using runs park factors for a FIP-based WAR leads to problems when calculating fWAR, and suggested the use of FIP park factors instead. Today we’ll analyze a different yet equally important problem with the current construction of FanGraphs Wins Above Replacement for both position players and pitchers: league adjustments. When calculating WAR, the reason we adjust for league is simple; the two leagues aren’t equal. The American League has been the superior league for some time now, and considering that all teams play about 88% of their games within their league, the relative strength of the leagues is relevant when trying to put a value on individual players. If a player moved from the American League, a stronger league, to the National League, a weaker league, we’d expect the player’s basic numbers to improve; yet, if we properly adjust for quality of league when calculating WAR, his WAR shouldn’t change significantly by moving into a weaker league.
The adjustments that FanGraphs makes for strength of league are unclear. The glossary entry “What is WAR?” and the links within it don’t seem to reference adjusting for the strength of a player’s league/division at all. The only league adjustment is within position player fWAR, and is described as “a small correction to make it so that each league’s runs above average balances out to zero”. Not exactly a major adjustment. Rather than evaluating FanGraphs’ methods of adjusting for league, let’s instead look at the how the two leagues compared in fWAR for both pitchers and position players in 2014:
Interestingly, AL pitchers seem to get a much greater advantage than AL position players from playing in a superior league. Yes, the AL does have a DH, but the effect of having a DH should be in the form of the AL replacement level RA/9 being higher than the NL replacement level RA/9. Having a DH (and hence a higher run environment) does not mean that the league should have more pitching fWAR. Essentially, somewhere in the calculation and implementation of fWAR, the WAR of AL pitchers is being inflated by around 13% and the WAR of NL pitchers is being deflated by the same amount. Meanwhile, AL position players don’t benefit at all from playing in a superior league. In order to accommodate for league strength, the entire American League should benefit from playing in the stronger league, not just the pitchers. In order to find out what the league adjustment should be (at least for the 2015 season), let’s look at each league’s interleague performance since 2013:
The “Regressed Winning Percentage” is simply the league’s interleague Winning Percentage regressed to the mean by a factor of .1, meaning that 90% of the league’s interleague WP% is assumed to be skill. Each league’s interleague winning percentage is regressed slightly to ensure that we aren’t overestimating the differences between the two leagues. Part of the reason we regress each league’s interleague winning percentage is because the interleague system is admittedly not perfect; while NL teams believe that the AL has an inherent advantage because of their everyday DH, AL teams complain about having pitchers who can’t bunt and a managerial style that is strategically difficult for their managers. While both sides have valid points, interleague games probably don’t hurt one side significantly more than the other, meaning that the vast amount of data that comes from interleague games is reliable as long as it is properly regressed.
Just knowing each league’s regressed interleague winning percentage, however, is not enough. We also need to know the percent of games each league plays within its own league. Why? The more games the league plays against the other league, the less playing in a superior league matters; the only reason we have to adjust for strength of league in the first place is because of the disparity in competition between the leagues. In a 162-game season, a team plays exactly 20 games against interleague opponents, meaning that 142 of 162 games, or 87.7% of a team’s schedule, is intra-league. Therefore, in order to find each league’s multiplier, the following equation is used:
League Multiplier = 2 * ((.877 * Regressed WP%) + ((1-.877) * Opponent Regressed WP%))
In this calculation, the “Opponent Regressed WP%” is simply the opposing league’s Regressed WP%. This is incorporated into the formula because each league plays 12.3% of its games (20 games) against the other league. Without further ado, here are the league multipliers:
As expected, the American League comes out as the stronger league, albeit by a smaller margin than its advantage in fWAR (remember, the AL’s league multiplier in fWAR was 1.056). Still, there are other adjustments that can be made besides adjusting for league. In the same way that the superiority of the American League is no secret, the fact that all divisions are not created equal is relatively obvious to most baseball fans. The AL East has long been considered the best division in baseball, and their inter-division record backs up that reputation; they have a .530 inter-division winning percentage over the last two seasons (only including games in their own league), best in the American League. Using the same process we used to calculate the league multipliers, division multipliers were calculated as shown below, with the data from the 2013-2014 seasons:
One difference between this calculation and the league multiplier calculation was that, in this calculation, not all games were used when determining what percent of a division’s games were intra-division; because we already adjusted for league earlier, the 20 interleague games on each team’s schedule were ignored from the calculation. The .535 figure in column 6 is simply the number of games each team plays against its own division, 76, divided by the number of non-interleague games each team plays, 142. In addition, the “Interdivision Opponent Regressed WP%” is the average opponent each division faces while playing out of division in non-interleague games. The AL East, for example, plays the AL Central and AL West in its remaining intra-league games, so the .487 inter-division opponent regressed WP% is calculated by taking a simple average of the AL Central’s Regressed WP%, .489, and the AL West’s Regressed WP%, .484.
Now that we have both divisional and league multipliers, we can derive each division’s total (observed) multiplier by simply multiplying the two:
How do these multipliers, which were fairly easy to calculate, compare with the multipliers implied in FanGraphs’ WAR calculations? Below, the multipliers are compared in bar graph form:
As you can see, the current construction of fWAR artificially helps certain divisions while hurting others. Let’s get a closer look at the problem by graphing how much fWAR inflates each division’s pitchers and position players relative to the multipliers we just calculated:
Upon viewing the chart, a theme emerges: Pitching WAR at FanGraphs is in need of serious repair. Pitching fWAR dramatically overvalues the American League. All three American League divisions have Pitching fWAR Multipliers at least 4.5% higher than they should be, while each Pitching fWAR Multipliers for the National League are all at least 6% lower than they should be.
Is this just a random aberration for 2014? Probably not; in 2013, the American League’s Pitching fWAR Multiplier was 1.095, not much lower than 2014’s 1.127 (and nowhere near the 1.038 value we got). For whatever reason, Pitching fWAR overvalues American League pitchers and undervalues their National League counterparts. The strongest National League division, the NL Central, suffers the most from this calculation error, while the weaker American League divisions (the AL Central and AL West) experience the greatest benefit. Fans of the Reds and Brewers in particular should take solace in the fact that their teams were hurt the most by not only the errors discussed here but also the park factor miscalculation discussed in Part 1 (hint: fWAR seriously undervalues Cueto).
As the chart shows, position player fWAR overvalues the National League, albeit to a lesser extent. Position player fWAR suffers an almost entirely different problem then Pitcher fWAR: Unlike pitcher fWAR, which seems to over-adjust for league, position player fWAR doesn’t adjust for strength of league and division at all. This inflates the fWAR of players/teams in weaker divisions – the NL East and NL West, for example – while deflating the fWAR of players in stronger divisions, like the AL East.
While the issue with position player fWAR is more obvious – a lack of league and divisional factors – the problem with pitching fWAR is less clear. Perhaps part of the problem is how replacement level is calculated. I am not familiar enough with the FanGraphs’ process of calculating WAR to know if there is a clear, fixable mistake. Either way, hopefully this article will inspire change in the way that fWAR is calculated for both pitchers and position players, with the changes to position player fWAR being much simpler to incorporate.
FanGraphs Wins Above Replacement is considered by many in the sabermetric community be the holy grail of WAR. And, even though I’m writing a piece that is critical of fWAR, FanGraphs is still the first website I go to when I want to get a basic understanding of a specific player or team’s value. Don’t view this article as an attack on fWAR or FanGraphs, both of which I use frequently; instead, consider this article as constructive criticism.
fWAR, specifically for pitchers, is riddled with minor problems that together make the metric less valuable. In Part 1 of the series, we’re going to look at a hotly debated issue regarding fWAR that has been brought up by other readers before: the fWAR park factors.
According to the FanGraphs glossary, a basic runs park factor is used when calculating fWAR. Because FIP models ERA, using runs park factors for FIP shouldn’t be a problem.
Unfortunately, this idea simply isn’t true. The inputs of FIP, HR/9, BB/9, and K/9, only include about 30% of plate appearances. Some ballparks (Citi Field for example), inflate HR/9 and FIP despite suppressing runs in general. If Pitcher fWAR is based on FIP, FIP park factors, not runs park factors, must be used. Below is a table comparing runs and FIP park factors for different teams/ballparks, with FIP park factor equaling ((13*HRPF)+(3*BBPF)-(2*SOPF))/(14), with all of the data coming from the FanGraphs park factors.
In addition, the standard difference between the Basic and FIP park factors was a staggering 5.5. Clearly, using runs park factors on FIP significantly benefits and hurts certain teams’ Pitcher fWAR.
While the Marlins, Red Sox, Pirates, Twins, and Royals benefit from park factors that overestimate their ballpark’s FIP-inflating ability, the Reds, Brewers, White Sox, Yankees and Mets experience the opposite effect, falsely increasing/decreasing these teams’ Pitcher fWAR.
Looking at the team pitching leaderboards, the effect of this mistake is pronounced on several teams’ fWAR. For example, the Mets, despite ranking 9th in the National League in FIP while playing in a ballpark that inflates FIP by 2%, rank dead last in the National League in Pitcher fWAR. Similarly, the Red Sox rank 5th in the AL in Pitcher fWAR despite ranking 10th in the AL in FIP and playing in a ballpark that suppresses FIP by 4%.
Using FIP park factors instead of runs park factors is a simple change that would vastly improve the accuracy of Pitcher fWAR. In the next segment of “Trying to Improve fWAR”, I’ll examine the league adjustments (or lack thereof) in both Position Player and Pitcher fWAR.