Trying to Improve fWAR Part 2: League and Divisional Factors
In Part 1 of the “Trying to Improve fWAR” series, we focused on how using runs park factors for a FIP-based WAR leads to problems when calculating fWAR, and suggested the use of FIP park factors instead. Today we’ll analyze a different yet equally important problem with the current construction of FanGraphs Wins Above Replacement for both position players and pitchers: league adjustments. When calculating WAR, the reason we adjust for league is simple; the two leagues aren’t equal. The American League has been the superior league for some time now, and considering that all teams play about 88% of their games within their league, the relative strength of the leagues is relevant when trying to put a value on individual players. If a player moved from the American League, a stronger league, to the National League, a weaker league, we’d expect the player’s basic numbers to improve; yet, if we properly adjust for quality of league when calculating WAR, his WAR shouldn’t change significantly by moving into a weaker league.
The adjustments that FanGraphs makes for strength of league are unclear. The glossary entry “What is WAR?” and the links within it don’t seem to reference adjusting for the strength of a player’s league/division at all. The only league adjustment is within position player fWAR, and is described as “a small correction to make it so that each league’s runs above average balances out to zero”. Not exactly a major adjustment. Rather than evaluating FanGraphs’ methods of adjusting for league, let’s instead look at the how the two leagues compared in fWAR for both pitchers and position players in 2014:
League |
Position Player fWAR | Pitcher fWAR | Total fWAR |
AL | 285.7 | 242.3 | 528 |
NL | 284.3 | 187.7 | 472 |
AL fWAR / League Average | 1.002 | 1.127 | 1.056 |
NL fWAR / League Average | .998 | .873 |
.944 |
Interestingly, AL pitchers seem to get a much greater advantage than AL position players from playing in a superior league. Yes, the AL does have a DH, but the effect of having a DH should be in the form of the AL replacement level RA/9 being higher than the NL replacement level RA/9. Having a DH (and hence a higher run environment) does not mean that the league should have more pitching fWAR. Essentially, somewhere in the calculation and implementation of fWAR, the WAR of AL pitchers is being inflated by around 13% and the WAR of NL pitchers is being deflated by the same amount. Meanwhile, AL position players don’t benefit at all from playing in a superior league. In order to accommodate for league strength, the entire American League should benefit from playing in the stronger league, not just the pitchers. In order to find out what the league adjustment should be (at least for the 2015 season), let’s look at each league’s interleague performance since 2013:
League | Wins | Losses | Interleague WP% | Regressed WP% |
AL | 317 | 283 | 0.528 | 0.5255 |
NL | 283 | 317 | 0.472 | 0.4745 |
The “Regressed Winning Percentage” is simply the league’s interleague Winning Percentage regressed to the mean by a factor of .1, meaning that 90% of the league’s interleague WP% is assumed to be skill. Each league’s interleague winning percentage is regressed slightly to ensure that we aren’t overestimating the differences between the two leagues. Part of the reason we regress each league’s interleague winning percentage is because the interleague system is admittedly not perfect; while NL teams believe that the AL has an inherent advantage because of their everyday DH, AL teams complain about having pitchers who can’t bunt and a managerial style that is strategically difficult for their managers. While both sides have valid points, interleague games probably don’t hurt one side significantly more than the other, meaning that the vast amount of data that comes from interleague games is reliable as long as it is properly regressed.
Just knowing each league’s regressed interleague winning percentage, however, is not enough. We also need to know the percent of games each league plays within its own league. Why? The more games the league plays against the other league, the less playing in a superior league matters; the only reason we have to adjust for strength of league in the first place is because of the disparity in competition between the leagues. In a 162-game season, a team plays exactly 20 games against interleague opponents, meaning that 142 of 162 games, or 87.7% of a team’s schedule, is intra-league. Therefore, in order to find each league’s multiplier, the following equation is used:
League Multiplier = 2 * ((.877 * Regressed WP%) + ((1-.877) * Opponent Regressed WP%))
In this calculation, the “Opponent Regressed WP%” is simply the opposing league’s Regressed WP%. This is incorporated into the formula because each league plays 12.3% of its games (20 games) against the other league. Without further ado, here are the league multipliers:
League | Regressed WP% | Percent of Games Intra-league | Interleague Opponent Regressed WP% |
League Multiplier |
AL | 0.5255 | 0.877 | 0.4745 | 1.0384 |
NL | 0.4745 | 0.877 | 0.5283 | 0.9616 |
As expected, the American League comes out as the stronger league, albeit by a smaller margin than its advantage in fWAR (remember, the AL’s league multiplier in fWAR was 1.056). Still, there are other adjustments that can be made besides adjusting for league. In the same way that the superiority of the American League is no secret, the fact that all divisions are not created equal is relatively obvious to most baseball fans. The AL East has long been considered the best division in baseball, and their inter-division record backs up that reputation; they have a .530 inter-division winning percentage over the last two seasons (only including games in their own league), best in the American League. Using the same process we used to calculate the league multipliers, division multipliers were calculated as shown below, with the data from the 2013-2014 seasons:
Division | W | L | Inter-division WP% | Regressed WP% | Percent of Non- Interleague Games Intra-division | Inter-division Opponent Regressed WP% | Division Multiplier |
AL East | 350 | 311 | 0.530 | 0.527 | 0.535 | 0.487 | 1.041 |
AL Central | 322 | 338 | 0.488 | 0.489 | 0.535 | 0.505 | 0.983 |
AL West | 319 | 342 | 0.483 | 0.484 | 0.535 | 0.508 | 0.976 |
NL East | 318 | 342 | 0.482 | 0.484 | 0.535 | 0.508 | 0.975 |
NL Central | 350 | 310 | 0.530 | 0.527 | 0.535 | 0.486 | 1.042 |
NL West | 322 | 338 | 0.488 | 0.489 | 0.535 | 0.505 | 0.983 |
One difference between this calculation and the league multiplier calculation was that, in this calculation, not all games were used when determining what percent of a division’s games were intra-division; because we already adjusted for league earlier, the 20 interleague games on each team’s schedule were ignored from the calculation. The .535 figure in column 6 is simply the number of games each team plays against its own division, 76, divided by the number of non-interleague games each team plays, 142. In addition, the “Interdivision Opponent Regressed WP%” is the average opponent each division faces while playing out of division in non-interleague games. The AL East, for example, plays the AL Central and AL West in its remaining intra-league games, so the .487 inter-division opponent regressed WP% is calculated by taking a simple average of the AL Central’s Regressed WP%, .489, and the AL West’s Regressed WP%, .484.
Now that we have both divisional and league multipliers, we can derive each division’s total (observed) multiplier by simply multiplying the two:
Division | Division Multiplier | League Multiplier | Total Multiplier |
AL East | 1.0408 | 1.0384 | 1.081 |
AL Central | 0.9833 | 1.0384 | 1.021 |
AL West | 0.9760 | 1.0384 | 1.013 |
NL East | 0.9749 | 0.9616 | 0.937 |
NL Central | 1.0419 | 0.9616 | 1.002 |
NL West | 0.9833 | 0.9616 | 0.945 |
How do these multipliers, which were fairly easy to calculate, compare with the multipliers implied in FanGraphs’ WAR calculations? Below, the multipliers are compared in bar graph form:
As you can see, the current construction of fWAR artificially helps certain divisions while hurting others. Let’s get a closer look at the problem by graphing how much fWAR inflates each division’s pitchers and position players relative to the multipliers we just calculated:
Upon viewing the chart, a theme emerges: Pitching WAR at FanGraphs is in need of serious repair. Pitching fWAR dramatically overvalues the American League. All three American League divisions have Pitching fWAR Multipliers at least 4.5% higher than they should be, while each Pitching fWAR Multipliers for the National League are all at least 6% lower than they should be.
Is this just a random aberration for 2014? Probably not; in 2013, the American League’s Pitching fWAR Multiplier was 1.095, not much lower than 2014’s 1.127 (and nowhere near the 1.038 value we got). For whatever reason, Pitching fWAR overvalues American League pitchers and undervalues their National League counterparts. The strongest National League division, the NL Central, suffers the most from this calculation error, while the weaker American League divisions (the AL Central and AL West) experience the greatest benefit. Fans of the Reds and Brewers in particular should take solace in the fact that their teams were hurt the most by not only the errors discussed here but also the park factor miscalculation discussed in Part 1 (hint: fWAR seriously undervalues Cueto).
As the chart shows, position player fWAR overvalues the National League, albeit to a lesser extent. Position player fWAR suffers an almost entirely different problem then Pitcher fWAR: Unlike pitcher fWAR, which seems to over-adjust for league, position player fWAR doesn’t adjust for strength of league and division at all. This inflates the fWAR of players/teams in weaker divisions – the NL East and NL West, for example – while deflating the fWAR of players in stronger divisions, like the AL East.
While the issue with position player fWAR is more obvious – a lack of league and divisional factors – the problem with pitching fWAR is less clear. Perhaps part of the problem is how replacement level is calculated. I am not familiar enough with the FanGraphs’ process of calculating WAR to know if there is a clear, fixable mistake. Either way, hopefully this article will inspire change in the way that fWAR is calculated for both pitchers and position players, with the changes to position player fWAR being much simpler to incorporate.
Founder of NothingButNumbers.com
Good stuff again Noah. We need to kick the tires on areas like the league adjustment in fWAR that might contain some systematic bias or material inaccuracy.
Another approach to reflect the league/strength of schedule impact on WAR components might look something like this:
– For each team, adjust the hitters’ wOBA by the weighted average of the wRC+ allowed by the pitching staffs that team had faced.
– Also for each team, adjust the pitchers’ FIP by the weighted average of the bbFIP of the hitters on the teams they had faced (as a % of the MLB average bbFIP)
– Use the schedule-adjusted Runs Above Average from the above steps to calculate Wins Above Average in the usual fashion. However, there is absolutely No reason to force the WAA to equal zero for each league individually.
– Determine the replacement level for WAR using the WAA for all players in both leagues in the usual fashion.
I’d be curious how this approach would compare to your approach here.
Your approach seems technically sound. To me, however, it seems much easier to just start off with AL Position Player WAR = NL Position Player WAR and AL Pitcher WAR = NL Pitcher WAR and then adjust for strength of league/division with multipliers that change each year.
Some nice food for thought. Hopefully, it makes some of the decision makers consider an update to WAR.
Even knowing the formula/structure of WAR has flaws, my biggest complaint is the repeated advice (from Dave Cameron among others) that WAR is not accurate to 0.1 value. If WAR shouldn’t be used to that precision, Fangraphs shouldn’t display it. It wouldn’t be difficult to change the displayed precision to 0.5 or 1.0 (whatever is accurate enough for general use).
Agreed. The reason WAR isn’t accurate to .1 value is because of structural problems; if WAR was being calculated correctly, it should be far more accurate than many writers here let on.
I don’t really agree. I don’t like “sig figs” as a way of trying to capture uncertainty also I think people are too vague about what they mean by “accuracy” for the term to have any meaning. I think ideally, we’d be very clear about exactly what uncertainty we’re talking about and use a +/- instead of limiting the number of decimal places.
I agree that showing a confidence interval (or error bars or whatever you want to call it) is a better solution than reducing the displayed precision. However, I highly doubt that Fangraphs would display this information. A ‘sig figs’ type conceptual approach has a much better chance of actually being publically implemented.
I think the problem with Pitching fWAR might relate to how FanGraphs calculates replacement level. In Part 4 of their Pitching WAR explainer, they write, “The league average runs per game in the AL last year was 4.78,” implying that RA/9, not FIP scaled to RA/9, is the league average baseline.
Here’s the problem: the AL’s league average FIP in 2008, scaled to RA/9, was 4.71. The AL’s league average FIP scaled to RA/9 (4.71) is different than the league average RA/9 (4.78). This makes the AL’s replacement level RA/9 artificially higher than it should be if it was based on FIP while the NL’s replacement level RA/9 is artificially lower.
This isn’t just present in 2008. Take 2014:
NL: League Average FIP scaled to RA/9: 4.01; League Average RA/9: 3.96
AL: League Average FIP scaled to RA/9: 4.12; League Average RA/9: 4.20
This problem is then compounded when calculating replacement level. I think we found the solution to Pitcher fWAR: use League Average FIP scaled to RA/9 instead of League Average RA/9 when calculating league average/replacement level.
Great catch. Just goes to show the importance of using a consistent basis for the components of a metric so you don’t introduce a distortion (just like OPS had the structural issue of different demonimators for the OBA and SLG components).
In addition to this, Pitching WAR (in my opinion) should also be calculated like Position Player WAR in the following way: Pitchers should be evaluated in runs above/below average. Then you add replacement runs in on a per plate appearance (or, for pitchers, total batter faced minus intentional walks) basis. Calculating separate replacement levels for the two leagues simply exacerbates the difference in RA/9 between the leagues.
Hey Noah,
I appreciate your work here. I actually played around with the FIP-park factors myself a coubple of years ago so we are on the same page in that department 🙂
Some things I wanted to adress.. I think the league average RA/9 is ALWAYS the league average FIP scaled to RA/9 in any given year. I might be mistaken but I think the scale is a qutient – not a difference. You take the quotient of all league earned rus and all league runs (ER/R) wich should come out as something close to .92. This should be the scale IIRC and therfore the RA/9-FIP and RA/9 should always match.
For batting value there is already a league adjustment worked into the formula wRAA uses the run environment of both leagues but compares the actual output to the respective league the player plays in. Additionaly, pitcher batting is removed – thus both league’s batting runs should be a lot closer together than one would think when taking a first look at a league’s overall batting line.
Yes, the scale is a quotient. You divide a the league’s ERA by .92 to get the average RA/9. The issue is that most years the league ERA and the league FIP are not equal.
With regards to the batting league adjustment, the adjustment is meant to set the leagues equal to each other. My point is that the leagues are not equal; players in the AL should be rewarded for putting up their production in a more difficult league.
Well, the .92 is not set in stone. It is whatever the league’s
(runs allowed/earned runs allowed) is that given year.
And I am not talking about the “league adjustment” which I think you refer to. I am talking about “batting runs” which you can find in the value section on a player page which is basicly a player’s wRAA adjusted for park and league.
Actually I am writing a community piece at this moment that deals with a solution to the league adjustment.
The issue is not the .92 figure. It doesn’t really matter what the exact figure they use each year, even though they do say in Part 4 of the WAR explainer that they use it. From the WAR explainer:
“The league average runs per game in the AL last year was 4.78. The FIPs that are displayed on the pitcher’s player card here at FanGraphs are scaled to ERA, but for the win values, we modified the formula slightly to scale it to match league RA. However, there’s a shortcut if you want to take a pitcher’s traditional FIP and have it match up with the league RA – that’s dividing his FIP by .92.
For instance, a 4.40 FIP divided by .92 will give you a 4.78 FIP. That .92 is the ERA-RA bridge, and allows us to conclude that 4.40 would be a league average FIP in the American League last year. So, a pitcher with a 4.40 FIP in a neutral park would be a league average pitcher. Or, put back into win% terms, a .500 pitcher.”
They start off by using the league average Runs per 9 innings, which is 4.78. They then multiply that by .92 to determine that the league average FIP in the AL in 2008 was 4.40. Here’s the problem: the league average AL FIP in 2008 wasn’t 4.40. It was 4.33. The difference in ERA and RA/9 between the leagues is greater than the FIP difference.
I mean they said explicitly in the explainer that a 4.40 FIP pitcher in the AL in 2008 was presumed to be a .500 pitcher by fWAR. That’s false, because the American League average FIP was 4.33, not 4.40.
So again, like the park factor problem, it’s a FIP-related problem. In the same way that RA/9 park factors are usually much more extreme than FIP park factors, the difference between the AL and NL is far more pronounced in ERA and RA/9 than it is in FIP.
It’s just another reason to use SIERA instead of FIP. Some guys like Kyle Lohse will always tend to outperform (or underperform) their FIP simply due to their batted ball profiles. In Lohse’s case, he doesn’t get very many strikeouts (even compared to his low walk rates), but he makes up for it by getting a lot of ground balls and weak contact.
I agree with you that SIERA is better, and should be used in WAR, but Lohse has actually been seeming to outpitch his SIERA by more than he’s outpitched his FIP in recent years.
On a related note: BBRef has been doing AL vs. NL adjustments from the onset. Here is the interesting part. Despite the NL actually beating the AL in interleague play from 1997-2004 to a tune of close to .510-.490, BBRef still gives the AL a .520-.480 or so advantage. I wonder why nobody has noticed this?
Tyinng it back in, since BBref and FG came to some common ground re: replacement level, this seems to be another area in which the two could line up pretty easily. At that point, the only major difference would be FIP vs. RA (adjusted for defense) for pitchers.
Well the other major difference you difference you didn’t mention is UZR vs DRS. UZR is definitely a better system when evaluating individual defensive players because it adjusts for the batted ball tendencies of the pitching staff. Still, when it comes to looking at team defense, I think I prefer DRS because teams often optimize their pitching staffs to play into the strengths of their defense.
Also, do you know where the BBRef League Adjustment is shown or how it is calculated?
you differenceYeah, UZR vs. DRS is another big difference.
Here are the BBRef adjustments…down at the bottom. It seems like NLers are being hurt from 1997-2004 a bit.
http://www.baseball-reference.com/about/war_explained_position.shtml
YOU DIFFERENCE
I didn’t follow all the “multipliers” that you were doing, but you have to do it relative to replacement level.
If there was no league adjustment needed, we get to 1000 WAR by doing:
(.500 – .294) x 162 x 30 = 1000
In 2013-2014, the AL has a .528 record. Gives past history, the Astros moving in, that’s just about right. But, that’s a .528 record against NL teams.
If we treat the .528 as a true talent against NL teams, then it would be .514 true talent against .500 teams. That’s because a .514 team facing a .486 team will have a .528 win%.
Therefore, AL would be:
(.514 – .294) x 162 x 15 = 535
And NL is 465.
Fangraphs numbers look about right.
I have an article coming up here which I wrote some days ago that will be about EXACTLY this, tango.
Email me when it’s online. Looking forward to it!
If you look at the each league’s total fWAR, then yes, the numbers are actually about right. I had the overall league multiplier at 1.038, theirs was 1.056, which makes sense given that I regressed the interleague WP% a bit.
The main point of the article (and the main problem with fWAR) was that for pitchers the implied multiplier for the AL far exceeded 1.056 (while for pitchers the multiplier was about 1). If the AL is superior than the NL, both pitchers and hitters should benefit equally. What happens in fWAR is that AL pitchers benefit an exorbitant amount (the Pitching fWAR Multiplier was a shocking 1.13) while position players barely benefited at all (their Multiplier was 1.00).
Then you have the issue of divisional adjustments. Not all divisions are equal, and we know that moving into a weaker division means better production. Yet WAR makes no such adjustment.
“If the AL is superior than the NL, both pitchers and hitters should benefit equally.”
That’s not true. You have to figure out what the split is. Check out MGL’s three-parter from 10 years ago on a few ways to do that:
http://www.hardballtimes.com/author/mgl59/
I love your work but you can’t just say “that’s not true”. It’s by no means a settled issue.
If you’re calculating WAR separately in each league then what I said is in fact true. With the way FanGraphs calculates WAR, then yes, maybe the benefit is not equally felt among the leagues.
Either way, you’re not suggesting that FanGraphs’ assumption that the two leagues should have equal Position Player WAR is accurate, right? Isn’t it a pretty simple idea that the AL position players should benefit from playing in a superior league, not just the pitchers?
Also the link that you posted doesn’t go to whatever article/series you’re referring to, which I’d love to read.
I mean if we’re comparing the International League to MLB, we don’t say that pitchers benefit more than hitters from playing in the weaker International League, do we? Relative to the offensive environment of their league, the effect of playing in a weaker/stronger league is the same. I firmly believe that (and have yet to see a convincing argument prove otherwise).
I’m not suggesting Fangraphs is correct in how they do it. I’m simply making the obvious point that if in 2015 Kershaw, Lee, Hamels, Strasburg, Bumgarner, and Zimmerman get traded to the AL, then the shift in WAR will have been disproportionately by pitchers.
Given that we’re only talking about 35 WAR in shift in 2014 between AL and NL, it’s pretty easy to see that if you get say 3 star pitchers moving one way, and 2 star hitters moving the other way, you’ll get an imbalance.
International league: again, you can’t assume that. It could very well be that you have some minor league in some year that has alot more pitching talent than hitting talent, and there’s a shift year to year.
Yes, if Kershaw, Bumgarner, Lee, Hamels, Zimmerman, and Strasburg all got traded to the AL, their unadjusted WAR would go down, because the league average FIP would be lower from the influx of talent.
But then the league adjustment would increase as the AL’s interleague record improves from the influx of talent. Overall I’d expect their WAR to stay the same regardless of league.
At the same time you can’t forget that the addition of these talented pitchers would also significantly hurt AL hitters as well. If you used different league multipliers for pitchers and position players (which I don’t recommend), you wouldn’t be doing the AL hitters, who now have to face tougher competition, proper justice.
“It could very well be that you have some minor league in some year that has alot more pitching talent than hitting talent, and there’s a shift year to year.”
How do we measure talent in the first place? To me, it’s always meant performance above league average. Calculating league factors separately for pitchers and position players would be failing to acknowledge the increased difficulty that comes from hitters facing “more pitching talent”.
Which is why I don’t really think there ever is such a thing as a league having “more hitting talent” or “more pitching talent”. I mean, sure, you can have leagues where the standard deviation is higher among pitchers than hitters, meaning that there are more elite pitchers than their are elite hitters, but that has nothing to do with the inherent talent of the two types of baseball players as a whole.
For example, I wouldn’t argue that pitching talent has vastly improved since Babe Ruth’s day. But that doesn’t mean we change the way that WAR is calculated to account for this difference. Even if a 3 WAR pitcher in 2014 is vastly more talented than a 3 WAR pitcher in 1924, we don’t treat pitchers and hitters differently when calculating WAR. We compare each player to league average (and thus replacement level) and move on.
What I’m suggesting is an additional step after simply determining each player’s value over that league’s replacement level:
Adjust that player’s WAR to reflect the strength of the league and division he is playing in. That’s it. And whether the player is a position player or a pitcher is irrelevant when determining the strength of that player’s league.
“To me, it’s always meant performance above league average. … Which is why I don’t really think there ever is such a thing as a league having “more hitting talent” or “more pitching talent”. ”
Well, you are simply wrong. You have to accept that you are wrong, otherwise, we’re not going to have a discussion. If you are going to say you are right, and I say you are wrong, then let’s stop right here. Otherwise, continue reading…
…ah, good… thank god you admitted you are wrong. As a matter of CONVENIENCE, I set the WAR ratio of 4:3 for nonpitchers:pitchers every year. But that doesn’t mean I believe it’s static year to year. It can’t possibly be. Now, it can be close enough year to year that it’s not worth the trouble, so we make it a SYSTEM LIMITATION. That’s fine. But let’s just remember that we are trying to take shortcuts here to get to a desirable point.
“I set the WAR ratio of 4:3 for nonpitchers:pitchers every year.”
I didn’t know that this was how WAR was calculated (there certainly isn’t any mention of this in FanGraphs’ WAR explainer, and I think this might be a problem.
Why do we have to set a ratio? Wouldn’t it be easier to simply calculate each player’s WAR and not worry about the ratio? If we’re doing it correctly, the ratio should be about the same every year.
Just do the same thing for hitters and pitchers. Calculate runs above/below average, add in replacement level runs, calculate the runs/win ratio for that specific player, add in a league/division adjustment, and you’re good. I don’t see any need to set a ratio, but then again I guess I’m not as familiar with the WAR calculation process as I thought.
I mean it’s kind of unfair to criticize me for being wrong when nobody has ever mentioned there being a WAR ratio.
Noah, the 4:3 or 57/43 split isn’t really explained but mentioned in the WAR pages. Play around with the league leaderboards and see for yourself, that position players get 57% or 570 WAR and pitchers get 43% or 430 WAR.
The split is coming out of a seemingly tough to understand calculation (I say seemingly tough because I remember TONS of posts over at tango’s blog in which he explained himself to his readers who couldn’t really grasp the concept)
The long answer is to go over to his blog and read through tons of comments and posts about this.
The short answer is: The split is a result of different variances. Run scoring (offense) run prevention (pitching + defense). Baseball is 50% offense and 50% defense. Position players take care of all the offense and part of the defense. The pitcher only takes care of some amount of the defense aspect. This can get long and mathy so I will stop here.
One thing to note: The 4:3 or 57/43 split can be observed in real world MLB organization’s spending on position players and pitchers. So it is really beautiful.
“…add in replacement level runs…”
Well, that’s the key. You don’t add the same number per PA for pitchers and nonpitchers. It has to be preset somehow.
Hence, I preset it based on the idea that nonpitchers have a .380 win%, SP have a .380 win% and relievers have a .470 win%.
If you take the .528 at face value then you’re correct in saying that the AL would be a .514 true talent against .500 teams.
The problem with your calculation, however, is that you’re not factoring in that the percent of games (12.3%) the AL also plays against the weaker National League. If the AL played 50% of its games against the NL then a league adjustment wouldn’t even be necessary because of the equality in competition. If you want to completely adjust for the competition that a player in the AL faces, it’s:
(.877*.514)+(.123*.486) = .5106 The NL is .4894.
Therefore the AL is, in terms of fWAR:
(.5106-.294) X 162 X 15 = 526
With the NL at 474. Almost exactly what the FanGraphs numbers are. Still, in my opinion the AL’s advantage is probably less than this because the Interleague numbers need to be regressed.
If you read the article, I never said that overvaluing the AL was a major problem with fWAR when calculating TOTAL WAR.
The predominant issue is that they are over-adjusting for league with Pitcher fWAR and not adjusting at all for Position Player fWAR. That’s the improvement that fWAR needs to make, and it starts with how they calculate replacement level FIP in each league.
In fact if you regressed Interleague Winning Percentage like I did, and assumed that the AL was a true talent .5255 against the NL, then the multipliers I got were exactly right by using your method.
.5255 (AL) facing .4745 would be true talent .51275 (we’ll round to .513). The NL is then .487.
Then you account for the fact that each league plays 12.3% of its games against the other league:
AL = (.877*.513) + (.123*.487) = .510
NL = .490
AL = (.510 – .294) X 162 X 15 = 524
NL = 476
And, the multipliers:
AL = 524/500 = 1.0479
NL = 476/500 = 0.9521
Not really a major difference from the 1.0384 and 0.9616 values I got before.
Sometimes we have to remember that the point of what we’re trying to do here is place a wins value on a player while assuming he is in a context neutral environment.
Knowing that the AL is a true talent .514 against .500 teams isn’t going far enough when you’re trying neutralize an American League player’s context. He didn’t play all of his games within his league. His “context”, so to speak, was playing against .510 opponents.
We could, for example, make up a league consisting solely of the Dodgers, Nationals, and Cardinals. Let’s call it “Noah’s League”. This league has a .600 Winning Percentage against “Tango’s League”, which we’ll call the remaining 27 teams, meaning that against .500 opponents Noah’s League is a roughly .550 true talent.
Do we have to make a league adjustment? No. Because Noah’s League is made up – and therefore doesn’t play an abnormally large percentage of its games against itself – there’s no need to make an adjustment for league.
With the AL and NL we do have to make an adjustment, because each league plays 87.7% of its games against itself. But it’s important that we don’t forget about the 12.3% of games each league plays against the other league. The more games the league plays outside of its own league, the weaker the league adjustment is. Does this make sense to you? I took a little while to explain it because I think the concept can be a bit difficult (I also didn’t spend as much time explaining this in the article as I should have).
I agree and that is how I handle it (furthermore, I handle it based on the number of games against each opponent, and so, takes care of the divisional issue as well).