Thoughts on the MVP Award: Team-Based Value and Voter Bias

November 30, 2013

You are reading this right now. That is a fact. Since you are reading this right now, many things can be reasonably inferred:

1. You probably read FanGraphs at least fairly often

2. Since you probably read FanGraphs at least fairly often, you probably know that there are a lot of differing opinions on the MVP award and that many articles here in the past week have been devoted to it.

3. You probably are quite familiar with sabermetrics

4. You probably are either a Tigers fan or think that Mike Trout should have won MVP, or both

5. You might know that Josh Donaldson got one first-place vote

6. You might even know that the first-place vote he got was by a voter from Oakland

7. You might know that Yadier Molina got two first-place votes, and they both came from voters from St. Louis

8. You might even know that one of the voters who put Molina first on his ballot put Matt Carpenter second

9. You might be wondering if there is any truth to the idea that Miguel Cabrera is much more important to his team than Mike Trout is

I have thought about many of those things myself. So, in this very long 2-part article, I am going to discuss them. Ready? Here goes:

Part 1: How much of an impact does a player have on his team?

Lots of people wanted Miguel Cabrera to win the MVP award. Some of you reading this may be shocked, but it’s actually true. One of the biggest arguments for Miguel Cabrera over Mike Trout for MVP is that Cabrera was much more important and “valuable” than Trout. Cabrera’s team made the playoffs. Trout’s team did not. Therefore anything Trout did cannot have been important. Well, let’s say too important. I don’t think that anybody’s claiming that Trout had zero impact on the game of baseball or the MLB standings whatsoever.

OK. That’s reasonable. There’s nothing flawed about that thinking when it’s not a rationale for voting Cabrera ahead of Trout for MVP. As just a general idea, it makes sense: Cabrera had a bigger impact on baseball this year than Trout did. I, along with many other people in the sabermetric community, disagree with the fact that that’s a reason to vote for Cabrera, though. But the question I’m going to ask is this: did Cabrera have a bigger impact on his own team than Trout did?

WAR tells us no. Trout had 10.4 WAR, tops in MLB. Cabrera had 7.6 – a fantastic number, good for 5th in baseball and 3rd in the AL, as well as his own career high – but clearly not as high as Trout. Miggy’s hitting was out of this world, at least until September, and it’s pretty clear than he could have at least topped 8 WAR easily had he stayed healthy through the final month and been just as productive as he was April through August. But, fact is, he did get hurt, and did not finish with a WAR as high as Trout. So if they were both replaced with a replacement player, the Tigers would suffer more than the Angels. Cabrera was certainly valuable – if replaced by a replacement, the 7 or 8 wins the Tigers would lose would probably not be enough to win them the AL Central. But take Trout out, and the Angels go from a mediocre-to-poor team to a really bad one. The Angels had 78 wins this year, and that would have been around 68 (if we trust WAR) without Trout. That would have been the 6th worst total in the league. So, by WAR, Trout meant more to his team than Cabrera did.

But WAR is not the be all and end all of statistics (though we may like to think it is sometimes). Let’s look at this from another angle. Here’s a theory for you: the loss of a key player on a good team would probably not hurt that team as much because they’re already good to begin with. If a not-so-good team loses a key player, though, the other players on the team aren’t as good so they can’t carry the team very well.

How do we test this theory? Well, we have at our disposal a fairly accurate and useful tool to determine how many wins a team should get. That tool is pythagorean expectation – a way of predicting wins and losses based on runs scored and allowed. So let’s see if replacing Trout with an average player (I am using average and not replacement because all the player run values given on FanGraphs are above or below average, not replacement) is more detrimental to the Angels than replacing Cabrera with an average player is to the Tigers.

The Angels, this year, scored 733 runs and allowed 737. Using the Pythagenpat (sorry to link to BP but I had to) formula, I calculated their expected win percentage, and it came out to .497 – roughly 80.6 wins and 81.4 losses*. That’s actually significantly better than they did this year, which is good news for Angels fans. But that’s not the focus right here.

Trout, this year, added 61.1 runs above average at the plate and 8.1 on the bases for a total of 69.2 runs of offense. He also saved 4.4 runs in the field (per UZR). So, using the Pythagenpat formula again with adjusted run values for if Trout were replaced by an average hitter and defender (663.8 runs scored and 741.4 runs allowed), I again calculated the Angels’ expected win percentage. This came out to be .449 – roughly 72.7 wins and 89.3 losses. 7.9 fewer wins than the original one. That’s the difference, for that specific Angels team, that Trout made. Now, keep in mind, this is above average, not replacement, so it will be lower than WAR by a couple wins (about two WAR signifies an average player, so wins above average will be about two less than wins above replacement). 7.9 wins is a lot. But is it more than Cabrera?

Let’s see. This year, the Tigers scored 796 runs and allowed 624. This gives them a pythagorean expectation (again, Pythagenpat formula) of a win percentage of .612 – roughly 99.1 wins and 62.9 losses. Again much better than what they did this year, but also not the focus of this article. Cabrera contributed 72.1 runs above average hitting and 4.4 runs below average on the bases for a total of 67.7 runs above average on offense. His defense was a terrible 16.8 runs below average.

Now take Cabrera out of the equation. With those adjusted run totals (728.3 runs scored and 607.2 runs allowed) we get a win percentage of .583 – 94.4 wins and 67.6 losses. A difference of 4.7 wins from the original.

Talk about anticlimactic. Trout completely blew Cabrera out of the water (I would say no pun intended, but that was intended). This makes sense if we think about it – a team with more runs scored will be hurt less by x fewer runs because they are losing a lower percentage of their runs. In fact, if we pretend the Angels scored 900 runs this year instead of 733, they go from a 96.5-win team with Trout to an 89.8-win team without. Obviously, they are better in both cases, but the difference Trout makes is only 6.7 wins – pretty far from the nearly 8 he makes in real life.

The thing about this statistic is that it penalizes players on good teams. Generally, statistics such as the “Win” for pitchers are frowned upon because they measure things that the pitcher can’t control – just like this one. But if we want to measure how much a team really needs a player, which is pretty much the definition of value, I think this does a pretty good job. Obviously, it isn’t perfect: the numbers that go into it, especially the baserunning and fielding ones, aren’t always completely accurate, and when looking at the team level, straight linear weights aren’t always the way to go; overall, though, this stat gives a fairly accurate picture. The numbers aren’t totally wrong.

Here’s a look at the top four vote-getters from each league by team-adjusted wins above average (I’ll call it tWAA):

Player	tWAA
Mike Trout	7.9
Andrew McCutchen	6.4
Paul Goldschmidt	6.2
Chris Davis	6.1
Josh Donaldson	4.9
Miguel Cabrera	4.7
Matt Carpenter	4.0
Yadier Molina	3.1

This is interesting. Like expected, the players on better teams have a lower tWAA than the ones on good teams, just as we discussed earlier. One notable player is Yadier Molina, who despite being considered one of, if not the best catcher in the game, has the lowest tWAA of anyone on that list. This may be because he missed some time. But let’s look at it a little closer: if we add the 2 wins that an average player would provide over a replacement-level player, we get 5.1 WAR, which isn’t so far off of his 5.6 total from this year. And the Cardinals’ pythagorean expectation was 101 wins, so obviously under this system he won’t be credited as much because his runs aren’t as valuable to his team. Another factor is that we’re not adjusting by position here (except for the fielding part), and Molina is worth more runs offensively above the average catcher than he is above the average hitter, since catchers generally aren’t as good at hitting. But if Molina was replaced with an average catcher, I’m fairly certain that the Cardinals would lose more than the 3 games more that this number suggests. They might miss Molina’s game calling skills – if such a thing exists – and there’s no way to quantify how much Molina has helped the Cardinal pitchers improve, especially since they have so many rookies. But there’s also something else, something we can quantify, even if not perfectly. And that’s pitch framing. Let’s add the 19.8 runs that Molina saved (measured by Statcorner) to Molina’s defensive runs saved (for which, by the way, I used the Fielding Bible’s DRS, since there is no UZR for catchers – that may be another reason Molina’s number may seem out of place, because DRS and UZR don’t always agree; Trout’s 2013 UZR was 4.4, and his DRS was -9. Molina did play 18 innings at first base, where he had a UZR of -0.2. We’ll ignore that, though, since it is such a small sample size and won’t make such a big difference).

Here is the table with only Molina’s tWAA changed, to account for pitch framing:

Player	tWAA
Mike Trout	7.9
Andrew McCutchen	6.4
Paul Goldschmidt	6.2
Chris Davis	6.1
Yadier Molina	5.4
Josh Donaldson	4.9
Miguel Cabrera	4.7
Matt Carpenter	3.9

Now we see Molina move up into 5th place out of 8 with a much better tWAA of 5.4 – more than 2 wins better than without the pitch framing, and about 7.4 WAR if we want to convert from wins above average to wins above replacement. Interesting. I don’t want to get into a whole argument now about whether pitch framing is accurate or actually based mostly on skill instead of luck, or whether it should be included in a catcher’s defensive numbers when we talk about their total defense. I’m just putting that data out there for you to think about.

But as I mentioned before, I used DRS for Molina and not UZR. What if we try to make this list more consistent and use DRS for everyone? (We can’t use UZR for everyone.) Let’s see:

Player	tWAA	DRS	UZR
Mike Trout	6.5	-9	4.4
Andrew McCutchen	6.4	7	6.9
Paul Goldschmidt	7.0	13	5.4
Chris Davis	5.5	-7	-1.2
Molina w/ Framing	5.4	31.8	N/A
Josh Donaldson	5.0	11	9.9
Miguel Cabrera	4.6	-18	-16.8
Matt Carpenter	4.1	0	-0.9
Yadier Molina	3.1	12	N/A

We see Trout go down by almost a win and a half here. I don’t really trust that, though, because I really don’t think that Mike Trout is a significantly below average fielder, despite what DRS tells me. DRS actually gave Trout a rating of 21 in 2012, so I don’t think it’s as trustworthy. But for the sake of consistency, I’m showing you those numbers too, with the DRS and UZR comparison so you can see why certain people lost/gained wins.

OK. So I think we have a pretty good sense for who was most valuable to their teams. But I also think we can improve this statistic a little bit more. Like I said earlier, the hitting number I use – wRAA – is based off of league average, not off of position average. In other words, if Chris Davis is 56.3 runs better than the average hitter, but we replace him with the average first baseman, that average first baseman is already going to be a few runs better than the average player. So what if we use weighted runs above position average? wRAA is calculated by subtracting the league-average wOBA from a player’s wOBA, dividing by the wOBA scale, and multiplying by plate appearances. What I did was subtract the position average wOBA from the player’s wOBA instead. So that penalizes players at positions where the position average wOBA is high.

Here’s your data (for the defensive numbers I used UZR because I think it was better than DRS, even though the metric wasn’t the same for everyone):

Player	position-adj. tWAA	Pos-adj. wRAA	wRAA
Trout	7.7	59.4	61.1
McCutchen	6.2	40.1	41.7
Molina w/ Framing	5.6	23.3	20.5
Goldschmidt	5.0	39.5	50.1
Davis	5.0	46.4	56.3
Donaldson	4.9	36.6	36.7
Cabrera	4.7	72.0	72.1
Carpenter**	4.3	41.7	37.8
Molina	3.4	23.3	20.5

I included here both the regular and position-adjusted wRAA for all players for reference. Chris Davis and Paul Goldschmidt suffered pretty heavily – each lost over a win of production – because the average first baseman is a much better hitter than the average player. Molina got a little better, as did Carpenter, because they play positions where the average player isn’t as good offensively. Everyone else stayed almost the same, though.

I think this position-adjusted tWAA is probably the most accurate. And I would also use the number with pitch framing included for Molina. It’s up to you to decide which one you like best – if you like any of them at all. Maybe you have a better idea, in which case you should let me know in the comments.

Part 2: Determining voter bias in the MVP award

As I mentioned in my introduction, Josh Donaldson got one first-place MVP vote – from an Oakland writer. Yadier Molina got 2 – both from St. Louis writers. Matt Carpenter got 1 second-place vote – also from a St. Louis writer. Obviously, voters have their bias when it comes to voting for MVP. But how much does that actually matter?

The way MVP voting works is that for each league, AL and NL, two sportswriters who are members of the BBWAA are chosen from each location that has a team in that league – 15 locations per league times 2 voters per location equals 30 voters total for each league. That way you won’t end up with a lot of voters or very few voters from one place who may be biased one way or another.

But is there really voter bias?

In order to answer this question, I took all players who received MVP votes this year (of which there were 49) and measured how many points each of them got per 2 voters***. Then I took the amount of points that each of them got from the voters from their chapter and found the difference. Here’s what I found:

AL:

Player, Club	City	Points	Points/2 voter	Points From City voters	% Homer votes	Homer difference
Josh Donaldson, Athletics	OAK	222	14.80	22	9.91%	7.20
Mike Trout, Angels	LA	282	18.80	23	8.16%	4.20
Evan Longoria, Rays	TB	103	6.87	11	10.68%	4.13
David Ortiz, Red Sox	BOS	47	3.13	7	14.89%	3.87
Adam Jones, Orioles	BAL	9	0.60	3	33.33%	2.40
Miguel Cabrera, Tigers	DET	385	25.67	28	7.27%	2.33
Coco Crisp, Athletics	OAK	3	0.20	2	66.67%	1.80
Edwin Encarnacion, Blue Jays	TOR	7	0.47	2	28.57%	1.53
Max Scherzer, Tigers	DET	25	1.67	3	12.00%	1.33
Salvador Perez, Royals	KC	1	0.07	1	100.00%	0.93
Koji Uehara, Red Sox	BOS	2	0.13	1	50.00%	0.87
Chris Davis, Orioles	BAL	232	15.47	16	6.90%	0.53
Adrian Beltre, Rangers	TEX	99	6.60	7	7.07%	0.40
Yu Darvish, Rangers	TEX	1	0.07	0	0.00%	-0.07
Felix Hernandez, Mariners	SEA	1	0.07	0	0.00%	-0.07
Shane Victorino, Red Sox	BOS	1	0.07	0	0.00%	-0.07
Jason Kipnis, Indians	CLE	31	2.07	2	6.45%	-0.07
Torii Hunter, Tigers	DET	2	0.13	0	0.00%	-0.13
Hisashi Iwakuma, Mariners	SEA	2	0.13	0	0.00%	-0.13
Greg Holland, Royals	KC	3	0.20	0	0.00%	-0.20
Carlos Santana, Indians	CLE	3	0.20	0	0.00%	-0.20
Jacoby Ellsbury, Red Sox	BOS	3	0.20	0	0.00%	-0.20
Dustin Pedroia, Red Sox	BOS	99	6.60	5	5.05%	-1.60
Manny Machado, Orioles	BAL	57	3.80	2	3.51%	-1.80
Robinson Cano, Yankees	NY	150	10.00	8	5.33%	-2.00

NL:

Player, Club	City	Points	Points/2 voter	Points from City Voters	% Homer votes	Homer difference
Yadier Molina, Cardinals	STL	219	14.60	28	12.79%	13.40
Hanley Ramirez, Dodgers	LA	58	3.87	7	12.07%	3.13
Joey Votto, Reds	CIN	149	9.93	13	8.72%	3.07
Allen Craig, Cardinals	STL	4	0.27	3	75.00%	2.73
Jayson Werth, Nationals	WAS	20	1.33	4	20.00%	2.67
Hunter Pence, Giants	SF	7	0.47	3	42.86%	2.53
Yasiel Puig, Dodgers	LA	10	0.67	3	30.00%	2.33
Matt Carpenter, Cardinals	STL	194	12.93	15	7.73%	2.07
Andrelton Simmons, Braves	ATL	14	0.93	2	14.29%	1.07
Paul Goldschmidt, D-backs	ARI	242	16.13	17	7.02%	0.87
Michael Cuddyer, Rockies	COL	3	0.20	1	33.33%	0.80
Andrew McCutchen, Pirates	PIT	409	27.27	28	6.85%	0.73
Clayton Kershaw, Dodgers	LA	146	9.73	10	6.85%	0.27
Craig Kimbrel, Braves	ATL	27	1.80	2	7.41%	0.20
Russell Martin, Pirates	PIT	1	0.07	0	0.00%	-0.07
Matt Holliday, Cardinals	STL	2	0.13	0	0.00%	-0.13
Buster Posey, Giants	SF	3	0.20	0	0.00%	-0.20
Adam Wainwright, Cardinals	STL	3	0.20	0	0.00%	-0.20
Adrian Gonzalez, Dodgers	LA	4	0.27	0	0.00%	-0.27
Troy Tulowitzki, Rockies	COL	5	0.33	0	0.00%	-0.33
Shin Soo Choo, Reds	CIN	23	1.53	1	4.35%	-0.53
Jay Bruce, Reds	CIN	30	2.00	1	3.33%	-1.00
Carlos Gomez, Brewers	MIL	43	2.87	1	2.33%	-1.87
Freddie Freeman, Braves	ATL	154	10.27	8	5.19%	-2.27

Where points is total points received, points/2 voter is points per two voters (points/15), points from city voters is points received from the voters in the player’s city, % homer votes is the percentage of a player’s points that came from voters in his city, and homer difference is the difference between points/2 voter and points from city voters. Charts are sorted by homer difference.

I don’t know that there’s all that much we can draw from this. Obviously, voters are more likely to vote for players from their own city, but that’s to be expected. Voting was a little bit less biased in the AL – the average player received exactly 1 point more from voters in their city than from all voters in the AL, whereas that number in the NL was 1.21. 8.08% of all votes in the AL came from homers compared to 8.31% in the NL. If you’re wondering which cities were the most biased, here’s a look:

AL:

City	Points	Points/2 voter	Points From City voters	Difference
OAK	225	15.00	24	9.00
LA	282	18.80	23	4.20
TB	103	6.87	11	4.13
DET	412	27.47	31	3.53
BOS	152	10.13	13	2.87
TOR	7	0.47	2	1.53
BAL	298	19.87	21	1.13
KC	4	0.27	1	0.73
TEX	100	6.67	7	0.33
SEA	3	0.20	0	-0.20
CLE	34	2.27	2	-0.27
NY	150	10.00	8	-2.00

NL:

City	Points	Points/2 voters	Points From City Voters	Difference
STL	422	28.13	46	17.87
LA	218	14.53	20	5.47
WAS	20	1.33	4	2.67
SF	10	0.67	3	2.33
CIN	202	13.47	15	1.53
ARI	242	16.13	17	0.87
PIT	410	27.33	28	0.67
COL	8	0.53	1	0.47
ATL	195	13.00	12	-1.00
MIL	43	2.87	1	-1.87

Where all these numbers are just the sum of the individual numbers for all players in that city.

If you’re wondering what players have benefited the most from homers in the past 2 years, check out this article by Reuben Fischer-Baum over at Deadspin’s Regressing that I found while looking up more info. He basically used the same method I did, only for 2012 as well (the first year that individual voting data was publicized).

So that’s all for this article. Hope you enjoyed.

———————————————————————————————————————————————————–

*I’m using fractions of wins because that gives us a more accurate number for the statistic I introduce by measuring it to the tenth and not to the single digit. Obviously a team can’t win .6 games in real life but we aren’t concerned with how many games the team won in real life, only their runs scored and allowed.

**Carpenter spent time both at second base and third base, so I used the equation (Innings played at 3B*average wOBA for 3rd basemen + Innings played at 2B*average wOBA for 2nd basemen)/(Innings played at 3B + Innings played at 2B) to get Carpenter’s “custom” position-average wOBA. He did play some other positions too, but very few innings at each of them so I didn’t include those. It came out to about .307.

***Voting is as such: Each voter puts 10 people on their ballot, with the points going 14-9-8-7-6-5-4-3-2-1.

2 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

triple_rMember since 2020

11 years ago

Three things:
1. You should try numerical footnotes instead of asterisks. They’re much less disruptive of the flow of the piece.
2. To what were you referring with the “no pun intended” line?
3. Y’know, “great job” and all that shit.

Jonah Pemstein

Reply to triple_r

1. Thanks, I don’t profess to be a writing expert but I’ll try to change that once I figure out how the hell I can edit this if that’s possible
2. A trout is a fish… “blow out of the water”… maybe that was pretty lame
3. Thanks again

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG