The Cascading Bias of ERA

There are so many problems with ERA that it’s unbelievable. I’m not going to sit here and tell you what’s wrong with ERA, though, because you’re probably smart. But there’s a problem with ERA, and it’s a problem that transcends ERA. It’s a problem that trickles down through FIP, xFIP, SIERA, TIPS, etc. etc. name your favorite stat, etc., and it’s something I don’t see talked about much.

All of our advanced pitcher metrics are trying to predict or estimate ERA. They’re trying to figure out what a pitcher’s ERA should be, and herein lies the problem: Because they could be exactly right, but they’d still be a little incorrect due to one little assumption.

This assumption–that pitchers have no control over whether or not the fielders behind them make errors–seems easy to make. Like most assumptions, however, this one is subtly incorrect. Thankfully, the reason is pretty simple. Ground balls are pretty hard to field without making an error, and fly balls aren’t. And the difficulty gap is pretty huge.

How big? Well in 2013 there were precisely 58,388 ground balls, 1,344 of which resulted in errors. On the other hand a mere 98 out of 39,328 fly balls resulted in errors. That means that 2.3% of ground balls result in errors while a tiny 0.25% of fly balls do. It’s time to stop pretending that this gap doesn’t exist, because it does.

So now that we know this, what does it mean? Well it means this: ground-ball pitchers will have an ERA that suggests they are better than their actual value, while fly-ball pitchers have the opposite effect. Pitchers who allow contact, additionally, are worse off because every time they allow contact they put pressure on their defense. They’re giving themselves a chance to stockpile unearned runs which nobody will count against them if they’re only looking at ERA derivatives. When it comes to winning baseball games, however, earned runs don’t matter. Runs matter.

I am going to call this the “pressure on the defense” effect, which will cause some pitchers to be more prone to unearned runs than other pitchers. How big is this effect? Well, not huge. The gap between the best pitcher and worst pitcher in the league is roughly three runs over the course of the season. But keep in mind that three runs is about a third of a win, and a third of win is worth about $2 million dollars. We’re not discussing mere minutiae here.

In order to better quantify this effect I have developed the xUR/180 metric, which will estimate how many unearned runs should have taken place behind each pitcher with an average defense. Below is a table of all qualified starting pitchers from 2013 ranked according this metric. I have also included how many unearned runs they actually allowed in 2013, scaled to 180 innings for comparative purposes.

# Name xUR/180 UR/180
1 Joe Saunders 7.24 9.84
2 Jeff Locke 7.11 4.33
3 Wily Peralta 6.97 17.7
4 Edwin Jackson 6.88 13.36
5 Edinson Volquez 6.81 6.35
6 Kyle Kendrick 6.77 8.9
7 Justin Masterson 6.66 0.93
8 Doug Fister 6.58 5.19
9 Wade Miley 6.57 7.12
10 Rick Porcello 6.51 2.03
11 Jerome Williams 6.47 7.45
12 Jorge de la Rosa 6.43 5.38
13 Yovani Gallardo 6.42 7.99
14 A.J. Burnett 6.35 8.48
15 Scott Feldman 6.32 8.94
16 Mike Leake 6.26 5.62
17 Andrew Cashner 6.25 8.23
18 Felix Doubront 6.22 6.66
19 Jhoulys Chacin 6.13 5.48
20 Kevin Correia 6.13 2.92
21 Jeremy Guthrie 6.13 3.41
22 Mark Buehrle 6.11 5.31
23 Andy Pettitte 6.05 7.78
24 Hyun-Jin Ryu 6.01 2.81
25 Jeff Samardzija 6.0 5.07
26 C.J. Wilson 5.93 11.03
27 CC Sabathia 5.9 8.53
28 Jon Lester 5.84 4.22
29 Ryan Dempster 5.8 10.52
30 Tim Lincecum 5.77 5.48
31 Hiroki Kuroda 5.72 4.48
32 Bud Norris 5.72 7.15
33 Jordan Zimmermann 5.69 3.38
34 Patrick Corbin 5.68 1.73
35 Dillon Gee 5.67 3.62
36 Ervin Santana 5.67 7.68
37 Kris Medlen 5.66 8.22
38 Bronson Arroyo 5.63 2.67
39 Stephen Strasburg 5.62 9.84
40 Mat Latos 5.62 6.85
41 Ubaldo Jimenez 5.61 7.9
# Name xUR/180 UR/180
42 Jarrod Parker 5.61 4.57
43 John Lackey 5.6 5.71
44 Gio Gonzalez 5.55 5.53
45 Lance Lynn 5.55 2.68
46 Eric Stults 5.5 7.09
47 Felix Hernandez 5.49 4.41
48 Zack Greinke 5.48 2.03
49 Hisashi Iwakuma 5.47 3.28
50 Jose Quintana 5.46 4.5
51 Ian Kennedy 5.46 8.95
52 Ricky Nolasco 5.45 7.23
53 R.A. Dickey 5.44 6.42
54 Jeremy Hellickson 5.4 3.1
55 Homer Bailey 5.38 3.44
56 Miguel Gonzalez 5.36 9.47
57 Madison Bumgarner 5.34 5.37
58 James Shields 5.32 1.58
59 Adam Wainwright 5.32 2.99
60 Bartolo Colon 5.32 3.79
61 Derek Holland 5.3 7.61
62 Kyle Lohse 5.26 3.63
63 Cole Hamels 5.18 4.91
64 Anibal Sanchez 5.18 3.96
65 David Price 5.18 8.7
66 Chris Sale 5.14 6.73
67 Justin Verlander 5.06 8.25
68 Chris Tillman 5.04 1.75
69 Jose Fernandez 5.03 5.23
70 Shelby Miller 4.98 6.24
71 Matt Cain 4.97 2.93
72 Clayton Kershaw 4.9 5.34
73 Julio Teheran 4.9 2.92
74 Matt Harvey 4.86 1.01
75 Cliff Lee 4.79 4.86
76 Travis Wood 4.78 3.6
77 Dan Haren 4.78 4.26
78 Yu Darvish 4.53 1.72
79 A.J. Griffin 4.46 5.4
80 Mike Minor 4.46 5.29
81 Max Scherzer 4.15 3.36

 

Some notes:

  • Groundballs are still good, they’re just not as good.
  • A combination of groundballs and contact lead to more unearned runs. The pitchers at the top of the board demonstrate this.
  • A combination of strikeouts and fly balls will tend to limit the impact of unearned runs, as demonstrated by the bottom of the board.
  • Errors that occur on fly balls tend to be more costly than errors on ground balls. This metric accounts for that gap, but the low likelihood of fly-ball errors make this bullet point’s effect relatively negligible.
  • Line drives are similar to fly ball in terms of error rate, but they tend to be less costly than fly ball errors.

I’m sure there is more to be gleaned, but the point is this: we need to stop trying to predict ERA, because ERA is not a pure value stat. We should be trying to figure out how many runs a pitcher should/should have given up, because that’s what matters. Runs matter, and who cares if they’re unearned? They’re kind of the pitcher’s fault, anyways.





Brandon Reppert is a computer "scientist" who finds talking about himself in the third-person peculiar.

31 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Baltar
10 years ago

An excellent idea well-supported and well-written.

Spencer
10 years ago

Very good article, concise, and a good attempt at putting a value to something that should be measured, could we potentially see a new sierra type thing that predicts for RA? I would guess that the FB/GB values would only need to be edited slightly to account for the difference as well as the numbers being scaled up to reflect RA instead of ER.

Bryce
10 years ago
Reply to  Spencer
The Foilsmember
10 years ago

You know an article’s good when you’re frustrated that it’s ending. 🙂

jss
10 years ago

Can you tell us how you get xUR? I’m not sure what to make of these numbers. For example, Masterson should have given up about 5.5 more UnEarned runs than he did. Why? Did those runs disappear, or were they Earned Runs?

olethros
10 years ago
Reply to  jss

The metric assumes a neutral defense. So deviation +/- the xUR means the defense behind that pitcher performed above or below average.

jss
10 years ago
Reply to  olethros

“means the defense behind that pitcher performed above or below average.”

That’s better than average for Errors only, right? Could be high BaBip, low errors, no?

jss
10 years ago

Can guys have lower or higher ER than they should because the defense did not get to balls that ‘should’ have been errors? Maybe look at types of balls in play, number of errors made on each type, number of expected errors, number of expected men getting to each base, on balls in play by type? Or something like that.

Christopher Carruthers
10 years ago

So really you’re saying we should test against RA9? This is already known, but the common theme is to scale to ERA, so testing against ERA is simple and easy. The differences are fairly small in testing. It’s not going to change which estimator wins in a sample if you use RA9 or ERA. If you think every stat should already be scaled to RA9, then I agree with you, but for the sake of conformity and familiarity, ERA scale is used.

Dan Farnsworth
10 years ago

I’ve been screaming this at my computer for a year or two. I’m glad you put into print what I couldn’t. Great job making a huge point with simple ideas!

Ralph
10 years ago

I think you are underestimating the value of ground balls. Ground balls are much more likely to result in a double play than a fly ball. It seems pitchers should at least get partial credit for inducing double plays.

Additionally, I suspect that errors on fly balls to the outfield have more severe consequences than ground ball errors.

Jon L.
10 years ago

This was a great idea and a great article. I look forward to seeing this tool used to assess how particular pitchers are misvalued. Already it’s helping to show how a pitcher can win the Cy Young Award with an infield of mostly DH’s.

MustBunique
10 years ago
Reply to  Jon L.

Unless I am mistaken in thinking that you are talking about Scherzer, according to Brandon’s numbers Scherzer allowed less unearned runs (UR/180 3.36) than would be expected with an average defense (xUR/180 4.15). The numbers do not support your claim that it was unearned runs by DH-like infielders which earned Scherzer the Cy-Young award.

Good work Brandon, I liked the article. Thanks for keeping it concise, it really made your point that much more powerful.

randhyllcho
10 years ago

Run value of an error: 0.24
Run value of a HR: 1.39
HR/FB% ~ 9.5%
Percent of ground balls that turn into HR:0.000001%
I’ll take a grounders all day…

randhyllcho
10 years ago

Whats the std Dev between the xUR/180 and UR/180? I’m wondering if that could be used to “jiggle” cleaned data +/- to get a range of ERA that is more realistic to what pitchers do.

http://www.insidethebook.com/ee/index.php/site/comments/run_values_of_events/

Not sure about the difference, he’s also got an error listed @0.47 runs on the same page. I missed that the first time…

studstats_13member
10 years ago

Yes totally agree

Charlie
10 years ago

Damn you, Brandon. Stop making me rethink my current baseball philosophies. Well done.

One question: Isn’t it unfair to bundle in FIP with the rest of the advanced metrics? Because, FIP blatantly ignores any batted ball in play. In my observations, FIP is used in a context not originally intended to be used in. Metrics like xFIP and SIERA take into account batted ball types, which is the point you are making in the assumptions such metrics are making.

Asa
10 years ago

Interesting article. I like a lot of where it goes with the data but unfortunately the entire premise is based on faulty assumptions. If ERA is a bad stat(which it is) trying to improve it using errors(an even worse stat)makes little sense. How many misplayed fly balls are officially recorded as singles, doubles, triples. That just affects the counting numbers. What about the weight of fly ball mistakes? An error in the outfield can be costlier
then infield errors in terms of advancing bases and runners scoring. I would love to see the same(ish) numbers run using other fielding metrics then is the gist of my long winded point.

cass
10 years ago

Totally agree.

I’d actually like to get rid of errors entirely. As was pointed out during this year’s AL MVP debate, some players (fast ones like Mike Trout) reach base on error far more often than other players (slow ones like Miguel Cabrera). But OBP gives no credit for reaching on an error. It should. The stat should simply be times on base divided by plate appearances. Shouldn’t be hard. I actually had never realized reached on errors weren’t included.