Introducing Two New Pitching Metrics: exOUT% and exRP27

exOUT%

In the early 21st century, Oakland Athletics’ General Manager Billy Beane revolutionized baseball forever. He was the first general manager in baseball to heavily utilize sabermetrics in his baseball operations. This isn’t a history lesson though, I bring him up because of his idea that outs are precious, and as a hitter your goal is to not make out, thus him prioritizing OBP so heavily. In the following years, baseball statistics have seen phenomenal progress on both offense and for pitchers. While I believe FIP and xFIP are both very useful statistics in really measuring a pitcher’s skill, my problem is that they essentially ignore all the batted ball data that we have (GB%, FB%, LD%). SIERA and tERA have solved some of these problems, but are far from perfect, and I believe the more statistics we have, the better.

As I mentioned with Beane, while we largely focus on a hitter’s ability to not make out, we still don’t have a catch-all statistic to realize how effective pitchers are at getting batters out, because if the batter’s goal is to not make out, the pitcher’s goal is to get the batter out. So I present to you expected out percentage, or exOUT% (the name is certainly a work in progress). exOUT% sets out to answer a simple question: For any plate appearance, what is the likelihood that the pitcher will get the batter out? This can easily be found by just looking at a pitcher’s opponent OBP, but that is rather primitive, and we can get a better estimate by focusing more on pitchers’ skills to strike people out, not walk batters, and the type of contact they are giving up, and also trying to negate the effect of the defense by him, by just using league averages. So to calculate a pitcher’s exOUT%, I used K%, BB%, GB%, LD%, FB%, lFFB%, and 2014 league averages on ground balls, line drives, and fly outs. (HBPs are essentially ignored but can certainly be incorporated in a future version, this is pretty much exOUT% v1.0)

I want to give full disclosure, I am not a statistician or close to it. Math and statistics are an area of interest and I am currently pursuing a degree in math-economics, but I am far from a professional, so I recognize there are going to be errors in my data. This is an extremely rough version; there’s even a combination of data from this year and last year so there will be inconsistencies, as I don’t have the resources to gather all the data I need. If after reading this, you are interested in this and would like to take this further, please feel free to contact me if you have the skills necessary to advance this further (or even if you don’t).

I will first post a simple step-by-step breakdown of how to calculate exOUT%, and then get into more detail and take you through it with Clayton Kershaw, because well, he is awesome.

1- Add K% and BB%, subtract this percentage from 100%, this leaves you with a balls in play%, let’s just say BIP%

2- Multiply the pitcher’s GB% (make the percentage a number less than 1, for example 40% is .4) and BIP% (leave it between 1 and 100, ex 40%), this gives you a GB% for all PAs, not just balls in play, we’ll call this overall GB%, or oGB%… now multiply this percentage (in between 1 and 100) times the league average percentage of ground balls that don’t go for hits (league average is .239 on ground balls in 2014, so out percentage on ground balls is 76.1%, but make it .761…. this will give you a percentage you can leave between 1 and 100, if the number is 20%, that means that there’s a 20% chance that pitcher will induce a ground ball out that PA, assuming league average defense, we can assume this because we’re using the league average for batting average on groundballs… we’ll call this exgbOUT%

3- Now follow the same steps but with LD%, exldOUT%, the percentage chance for any given PA that the pitcher will produce a line drive out. (The league average on line drives last season was .685 (!) so that means there is a 31.5% chance a line drive will result in an out)

4- Same thing with FB%, sort of, because we also want to incorporate IFFB%. So multiply a pitcher’s FB% by their IFFB%, this gives you the percentage of balls in play that the pitcher produces an infield fly ball (bipIFFB%). Multiply this percentage by their BIP% to get his overall percentage of PAs that result in an infield fly, and this will also be their exiffbOUT%, because any infield fly ball should be converted to an out, and if not, it’s to no fault of the pitcher, so we won’t punish him. Next subtract a pitcher’s IFFB% from 1 or 100, whatever, and this is their balls in play percentage of fly balls that are normal fly balls, to the outfield. Multiply this number by their BIP%, this gives you the overall normal FB% for a pitcher, not just balls in play. Multiply this number by .793 (the league average on fly balls in is .207, so there’s a 79.3% that a fly ball will result in an out). This number is the percentage chance that for any given PA, the pitcher will produce a fly ball out to the outfield. Add this exnfbOUT% (n for normal) and his exiffbOUT% and you have his exfbOUT%, the percentage that for any given PA, the pitcher will produce a flyball out, to the infield or outfield.

5- Add K% + exgbOUT + exldOUT + exfbOUT

6- You have your exOUT%

 

The terms are not that technical or scientific so I don’t confuse anyone — I tried to simplify a very complicated procedure as much as possible. To clarify and give you an example, let’s go through Clayton Kershaw.

Kershaw profiles like this (I compiled this data on 8/21): 32.3 K%, 4.9 BB%, 52.8 GB%, 26 FB%, 11.8 IFFB%, 21.2 LD%.

So let’s look at the balls that don’t go in play, strikeouts and walks. Add the two and balls not in play percentage is 37.2, 4.9% are walks and thus won’t be an out, and 32.3% are strikeouts so will be an out. Thus far, Kershaw’s exOUT% is 32.3 (of a possible 37.2 so far)

Now let’s look at the balls in play. People will usually say that a pitcher can’t control what happens when a ball is in play, but I vehemently disagree, the type of contact the pitcher gives up can’t be ignored and largely effects what will happen to the ball in play. I will quote a FanGraphs article here to explain it, “Generally speaking, line drives go for hits most often, ground balls go for hits more often than fly balls, and fly balls are more productive than ground balls when they do go for hits (i.e. extra base hits). Additionally, infield fly balls are essentially strikeouts and almost never result in hits or runner advancement.” And FanGraphs also gives us this data from 2014.

GB: AVG- .239, ISO- .020, wOBA- .220

LD: AVG- .685, ISO- 190, wOBA- .684

FB: AVG- .207, ISO- .378, wOBA- .335

 

So this means that fly ball pitchers are most likely to get outs, although they may be less effective because when they don’t get outs, it’s more trouble than for ground ball pitchers. But remember, this statistic is just finding the chance that the pitcher will get a hitter out.

 

All right, so, let’s calculate Kershaw’s exgbOUT%, exldOUT%, and exfbOUT%; you can follow the numbers along with the steps I listed above.

 

GB%- 52.8

62.8 x .528 = 33.1584

(33.1584 x .761)=  25.23354424 exgbOUT

 

LD%- 21.2

62.8 x .212 = 13.3136

(13.3136 x .315) = 4.193784 exldOUT

 

FB%- 26

26 x .118= 3.068 bipIFFB%

26 x .882= 22.932 (bipFB%)

62.8 x .22932= 14.401296 (onFB%)

14.401296 x .791= 11.3914251 exnfbOUT%

62.8 x .03068= 1.926704 oIFFB% and exiffbOUT%

exnfbOUT% + exiffbOUT% = 13.3469317 exfbOUT%, if you followed my math exactly a decimal may be off, like 13.31 something, but this is the number the excel doc chugged out, so I’m trusting that, my iPhone calculator can’t carry all the decimals sometimes.

Now add them all up

32.3 + 25.23354424 + 4.193784 + 1.926704  + 11.3914251 = 75.07%

K% + exgbOUT% +  exldOUT% + exiffbOUT% + exnfbOUT% = exOUT%

The league average exOUT%, using league average statistics from 2014 for the ones involved, is 69.8%. Scherzer leads the majors (well the 89 pitchers I was able to export data from FanGraphs) with a 76.43 exOUT%. If you want to look at it as a more concise and better version of opponent OBP, his is .236, so, you know, good. Here is a picture of the data for the top 37 — the J column is what you are looking at. Betances is in their because I wanted to calculate one reliever. 

View post on imgur.com

All right, I’ve explained it a bit in the prologue, but now that you’ve seen it, let me explain more why I like this stat. Well first, I created it and calculated, so, well, yeah… but I also like this stat because it answers a very simple question “How good is a pitcher at getting people out?” Pitching in its simplest form, is exactly that, getting people out. The stat recognizes that there’s basically only these outcomes for an at bat: strikeout, walk, ground ball, line drive, and fly out, and looks at the pitcher’s stats in these categories to determine how many people he should be getting out. The stat is more predictive than evaluative in nature, because you can calculate a pitcher’s actual out percentage, but that doesn’t nearly tell the whole story, because a lot of luck is involved with balls in play, and other fluky outcomes.

This operates under the basis that a ground ball will perform the way the average ground ball does, a line drive performs the way an average line drive does, and a fly ball behaves the way a typical fly ball does. There could be guys getting very fortunate with ground balls: having a great infield behind them, balls not squeaking through the holes; with line drives: being hit right at people; and fly balls: staying in the park, having outfielders who cover a lot of ground. And there could be guys who are getting unlucky: the ground balls are getting through the holes, the infielders don’t have range; line drives seem like they are always going for hits, and fly balls are falling in. This says that a pitcher can’t control that, but they can control how much they strike out people, how much they walk people, and how often they give up ground balls, line drives, and fly balls, and if these balls in play behaved the way they should, the pitcher should be getting this percentage of people out.

I will address the flaws I have found with it. As much as getting people out is important, sometimes what happens in the plate appearances that don’t end in outs are almost as important. This only deals in batting average regarding balls in play, but wOBA is very important too. Fly balls are more likely to be outs than ground balls, but the wOBA on fly balls is over 100 points higher. Additionally, I’d prefer instead of ground balls, line drives, fly balls, to use soft contact, medium contact, hard contact, because that is a truer test of pitcher skill, however, I did not have this data at my disposal as far as league averages on what the batting average is for soft contact, medium contact, hard contact (if someone does, please contact me like I said). So what I have for now will do and this batted ball data is still a good measure. I set out to calculate what percentage of batters a pitcher should be getting out, and that is exactly what I found out. So while it’s not perfect, it has its use, and it’s something to build on.

 

exRP27

And build on I did. While the out percentage is nice, it doesn’t give us a measure like ERA or FIP or xFIP, that tells us how many runs a pitcher should be giving up. So using the data I used to calculate exOUT%, I present to you exRP27 (expected runs per 27 outs, a stupid name for a hopefully not stupid stat).

The basis for this stat is this data from FanGraphs, “Line drives are death to pitchers, while ground balls are the best for a pitcher. In numerical terms, line drives produce 1.26 runs/out, fly balls produce 0.13 R/O, and ground balls produce only 0.05 R/O.” (I don’t know how this was calculated, or when it is accurate for, but this is what I got). We don’t know this for soft contact, medium contact, hard contact, so again I’m sticking with ground balls, line drives, and fly balls. 

All right, so what I am going to do using this stat and the pitcher’s K%, BB%, GB%, LD%, and FB% is see how many runs the pitcher should be allowing over 27 outs, and then adjust it to get it on a scale similar to ERA, FIP, and xFIP.

Keeping Clayton Kershaw as our example, let’s take a look.

Kershaw’s K% is 32.3 — we’re multiplying this by 27 (for outs in a game), and we get 8.721 K’s, so 0 runs so far because a K will never produce a run

Now GB%. His exgbOUT% is 25.23354424, multiply this by 27 and we get 6.8 (ish, final number will be exact via the Excel doc). Multiply this by .05 (the runs per GB out he gets) and we get .34 runs.

LD%- his exldOUT% is 4.193784, multiply by 27 and get 1.13232168, and multiply this by 1.26 for LD runs/out and we get 1.43 runs

His exfbOUT% is 13.3181291, now multiply by 27 get 3.6 and then that by .13 and you get .47 runs

Add up all these exRUNS and Kershaw’s total is 2.24. However, we can’t stop here because the number of outs he’s recorded is only 20.3 (8.7+6.8+1.1+3.6) approximately. 20.3 is the rounded up total. So get this 20.3 (or whatever the pitcher’s exOUTS is) up to 27  by multiplying by whatever it takes, and then multiply his exRUNS by this same number. For Kershaw you end up with 2.97 exRP27. The league average would be 3.78. Last year’s average ERA/FIP/xFIP was 3.74, but when I adjust everything to that, everyone’s exRP27 just goes down slightly (Kershaw’s from 2.97 to 2.94), but I want it to be on a more realistic scale where everyone’s totals are lower and a really good exRP27 is comparable to a really good FIP, like in the low 2s. 

So I don’t know what the statistic’s correct way is, but here is what I did to make it work. I calculated what his “ERA” would be using by multiplying his exRUNS by 9 and then dividing that by his exOUTS. His was .99, the league average was 1.26. I then did .99/1.26 to get .78 or so, I then multiplied that by his exRP27 and got 2.34. I felt like this was more realistic and in line with his ERA/FIP/xFIP. Obviously, can’t be the same because they measure different things, but just got in in the area. And the same is done for all pitchers. Obviously, not everyone gets multiplied by .78 of course. The league average remains 3.78, between last season and this season’s average for ERA/FIP/xFIP.

Here is the leaderboard for that (S column):

View post on imgur.com

 I really like this stat a lot, and feel like it does what I wanted to accomplish: figure out how many runs a pitcher should allow per 27 outs given his K%, BB%, GB%, LD%, FB%, and the notion that balls in play will behave the way they normally do, as anything else is likely luck and not indicative of the pitcher’s performance.

I look at Sonny Gray as someone this stat is perfect for. His ERA is outstanding at 2.04, but his FIP is 3.00, his xFIP is 3.47 and his SIERA is 3.50. The problem is, at least with FIP and xFIP for sure, is that they ignore what happens when the ball is in play. He doesn’t strike out too many people, he has a good BB% but not spectacular, and he’s given up 10 home runs, a fair amount, so this hurts his FIP and whatnot. However, instead of saying “well he will regress, look at his FIP/xFIP/SIERA” this looks at why he’s having this success, and it has to do with the balls in play, which is getting ignored. Gray’s LD% is just 14.6! That is really good! Second best of the 90 pitchers I did this for. And his GB% is 54%, 9th best, also really good. The pitcher does have control over the type of contact he allows, and the fact that Gray is producing a ton of ground balls, and very few line drives, is why he’s been so successful. His 2.34 exRP27 suggests that he has not been as good as his 2.04 ERA suggests, but he’s not as far off as the other stats suggest. 

Obviously exRP27 is far from perfect, and is in no way supposed to replace FIP/xFIP/SIERA, but it is something to look at with them. I am a big believes in aggregation, so I think that averaging some combination of these 4 stats together or them all, is an even better way to evaluate a pitcher. We’ve got more data than ever, so it makes sense to use it, exRP27 and exOUT% are just more examples of utilizing this data to help better evaluate pitchers.  

I hope you guys enjoyed. Any feedback please comment or contact me. Next I will be looking at exWOBA against for pitchers using similar data, and exWOBA for batters using the data but for hitters.





Follow @TomScibelliWFUV, email tscibelli@fordham.edu

8 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Darren
8 years ago

Nice work. However, how is this different from TRa or Tera that was on StatCorner a few years ago? Also how are you calculating your run value for each batted ball event? Is this the delta before and after during the 2015 season? or for multiple years? Is it park adjusted?

tz
8 years ago
Reply to  Darren

It looks a lot like tERA as I understood it, although I’m glad to have a transparent version of it actively being looked at, since Fangraphs has taken it off their standard set of reported stats (I’m a huge fan of this concept).

One small technical note – BB% should really be (BB% – IBB% + HBP%), since pitchers do have influence on HBP but don’t really have it for IBB.

Good stuff. I like it the core premise of tRA and your work here is useful to keep it going. Look forward to your follow up!

endlessgonzales
8 years ago

Awesome, might not be perfect but this is getting us closer to properly evaluating effective contact pitchers. Chris young looks better under these stats i would imagine

Chuck
8 years ago

When will these stats be available from Fan Graphs. A side question, Do you prefer xFIP over SIERA?

bluejays49
8 years ago

I can appreciate the effort. You did a good job explaining your process.

I think you’re missing some of the edge events. Sacrifices aren’t at-bats so aren’t included in the batting average on contact type numbers that you’re using. But they are outs. Also, reaching on error counts against batting average but does not result in an out. Out% isn’t technically the inverse of OBP and OBP isn’t the product of BA + BB%.

Also, I think the purpose should be better defined. If you include line drive rates, you’ll move away from measuring “how good a pitcher is at getting people out,” since line drive rates aren’t really talent-based. The stat right now is answering “given basic batted ball and defense-indepedent outcomes, what should a pitcher’s out% have been in a past sample?”

Ben
8 years ago

A possible suggestion for the exOUT%: constants to adjust for defense. Let’s say Kershaw pitched in front of a defense that was 2% above the league average defense, so you could multiply his exOUT% for all BIP by 1.02, and add that to the K%. Or if you wanted to take it a step further, for GB% you could multiply whatever the exOUT% is by what the constant for the infield only. For FB% it’d be for the outfield only, LD% by both. It’s a small improvement that may not make much of a difference anyway, but it takes out the ambiguity of the defense, and you can easily change them in the event of a trade.

Ben
8 years ago
Reply to  Ben

I suppose you could do that for park factors as well

Spitball McPhee
8 years ago

I was just taking a shit and pondering baseball statistics, as i am wont to do, and i was thinking how much we needed a pitching metric that encompasses quality of contact surrendered.