Battle of the Ks: K/9, K/BB and K%

The great debate has been raging for years: which strikeout-related metric is a better predictor of actual pitching success? Some would say there is no right or wrong answer — that each metric has it’s own unique merit and value. That one must look at certain strikeout-related metrics in combination with others. Unfortunately, as tragic as it may seem, statistical evidence begs to differ. Statistics tell us there is in fact a right answer, and it’s a whopper.

Let’s start with K/9. Looking at all 2013 pitchers with 80+ innings, the correlation (R2) between strikeouts per 9 and ERA is a solid  .1081. This correlation has been consistent, plus or minus a few hundredths, for the past five years. So nothing exciting or anomalous can be found in looking at other seasons. Yu Darvish leads the category with Tony Cingrani, Max Scherzer, Anibal Sanchez, and A.J. Burnett rounding out the top five. Additionally, eight of the top ten K/9 leaders ended up with sub 3.10 ERAs. So a decent indicator all-around.

 photo 53a65e17-24d6-482d-b2de-766753f09051_zps2940fbe7.png

K/BB get’s a bit more interesting. We see a jump in linear correlation to .1671 — more than a 50% increase over K/9. Clayton Kershaw, Cliff Lee, and Adam Wainwright  all leap into the top ten of this metric, with Hisashi Iwakuma climbing into the top fifteen — four elite hurlers in 2013 left out of the K/9 leaderboard.

 photo 98225caf-a307-44c3-850b-d610a9444d32_zps70ee67d9.png

But the real gem is K%. It shows double the correlation versus K/9. Plus, the top fifteen in this category ended the year with sub 3.30 ERA — whereas Scott Kazmir (4.04) and Josh Johnson (6.20) smeared the good name of the K/9 leaderboard; with Kevin Slowey (4.11) and Dan Haren (4.67) unpleasantly loitering on the K/BB board.

The reason K% is so powerful is that it simplifies how effective a pitcher is at simply striking out each batter he faces. When BABIP gets involved — as it does for K/9 (high BABIP pitchers are rewarded on K/9 since the number of outs remains the same even if they’re giving up, say, 10+ hits per game) — the value of each strikeout is severely reduced.

 photo 17feabf1-8665-45c5-af39-48d69923e54a_zpsf45972cf.png

 

To recap:

2013 R2 (correlation to ERA)
K/9 .1081
K/BB .1671
K% .2089

So should we end the debate completely? No. But if you asked me to put money on Tim Lincecum, a career 25.8 K% pitcher with no decline in the stat over the past 2 years, over Tyler Chatwood, a career 13.0 K% who had a breakout year in 2013 with his freakish 76.3% LOB, I would bet on Lincecum every doggone time.





Slava Heretz is a Finance Manager, Red Sox fan, and adult ball league hack in Somerville, MA. Business metrics is like sabermetrics... just a different form of nerdy storytelling.

9 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Noah Baron
10 years ago

Great analysis. My only quibble would be a preference to see the graphs in a scatter plot form, but I think the bar graph was pretty good as well.

Also, maybe we would get different results if we set a minimum innings threshold. There’s a reason Josh Johnson is the huge outlier; He only pitched 81 innings. I feel like the correlations end up being more reflective on where the outliers fall, rather than the overall trend.

Either way, interesting results. I’m somewhat surprised K/BB doesn’t have the greatest correlation with ERA.

KK-Swizzle
10 years ago

Awesome! I’ve been thinking K% was the way to go for quite some time now for the exact reasons you ended up specifying…It’s nice to see my hunch verified by some simple, effective analysis!

AE1324
10 years ago

Hate to nitpick, but R squared is not correlation. R squared is a measure of how well the data fits around your model or line of best fit (correlation). The higher the number, the better predictor and more accurate your model or line is.

Correlation is a number between -1 and 1.

AE1324
10 years ago
Reply to  AE1324

Forgot to add that R squared explains variability. The more variability that’s explained, the more accurate your line or model!

Interesting analysis though. I’ve always used K/BB…

Peter Jensen
10 years ago

Why the neglect of K-B in your analysis?

Nathaniel Dawson
10 years ago
Reply to  Peter Jensen

Yeah, if you want to use two components, K rate minus BB rate should have a better correlation than K to BB ratio.

Spitball McPhee
10 years ago

What about k%/bb%…?

a eskpert
10 years ago

It’s the same as k/bb