The “Exceptional” Kyle Lohse

After the 2012 season, Kyle Lohse declined the qualifying offer of the St. Louis Cardinals, and hit the free agent market.  Lohse’s 2012 season was exactly what any starter would want in a contract year: a career-best 2.86 ERA over 211 innings.  It completed a comeback from a rough 2010 in which Lohse battled arm trouble, and had one of his worst seasons. 

Many commentators felt that that Lohse’s 2012 campaign was a one-time affair.  Lohse’s ERA benefited from an unusually low .262 Batting Average on Balls in Play (BABIP), and the usually reliable pitching statistic of Fielding Independent Pitching (FIP) dinged him for it, pegging his real performance at 3.51 — almost three quarters of a run higher.  Furthermore, Lohse spent 2012 at Busch Stadium, a pitcher’s park, and got to have his pitches called by Yadier Molina, perhaps the best catcher in the game.  

But was Lohse’s low BABIP in 2012 truly a fluke? 

Let’s start by comparing Lohse to other Cardinals starters with at least 150 IP that year.  Like Lohse, they pitched their home games in the same pitcher’s park, and also took their signs from Yadier Molina:

Kyle Lohse 211 0.262 2.86 3.51
Jake Westbrook 174.2 0.312 3.97 3.8
Adam Wainwright 198.2 0.315 3.94 3.1
Lance Lynn 169 0.316 3.67 3.47

Of all Cardinals starters that year, Kyle Lohse had the best starter BABIP by 50 points, and was the only one below the league BABIP average.  Interesting.  But, one season proves nothing.  So, let’s look at 2011, again for Cardinal starters with at least 150 IP:

Kyle Lohse 188.1 0.269 3.39 3.67
Chris Carpenter 237.1 0.312 3.45 3.06
Jake Westbrook 183.1 0.313 4.66 4.25
Jaime Garcia 194.2 0.318 3.56 3.23

In 2011, Kyle Lohse’s BABIP was a mere seven points higher than his 2012 BABIP, and still absurdly low.  Once again, Lohse’s BABIP was by far better than any other Cardinals starter, and well below league average.  Is this still a fluke?  Does Yadi just save his best calls for his friend Kyle?

Perhaps, the key is to get Lohse away from Molina and Busch Stadium.  Fortunately for our purposes, the Milwaukee Brewers indulged this notion, signing Lohse at the conclusion of 2013 Spring Training.  Miller Park, where the Brewers play, is a hitter’s park where the fly balls go a long way and batters get more hits.  Furthermore, in 2012, the Brewers had one of the worst defenses in baseball.  The stage seemed to be set for a substantial BABIP regression.

The 2013 season is now almost complete for the Brewers.  Yet, as of the time this article was written, here are the statistics for Brewers starters with at least 150 IP in 2013:

Kyle Lohse 184.2 0.284 3.46 4.1
Yovani Gallardo 161.2 0.299 4.18 3.95
Wily Peralta 172.1 0.292 4.49 4.28

Lohse’s BABIP did regress a bit.  Yet, Lohse’s BABIP is not only the lowest of the three qualifying Brewers starters, but still notably below the .294 BABIP average of baseball. 

One last comparison: other NL Central starters play in many of the same stadiums that Kyle Lohse does.  How does his BABIP compare to starters who have also spent the last three years pitching at least 450 innings exclusively for NL Central teams?

Kyle Lohse 0.271 3.22 3.75
Bronson Arroyo 0.278 4.13 4.63
Mike Leake 0.284 3.87 4.21
Homer Bailey 0.292 3.76 3.67
Yovani Gallardo 0.293 3.79 3.83
Jake Westbrook 0.307 4.23 4.15

There he is again.  The lowest BABIP in the NL Central for starters over the last three years belongs to Kyle Lohse.

What is going on?  Does Kyle Lohse simply possess The Will to Pitch? 

Certainly, many of you might claim Kyle Lohse is the beneficiary of nothing more than good luck.  It is almost an article of faith among observers that BABIP is essentially a random attribute beyond the pitcher’s control, benefiting substantially from defense.  One could also argue I am using arbitrary endpoints.  While Kyle Lohse had a terrific pitching BABIP from 2011–2013, his major league BABIP was .364 in 2010.  Move the goalposts, some would say, and get a different result.  Finally, Derek Carty suggests that BABIP can take as long as 8 years (~3729 batters) to stabilize into a predictable indicator of a pitcher’s ability, which is another way of saying that it never really stabilizes at all, and is therefore indicative of nothing.

As to Kyle Lohse, that view may be correct.  But I suspect it is not.  Rather, I suspect that Kyle Lohse’s career renaissance has actually been driven in part from his ability to limit the damage caused by balls put into play.  To explain why, I’ll first address the arguments I just made in favor of his performance being unsustainable.

First, let’s talk about BABIP.  Although it common to attribute BABIP entirely to luck, it is more complicated than that.  Tom Tango and his colleagues found, for example, that BABIP was 44% luck.  The remainder (majority) of BABIP was attributed to a combination of the pitcher, the park, and fielding.  The pitcher was given 28% of the credit for his BABIP, but that is just an average; many observers suspect that a small class of pitchers has a unique ability to control their BABIP by inducing less effective contact.  Strikeout pitchers are one example. So, while it is common to dismiss good BABIPs as flukes, it is intellectually lazy to do so, particularly if a pitcher is generating low BABIPs on a consistent basis. 

Second, let’s address arbitrary endpoints.  Am I excluding Kyle Lohse’s dreadful 2010 season from my endpoints?  Yes.  Why? A few reasons.  First, because Lohse was injured that year and dealing with arm trouble that he finally was able to resolve.  In fact, the 2010 season was the culmination of a few injury-plagued seasons for Lohse.  But since the 2011 season that followed, Lohse has consistently pitched at least 180 innings per year and also consistently been effective, more so than he was ever was before.  Since 2011, his walk rates have been the best of his career, as have the ratio of his strikeouts to walks, both attributes that everyone agrees are controlled primarily by the pitcher’s ability.  Also, as Russell Carleton has found, a pitcher’s recent BABIP performance tends to be more predictive of their BABIP going forward.  So, what some would call an arbitrary endpoint (the beginning of Lohse’s 2011 season), I would call appropriate, and indicative.    

Finally, there is the issue of sample size.  Although I have no quarrel with the method Derek Carty used to conclude that a pitcher’s BABIP can take 3729 batters to stabilize, Kyle Lohse has faced over 2400 batters in the past three years.  That is not trivial sample, particularly when it spans home stadiums at opposite ends of the park factor spectrum. 

My suspicions about Lohse are further confirmed when you consider the differential between his RA9-WAR and his fWAR.  FanGraphs bases fWAR for pitchers entirely on their FIP.  However, FanGraphs also recognizes that FIP, while effective in evaluating most pitchers, does not properly evaluate pitchers who actually possess the skill to limit the damage on balls put into play.  Rather than toss FIP and fWAR aside, FanGraphs last year began publishing RA9-WAR as an alternative metric to allow a comparison between the number of runs that actually come across the plate while a pitcher is on the mound, versus those that FIP is willing to credit to the pitcher as having personally prevented.  The differential between a pitcher’s RA9-WAR and fWAR tells you how much of that pitcher’s run prevention cannot be explained by the three “true” outcomes of home runs, walks, and strikeouts.  Niftily, FanGraphs also estimates how the other runs were prevented — through BABIP (BIP-Wins) and by runners stranded (LOB-Wins).  Both RA9-WAR and fWAR are also park-adjusted.

Let’s start with the entire time period of 2011-2013.  For starters with 450 IP, Lohse’s RA9-WAR / fWAR differential is one of the top 10% in the game.

Name RA9-WAR BIP-Wins LOB-Wins FDP-Wins RAR WAR RA9 / fWAR Differential
Jered Weaver 17 6.1 -0.1 6 102 10.9 6.1
Jeremy Hellickson 9 4.6 0.6 5.2 37.2 3.8 5.2
Hiroki Kuroda 14 1.7 2.6 4.3 90.4 9.7 4.3
Clayton Kershaw 21.9 5.6 -1.5 4.1 152.9 17.8 4.1
Bronson Arroyo 6.6 2.3 1.7 3.9 23.3 2.6 4
Kyle Lohse 11 3.6 0.2 3.8 66.1 7.2 3.8
Ervin Santana 8.2 4.6 -0.9 3.7 41.9 4.5 3.7
R.A. Dickey 11.8 3.2 0 3.2 80 8.6 3.2
James Shields 15.5 2 1 3 117.3 12.5 3

Lohse’s differential has intensified in 2012-2013.  Over the last two years, among those with 300 IP pitched, only one starter in baseball had a larger RA9-WAR / fWAR differential (last column) than Kyle Lohse:

Name RA9-WAR BIP-Wins LOB-Wins FDP-Wins fWAR RA9-WAR minus fWAR
Clayton Kershaw 14.6 4.3 -0.9 3.4 11.2 3.4
Kyle Lohse 8.3 2.4 0.9 3.3 5 3.3
Hiroki Kuroda 10.3 1.6 1.1 2.7 7.6 2.7
Bronson Arroyo 6.7 1.3 1.1 2.5 4.2 2.5
Jarrod Parker 7.2 2.1 0.2 2.3 5 2.2
Jordan Zimmermann 8.3 1.1 0.8 2 6.4 1.9
Ervin Santana 3.5 3.4 -1.6 1.9 1.7 1.8
R.A. Dickey 8.2 2.3 -0.4 1.9 6.4 1.8
Chris Sale 11.3 0.8 0.8 1.6 9.7 1.6

That guy’s name is Clayton Kershaw, and he is pretty good.  In fact, Kershaw and Lohse have beat their FIP by basically the same amount over the past two years.  Unlike Kershaw, Lohse has pitched one of those seasons at home in Miller Park.

Overall, it is safe to say Lohse is showing a strong and consistent ability to beat his FIP, and over the last few years, is doing so better than almost any starter in baseball.  He is doing so by generating balls in play that are uniquely unsuccessful at becoming hits, and which his defense seems unusually capable of being able to field for outs.

How is he doing this?  It certainly is not his strikeout rate.  Lohse is not anybody’s idea of a strikeout pitcher.

What Lohse does do, however, is control the count, minimize walks, and consistently pitch from ahead.  This quality makes Lohse an extremely enjoyable pitcher to watch: despite topping out at 90 mph, he pounds the zone and challenges hitters.  His BB/9 over the last three years has ranged from 1.62 to 2.01.  During that same time frame, only Cliff Lee is more likely than Kyle Lohse to throw a first-pitch strike, which Lohse did 67.5% of the time.  The fact that Lohse is throwing first-pitch strikes against 2/3 of the batters he faces without getting killed suggests that he is putting those strikes in locations where batters want no part of them.  In short, Lohse has terrific control and consistently finds himself in counts where he and his catcher have the luxury of choosing their pitch.

Does Lohse’s control affect the quality of the ball being put into play against him?  It very well may.  Although his sample size could have been larger, Russell Carleton found that pitcher BABIPs correlated with the pitch counts the hitters were facing when they put the bat on the ball.  The more favorable the count to the pitcher, the less likely the hitter will get on base from his hit.  Kyle Lohse’s three best counts for limiting batter wOBA this year?  Why, those would be 0-2, 1-2, and 0-1.  And the three counts Kyle Lohse faces far less than any others?  Those would be 3-0, 3-1, and 3-2. 

The bottom line is that Kyle Lohse is an exception among aging starters: a pitcher who has gained effectiveness in his mid-thirties through terrific control that not only forces hitters to beat him, but also apparently limits the damage even when batters do hit the ball.  Should the Brewers make Lohse available at the trade deadline next year, contenders would be foolish not to give him a close look, particularly with Lohse under control through 2015.  When the difference between collecting a pennant and going home can be a batted ball just out of reach, it makes sense to have a pitcher with a demonstrated knack for putting the ball in the defender’s glove.  

Jonathan Judge has a degree in piano performance, but is now a product liability lawyer. He has written for Disciples of Uecker and Baseball Prospectus. Follow him on Twitter @bachlaw.

Newest Most Voted
Inline Feedbacks
View all comments
10 years ago

In 2011, Kyle Lohse’s BABIP was seven points lower than his supposedly fluky BABIP from 2012.

A minor typo in an otherwise excellent article.

10 years ago

FDP-Wins is BIB-Wins plus LOB-Wins, not a completely separate statistic. That is why it is essentially the same as the RA9-WAR/fWAR differential column in the tables. I am not sure but I think the few 0.1 differences may be due to changes in yearly park factors and other constants in WAR.

10 years ago

So, while it is common to dismiss good BABIPs as flukes, it is intellectually lazy to do so

Thanks for not being lazy. Great article.

It would be awesome if Fangraphs had a park adjusted BABIP skill statistic (descriptive and/or predictive).

10 years ago

Thank you for finally giving Kyle Lohse the respect he deserves! He’s been underrated and dismissed for years because of his low K/9%, relatively high FIP, and horrible 2009-10 seasons (during most of which he was pitching hurt).

As you said, FIP and the idea that BABIP should hover around .300 are fundamentally flawed by not taking into account the type of contact given up, most specifically GB% and LD%, and I wish more baseball analysts would realize that. Contact pitchers can be very good as long as they limit walks and line drives while getting plenty of ground balls, otherwise Greg Maddux would not be a future Hall of Famer.