Visualizing Pitcher Consistency
Visualizing Pitcher Consistency
When evaluating starting pitcher performance, fantasy owners and fans alike lament the relative inconsistency of certain pitchers deemed especially volatile (Francisco Liriano will break your heart), while others like Mark Buehrle are workhorses often viewed as among the most steady arms available. A.J. Mass of ESPN has written about the value of calculating “Mulligan ERAs,” in which a pitcher’s three worst outings are subtracted from his overall ERA. His colleague Tristan Cockroft routinely publishes Consistency Ratings to let readers know which pitchers have remained relatively high on ESPN’s player rater from week to week.
While these methods focus on pitcher performance from start to start, it may be useful to evaluate pitcher performance against individual batters. If Tommy Milone gets rocked pitching on the road in Texas, we may be less concerned than if he is routinely unable to get out low quality hitters. To this end, we can examine how pitchers perform against different levels of batters. How well does a given pitcher avoid putting low OBP batters on base? How does this compare to his rate of putting a high OBP batter on base? We would expect to see a linear relationship—the Emilio Bonifacios of the world should be easier to get out than the Joey Vottos.
Methods
We begin by examining the 31 pitchers with the most innings pitched for the 2012-2013 seasons. After obtaining batter vs. pitcher data for each of these pitchers during the last season and a half, we can calculate the OBP allowed by each pitcher to any batter with at least 5 plate appearances during this time period (arbitrary cutoff alert!). We can now see how Buster Posey fares against the likes of Clayton Kershaw, Ian Kennedy, and any other NL pitcher in which he has accrued at least 5 PA. It turns out Posey did pretty well for himself.
In order to obtain the OBP of batters in general, not in relation to particular pitchers, we can examine the leaderboards for players with at least 450 PA in 2012-2013. Based on the work of Russell Carleton, we have confidence that after ~450 PA, a batter’s OBP tends to stabilize and represents their long-term OBP skill level.
Batters were then placed in five buckets, lowest, low, medium, high, and highest OBP levels.
Batter On-Base Percentage Classification |
||
OBP Category |
OBP |
Player Examples |
Lowest |
0.243-.311 |
Colby Rasmus, J.J. Hardy, Raul Ibanez |
Low |
.311-.330 |
Ruben Tejada, Eric Hosmer, Michael Young |
Medium |
.330-.338 |
Elvis Andrus, Jason Heyward, Yoenis Cespedes |
High |
.338-.349 |
Brandon Belt, Jason Kipnis, Coco Crisp |
Highest |
.349-.458 |
Allen Craig, Andrew McCutchen, Mike Trout |
Each batter, assigned a score of lowest to highest, was then matched with the batter vs. pitcher dataset, allowing for us to calculate the mean OBP allowed by individual pitchers to hitters in each of the categories. So, although someone like Zack Cozart sports a .283 OBP in 2012-2013, earning a spot in the lowest category, he does own a .329 OBP against Yovani Gallardo. Maybe this is all the evidence Reds Coach Dusty Baker needs to keep batting Cozart second in the lineup.
Results
If we examine the performance of pitchers across five categories of OBP skill, we can calculate the correlation coefficient of these five points. R2 in this case is a measure of how well the data fits a straight line—if a pitcher allows a low OBP to low OBP hitters, and a correspondingly higher OBP to high OBP hitters, the data points should increase linearly and the value of R2 should approach 1. Conversely, pitchers that are inconsistent in their ability to get hitters of a certain skill level out would have a R2 much closer to 0.00.
Correlation Coefficient for OBP Allowed Among Differently Skilled Batters |
|
Name |
R2 |
Adam Wainwright |
0.798 |
Jason Vargas |
0.793 |
Max Scherzer |
0.771 |
Ricky Nolasco |
0.740 |
Matt Cain |
0.734 |
Yu Darvish |
0.717 |
Wade Miley |
0.705 |
C.J. Wilson |
0.700 |
Jordan Zimmermann |
0.697 |
Kyle Lohse |
0.660 |
Bronson Arroyo |
0.657 |
Yovani Gallardo |
0.638 |
Justin Verlander |
0.619 |
Mat Latos |
0.617 |
Cliff Lee |
0.553 |
Hiroki Kuroda |
0.536 |
James Shields |
0.469 |
Justin Masterson |
0.443 |
Homer Bailey |
0.377 |
Ian Kennedy |
0.353 |
Clayton Kershaw |
0.329 |
Cole Hamels |
0.159 |
Gio Gonzalez |
0.140 |
Mark Buehrle |
0.105 |
Trevor Cahill |
0.083 |
Felix Hernandez |
0.076 |
Chris Sale |
0.031 |
R.A. Dickey |
0.029 |
CC Sabathia |
0.028 |
Jon Lester |
0.028 |
Madison Bumgarner |
0.025 |
There is a wide range of R2 values among this list of starting pitchers. Adam Wainwright takes the grand prize for consistency. He is far more prone to putting elite OBP hitters on base than lowly hitters. Madison Bumgarner, on the other hand, strangely performs worse against low OBP than high OBP hitters, and has the lowest R2. And R.A. Dickey, as you might expect, is sort of all over the place.
Below is a visual representation of the OBP against pitchers with high and low R2 values. We can see that the pitchers with the highest correlation coefficient have a much more linear relationship overall with OBP allowed than pitchers with low values.
Additional analyses showed that there was no relationship between a starter’s FIP and their correlation coefficient. A quick glance at the names in the two graphs above confirms this. Jason Vargas, with a R2 of .793 is a worse pitcher, in pretty much all respects, than Felix Hernandez at .076. Interestingly, Jason Vargas has one of the league’s highest HR/9 at 1.28 during 2012-2013, while King Felix sports one of the lowest ratios at .62.
What, then, does pitcher consistency tell us? While it may not tell us much about the overall skill of a pitcher by itself, we can discern from the data which pitchers are doing a good job getting out poor hitters. Pitchers like Adam Wainwright and Max Scherzer are doing extremely well, and their R2 values indicate that they are pitching steady—they are less likely to blow up against poor hitters. Of course, pitcher performance can differ greatly from start to start, but one can have confidence that Ricky Nolasco will probably dominate his former Marlins teammates (30th in team OBP), because he consistently allows a low OBP to low OBP hitters. Conversely, perhaps it’s a good thing Jason Vargas does not have to pitch against his Angels teammates, who collectively have the 4th highest team OBP in the majors.
Oddly enough, Justin Masterson’s OBP allowed has a small range, from .299 in the middle OBP tier to .371 against the highest tier, indicating that when he’s brought his good stuff, he mostly dominates all batters regardless of their level of skill. We can have less confidence that Justin Masterson will dominate a middling OBP team like Kansas City (6.39 ERA this year), ranked 20th overall in the majors, while he has repeatedly humiliated the Blue Jays, who just beat out the Royals at 17th overall.
Despite the comically bad timing of his recent piece on batting Raul Ibanez against CC Sabathia, David Cameron was right to point out the relative worthlessness of individual batter vs. pitcher matchups and the danger of drawing conclusions from such small sample sizes. However, we can use aggregated batter vs. pitcher data to learn more about what kinds of players pitchers are more likely to strike out, or serve up the long ball, or a base on balls. While it’s easy to assume that pitcher X will be less likely to strike out Norichika Aoki than Ike Davis, by studying consistency we may be able to see who deviates from this linear pattern. Are some average strike out pitchers more likely to strike out low strikeout hitters? We can already see from the data above that R.A. Dickey is as likely to put a low OBP hitter on base as a high OBP hitter. While this fact seems to make little sense, these results indicate that the knuckleball can baffle expert hitters as much as less skilled batsmen. It may be worthwhile to use consistency ratings such as these to determine what kinds of pitchers deviate from the expected patterns.
All data courtesy of Fangraphs and Baseball Reference.
Because I’m a big believer in open data, here is a link to the R code used to find Batter vs. Pitcher OBP percentages by quintile.
Really enjoyed this piece man. That’s some interesting stuff. I’m always curious to why it seems like some pitchers do worse versus inferior hitters. Keep up the good work.