The Year-to-Year Consistency of Contact Quality: Pitchers

A few months ago, I read an article on FiveThirtyEight by Rob Arthur about a pitcher’s ability to suppress hard contact. One of his conclusions was that some pitchers are better at limiting hard contact than others. This makes good sense, and we can see that suppressed contact in guys like Johnny Cueto and Chris Young. He used the Statcast dataset to find, in MPH, how much faster or slower, on average, a ball would come off the bat from a given pitcher. While the Statcast dataset is still a work in progress, and the metrics may not be super reliable at the moment, the basic idea that pitchers can suppress contact quality, and therefore hits, remains.

That’s all fine, but these statistics would only be useful if they are predictive. I want to see if contact quality is consistent from year to year. I went back through the FanGraphs leaderboards and pulled pitcher seasons from 2010-2014 with at least 200 balls in play. I chose 2010 as the start year because it was the first season Baseball Info Solutions (BIS) used an algorithm to determine contact quality, instead of the video scouts’ judgments. I wanted to see how the Hard% compared from one year to the next, so I took the 20 best and 20 worst pitchers by the metric in each year and matched them with the next year’s data.

Now, since I used a 200 ball in play cutoff, some of the top 20 for a given year did not qualify for the next year, so I only used pitcher seasons that qualified in consecutive years. I did the same thing for Soft%, but not Med%, as nobody cares about who gave up the least medium contact. I had to do all this relative to the league average in that season because league average changed drastically each year (league average Soft% was .1716 in 2010 and .2417 in 2011 for pitchers in my sample). Starting with Soft%:

Year AVG Top 20 Diff Top 20 Next AVG Next Diff Next Change
2010 0.1716 0.2201 0.0485 0.2474 0.2417 0.0057 -0.0428
2011 0.2417 0.2905 0.0488 0.1677 0.1565 0.0112 -0.0376
2012 0.1565 0.1956 0.0391 0.1591 0.1499 0.0092 -0.0299
2013 0.1499 0.1877 0.0378 0.1926 0.1810 0.0116 -0.0262
Total 0.1799 0.2235 0.0436 0.1917 0.1823 0.0094 -0.0341
Year AVG Bot 20 Diff Bot 20 Next AVG Next Diff Next Change
2010 0.1716 0.1318 -0.0398 0.2344 0.2417 -0.0073 0.0325
2011 0.2417 0.2019 -0.0398 0.1549 0.1565 -0.0016 0.0382
2012 0.1565 0.1189 -0.0376 0.1364 0.1499 -0.0135 0.0241
2013 0.1499 0.1140 -0.0359 0.1818 0.1810 0.0008 0.0367
Total 0.1799 0.1417 -0.0383 0.1769 0.1823 -0.0054 0.0329

This table is not the easiest to read because, but the columns to focus on in each table are Diff, Diff Next, and Change. Diff is the difference between the Top/Bot 20 average and the league average for that year. Diff Next is the difference between how those same pitchers perform the next year and the league average for next year, and Change is the difference between Diff and Diff Next.

On average, the top 20 pitchers by Soft% had a Diff of .0436 in year one, and .0094 in year two. In other words, they generated 24.2% more soft contact than average in year 1, and only 5.1% more the next year. Similarly, the bottom 20 pitchers generated 21.3% less soft contact in the first year and 3.0% less the next year.

Here are the same results for Hard%:

Year AVG Bot 20 Diff Bot 20 Next AVG Next Diff Next Change
2010 0.3033 0.3462 0.0429 0.2523 0.2465 0.0058 -0.0371
2011 0.2465 0.2853 0.0388 0.2907 0.2858 0.0049 -0.0339
2012 0.2858 0.3282 0.0424 0.3136 0.3066 0.0070 -0.0354
2013 0.3066 0.3530 0.0464 0.3095 0.2917 0.0178 -0.0286
Total 0.2856 0.3282 0.0426 0.2915 0.2827 0.0089 -0.0338
Year AVG Top 20 Diff Top 20 Next AVG Next Diff Next Change
2010 0.3033 0.2606 -0.0427 0.2346 0.2465 -0.0119 0.0308
2011 0.2465 0.1996 -0.0469 0.2692 0.2858 -0.0166 0.0303
2012 0.2858 0.2419 -0.0439 0.3013 0.3066 -0.0053 0.0386
2013 0.3066 0.2570 -0.0496 0.2820 0.2917 -0.0097 0.0399
Total 0.2856 0.2398 -0.0458 0.2718 0.2827 -0.0109 0.0349

The 20 pitchers who allowed the most hard contact allowed 14.9% more than average in year one, but only 3.1% more in year two. The 20 best pitchers by Hard% allowed 16.0% less than average one year and 3.9% less the next.

It is obvious that some regression should be expected for these over- and under-performers. For both metrics, the top and bottom 20 pitchers in one season come much closer to average the next. These quality-of-contact metrics are similar to BABIP in that they are highly volatile from year to year.

The numbers, however, don’t come all the way back to league average in year two. The top 20 pitchers stay slightly above average the next year, while the bottom 20 guys similarly stay slightly below average. This suggests, which is often the case, that a year of these highly variable quality of contact metrics can still carry some predictive value. It is hard to say just how much predictive power they have without knowing how much to regress someone’s Hard%, for example, given some number of balls in play.

While there is some predictive value in a season’s worth of batted-ball data, there isn’t much, so it’s hard to attribute an extremely high Soft% to talent. More likely, these metrics behave similarly to BABIP, in that one fortunate season is not enough to determine the talent level of a player. Batted-ball profiles and BABIP are closely connected, as hard-hit balls tend to fall for hits more often than softly-hit balls.

Groundballs, line drives, and fly balls also have their own expected BABIPs, so we could combine this entire batted-ball profile and come up with an expected BABIP for a pitcher, both within a season and for a career. While we know how many groundballs and how much soft contact a pitcher gives up, we don’t know how many soft groundballs a pitcher gives up. Ideally, we could classify each batted ball into flight type and speed. This is what Statcast tries to do with its launch angle and launch speed data, but that system still has a ways to go. For now, don’t put too much stock into a pitcher’s ability to suppress hard contact in a single season, the same way we don’t put too much stock into a pitcher’s low BABIP for the year.

Beau played baseball at Williams College and is currently an MBA/MS Sport Management student at UMass Amherst

newest oldest most voted

Beau – what happens if you do the same analysis but use (Hard% minus Soft%) as the metric?

If pitchers do have the ability to control the contact quality on batted balls, there’s a likelihood that Hard% and Soft% would be negatively correlated, and the year-to-year correlation of (Hard% minus Soft%) could be higher than the year-to-year correlation of either Hard% or Soft% on its own.