How Sticky Are Walk Rates After Velocity Changes?

An increasingly popular strategy for drafting pitchers is taking ones with plus control and underwhelming fastballs, with the idea being that the club’s player development team can coax out a velocity jump. Intuitively, this makes sense since it is relatively easy to develop velocity and relatively difficult to improve control. Being able to get a pitcher from control-only to one with above-average stuff and plus control, and suddenly you have a solid rotation arm out of an org-depth pitcher.

However, skill improvement does not happen in a vacuum, and there are potential side effects to this strategy, namely that control might end up getting worse as velocity increases. In general, these end up being fine tradeoffs since we are in a power-oriented offense, but investigating these effects is important in evaluating how valuable this strategy is, which is what I do in this article.

Methodology

The variable we will be testing is difference in walk percentage (BB%) from one year to the next. I prefer this approach to something more sophisticated like Command+ since I do not want to regress to a metric derived from a regression. This will be tested using linear regression, with the independent variables being percent change in fastball velocity, age, and last season walk percentage. For age, we would expect that older pitchers’ skills should stay roughly the same, whereas younger pitchers might be more susceptible to an increase in velocity leading to an increase in walks.

An important note is that this dataset is biased. Unfortunately, we do not have minor league pitch velocity data, so we cannot include players who had a big velocity jump but did not end up making the major leagues. Our dataset comprises of pitchers who are already good enough to make the majors, so it should make sense that if they add velocity, their BB% should hopefully remain close to unchanged since they are already so talented. The relevance of this study being applied to the amateur draft and minor league baseball is hindered because of this, though if age is a significant factor in the regression, then I think it is safe to assume some of the results hold for younger and less polished players.

The data comes from the time frame of 2015-2019. I did not want to include the COVID year or this current year. This is only five seasons, but going back even farther would not help since we would be including a different time period of coaching pitchers.

Results

Running this linear regression, we get a regression with an F statistic of 90 with a corresponding p value of less than 2.2 X 10-16, indicating that this model is statistically significant. The R Squared is only .1262 though, which is low. The output table is shown below:

Linear Regression Output Table
Variable Coefficient Std Error T Value P(>|t|)
Intercept 5.68307 .59275 9.588 < 2 X 10-16
Change_FB_Velo -.21481 .05462 -3.933 8.7 X 10-5
Prior_BB_Per -.52674 .02243 -23.482 < 2 X 10-16
Player_Age -.03137 .01811 -1.732 .0834

From the table, we can see that change in fastball velocity and prior control ability are important variables in one year difference in walk percentage. Age was not important in the model, which makes it unclear as to whether or not we can apply these results to minor leaguers.

This does not quite answer the question of whether or not increasing the velocity of pitchers who already have plus control affects their control at all. In this model, I restricted the data to just above-average control-pitchers (< 7.5% BB% in the past year), and then I used player age and change in fastball velocity to model difference in walk percentage. Here we see that all of the inputs in the model are significant, including age, which is interesting. Additionally, the F statistic has a statistically significant p value, but the R Squared of this model is .03864, so it is hard to make much of these results.

Linear Regression Output Table (Control Pitchers)
Variable Coefficient Std Error T Value P(>|t|)
Intercept 5.50138 .94504 5.821 8.8 X 10-9
Change_FB_Velo -.16946 .07989 -2.121 .0342
Prior_BB_Per -.40993 .07934 -5.166 3.1 X 10-7
Player_Age -.05254 .02591 -2.028 .0430

Conclusion

To summarize, predicting walk rates is difficult, which is not really a surprise. Control is difficult to measure, so modeling it using primitive statistics is not necessarily accurate. From what I have shown here, there is some slight evidence to suggest that an increase in fastball velocity does not necessarily mean that the pitcher’s walk percentage will get worse. There is also some evidence that player age is important to difference in walk rates for players with above-average command already, indicating that increasing velocity at a young age for players with premium command does not result in command getting worse.

While this may be disappointing, I think the best we can do to answer this question is to wait and see. Early adapters of the “draft plus control with below-average velocity and develop velocity” model will either win big or end up with a bunch of Quad-A pitchers. These sorts of effects are hard to capture quantitatively with the information we have now, at least publicly. Following pitchers like Reid Detmers, who have made a velocity jump while not sacrificing the plus command, will be more informative towards answering this question than this study. Exciting times are ahead of us.





2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
newsensemember
2 years ago

The reason your F statistic is so high is a methodological error. Change in walk pct will always negatively correlated with the prior year’s walk %. You’re regressing (BB1-BB0) against BB0. To prove it to yourself create two columns of random numbers on a spreadsheet and produce a third column where each row is the difference between the first two entries in the rows (c1=b1-a1). To eliminate this effect your dependent variable should be the current year’s walk rate, not the difference between the current year and the prior year. Alternatively, if you want to use the year-to-year change as your dependent variable, instead of using the prior year’s walk rate, use the sum of the current year rate and the prior year rate: (BB1+BB0) is uncorrelated with (BB1-BB0)