How Sticky Are Walk Rates After Velocity Changes?

by David Gerth

June 25, 2021

An increasingly popular strategy for drafting pitchers is taking ones with plus control and underwhelming fastballs, with the idea being that the club’s player development team can coax out a velocity jump. Intuitively, this makes sense since it is relatively easy to develop velocity and relatively difficult to improve control. Being able to get a pitcher from control-only to one with above-average stuff and plus control, and suddenly you have a solid rotation arm out of an org-depth pitcher.

However, skill improvement does not happen in a vacuum, and there are potential side effects to this strategy, namely that control might end up getting worse as velocity increases. In general, these end up being fine tradeoffs since we are in a power-oriented offense, but investigating these effects is important in evaluating how valuable this strategy is, which is what I do in this article.

Methodology

The variable we will be testing is difference in walk percentage (BB%) from one year to the next. I prefer this approach to something more sophisticated like Command+ since I do not want to regress to a metric derived from a regression. This will be tested using linear regression, with the independent variables being percent change in fastball velocity, age, and last season walk percentage. For age, we would expect that older pitchers’ skills should stay roughly the same, whereas younger pitchers might be more susceptible to an increase in velocity leading to an increase in walks.

An important note is that this dataset is biased. Unfortunately, we do not have minor league pitch velocity data, so we cannot include players who had a big velocity jump but did not end up making the major leagues. Our dataset comprises of pitchers who are already good enough to make the majors, so it should make sense that if they add velocity, their BB% should hopefully remain close to unchanged since they are already so talented. The relevance of this study being applied to the amateur draft and minor league baseball is hindered because of this, though if age is a significant factor in the regression, then I think it is safe to assume some of the results hold for younger and less polished players.

The data comes from the time frame of 2015-2019. I did not want to include the COVID year or this current year. This is only five seasons, but going back even farther would not help since we would be including a different time period of coaching pitchers.

Results

Running this linear regression, we get a regression with an F statistic of 90 with a corresponding p value of less than 2.2 X 10^-16, indicating that this model is statistically significant. The R Squared is only .1262 though, which is low. The output table is shown below:

Linear Regression Output Table

Variable	Coefficient	Std Error	T Value	P(>\|t\|)
Intercept	5.68307	.59275	9.588	< 2 X 10-16
Change_FB_Velo	-.21481	.05462	-3.933	8.7 X 10-5
Prior_BB_Per	-.52674	.02243	-23.482	< 2 X 10-16
Player_Age	-.03137	.01811	-1.732	.0834

From the table, we can see that change in fastball velocity and prior control ability are important variables in one year difference in walk percentage. Age was not important in the model, which makes it unclear as to whether or not we can apply these results to minor leaguers.

This does not quite answer the question of whether or not increasing the velocity of pitchers who already have plus control affects their control at all. In this model, I restricted the data to just above-average control-pitchers (< 7.5% BB% in the past year), and then I used player age and change in fastball velocity to model difference in walk percentage. Here we see that all of the inputs in the model are significant, including age, which is interesting. Additionally, the F statistic has a statistically significant p value, but the R Squared of this model is .03864, so it is hard to make much of these results.

Linear Regression Output Table (Control Pitchers)

Variable	Coefficient	Std Error	T Value	P(>\|t\|)
Intercept	5.50138	.94504	5.821	8.8 X 10-9
Change_FB_Velo	-.16946	.07989	-2.121	.0342
Prior_BB_Per	-.40993	.07934	-5.166	3.1 X 10-7
Player_Age	-.05254	.02591	-2.028	.0430

Conclusion

To summarize, predicting walk rates is difficult, which is not really a surprise. Control is difficult to measure, so modeling it using primitive statistics is not necessarily accurate. From what I have shown here, there is some slight evidence to suggest that an increase in fastball velocity does not necessarily mean that the pitcher’s walk percentage will get worse. There is also some evidence that player age is important to difference in walk rates for players with above-average command already, indicating that increasing velocity at a young age for players with premium command does not result in command getting worse.

While this may be disappointing, I think the best we can do to answer this question is to wait and see. Early adapters of the “draft plus control with below-average velocity and develop velocity” model will either win big or end up with a bunch of Quad-A pitchers. These sorts of effects are hard to capture quantitatively with the information we have now, at least publicly. Following pitchers like Reid Detmers, who have made a velocity jump while not sacrificing the plus command, will be more informative towards answering this question than this study. Exciting times are ahead of us.

2 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

newsenseMember since 2020

4 years ago

The reason your F statistic is so high is a methodological error. Change in walk pct will always negatively correlated with the prior year’s walk %. You’re regressing (BB1-BB0) against BB0. To prove it to yourself create two columns of random numbers on a spreadsheet and produce a third column where each row is the difference between the first two entries in the rows (c1=b1-a1). To eliminate this effect your dependent variable should be the current year’s walk rate, not the difference between the current year and the prior year. Alternatively, if you want to use the year-to-year change as your dependent variable, instead of using the prior year’s walk rate, use the sum of the current year rate and the prior year rate: (BB1+BB0) is uncorrelated with (BB1-BB0)

David GerthMember since 2022

Reply to newsense

This is really helpful, thank you. I have attached corrected code at the bottom. Thankfully, the results are roughly still the same, though with lower F values/ R^2

https://github.com/dgerth5/mlb_draft_walk_article/blob/main/fangraphs%20bb%20study.R

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG