PitchingBot: Using Machine Learning To Understand What Makes a Good Pitch

People have always been looking to understand what makes a good pitch. With advances in pitch tracking technology and computing power, we can begin to use large amounts of data to answer this question more definitively. I’ve created a model called PitchingBot which uses machine learning to try and find what makes a good pitch.

Machine learning describes a general class of algorithms that are very flexible and “learn” patterns from large amounts of data. This means I don’t have to tell PitchingBot what I think a good pitch is, but instead I can give it a load of pitches (and the results of those pitches) and it will train itself to recognize a pitch that gives good results.

I intend to investigate a couple of key questions:

Does PitchingBot reach the same conclusions as conventional wisdom about what makes a good pitch?

Naively, I would expect a good pitch to have the following qualities: high velocity, plenty of movement, and good location in the corner of the strike zone. I will look at whether these are true for PitchingBot and how the definition of a good pitch changes with the ball/strike count.

Can we meaningfully compare and evaluate pitchers using PitchingBot?

Are the pitchers who are best according to PitchingBot those who get the best results? PitchingBot isn’t very useful if it does not agree with real pitcher performance.

I’m also not the first to try to answer this question. A similar Quality Of Pitch (QOP) statistic was developed a few years ago with similar aims.

Building PitchingBot

This section is a little technical, so feel free to skip straight to the results if you aren’t interested.

The first, hardest, and most important part of making a machine learning model is the gathering and manipulation of data. This can make or break a model before you even start.

The input data to PitchingBot was taken from all recorded pitches in the Statcast era (2015-). The input pitch characteristics included: pitch type, speed, spin rate, position over the plate, horizontal and vertical movement, ball/strike count, and batter and pitcher handedness. This gave me around 3.5 million pitches on which to train PitchingBot.

I aimed to predict the expected run difference between before and after the pitch was thrown. The number of expected runs changes with the count as well as when an event happens, such as an out is made or a hit or walk is recorded. The expected runs by pitch count were calculated myself with similar values to the table found here, and the expected runs by event were taken from here.

I used a common machine learning method called XGBoost which creates a model made of a series of decision trees. To avoid overfitting, I used cross validation and adjusted the max depth of the trees and the number of training steps to get the lowest mean square error without significant differences between the training and test sets.

I did run into a few problems and areas for potential improvement. For instance, around 10% of pitches had incomplete tracking data and had to be thrown away. I also didn’t include the pitch release point, which could be a useful input. There is no information provided on other pitches in the same at-bat. An ideal model would include the effect of sequencing and pitcher arsenal on pitch quality. Finally, I didn’t investigate a large range of different machine learning models and model parameters, which could tune PitchingBot for maximum performance.

StuffBot and CommandBot

In addition to PitchingBot, I created two other models to focus on some very different aspects of pitching: StuffBot and CommandBot.

Stuff is the raw quality of the pitches thrown, regardless of where they end up. A pitcher with good stuff will have high velocity, high spin rates, and lots of movement on their pitches, making them hard to hit. To train StuffBot, I removed the plate position of the pitch from the input variables when training, so the model has to predict how good a pitch is without even knowing where it ends up.

Command is the ability to throw the ball in a precise location. A pitcher with good command will dot the corners of the strike zone for strikes and throw balls in positions which encourage batters to chase out of the zone. To train CommandBot, I removed the speed, spin rate, and movement of the pitch from the input variables when training.

Results: What Makes a Good Pitch

The most important features for a good pitch are throwing it in the appropriate place for the pitch type and the count. Let’s look at some of PitchingBot’s predictions. In all of the following images, the expected runs above average are shown by the position of the pitch on the plate. Blue means PitchingBot thinks it’s a good pitch, red means PitchingBot disapproves.

For a full count four-seam fastball, it’s best to hit the corners of the zone and it’s better to be high than low if you throw just outside the zone.

Meanwhile for a 3-0 fastball, all you have to do is hit the zone. PitchingBot doesn’t care if it’s middle-middle or not.

The picture changes dramatically in an 0-2 count. When throwing an 0-2 slider, PitchingBot thinks it is ideal to place it down and breaking away from the strike zone — just don’t hang it!

It looks like the way PitchingBot evaluates pitches is consistent with conventional wisdom about what is a good pitch in different situations. It’s learned the position of the strike zone well, how to get hitters to chase when they’re behind in the count, and to avoid the center of the zone.

As for stuff, I won’t bore you with lots of images here, but I’ll summarize what StuffBot likes the most from each pitch type:

What StuffBot “Likes” For Each Pitch Type
Pitch Type StuffBot Likes
Four-Seam Fastball High velocity, lots of vertical movement, high spin rate
Sinker High spin rate, lots of armside and downward movement
Cut Fastball High spin rate, more important to have downward movement than horizontal movement
Curveball High velocity, lots of horizontal movement
Slider High spin rate, lots of horizontal movement
Changeup High velocity, lots of armside and downward movement

As expected, StuffBot likes pitches which have higher velocity and more movement. An interesting feature is that for sliders and curveballs, StuffBot likes horizontal movement more than vertical movement. Additionally, StuffBot likes cut fastballs and sinkers to have downward vertical movement, to dive below bats which are looking for flat fastballs.

Results: Who Does PitchingBot Like

Looking back at the 2020 season, let’s check who PitchingBot thinks would have performed the best and whether the model is making sensible guesses or not. Before making any conclusions, we should check that PitchingBot, StuffBot, and CommandBot are all doing the right job. We can compare each bot to a traditional pitching stat which represents the target measure. Each point on the following graphs represents a season for a pitcher from 2015-19.

We can compare PitchingBot to FIP:

We can compare StuffBot to strikeout rate:

And we can compare CommandBot to walk rate:

We can see that the results from the models are correlated with traditional statistics. With this in mind, let’s look at who PitchingBot, StuffBot, and CommandBot think were 2020’s best performers.

To make this comparison, I took predicted runs saved above average per hundred pitches for each pitcher. First we can compare by pitch type (min. 100 pitches of each type).

Looking a little closer at the data shows that most of the contribution to these rankings comes from pitch command. CommandBot agrees with these rankings much more than StuffBot.

The previous tables show the pitcher performance by rate. As for the pitchers who PitchingBot thinks were the best cumulatively over the 2020 season, it ranked them as Gerrit Cole, Zac Gallen, Jacob deGrom, Trevor Bauer, and Blake Snell, first through fifth. It seems that pitchers who throw the best pitches according to PitchingBot also get some of the best results.


Machine learning seems to be applicable to classifying the quality of baseball pitches. The results agree with conventional pitching wisdom about where to throw pitches in particular counts, and PitchingBot can identify great pitchers from their pitch quality alone. This model excluded the effects of sequencing and pitcher arsenal, which could take this approach further.

I’ve created an app to show how the different models rank pitchers for the 2020 regular season, offering the results in more detail for each individual pitch type. In case your favorite pitcher didn’t make any of the lists I’ve shown, the data can be found here and here.

This article is based on a Reddit post which I wrote a couple of months ago that can be found here.

newest oldest most voted
Greg Golden
Greg Golden

I’m an MBA student and I’m taking some data science classes. This seems like the grownup version of stuff I was trying to do for assignments last semester. Very cool.


Thanks Cameron. Interesting stuff.