# Reverse Engineering Swing Mechanics from Statcast Data

There’s no question that Statcast has revolutionized the way we think about hitting. Now in year three of the Statcast era, everyone from players to stat-heads to the average fan is talking about exit velocities and launch angles. But what can a player do to improve both their exit velocity and launch angle? It all comes down to the mechanics of the swing.

The next great revolution in baseball is leveraging data about swing mechanics to optimize exit velocities and launch angles. It’s a revolution that has already begun. Using technologies developed by companies like Zepp, Blast Motion, and Diamond Kinetics, players and coaches can now get detailed analyses of every swing during practice. Teams are already starting to integrate these swing analyses into their player-development programs. However, none of these sensors are currently being used during MLB games.

It’s only a matter of time before MLB starts tracking swing data during games, but until then we can use Statcast data and a little physics to reverse engineer the mechanics of the swing. A couple of weeks ago, Eno Sarris and Andrew Perpetua wrote some great articles about the importance of making contact out in front of the plate and how we can infer the contact point from Statcast data. Other than contact point, what are the other important characteristics of a swing? Well, let’s look at Eno’s favorite graphic, from the time Zepp analyzed his swing:

It all comes down to swing speed, attack angle, and timing! The time to impact is probably impossible to get from the Statcast data, so let’s focus on the two remaining metrics: swing speed and attack angle.

# Swing speed

Statcast doesn’t measure swing speed directly, but nonetheless reports an estimated swing speed, computed using an algorithm with all the transparency of a black box. In fact, it’s so secretive that estimated swing speeds have all but disappeared from Baseball Savant in recent weeks. Just to find the data, I had to dig up a couple of the saved searches from Alex Chamberlain’s article from a few weeks ago on that topic. Here is the leaderboard of the fastest average estimated swing speeds as reported in that article:

Player | Year | AB | MPH |
---|---|---|---|

Giancarlo Stanton | 2015 | 437 | 66.5 |

Aaron Judge | 2017 | 406 | 66.1 |

Nelson Cruz | 2016 | 325 | 65.5 |

Giancarlo Stanton | 2016 | 192 | 64.8 |

Miguel Cabrera | 2016 | 342 | 64.8 |

Eno swings like Giancarlo Stanton!

Now, I don’t want to shatter anyone’s dreams of blasting a home run off of a Major League pitcher, but something is clearly off about the data. It turns out that not all reported bat speeds are equal. Physics tells us that as the bat rotates, the barrel (the end) of the bat moves the fastest and that the bat speed decreases in an approximately linear fashion as we move toward the hands. According to Patrick Cherveny, the lead biomechanist for Blast Motion, which is the official swing sensor of the MLB, measuring the barrel speed is essentially meaningless:

“We see some swing speeds where people claim that you get into the 90s. That would make sense if it’s at the end of the bat, but if you hit it at the end of the bat, it’s not going to travel as far because some of the energy is lost in the bat’s vibration. So that kind of a swing speed is essentially ‘false.’ Swing speed is dependent on where you’re measuring on the bat. In order to maximize quality of contact, the best hitters want to hit the ball in the “sweet spot” of the bat.”

Measuring the speed of the bat at the sweet spot, a two-inch-long area whose center is located six inches from the barrel of the bat, Blast Motion reports that MLB players swing the bat between 65 and 85 MPH. Zepp, on the other hand, reports the barrel speed, which accounts for its elevated values. Still, none of the swing-tracking devices on the market report swing speeds as low as those estimated by Statcast.

Let’s see if we can uncover more information about the black-box algorithm used by Statcast to estimate swing speeds. A quick linear regression between average estimated swing speed and average exit velocity for all batters with at least 100 batted ball events (BBE) in a season from 2015-2017 yields an R^{2} of 0.99. Wow! Statcast estimates swing speeds almost entirely from exit-velocity data. No wonder the names at the top of the list are so obvious.

Exit velocity, however, isn’t the only velocity measured by Statcast. We also know the speed of the pitch as it is released from the pitcher’s hand. Thinking about the physics, the bat transfers energy and momentum to the oncoming ball at the point where the bat collides with the ball. Thus, any estimation of swing speed based on Statcast’s EV and pitch speed data represents the speed of the bat at the point where it makes contact with the ball. Since hitters want to hit the ball at the sweet spot, swing speeds estimated from Statcast data should fall in approximately the same range as those measured by Blast Motion.

Much of the research on the physics of bat-ball collisions has been conducted by Dr. Alan Nathan, so let’s start with one of his equations:

*EV = e _{A}v_{ball} + (1 + e_{A})v_{bat}*

where EV is the exit velocity, *v _{ball}* is the velocity of the ball before it hits the bat, and

*v*is the velocity of the bat. Here

_{bat}*e*is a fudge factor called the collision efficiency, and depends on the COR of the ball, which was at the center of the juiced-ball controversy, the physical properties of the bat, and the point on the bat in which that bat strikes the ball. Thus, assuming all MLB players use a standard ball and bat,

_{A}*e*can be viewed as a measure of quality of contact. Nathan found that at the sweet spot of a wood bat,

_{A}*e*= 0.2. Using that value of

_{A }*e*and the release speed and exit velocities from Statcast, we can estimate the bat speed for every ball in play. According to Nathan’s pitch-trajectory calculator, the average pitch slows down by 8.4% from the release point to when it crosses the plate, so we’ll also make that adjustment to the release speed reported by Statcast. Here’s the relationship between our physics-based model for swing speed and the estimated swing speed from Statcast/Baseball Savant:

_{A}Look at that! When you get a slope of 1 and an intercept of about 0, you know you’ve hit the nail on the head. This must be the equation that Statcast is using to estimate swing speed. After doing a little digging, it appears that Nathan gave them that exact formula, but assumed that the pitch slows down by 10% by the time it crosses the plate.

The problem with this algorithm is it assumes that the hitter always hits the ball at the sweet spot. Nathan’s paper actually shows that *e _{A}* varies linearly as a function of EV, from about -0.1 for the weakest hit balls to 0.21 for the best hit, depending on how far from the barrel the bat collides with the ball. To get a good estimate of swing speed, we’ll need to get a better estimate of

*e*. Unfortunately,

_{A}*e*must be computed independently for every hitter due to inherent differences in a hitter’s strength. For instance, when Giancarlo Stanton hits a ball with an EV of 100 MPH, he is making weaker contact than when Billy Hamilton hits a ball 100 MPH.

_{A}I calibrated *e _{A}* for each hitter with at least 100 BBE in a season by estimating that the average of the top 15 BBE by exit velocity corresponds to

*e*=0.21 and the average of the bottom 15 BBE by exit velocity corresponds to

_{A}*e*= -0.1 for each player. Since

_{A}*e*and EV are related linearly, we can compute

_{A}*e*from EV for each player. Finally, I will assume that every player uses a standard 34 in., 32 oz. bat. Since Nathan’s study used a 34 in., 31 oz. bat, I subtracted 0.42 MPH from the estimated swing speeds, because every extra ounce reduces that bat speed by about 0.42 MPH. Here’s a look at our new average estimated swing speeds:

_{A}We see that swing speed still correlates strongly with exit velocity, but with a much more reasonable R^{2} value of 0.81. Much of the remaining variance is due to the quality of contact, as estimated by *e _{A}*. The colors here show the soft-hit rates from FanGraphs. We can see not only that slower swing speeds result in more soft contact, but also that the regression line strongly divides hitters based on their soft-contact rates. Hitters above the line tend to make better contact and hit the ball more efficiently than those below the line, given their swing speeds.

Knowing the value of *e _{A}* also gives us an estimate of where the ball hit the bat in relation to the barrel. Nathan found that

*e*, where d is the distance from the barrel. Since a quadratic function has no inverse, we’re forced to infer d from our computed values of

_{A }~ d^{2}*e*by assuming a linear relationship between the two variables. Once we know where the ball struck the bat, we can also estimate the barrel speed and hand speed, assuming that those speeds are proportional to distance from the axis of rotation.

_{A}Point of Contact | Barrel | Hands | |||||||
---|---|---|---|---|---|---|---|---|---|

Year | Min | Avg | Max | Min | Avg | Max | Min | Avg | Max |

2015 | 63.9 | 71.9 | 83.3 | 76.3 | 85.8 | 98.9 | 22.8 | 26.7 | 32.2 |

2016 | 63.7 | 72.2 | 80.8 | 76.2 | 86.2 | 95.5 | 22.9 | 26.8 | 31.0 |

2017 | 63.0 | 71.1 | 78.6 | 75.3 | 84.9 | 93.8 | 22.5 | 26.4 | 30.7 |

Overall | 63.0 | 71.7 | 83.3 | 75.3 | 85.7 | 98.9 | 22.5 | 26.6 | 32.2 |

I have no idea how accurate these estimates are, but they look pretty good! The swing speeds at the point of contact line up nicely with those from Blast Motion (65-85 MPH range and league average of 70 MPH), as do the barrel speeds (Zepp claims 75-95 MPH) and hand speeds (Blast Motion says 23-29 MPH). There’s a lot more uncertainty in the barrel and hand speeds than at the point of contact, because they require additional assumptions about bat size, axis of rotation, and distance from barrel of the point of contact. Even with all of those assumptions, the accuracy probably isn’t much worse than those of the swing-tracking devices on the market today, which claim an uncertainty of about 3-7 MPH for individual swings.

Here are the fastest and slowest average swing speeds in a season during the Statcast era:

Player | Year | BBE | Point of Impact (MPH) | Barrel (MPH) | Hands (MPH) |
---|---|---|---|---|---|

Giancarlo Stanton | 2015 | 187 | 83.3 | 98.9 | 32.2 |

Rickie Weeks Jr. | 2016 | 127 | 80.8 | 95.5 | 29.5 |

Giancarlo Stanton | 2016 | 275 | 80.3 | 95.5 | 31.0 |

Greg Bird | 2015 | 107 | 80.2 | 95.2 | 30.4 |

Gary Sanchez | 2016 | 145 | 80.1 | 95.0 | 29.9 |

Kelby Tomlinson | 2017 | 131 | 63.8 | 76.3 | 24.1 |

Dee Gordon | 2017 | 497 | 63.8 | 76.2 | 23.2 |

Shawn O’Malley | 2016 | 152 | 63.7 | 76.2 | 23.2 |

Mallex Smith | 2017 | 178 | 63.5 | 75.6 | 22.6 |

Billy Hamilton | 2017 | 436 | 63.0 | 75.3 | 22.5 |

At the top of the list we see some well-known sluggers and … Rickie Weeks? Who knew he had such elite bat speed? Unfortunately for him, his average *e _{A}* in 2016 was the lowest of any player in the Statcast era, indicating that he was making a ton of weak contact. Weeks is the quintessential over-swinger, whose impressive bat speed is often nullified by a lack of bat control. That’s completely unsurprising for a player’s whose 2016 highlight reel features at least one hack that would make even Charlie Brown blush:

I was also going to include a table of all of the fastest individual swings, until it turned into an exercise in how many times I can write Giancarlo Stanton’s name. He has 18 of the top 19 swings by barrel speed, which tops out at 108 MPH.

# Attack Angle

Unlike swing speed, Statcast doesn’t give us an estimate of attack angle. Instead, we’ll again turn to some research done by Dr. Alan Nathan, this time from his 2017 Saberseminar presentation. To better understand the geometry of the bat-ball collision, let’s look at a diagram from his presentation:

The attack angle, or swing plane, is the angle that the bat is moving at when it hits the ball. Drawing a line between the centers of the bat and ball at the time of impact defines a second angle, called the centerline angle. When a hitter swings the bat such that the attack angle lines up with the centerline angle, he generates his maximum exit velocity and launches the ball at an angle equal to that of the attack angle.

Armed with this information, we can compute the attack angle by looking at the launch angles when a hitter produces his highest exit velocities. Nathan does this by plotting EV against LA for each hitter (below is his figure for Khris Davis’s BBE, whose attack angle is about 20°). He then divides the data, presumably binning the data by launch angle and then pulling out the top few BBE by exit velocity in each bin (red points). Once the data has been divided, a parabola can be fit to the red points, such that the attack angle corresponds to the peak of the parabola.

I found that the computed attack angle is fairly sensitive to the number of bins and number of data points in each bin, so this method is far from perfect. Ultimately, I chose the number of bins based on each player’s standard deviation in launch angle (~3° bins), and selected the top 20% of data points by exit velocity. I then computed a second version of attack angle by averaging the launch angles of the top 15 BBE by exit velocity (just as I did when computing swing speeds). Finally, I averaged the values from the two different methods to get a final value for the attack angle.

This method of computing the attack angle gives us what I’ll call the “preferred” attack angle. Batters change their attack angles slightly based on pitch location, but the preferred attack angle represents the plane of a hitter’s natural swing when he gets a good pitch to hit (à la batting practice).

A lot of digital ink has been spilled over the last few years trying to make sense of how to evaluate hitters using launch angles. While a ton of progress has been made, we still have a long way to go. Who knew launch angles could be so complicated? Here, we see a relatively weak correlation between attack angle and launch angle, because launch angle is also strongly dependent a hitter’s aim, timing, and bat speed. While we don’t have any direct measurements of aim or timing, we can see from the color scale that players with flatter swings (lower attack angles) have more margin for error when it comes to timing, and therefore tend to have higher contact rates than players with uppercut swings (larger attack angles).

Year | Launch Angle | Attack Angle |
---|---|---|

2015 | 10.5 | 11.4 |

2016 | 11.1 | 12.0 |

2017 | 11.4 | 13.8 |

Overall | 11.0 | 12.4 |

The fly-ball revolution is even more evident when looking at league-wide attack angles instead of launch angles. There was a lot of buzz before this season about players reworking their swings to increase their launch angle. Not all of them were successful though, as the average launch angle only increased by 0.3°, despite a nearly 2° jump in attack angle.

Here are the highest and lowest preferred attack angles in a season during the Statcast era:

Player | Year | BBE | Attack Angle(°) |
---|---|---|---|

Brian Dozier | 2017 | 433 | 29.2 |

Mike Napoli | 2017 | 268 | 29.0 |

Ryan Schimpf | 2016 | 351 | 27.6 |

Ryan Howard | 2016 | 220 | 25.7 |

Chris Davis | 2015 | 265 | 25.1 |

Jarrod Dyson | 2016 | 269 | -0.1 |

Jason Bourgeois | 2015 | 164 | -0.2 |

Justin Morneau | 2015 | 143 | -1.4 |

Billy Burns | 2016 | 279 | -1.7 |

Jonathan Herrera | 2015 | 107 | -4.5 |

It’s good confirmation to see Ryan Schimpf’s name on this list, though it’s interesting that his attack angle isn’t the extreme outlier that his GB/FB ratio and LA are. An analysis of attack angle may also finally give us an answer to why Brian Doziers’s home runs have gone missing this season. His 2017 batting line is almost identical to that of 2016, except his ISO (and HRs) have plummeted. The biggest difference is his attack angle has skyrocketed from 20° to 29°. We know that the optimal LA for hitting home runs is about 24°, so he’s probably getting too much loft on his fly balls this year. All of these guys at the top of the list would probably benefit by flattening out their swings a bit. Interestingly, Joey Gallo, everyone’s other favorite extreme fly-ball hitter, has an attack angle right at 24° this year. He has built the perfect swing for his batted-ball profile, which explains why he is among the league leaders in HR/FB ratio.

This turned out to be an extremely lengthy primer on swing mechanics, but there plenty of questions that can be tackled with estimates of swing metrics. For instance, can we use swing speed and attack angle to predict future exit velocities and launch angles? How much do hitters reduce their swing speeds on two-strike counts? How do attack angles change with pitch location? But, alas, those questions will have to be answered at a later time.

A complete list of swing speeds and attack angles for players with at least 100 BBE is available here.

Wow this is awesome work. Great analysis!

great article. BTW I think the estimated swing Speeds are gone because they are wrong. I sent the statcast guys on twitter a couple months ago that I have hit 80+ on my zepp and I’m a fat rec Player. zepp uses stanton for Marketing and they have his swing like mid 90s or so. I assume they reverse engeneered the batspeeds by using a collision coeffitient between bat and ball but they were probably assuming flush perfect contact in the sweetspot. however a lot of balls have back, top or sidespin or are hit off the sweet spot and… Read more »

sorry about that post, I replied before finishing the article, you adressed on most of what I wrote, so it really is redundant and should be deleted:).

BTW I think the average LA and attack angle might usually be quite Close together. the hardest EVs usually occur around 10 degrees and that is also the average LA of the league give or take a bit. some Players talk about hitting the top of the ball or power air ball hitters like murphy talk about the bottom but I think in reality at 90 plus it is pretty random where contact is made and the LAs are distributed more or less around the attack angle in Kind of a bell curve. now I could see that high pitches… Read more »

You’re probably right that LAs are approximately normally distributed around the attack angle of the swing, so the average attack and LAs should be about the same. What I’m reporting here isn’t the average attack angle, but the preferred attack angle, because it is calculated primarily from the best hit balls. Players intentionally make adjustments to their swing planes to get to high and low pitches, so the average attack angle should be slightly different. Comparing the average LA and preferred attack angle may actually be the best way to figure out which players change their swing planes the most.… Read more »

Hire this man!

This is awesome!

technically not swing mechanics, but you could use average spin on batted ball as a proxy for aim (players with a greater offset will generate backspin, which might not be on baseball savant but you can infer it based on EV launch angle and distance) also you could use horizontal stray angle as a proxy for timing, pulled balls a player is early on etc.

I really like your idea about using horizontal spray angle as a proxy for timing. It’s not completely straight forward, though because spray angle depends on other factors like side spin. That will have to be the next analysis that I do.

Using spin as a proxy for aim is also tricky, because Statcast only reports spin rates of pitches. I’ll have to look in to calculating it from EV, LA, etc. I imagine that it’s a bit complicated because you have to consider the spin rate and direction of both the pitch and batted ball.

I have to confess that I don’t understand how you created the plot of exit velocity vs.swing speed. Does each point represent a single BBE? Is the scatter about the regression line due to different impact locations along the bat? If so, how do you take into account that the bat speed is different at different locations? Perhaps you could clarify.

I’m happy to clarify. Each point on the plot of EV vs. swing speed represents a season long average for a player with at least 100 BBE. The scatter about the regression line is due to the fact that each player has a different season long average impact location along the bat. Therefore, this average swing speed is really the average swing speed at the average point of impact, which is different for every player. Even though the bat speed is different at different locations, I would argue that the swing speed at the point of impact is most important,… Read more »

Sorry, I’m still not getting it. The formula that you quote from me involving eA *only* applies for the components of ball and bat velocities normal (i.e., perpendicular) to the surface of the bat. If we ignore any tangential components (which may be due to either non-zero offset or timing), then the formula applies. For the moment, let me therefore ignore all tangential components. We know that eA varies along the length of the bat. We also know that the speed of the bat varies along the length of the bat. If we knew the impact location, we would know… Read more »

Thank you for your comments.

I will send you an email with my reply.

BTW does Quality of contact consider cutting the ball or just off Center hits vs sweet spot hits? I assume the 0.1 as not so good contact would consider not hitting the sweetspot but not “getting a part of the ball”, right?

Yes, quality of contact in this case considers cutting the ball (missing high or low) as well as hitting the ball too close to the barrel or hands. This definition of ea differs from Alan Nathan’s, which I believe only considers different locations along the axis of the bat.

The way I’ve calculated it, we can’t determine whether a value of 0.1 was caused by hitting the ball too close to the barrel or hands, or getting too far over or under the ball.

Very good article. There is one very important aspect on launch/attack angles that is being missed – as players increase vertical bat angle to adjust to lower pitches, there is loft coming from timing. For high pitches, loft is determined by primarily by swing angle with timing determining direction but for low pitches, timing determines a significant portion of overall loft. So there are three loft factors – swing angle, timing and ball contact point. One other point is the best, most consistent hitters are not hitting with significant offset or spin. More on both issues here if you are… Read more »

You’re absolutely right that timing is extremely important. I only mentioned it in passing, because it isn’t easy to get from Statcast data. My first thought about measuring timing was the same as yours: it should be reflected in the standard deviation in LA. However, the standard deviation would also increase if a hitter changes his swing plane frequently. Changes in swing plane can be analyzed by looking at the attack angles at different locations in the zone, but that’s still a work in progress. Additionally, problems with timing are magnified by hitters with larger attack angles. If a player… Read more »

But I suspect that the best hitters are adjusting their swing angle based on pitch location. It’s about the total amount of loft in the swing, not necessarily where it comes from. The adjustment of the loft components in the best hitters is definitely occurring because the standard deviation for low vs. high LA is not significantly different. Thus, the hitters – at least in terms of “muscle memory knowledge” are well out in front of everyone. No surprise but mind boggling how this is being done automatically as they pick up the pitch location.

My concern is that the Statcast data doesn’t allow us to differentiate an adjustment in swing plane from a mistake in timing just from looking at LAs. For example, consider two players that each take 3 swings which are measured by Statcast. Player A swings with an attack angle of 20 degrees on each of the 3 swings, but the resulting LAs are 10 degrees, 20 degrees and 30 degrees. Player B on the other hand adjusts his swing plane for each pitch. His LAs are also 10, 20, and 30 degrees, but he adjusted his swing each time such… Read more »

Understood. I think the main problem here is that you can’t measure the true attack angle with only one factor – swing angle. When it comes to the exact path of the bat in 3D space, both timing and swing angle impact the path/attack angle so I’m not sure how you define true attack angle without it. You are correct in that it is complicated. In order to truly separate the two factors you would need multiple high speed cameras from several different angles to isolate the loft coming from each separate axis. It can be done but currently, only… Read more »

You’re right, it gets a lot more complicated when we look at it in 3D. I know Blast Motion has some type of motion capture setup that they use to calibrate their swing tracking devices. Hopefully they get those set up in MLB parks soon, but until then the best we can do is make some rough estimations from the data we do have.

Great stuff! I’m also glad to know that Estimated Swing Speed didn’t just disappear for me on the Savant site recently. I thought I was going crazy. Hopefully they bring it back.