Finding the Indians a Hand for the Pen

It’s no secret that the American League has been incredibly top heavy this year, with the playoff race having essentially been decided by the All-Star break (insert crying emoji for Angels’ fans). The Astros, Yankees, and Red Sox have been incredible, ranking first, second, and third respectively in total team WAR. The Mariners have been a pleasant surprise, parlaying a breakout season from Mitch Haniger and dominant pitching from Edwin Diaz and James Paxton with a ludicrous record in one run games (that almost certainly won’t be sustained over the course of a full season). Lurking just outside of the playoff picture are the exciting, young Oakland A’s, who have an interesting mix of young blossoming stars and seasoned vets, all who seem to be clicking at the right time.

Forgotten in all of the competitiveness that defines the AL West and AL East is the mediocrity of the AL Central. Much has been written about how historically bad the division is this year, which houses three rebuilding teams—the Tigers, White Sox, and Royals—and a vastly underperforming/fringy pre-season playoff contender in the Twins. As it currently stands, those four teams all project to finish below .500, with the White Sox and Royals both on pace to lose over 100 games. There is one bright spot, though; this team won 102 games in 2017, lost in extra innings of game 7 of the World Series in 2016, and is home to a two-time Cy Young award winner and the best SS/3B combo in the MLB. That team is the Cleveland Indians.

(Note: All statistics are current as of 07/02/18)

The Tribe have flown under the radar this year in terms of AL World Series contenders, partly because the Astros/Yankee/Red Sox have been so great, and partly because the Indians… well, haven’t been. Early on in the season, the lineup struggled mightily to score runs, couldn’t get reliable starts out of their 5th rotation spot (re: Josh Tomlin), and had a revolving door in the bullpen. As many expected, the offense snapped out of its funk, and now rank 6th in the MLB in total team batter WAR with 14.8, thanks to monster first halves from Francisco Lindor (.936 OPS, 153 wRC+, 21 HRs), and Jose Ramirez (1.007 OPS, 169 wRC+, 24 HRs). The rotation is now as formidable as ever, ranking 2nd in total SP WAR with 11.0, and replacing Tomlin with a combination of a serviceable Adam Plutko and top prospect Shane Bieber. Not to mention, Trevor Bauer and Mike Clevinger have taken massive steps forward, Corey Kluber has maintained his ace status, and 2017 4th place Cy Young finisher Carlos Carrasco hasn’t even found his stride yet. The team is positioned as well as anyone to make a deep postseason run… if their bullpen wasn’t still an atrocity. Take a look at this chart from Sunday’s game against Oakland.

View post on imgur.com

Going down the list, everything looks pretty good until… WOW. A bullpen ERA over 5, up almost 2.5 runs from a season ago. The bullpen might not be quite as bad as they appear on the surface, as they are running an abnormally low LOB%, as well as an unsually high HR/FB rate—which explains why their xFIP is 4.11 compared to their 5.17 ERA/4.64 FIP. The Tribe’s relief corp ranks nearly dead last in total team RP WAR, and their futility—combined with the excellence of the starting pitching—has led Terry Francona to only use his pen for 222.2 innings this year, good for dead last in the MLB. From 2016-2017, the Indians had the 4th best bullpen in baseball by WAR, including the best bullpen ERA and FIP, and the second best xFIP. What happened to this once formidable group? For one, the team lost bullpen stalwart Bryan Shaw via free agency, as well as mid-season acquisiton, Joe Smith. From 2013-2017, no relief pitcher threw more innings than Shaw, who carried a solid 3.11 ERA over those 5 seasons. In addition, superstar Andrew Miller has thrown just 14.1 innings, having been on the shelf for 52 days this season across two separate DL stints. Because of this, Francona has had to rely on closer Cody Allen much more than he would like to, with the results being slighly underwhelming. Allen’s ERA has jumped from 2.94 in 2017, to 3.62 in 2018, all while running an abnormally low BABIP of .233. This seems to suggest that there’s more regression on the way, which doesn’t bode well for a guy who’s K% has fallen almost three percent, and who’s out of zone swing rate has decreased while his out of zone contact rate (and overall contact rate) have increased.

Take a look at the motley crew the Tribe has run out this year (no bullpen carts yet in Cleveland, unfortunately).

View post on imgur.com

When two of the top three guys in innings pitched have combined for -0.5 WAR, you know things aren’t good. Oliver Perez and Neil Ramirez have actually been pretty solid over the last month or so, but those aren’t the guys you want coming in to face Aaron Judge, JD Martinez, or Carlos Correa in October. With both Cody Allen and Andrew Miller likely to be in different uniforms next season, it would greatly benefit the Tribe to capitalize on the team they have now by fortifying their biggest weak spot with pitchers controllable past this season alone. MLB Trade Rumors confirmed this idea, writing, “The Indians hope to acquire at least one quality reliever who’s under control past this season, per [Buster] Olney.”

So, who fits that mold? Obviously, we can rule out a few teams right off the bat:

  • Astros, Red Sox, Yankees, Mariners, Nationals, Braves, Phillies, Cubs, Cardinals, Brewers, Dodgers, Giants, Diamondbacks, Rockies, A’s, Twins, Tigers, Royals, White Sox

All of those teams in that list are firmly entrenched in playoff races, or have invested enough in their team that it would be tough to justify tearing it down right now. The A’s are interesting, given their penchant for selling off productive players at peak value. I think their recent success and competitive timeline will lead them to hold onto their stud closer Blake Trienen, though. I also added the other AL Central teams in there, given that it is unlikely they would like to trade within the division (which for the record, is stupid).

That leaves the following teams as potential trade partners:

  • Rays, Blue Jays, Rangers, Angels, Orioles, Padres, Pirates, Reds, Marlins, Mets

I find it highly unlikely that the Rays move another controllable bullpen piece, given their recent success with the “opener” and the fact that they already traded their former All-Star closer, Alex Colome. The Angels came into the year expecting to compete for a playoff spot, but injuries have derailed them this year. They will be looking to bounce back next year, and probably will only sell off short-term assets. The Mets don’t have anyone worth trading for, while teams like the Pirates and Blue Jays with ace closers (Vazquez and Osuna, respectively) probably will cost too much in prospect currency. The Orioles have relief pitchers with solid track records, but Britton/Brach are both free agents at the end of the year, and Darren O’Day is out for the year after hamstring surgery. Mychal Givens—who is a good reliever in his own right—is controlled through 2021, but probably isn’t the type of arm you can count on as the main guy in a bullpen.

That leaves four teams to choose from: Rangers, Marlins, Reds, and Padres. While each of the first three organizations possess quality, controllable arms, I think the Indians should be aggressive in obtaining Padres’ relievers Brad Hand (LHP) and Kirby Yates (RHP).

While Hand has been an above average reliever for a few years now, Yates really has broken onto the scene this year. Check out some numbers and rankings and see for yourself:

View post on imgur.com

Both rank in the upper echelon in K%-BB%, which shows an ability to miss bats while still maintaining command of the zone. Hand’s ERA and FIP look better, mostly because he had more success in previous years than Yates—although this season Yates is actually vastly outperforming Hand in this area. For those that don’t know what SwStr% means, it is calculated by dividing swings and misses by total pitches thrown by the pitcher, with league average being around 9.5%. Both Hand and Yates are well above average in this category, with Yates’ numbers bordering on elite. These two excellent relievers would fit in nicely in 2018 with a healthy Andrew Miller and a less overworked Cody Allen, and also would provide a sturdy base for the 2019 relief corps that are almost certainly going to lose the two aforementioned Indians.

Check out these two surplus value graphs that I calculated for both Padres relievers:

View post on imgur.com

View post on imgur.com

Let me briefly explain my methodology. When calculating surplus value on a contract, the goal is to figure out how much “extra” value the player is giving to his team by subtracting the market value for his production by the dollars he is actually owed via his contract. The $/WAR calculation for 2018-2020 were taken directly from an article written here on Fangraphs by Matt Swartz, and the expected WAR totals were generated by the generally accepted decline rates laid out by former Fangraphs’ editor and chief, Dave Cameron. Players on average perform at 90% of their previous year’s WAR output through age 30, 85% from 31-35, and 80% from 36 and up. For 2018 dollars owed, I divided each players’ contract by 3 to account for the fact that each would only be in Cleveland for about a third of the current season. Similarly, the 2018 final WAR calculation shown at the bottom of my charts uses 2016 and 2017 WAR totals added to each player’s current WAR, plus an average of what Zips and Steamer—which are Fangraph’s projection systems—think they will produce the rest of the season. Finally, that total is then divided by three to get the average of the three seasons, and then divided by three again, to account for the third of the season Hand and Yates would pitch for the Tribe. Using the previous two seasons WAR in addition to the current season will enable us to have a number that is less influenced by one particular season’s performance, thus giving us a sturdier baseline for which to calculate the next season’s WAR totals.

These charts aren’t perfect, and are meant to be used as simply a rough estimate to see how valuable each player would be over the course of their contracts. For example, Yates will head to arbitration in 2019 and 2020. It is tough to say how much of a raise he will get each season, so I kept the raises equal, and simply doubled his previous year’s salary. Arbitration pays for saves, as well as other counting statistics like strikeouts and holds. Since Yates probably will never do much closing, and really hasn’t become dominant until this season, I feel comfortable saying his salary won’t escalate too much the next few years—which is great for a small market team like the Indians.

Clearly, if we can ascertain how valuable Hand and Yates are, the Padres certainly know their value as well. What would it take to get both of these guys? Would it take a massive overpay? Is there a point where it becomes too expensive? The answer, in short, is “it depends.” The Padres have an absolutely loaded farm system, so they might have more specific desires when it comes to prospect returns. Of course, prized Tribe prospect Francisco Mejia’s name will most definitely be talked about first. I’ve seen him ranked anywhere from a 55FV to a 60FV on the 20-80 scouting scale, meaning some see him as an above average MLB contributor, while others see him as a potential All-Star. Using Kevin Creagh and Steve DiMiceli’s prospect valuation model, which according to Dave Cameron “looks at the level of expected performance and the expected cost of a player during the years before he reaches free agency, and then estimates a player’s value to his organization during that time,” here is how top prospects can be valued:

View post on imgur.com

As you can see, pitchers are valued less due to the fact that pitcher’s get injured much more than hitters, and thus are riskier over the course of their control years. Depending on how you value Mejia, there’s a 22-million-dollar difference in his estimated prospect value. To elaborate, if Hand/Yates combined are around 40 million dollars in surplus value by my model, Mejia’s value outweighs theirs by 20 million if he is a 60FV, while it actually is about 2 million less if he ends up a 55FV. Is there a lot more that goes into trades than simply surplus value? Absolutely. Teams all have their own ways of valuing MLB/MiLB players, and the value of a championship sometimes is the necessary boost to make a move that might be otherwise unadvisable (see: 2015 Kansas City Royals).

Based on what I’ve seen/read, it’s tough for me to value Mejia at a 60FV, mostly due to the uncertainty surrounding his long-term projectability behind the dish. Everyone knows he has an absolute hose, but it appears as if his footwork, blocking, and game-calling need work. Not to mention, he doesn’t have a traditional “sturdy” catcher’s frame, and there is legitimate concern that his body will not hold up after catching 110+ games a season. If he isn’t able to develop into an average to slightly above average catcher, it’s not as if his advanced barrel control from both sides combined with some sneaky raw power won’t play at a corner outfield spot or 3B. That profile will just will make him less valuable, because it is extremely difficult to find a catcher that can hit as well as Mejia is projected to.

Let’s construct two trade scenarios based on Mejia being a 55FV prospect. Because of the scarcity of dominant relievers on the trade market with team friendly contracts, and the fact that the demand for Hand/Yates will be high among contenders, the Tribe will probably have to overpay for their services.

Here’s two mock trades I think could work:

View post on imgur.com

View post on imgur.com

We’ve already talked in length about Mejia, but here’s some quick information on the other guys I included in this deal (note: all FV ratings are via Fangraphs). Talking about deal number one first, Willi Castro (45FV) is a slick fielding SS with solid contact ability, and would fill in nicely at SS if top Padres’ prospect, Fernando Tatis Jr., moves off of short like many are expecting. Quentin Holmes (40FV) is a plus runner with immense athleticism that will help him play plus defense in CF someday. The hit tool isn’t there yet, but if he develops the way the Tribe are hoping, he will be a key cog at the top of a lineup down the road. Julian Merryweather (NR) is a big, right-handed reliever that Fangraph’s prospect analysts Eric Longenhagen and Kiley McDaniel have up to 98MPH, with a plus curveball and changeup. He’s coming off Tommy John surgery, but is an interesting high-upside bullpen piece that could break camp with the team as early as 2019. In the second deal—not including aforementioned Mejia and Merryweather—there is Yu Chang (50FV), who probably has just enough athleticism/twitchiness to stick at short, but also could fill in nicely at 3B down the road. He has some swing and miss issues, but there’s legitimate raw pop in his bat, and he should be able to get to that power in game enough to be a valuable everyday regular. Jordan Milbrath (NR) is a massive 6’6” 215lb reliever that has what Longenhagen and McDaniel describe as a plus sinker and slider.

Will either of these deals be satisfactory to the Padres? Who knows. Will the Indians think that Mejia headlining a deal is too expensive? Who knows. These mock trades were meant to put out a framework of what could potentially move the needle on two relievers with justifiably high price tags, all while staying within the parameters of what a fair deal would be in terms of surplus value.

With the recent departure of LeBron James, and the consistent futility of the Browns (save us, Baker Mayfield) the Indians are Cleveland’s best chance at fielding a championship team over the next five years. Although the price might be steep, it’s impossible to put a dollar amount on what a World Series would mean for the city of Cleveland, even if it requires trading an arm and a leg for a Hand (and a Yates).


The Atlanta Braves Have no Fear of Swinging

The austere face of Freddie Freeman; the resounding crack of Dansby Swanson’s bat; Ozzie Albies brimming smile – these are the surprising Atlanta Braves whose description is no longer surprising, but partial to a definitive fun run through the National League East. The Braves are baseball joy with a mix of relaxed confidence, even brimming optimism. A brimming optimism that has little been partial to any of the Braves players in the past.

A sort of confidence is sweeping the organization as every player is contributing, allowing each player to be distinctly themselves. No longer does Swanson have to turn himself into an all-star, slugger defined hitter, but a second-year player still learning. Nick Markakis can take time to become more confident in a refinement of his mechanics.

The simple undertone is two-fold; the Braves batting lineup is simultaneously playing at a career high, which has allowed the Braves batting lineup to refine their optimal batting throughout the first half of the season. The dominoes fell right, and the Braves learned how to optimize, cutting their progression time in half through analytical chemistry. Second, the one point that defined their functional progression: they have no fear of swinging, second highest in the MLB at 48.6 percent combined with the third highest contact percentage at 79.6 percent.

The odd perception is that swinging this high would lead to inappropriate risk. And for most developing teams, it has. The Detroit Tigers, Kansas City Royals, Chicago White Sox, and Baltimore Orioles round out the top five in swinging percentage, each with a resulting high swinging-strike rate. The Braves, however, have a 9.9 percent swinging-strike rate, eighth best in the MLB. The magic is not accidental due to a combination of veterans who are more patient and, young, power hitters whose slugging means swinging more is appropriate. Nick Markakis has a 4.8 percent swinging-strike rate and Kurt Suzuki is at a mere 7.5. This does not excuse Swanson or Freeman posting 11.7 and 11 percent swinging-strike percentage, respectively, but it allows them to take those extra risks to optimize slugging opportunity.

Suzuki has been an enigma for the Braves, but one of the most important supporting pieces to their run creation. After going his entire career with only one season at a wRC+ over 100 (Minnesota Twins, 2014, 106), Suzuki is now on pace to break his 129 wRC+ and tie his 2.7 WAR from last season. There might not be a coincidence that these two seasons have also seen him break the 50-percent swing margin (52.8 and 53.6 percent) while maintaining a high contact rate, specifically in the zone (93.5 percent this season).

Suzuki’s resolution has come on the backward notion to stop attempts to hit the ball opposite (below 20 percent of hits) instead opting to pull the simply pull hits for apt run creation. His placement map dictates he is better at hitting sharp, pulling balls, and his hits to opposite field were traditionally drab and futile with long hang-time. Hence, an allowance to be better at playing Suzuki baseball and not a league meta-style.

While Suzuki has added value by changing his batting style, there is Nick Markakis who is playing the exact same baseball, just with better contact and providing better leverage. He is hitting well above his career average in ISO at .160 while striking out remarkably lower at only 10.2 percent of pitches. His batted ball profile remains the same, making Markakis a player benefitting from the simple adage of relaxed baseball and improving at tearing pitchers apart in high-leverage situations.

Ozzie Albies, in his second season for the Braves, has already blown away a good first season, posting a 2.4 WAR with a 118 wRC+ (1.9, 112 in 2017). Much like Markakis, Albies has been a run creating machine with high-leverage situation hitting. He doubles down on chaos creation by forcing pitchers to throw uncomfortably away from the zone, less he turns a pitch for a deep slug shot. Albies has refined his slap-shot hitting by achieving his best slugging percentage in the bottom of the zone; thus as pitchers throw breaking-balls and off-speed pitches to derive poor contact, and those pitches drift, Albies is not only able to make contact, but make derisive contact.

The macro change has come with a micro improvement on finding the changeup. He has starkly increased his contact percentage, now above 85 percent in all but two zones. Last season he was above 80 percent in only four of nine zones. Albies is sending more of that contact higher into the air, a bit of a downside to the slugging revolution, but at the same time, is expanding his placement map. He has placed more balls sharply under three seconds of hang time, specifically under 1.5 seconds, implicating an ability to send even soft contact for hits. The career-trajectory implication is Albies is developing an ability to be a rounded hitter, known for more than homeruns.

That then is how the Braves have become a team still fighting into the July trade deadline; a buyer and not a depressed seller. The sudden power from the veterans meant the younger players had time to relax and optimize their best ability, creating a waterfall effect. The Braves have the best high-leverage analytics in the MLB because they find ways to creatively get on base, and those veterans now have players to send home.


New Swing Brings New Struggles for Kyle Seager

Behind many high hopes for the 2018 Mariners was a quiet confidence in the continued performance of veteran players. Among those players was Kyle Seager. Although his offensive numbers dipped quite a bit in 2017, he was still viewed as a quality hitter going forward, but as we close in on the season’s halfway mark, Seager’s performance is still leaving something to be desired.

The power is still there. Twelve homers and 17 doubles put Seager on pace to finish around his typical extra-base production, but the strikeouts are way up, the walks are way down, and as a result, his OBP is a disastrous .270. Even though he has generally come through when the Mariners need him the most (.300/.333/.750 175 wRC+ in 33 PA in high leverage situations), Seager’s overall production has been a disappointment, as he’s slashing .222/.270/.408 (86 wRC+) on the year. Perhaps it all started last season when he curiously turned in his worst full-season performance (106 wRC+) immediately after a career-year at the plate in 2016 (132 wRC+), so let’s cozy up in our armchairs and play hitting coach for a few minutes.

First, we’ll get familiar with Kyle’s swing this year:

Note: There’s nothing wrong with your internet connection. The gifs are just in slooooowww moootiooonnnn.

swing2018 slow.gif

Pretty upright. Medium leg kick. A lot of pre-swing action and an obvious hitch before the hands load. There are a lot of moving parts here, but nothing jumps out as clearly flawed.

Back in 2014, things were much quieter though:

swing2014.gif

Here, Seager’s leg kick is more subdued with a quick toe tap and his hands are much quieter throughout.

In 2015, Seager adopted a more substantial stride, leaving the toe tap behind:

swing2015 slow

His hands start lower, but as they load, they come up to a position consistent with 2014. The camera angles make it difficult to tell, but there also appears to be greater separation between his hands and chest.

Moving onto 2016 (Seager’s career year), we start to see a more exaggerated pre-swing motion:

swing2016 slow.gif

But that extra motion is inconsequential because, once again, as his swing comes together, his hands return to a position consistent with the previous couple years. We also see the return of his toe tap.

Now 2017:

swing2017 slow.gif

His stance looks a little more open here with a slightly bigger stride, but Seager’s swing looks very much the same as it did in 2016. The bat waggle is there, the toe tap is there, but his hands seem to drop ever so slightly more and don’t return to their usual position.

Compare these two gifs from a different angle (from Baseball Swingpedia on YouTube):

side swing 2016.gif

side swing 2017

In 2016 (top), as his foot comes down, Seager keeps his left elbow up and his hands around chin high; however, in 2017 (bottom), his left elbow creeps down just a bit and his hands settle around shoulder high. For a clearer picture, check out these screenshots just as he plants his right foot:

Screen Shot 2018-06-20 at 1.15.27 PM

 

Screen Shot 2018-06-20 at 1.17.23 PM

Hopefully, this lower position is more obvious in these screenshots because it’s subtle in real time. Now, hand position isn’t everything, but it is hugely important, and Seager’s hands might be a prominent factor in his recent offensive woes and might partially explain why 2017 was a year of great change for him.

That change may be best illustrated in the following table:

Year LD% GB% FB%
2011 27.7 30.4 41.9
2012 21.9 35.9 42.3
2013 20.8 34.3 45.0
2014 22.2 36.7 41.1
2015 24.0 35.2 40.8
2016 21.9 36.1 42.0
2017 17.1 31.3 51.6

Whether it was the lowering of his hands that created more fly balls or the desire to hit more fly balls that lowered his hands, Seager’s fly ball percentage skyrocketed in 2017 and his average launch angle on line drives and fly balls combined jumped from 26.4° and 26.7° in consecutive years to 29.5°.

In theory, this wasn’t a bad idea. Seager does most of his damage in the air, and the quality of the fly balls he hit in 2017 (.412 xwOBA) was similar to that of his fly balls in 2016 (.434 xwOBA). A higher volume of fly balls should have meant more damage, but his altered swing may have caused his line drive rate to plummet. And with much fewer of those high percentage hits, Seager may have lowered what was an impressive offensive floor.Screen Shot 2018-06-20 at 1.43.35 PMSeager’s hands appear much closer to their 2016 position. His average launch angle on line drives and fly balls combined is eerily similar to last year at 29.6° but his line drive launch angle has gone down while his fly ball launch angle has gone up, which appears to be a good thing based on his xwOBA on those batted balls. That’s not to say he’s 100% mechanically sound though (not that I would know exactly what that is but I digress). Currently, he’s on pace for a 44.7% FB% and a 19.2% LD% — the third highest and second lowest of his career, respectively — and that still-deflated LD% might be a “feel” or timing issue due in part to his hands’ tendency to drift as a pitch is coming in. Watch Seager’s hands closely in a couple examples from 2018:

Between the pitch being released and him getting his foot down, Seager’s hands are still drifting backward whereas, in previous years, he’s been surgically steady:

I promise these are the last two gifs.

2018:

split hands 2018

2015:

split hands2015

It’s a minute difference, but small disruptions in a hitter’s mechanics can have significant consequences. A diminished ability to square up line drives may be among those consequences and that problem has been magnified by the shift. Given that Seager is one of the more consistently shifted on batters in the league, his ability to hit line drives may be the crux of returning to normalcy at the plate.

Year PA with any shift on LD% GB% FB% wRC+
2013 47 19.6 34.8 45.7 75
2014 212 28.4 41.3 30.3 143
2015 280 27.5 39.9 32.6 86
2016 358 22.8 40.3 36.9 86
2017 374 18.8 32.4 48.8 56
2018 192 19.3 40.1 40.6 54

Generally, the higher Seager’s line drive percentage has been with the shift on, the better he has performed against it by wRC+. And although that doesn’t tell the whole story, it certainly makes a lot of sense. He’ll hit homers over the shift and he’ll poke some grounders through the shift, but if he can’t line balls past the shift a bit more, as we saw last year and are seeing this year, his offensive ceiling just won’t be the same.

Various little changes from one year to the next are what make the best players in the game the best players in the game, but in that quest to become the best, sometimes you can lose what once made you successful. Before you know it, square 1 isn’t where it was a few years ago. What is normal for Seager now is not what was normal when he was at his most successful, but considering that his power is still evident, he seems far from broken. For the majority of the season, his poor offensive performance has been buoyed by good teammates, yet the challenging past few games show that if the Mariners really hope to hang with the best of ’em, they need the Kyle Seager to show up on both sides of the ball.


All data from FanGraphs and Baseball Savant and referenced prior to games on 06/20/18. 


Blast Motion Sensor: Correlation to On-Field Performance and How to Utilize it

Introduction

Athletes are always looking for an edge – how to get better, how to raise their level of play – and are looking for different tools and resources to help them to accomplish this goal. In recent years, baseball has been going through a boom of data-driven player development, with players and coaches looking for the best technology to increase on-field performance as much as possible. Technology allows for coaches to stop guessing, and instead to leverage data in order to deliver answers. As just one example, swing sensors have become extremely popular in recent years in both baseball and softball to help create better hitters.

The Blast Motion sensor, when attached to the end of the bat, gives the hitter different metrics pertaining to the swing. The Blast Motion sensor is the official bet sensor of Major League Baseball. As a baseball player looking for a data-driven way to objectively look at my swing and try to improve it, I purchased a Blast sensor last May. While using the sensor two important questions came up: (i) what metrics matter the most in creating the best possible swing? (ii) do any of these metrics correlate to on-field success? If this question can be answered, users of the Blast Motion sensor can be better prepared for how they use it to create better swings, and hitters.

There appear to have been no empirical studies of these questions, and so I decided to try to answer them myself.  I had an entire college baseball team full of hitters to use as my sample. My goal was to create a study looking into the metrics on the Blast Motion sensor and see which correlate the best to on-field success and therefore are the most important to focus on and gear your training towards.

Data Collection and Exploration

The sample of this study is the Babson College Varsity baseball team. Each member of the team took 50 swings using the Blast Motion Sensor on their bat over the course of a single batting practice session sometime between November 2017 and February 2018. Each player was given the opportunity to warm up before swinging to ensure they were loose and were swinging as hard as they could, just as happens in real games.  The swings were taken against an underhand toss that I threw. Having players hit against front toss rather than swing off a tee more closely simulates an in-game scenario. The 50 swings each player recorded were taken in 5 rounds of 10 swings. Players took a short break in between each round of 10 swings.

Gaining a better understanding of each metric provided by the Blast sensor will help me comprehend and analyze the conclusions I gather later on in the study. The definitions and calculations for each metric are detailed in the table below.

Metric Measurement Definition
Bat Speed (BS) MPH The speed of the bat at contact.
Attack Angle (AA) degrees The angle of the bat at contact where a completely flat-bat that is parallel to the ground is 0 degrees. If the bat is coming from a down-to-up angle that is positive and an up-to-down angle that is negative
Time to Contact (TTC) seconds The time it takes for a hitter to make contact with the ball from the start of their swing.
Peak Bat Speed (PBS) MPH The fastest speed observed at any point in the swing.
Vertical Bat Angle (VBA) degrees Vertical bat angle is the angle of the bat at contact with 0 degrees being a perfectly flat and horizontal barrel. A barrel that is below the hands at contact results in a negative angle(9). The ideal vertical bat angle range is -25 to -35 degrees (3) Figure 1 below displays Mike Trout making contact and is a good visual of where vertical bat angle is measured.
Power (P) kW Power is a measurement incorporating both bat mass and bat velocity. The average power generated during the swing is found from the effective mass of the bat, bat speed at impact, and the average acceleration during the downswing (10). Players with the ability to swing a heavier bat that they can accelerate faster produce more power.
Blast Factor (BF) 1-100 This is a metric created by the Blast team that is on the scale of 0-100 where 0 is the worst possible score and 100 is the best possible score. The 100 possible points in the blast factor are comprised of two equally weighted components: power and swing efficiency (1). The power part of it is comprised of the power metric. The efficiency part of Blast Factor is more complicated, and we will discuss it in more depth below.
Body Rotation (BR) 0%-100% Body Rotation is also rated on a 0-100% scale. Body rotation is expressed as the ratio of Body rotation during the time a players “wrists unhinge” to the total rotation during this time. (8). The ideal number for body rotation is 45% and ideal range is 40%-50%. .
On Plane Percentage (OPP) 0%-100% This metric calculates how “on plane” a player is. It is on a 0%-100% scale. The red line in figure 2 below represents the pitch plane. The green dots represent different points of Miguel Cabrera’s swing. The two dots that are roughly on the line represent the “on plane” portion of Cabrera’s swing. How well a player does this is represented in OPP. Blast calculates OPP by defining how long the sweet spot of a players barrel is on plane. The percentage is calculated by how well a players bat speeds up during this point (11).  A typical range for a good swing is 55%-65% (4).
Peak Hand Speed (PHS) MPH The top speed of a hitter’s hands during the swing.

Figure 1:

Figure 2:

Regression models for Blast Factor:

Now that I understand the meaning of every variable, I can move on to better understanding how the Blast factor is calculated. There have been some conflicting formulas I have seen. Due to these conflicting formulas, I will be running my own analysis in R to see if the data can explain the calculation of swing efficiency. The goal is for these models is to figure out what variables go into the equation that provides me with the swing efficiency score. I will run a linear regression model with the other nine swing metrics (excluding Blast factor) as the predictors and blast factor as the target. I will use the 1,000 swings captured on the Blast sensor in this study as my training dataset. The goal of this model is to understand which variables lead to a better blast factor to see if they can understand it. The full results of this output can be found in the appendix.

We find that a simple linear regression model can capture 72.1% of the variance in the blast factor (R2 = 0.7218). Every variable had a p-value less than 0.05 and was statistically significant in the model. To verify that all variables were making meaningful contributions to the model (outside of statistical significance), I used backward variable selection in R to see if any variables should be taken out of the model. Once again, every variable was included in the model.

To continue to try and gain more of an understanding of blast factor, I will try and see which metric influences blast factor the most. To understand which metric contributed the most to blast factor we will run nine different models. In each model, we will remove exactly one variable and compare the R2 of each model to the 72.18% benchmark we got from the full model. The difference between these two quantities will give us a measure of importance for each variable.

This table below shows the difference between the original R2 value of 72.18% and the model with the corresponding variable removed R2 value.

Metric BS AA TTC PHS OPP P PBS BR VBA
Difference in R2 5.84 2.80 0.92 0.52 13.52 0.64 1.26 9.12 1.05

By far the most important variable to computing blast factor is on plane percentage (OPP). When OPP was taken out of the model, the R2 value dipped 13.52% from its value in the full model. The next closest variable in importance to computing Blast factor was body rotation (BR) at 9.12%. Power (P) and peak hand speed (PHS) are the least important when calculating Blast. This taught us that OPP and BR are the most important variables in understand blast factor.

Below is a table with each metric and the metric coefficient in the full model.

Metric BS AA TTC PHS OP P PBS BR VBA
Coefficient 1.903 0.256 -232.96 -0.36 26.57 -4.54 -0.84 69.44 -0.09

Before interpreting any of these numbers we have to remember a few things. First, these coefficients are the metric relations to blast factor and blast factor alone. Also, while coefficient values may vary, the main thing is to look at is their impact on blast factor. In each interpretation, there will be a scatter plot included showing the relationship between the variables. The coefficients will be interpreted in the context of each metric’s scatterplot. The most notable variables and their relationship to blast factor are documented below, and the rest (along with the code used to produce these plots) can be found in the Appendices.

Bat speed: The coefficient of bat speed indicates that as bat speed increases, so will Blast factor. This agrees with intuition and baseball common sense because players who have fast swings generally have better swings. The following plot depicts the relationship between bat speed and blast factor. As bat speed increases from 55 MPH to 70 MPH blast factor increases steadily. Past 70 MPH bat speed, blast factor stays pretty consistent although there is a decrease past 80 MPH in blast factor. The overall trend of this plot is swing faster for a better blast factor. Of course, one variable in a model does not explain all the variability in the result, but bat speed is the second most important variable to the model so its relationship to Blast factor is important and must be examined fully.

Peak hand speed: The coefficient of peak hand speed indicates that as peak hand speed increases blast factor decreases. That is counterintuitive to what one might think. Having fast hands is a good thing according to conventional wisdom in baseball, so when hand speed increases a metric given to the overall quality of the swing such as blast factor should not decrease. Examining the plot below, the relationship between the two variables is interesting. As peak hand speed initially increases so does blast factor rapidly until it reaches about 23 MPH where it stays pretty consistent to about 26.5 MPH. From there increases in peak hand speed appear to have diminishing blast factor returns.

On plane %: The more on plane a hitter is the higher their blast factor. OPP had the highest change in the models R^2 value when it was removed from the model, meaning it has the strongest relationship with blast factor. This makes sense as being on plane should correlate to a better swing. According to Blast, 55% and better for OPP is a good rating (4). Another thing to consider is according to a spokesperson at Blast, Jose Altuve, the reigning AL MVP, has the highest on plane % of anyone the company has tracked. It stands to reason that one of the best hitters in baseball would also have the best on plan percentage. There is evidence of a clear linear relationship between the OPP and blast factor.  As OPP progressively increases so does blast factor. Although the returns on blast factor slightly diminish as OPP surpasses 75%, there are not enough swings in this region for us to fully resolve the trend.

Power: The coefficient of power says as it increases Blast factor decreases. This is a little perplexing because you would think more power in a swing is a good thing. Also, power efficiency is half of the blast factor rating (1). One potential explanation is as a players swing becomes more powerful they potentially could become more erratic and lose efficiency to their swing. Outside of this explanation, I do not have a ready explanation for the sign of this coefficient. As power increases towards 4 kW blast factor increases, and then it stays pretty consistent until power reaches 6kW, and then blast factor slightly decreases as power increases more. There is not evidence of much of the relationship between power and blast factor the coefficient indicated, but such a small axis that could play in the lack of evidence of the relationship. The R2 value indicated power was insignificant in determining blast factor which is strange. Maybe blast uses a different power metric in their blast factor formula than in the actual power metric they produce. Otherwise, as the coefficient, R2 value, and plot predict there is not much of a relationship between the variables and quite possibly an inverse relationship if any.

 

Before completely jumping into the study one last way to understand blast factor is to create a regression tree with every metric trying to predict a players blast factor. The regression tree allows for another way of looking at each metrics relationship to blast factor and the variables can help predict and better understand blast factor. The regression tree:

In this regression tree model, the only variables used to find blast factor were body rotation, time to contact, on-plane %, attack angle, and power. Players fell into 14 Blast Factor categories based on swings they took. Players who had body rotations below 44% and a time to contact equal to or greater than 0.17 seconds struggled with their blast factor. The 27 swings in this leaf produced an average blast factor of 66. The last two leaves of this regression tree have average blast factors of 91, and 95 respectively. For a players swing to fall in one of these leaves the player must have an on plane % greater than 44% and a body rotation greater than 42%. The distinction between players who had a 95 average Blast Factor was that they had an on plane % greater than 54%! For players whose, average blast factor was 91 on their swings their on plane % fell between 45% and 53% and also had attack angles greater than 3.5 degrees. There were 156 swings in the leaf containing an average blast factor of 91, and there were 151 swings in the leaf containing an average blast factor of 95. That accounts for over 30% of swings in the dataset. To have a good blast factor, players it seems should concentrate on having their body rotation be greater than 42% and be on plane above 44%. The following table displays the error rates for this model.

MAPE MAPE Benchmark RMSE RMSE Benchmark
Error Rate .03279 .07234 3.55628 7.66747

The benchmark error is another representation of the total errors associated with this model. The benchmark of each error rate represents how well this model can predict something without using any real data from the dataset. To interpret the MAPE Benchmark, our benchmark error rate is 7.234%, meaning with no use of the dataset that is how often the model will make an error. The error rates of MAPE and RMSE must be lower than their benchmark rates to ensure the model is valid and good.

Both MAPE and RMSE are below their benchmark rates meaning this tree model is a good model and results can be taken seriously.

The linear regression model, plots, and regression tree all indicate that on plane % is probably the best indicator of blast factor and that is can explain a lot of the variability in a players swing efficiency.

Predicting On-field Performance

The actual goal of this study is to see what metrics Blast provides if any, correlate well to on-field success, meaning if a player’s swing performs well in some metrics, does that make the player a better hitter? If I am able to identify certain metrics that make a better hitter than the Blast Motion sensor can be better utilized to create and identify good hitters. To define a hitter’s success and the measure of how good a hitter is the response variable I choose is wOBA (weighted on-base average). The following snippet from Fangraphs shows the definition and formula used to calculate wOBA.

I use wOBA over other notable offensive metrics like batting average, on-base %, slugging %, on-base plus slugging, and RBI’s for a few reasons. Batting average does not account for extra-base hits being worth more than singles. On-base %, while extremely valuable in player evaluation, also does not account for extra-base hits being worth more than singles. On-base plus slugging, the sum of a players on-base % and slugging %, is great, but it does not have the advantage of giving each outcome a specific weight like wOBA does. Luck and randomness account for a lot of the variation in RBI’s making it a poor choice to use as the representation of offensive output. Luck and randomness occur in all stats, but there is more a player cannot control for in their RBI total than other metrics. Players who play for teams with lineups that have hitters who get on-base often get more RBI chances and generally drive in more runs, while quality hitters in bad lineups RBI totals generally suffer due to a lack of base runners. wOBA is superior because it provides specific weights for each outcome and it is easy to understand. The linear weights are calculated by taking every individual play that occurred in a given season and calculating the sum of their Run Expectancy value divided by how many times that event occurred (7). The sum of Run Expectancy is calculated getting the sum of each play in RE. RE is calculated based on the Run Expectancy Matrix created by Tom Tango where the run expectancy of the end state of a play subtracted by the run expectancy at the beginning of a play plus runs scored. There is one potential way to improve the metric being used as the response variable, but being that my sample size was college baseball players I do not have the necessary technology or resources to calculate it. In the MLB there is a metric called xwOBA or expected weighted on-base average. This is similar to wOBA, with the only difference being it is calculated by what a player is expected to get on their batted balls based off the exit velocity and launch angle of each hit, two metrics measured by Statcast (6). As elements like luck, an opposing team’s defensive ability, and wind can play a role leading to a difference in xwOBA and wOBA, xwOBA would be better to use if possible because it is computed solely off the inputs of what a player does hitting. At the collegiate level without the funding necessary for such a system it is impossible to get this which is alright because wOBA is a sufficient response variable in this study.

Now that we have an understanding of the independent and response variables we can build the linear regression model. I will be building four different models. While there are 20 members of the Babson College baseball team who are hitters, there are only a certain number of team members who actually get to play in games. Due to this, I will be building models using players who got at least 40 plate appearances during the 2017 season in one model and the other model use players performance during fall intrasquad scrimmages. By doing this it will give me a larger sample size to compare to and see if different metrics are significant in both models. If metrics are significant in both models it gives a better chance they indicate a good hitter and instruction with the Blast should be tailored towards these metrics.

Linear Regression:

To create the models I will use backward selection in R so that it chooses the variables in each model for me. Code and output can be found in the appendix.

For the Spring model the following variables were deemed insignificant by backward selection: On plane %, peak bat speed, and body rotation. The R^2 output for this model was 48.27%.

For the Fall model, the following variables were deemed insignificant by backward selection: attack angle and vertical bat angle. The R^2 output for this model was 57.48%.

To begin evaluating the two models the following table shows the coefficients for each variable in the models with an interpretation of each coefficient.

Metric Fall coefficient Spring Coefficient
Intercept 0.5909757 0.7133880
Bat Speed -0.0351286 -0.0133164
Attack Angle N/A 0.0014424
Time to Contact 5.3589336 -1.3025226
Blast Factor 0.0028392  0.0026883
Power 0.3698810  0.0672630
Peak Bat Speed  -0.0039033 0.0042570
Vertical Bat Angle N/A  0.0023798
Peak Hand Speed -0.0048995 N/A
Body Rotation 0.2172541 N/A
On Plane % -0.2789730 N/A

Bat Speed: For both of the models the coefficients were pretty consistent. Each model said that for each mile an hour faster a player swings their wOBA decreases slightly. While this may seem confusing, bat speed is good, but a certain point returns may diminish. Consider the graphic in Figure 3:

The MLB Average Bat speed according to this is 69.6 MPH. The average swing speed in this sample size is 70.98 MPH. Maybe there is something to swinging as hard as you can or swinging harder leads to a slight decrease in production. This graphic was taken from the 2016 MLB Futures Game. The Futures Game consists of the Top Prospects across baseball playing against each other in a Scrimmage Game. The top speed in this game was just 77.4 MPH. Three players in the sample I collected had average swing speeds above this peak speed. Nine of the 20 players in this sample recorded at least one swing faster than 77.4 MPH. Maybe players swing slower in game. I do not think division 3 college baseball players would swing harder than professional players who are older and stronger than college players.

 

Attack Angle: Attack angle was one of the variables that were significant in one model and not in the other. I don’t really know why that is. The spring sample has 9 hitters and the fall sample has 20. So there has to be a difference in the 11 additional hitters and their production. As attack angle increases wOBA increases slightly in the Spring model.

 

Time to Contact: This is the most perplexing metric on the list. In the spring as time to contact decreased wOBA increased which makes sense. For the Fall model, it was the exact opposite and the coefficient was quite large. Hitters in the fall sample who were slower to the ball had higher wOBA’s. It would make sense hitters who had success during the actual Spring season were faster to the ball. This could potentially be due to its relationship with peak bat speed. For each model, time to contact and peak bat speed have inverse relationships with each other. One is positive and one is negative.

 

Blast Factor: This metric was extremely consistent across the spring and the fall. As blast factor increased wOBA increased in both instances which is as expected.

 

Power: Power was positive in both models meaning as a players power increased their wOBA did too.

 

Peak Bat Speed: Peak Bat speed is significant in both models. In the fall model as peak bat speed increases production slightly decreases which is consistent with the results of bat speed. For the spring production slightly increases as peak bat speed increases. Overall the net of the two would suggest an increased peak bat speed doesn’t lead to an increase in a players wOBA.

 

Vertical bat angle: As vertical bat increased production increased. This metric was only significant for players in the spring sample. This coefficient makes sense because as vertical bat angle increased to the desired angle of -25, production should increase with it as well.

 

Peak Hand Speed: Peak Hand speed was only significant for the model measuring players production in the fall. As peak hand speed increased production slightly decreased. Going into this study I did not think peak hand speed was very important to determining a hitters success and quality of their swing.

 

Body Rotation: This is another metric that was only significant for the fall model. As body rotation increased generally so did a players production.

 

On plane %: This metric also was only significant for the sample consisting of players production during the fall. As on plane % decreased production increased. This is interesting because it was the most important metric for predicting blast factor, which is one of the best predictors of a productive player. This could be evidence that being on plane does not necessarily indicate a productive hitter and vice versa.

I have to now calculate the MAPE and RMSE for the two models. The following table has the MAPE, RMSE, and benchmarks for these measurements for the Spring and Fall models. All measurements are rounded to the fifth decimal place.

Season MAPE MAPE Benchmark RMSE RMSE Benchmark
Spring .13943 .24806 .06071 .08441
Fall .26036 .36031 .08377 .12840

Regression Tree:

The next step to this study is to build regression trees to see if there is a fluid way to predict each players success through a visualization. The following picture is the regression tree from the Spring data:

The variables that appear in the spring regression tree are time to contact, attack angle, vertical bat angle, peak bat speed, and blast factor. In the lowest leaf for wOBA, the average wOBA was 0.170 and the leaf contained 44 swings. Players who found themselves in this leaf had a time to contact of 0.16 seconds or greater and an attack angle less than 7.5 degrees. For the most productive leaves, the average wOBA was 0.420, and 0.390. The players in the leaf with an average wOBA of .390 only needed one metric to be separated in this leaf and that was having a time to contact lower than 0.14. There were 165 swings or 37% of the data that fell into this leaf. For the leaf next to it that had an average wOBA of 0.420, these players had swings with time to contacts of 0.14, or 0.15 and vertical bat angles greater than -20 degrees. Only 20 swings or 4.4% of the dataset ended up in this leaf. For this model players who had the most success simply had time to contacts of .15 or lower. They were the quickest to making contact.

The following picture is the regression tree from the Fall data:

For the fall regression tree, the variables included are time to contact, on plane %, peak hand speed, body rotation, and power. The least productive bracket of swings in this model contains just 34 swings but has an average wOBA of 0.150. Players in this leaf had swings with time to contact greater than or equal to .15, on plane % greater than or equal to 36%, body rotation less than 40% and peak hand speed less than 26 MPH. The most productive leaf by wOBA in this tree had a wOBA of 0.630. This leaf contained 79 swings or about 8% of the data. Swings in this leaf had time to contact less than 0.14 seconds, on plane % less than 58%, and peak hand speed greater than or equal to 22 MPH. The largest leaf had an average wOBA of 0.280 containing 323 swings or roughly 34% of the data. This leaf contained swings with a time to contact greater than or equal to 0.14 seconds, peak hand speed less than 26 MPH, body rotation, greater than or equal to 40%, and power below 4 kW.

The following table shows the MAPE, MAPE benchmark, RMSE, and RMSE benchmark for the regression trees from the Fall and Spring data. Numbers are rounded to five decimal places.

Season MAPE MAPE Benchmark RMSE RMSE Benchmark
Spring .21909 .36031 .07490 .12840
Fall .10838 .24806 .05292 .08441

Discussion and Conclusion

First, before evaluating the actual results of the model we need to evaluate the MAPE, and RMSE of each regression model, and regression tree. MAPE and RMSE represent error rates that evaluate how good a model is. These error rates that are produced, are compared to the MAPE Benchmark and RMSE Benchmark. If the MAPE and RMSE are less than their benchmark rates, then the model is good! If not then the model is not good and not really useful. Fortunately, both regression models and regression trees MAPE and RMSE all are significantly lower than their benchmark rates. This means the models do a good job of predicting the response variable! Now that I know the models we produced are useful I can officially draw conclusions.

In the Spring linear regression model, the following metrics were deemed significant in determining a players wOBA: Bat Speed, Attack Angle, Time to Contact, Blast Factor, Power, Peak Bat Speed and Vertical Bat Angle. Time to Contact and Peak Bat Speed both had p values above .05 in this model, but backward selection through R said these variables added to the validity of the model so I kept them. Of all these variables Bat Speed, Blast Factor, and Vertical Bat Angle had the lowest p-values, so these metrics are the most significant in predicting a players success in the Spring linear regression model.

The metrics included in the Spring regression tree are: Time to contact, attack angle, vertical bat angle, bat speed, blast factor, and peak bat speed. Time to contact was the metric in this tree that best predicted a players success. Players who had success during the Spring season were quick to the ball and players who struggled were slow to the ball.

The last conclusion to draw from the spring linear regression model is the R^2. The R^2 for the spring model is 48.27%. Meaning 48.27% of the variance of a players wOBA in the spring can be explained by a players performance in the significant swing metrics. This may seem a little low, but I think this shows great correlation. A swing is not *everything* in hitting a baseball. There are other variables I knew could not be accounted for in this model, such as approach, vision, among others. Second, there are only nine hitters in this sample size. Nine! That is significantly lower than the sample size I have for the fall. So in a small sample size, there is more variance and each swing matters more. I think “swing metrics” being able to explain about half of a players production in this model is great and proves the Blast Motion sensor can be a useful tool to help improve a players swing and production.

In the fall model, the following metrics were deemed significant in determining a players wOBA: bat speed, time to contact, blast factor, power, peak bat speed, peak hand speed, body rotation, and on plane %. Peak Bat speed has a p-value above .05, but backward selection in R selected the variable, saying that it contributed to the overall validity of the model. Every other variable had a p-value under .05. The most significant variables in their contribution to this model were bat speed, time to contact, power, and on plane % because they had the lowest p-values.

The regression tree for the model included the following metrics: Time to contact, on plane %, peak hand speed, body rotation, and power. There was no clear trend to a player being successful or not based on the regression tree for the fall like the tree in the trend for spring regression tree.

The last conclusion to draw from is the R^2 value. In the Fall model, there is a 19 player sample size. More than twice as large as the spring model. The only difference is in this sample players did not accumulate as many plate appearances as those in the spring model did. The R^2 in the fall linear regression model was 57.44%. Nearly 10% greater than that from the spring model. This means how a player produced in the swing metrics included in this model can explain 57.44% of the variance in their wOBA! This is better results than the spring model in terms of indicating a relationship between performance on the Blast to on-field performance. This is further evidence that the Blast Motion sensor is a useful tool to use in the evaluation and development of a hitter. The sensor is not able to explain everything that leads to a players performance, but can certainly explain a large portion of it.

Now that both models have been evaluated I want to look at the variables deemed significant in both models and which ones are probably the best to gear instruction to create better swings. The following variables were significant in both models: bat speed, time to contact, blast factor, power, and peak bat speed. Of these metrics, there is one I immediately want to eliminate when considering which metrics to focus on and that is peak bat speed. Bat speed and peak bat speed pretty much measure the same thing. There is no reason to try and improve both because if you increase bat speed you increase peak bat speed and vice-versa. I will eliminate peak bat speed from this group and only consider bat speed in this analysis. The four metrics appear to be the most significant contributors to a good swing and on-field success according to this study. Evaluation of these four metrics are below and their significance to a swing are explained below according to the results of both linear regression models. A table showing the coefficients of each metric in the spring and fall models are shown below:

 

Metric Fall coefficient Spring coefficient
Bat Speed -0.0351286 -0.0133164
Time to Contact 5.3589336 -1.3025226
Power 0.3698810  0.0672630
Blast Factor 0.0028392  0.0026883

 

Bat Speed: For guys who swing slightly slower their production increases. One reason I think this happened is the player with the best production in the spring model had the 5th lowest average bat speed of any player in the complete sample size. This player also had the 3rd lowest average bat speed of the players in the spring model. The player with the lowest average bat speed happened to be fairly productive in both the spring and fall model. I don’t necessarily believe because of this swinging slower is better, but maybe specifically training to increase bat speed isn’t the best to increase a players production. I think what this tells is that players are capable of being successful at different bat speeds. You do not necessarily have to swing harder to be a productive hitter. In terms of training to swing harder to increase production, that is an entirely different question that I cannot answer based on this study alone.

Time to Contact: Like bat speed, time to contact is another perplexing metric, in terms of its interpretation. In the spring model the quicker a player was to make contact the better their production was according to our model. In the spring regression tree, there was evidence that time to contact was the most important metric for predicting success. In the fall model, it was the exact opposite. I do not think being slower to contact necessarily indicates a more successful hitter like it did in the Fall model. I think what this says is players do not necessarily have to be fast to make contact to have success as a hitter. Players should try and be as quick to the ball as possible because if they do this it means a player can wait and recognize a pitch for longer, even for a few hundredths of a second it allows them to decide later whether they want to swing or not at a ball or strike and make adjustments better within in their swings. A player does not necessarily have to be fast to the ball to be successful, there are other variables that go into a hitter being productive that a player can excel in making them productive. If a player can utilize their time to contact properly and ensure they can have proper timing with it, they can be successful with a slower time to contact.

Power: A metric that both models agree upon! Players who had a higher power output in both models tended to perform better by wOBA. This is parallel to what you might think. Players using the Blast Motion can try and increase their power output to increase their production. This makes sense because players who hit for more power and produce more of a power output tend to be better hitters. This represents a clear conclusion and guidance in use of the Blast Motion Sensor. If a player increases their power output they have a better chance of increasing their production, therefore those using the sensor should train to better their performance in this metric.

Blast Factor: Another metric that the models agree upon! The company Blast says the better the blast factor the better the hitter, and according to the models as blast factor increases wOBA slightly increases. This means players and coaches using the Blast Motion can teach hitters to increase their blast factor to become better hitters. A problem with this like we explained above is blast factor is extremely complex. While it is still somewhat unclear after the analysis on blast factor performed as to what exactly goes into it, it is known blast factor is half power index, and half efficiency index. The models run trying to explain blast factor indicated on plane % is the best indicator of a strong blast factor. To increase blast factor it is beneficial to tailor instruction to improving power output, and players on plane %.

.

Those are the four metrics based on this study I would suggest focusing on, to players using the Blast Motion Sensor to increase their output on the field.  Power and blast factor have the strongest evidence that excelling in these two metrics leads to a productive hitter. This is not to say that the other metrics are worthless and not worth analyzing and attempting to improve, but over the sample size, I analyzed these two metrics explained players on-field production the best. A player should also understand while increasing bat speed, and decreasing their time to contact can make them a better hitter, it is not necessary according to this study to perform well in these metrics to be a successful hitter.

After drawing conclusions the last thing that must be done is an evaluation of the study design, and what I learned could be improved in this study. First, there are certain aspects of this study design that could have been done better. The first is obvious to me, but would have been impossible to accomplish with the resources I had, is to obtain swing metrics from swings players actually took in game. This is something done at the MLB Futures game every year, which was noted earlier in a graphic, but unfortunately, wearable technology is illegal by the NCAA. Another thing is to have gotten more swings from each player. With the sample size, it was difficult to ensure everyone participated and swung on the Blast Motion sensor. Ideally, I would have had every player take 1,000 swings using the sensor because I believe the swing results would have been more consistent. In one 50 swing sample size results can be inconsistent, a player could have been fatigued, not hit for a while, taking lazy swings, and a whole bunch of other factors could have affected their performance. If they took a large sample of swings all of these external variables would have evened themselves out. The last thing I could have done better is getting a larger sample size of players. Of course, I could not get other players involved due to the difficulties of being in college and not having the luxury of seeing other baseball players not apart of the Babson team. Ideally, I would have randomly selected thousands of different players from across the country to participate in this study to be able to approximate that sample size for one representing the entire population of baseball players. Unfortunately, that is not realistic at the moment and I had to work with the sample size I was provided at Babson College.

Overall what I learned from this study is the Blast Motion sensor is a useful tool in predicting a hitter’s performance, evaluating their swing, and can be utilized by coaches. According to the models this study produced, about half of the variance in a players production by wOBA can be explained by their performance in the swing metrics Blast provides. Although due to the sample size constraints this cannot be approximated to the entire sample size of hitters, this does give evidence the Blast sensor is a good indicator of a players performance. The most important metrics this study indicated to concentrate on are bat speed, blast factor, time to contact, and power. While it is potentially beneficial to train to increase their bat speed, a player with slow bat speed is not necessarily one who is a bad hitter. The same goes for time to contact, a player who is not quick to the ball can be successful as well if they utilize their timing properly. Players should work to increase their power output and blast factor. Thank you for taking the time to read this study and I hope coaches and players alike can use this and the Blast Motion sensor to better themselves as players and instructors. If anyone would like to access the code used, and spreadsheets used for this study you can find them on GitHub located HERE. There are some additional visualizations in the appendix that are not included here as well if anyone would like access to that, and the full appendix, feel free to reach out to me at studentsofbaseball@gmail.com. Lastly, if anyone would like to have any further dialogue about this study feel free to reach out to the email provided!

 

Works Cited

 

  1. “What Is Blast Factor?” Blast Motion, blastmotion.com/training-center/baseball/metrics/blast-factor/what-is-blast-factor/.
  2. “What Should Body Rotation Be?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/body-rotation/what-should-body-rotation-be/.
  3. “What Should Vertical Bat Angle Be?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/vertical-bat-angle/vertical-bat-angle-2/.
  4. “What Should On Plane Be?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/on-plane/what-should-on-plane-be/.
  5. “What Should Attack Angle Be?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/attack-angle/what-should-attack-angle-be/.
  6. “What Is a Expected Weighted On-Base Average (XwOBA)? | Glossary.” Major League Baseball, m.mlb.com/glossary/statcast/expected-woba.
  7. Linear Weights | FanGraphs Sabermetrics Library, www.fangraphs.com/library/principles/linear-weights/.
  8. “What Is Body Rotation?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/body-rotation/what-is-body-rotation/.
  9. “What Is Vertical Bat Angle?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/vertical-bat-angle/vertical-bat-angle/.
  10. “What Is Power?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/power/what-is-power/.
  11. “What Is On Plane?” Blast Motion, blastmotion.com/training-center/baseball/metrics-2/on-plane/what-is-on-plane/.

How Sabathia Reformed his Career

I love rooting for late career resurgences. Seeing a player with diminished skills, who likely considered retirement, turn their career around for a few more years instills a feeling of hope. From an analytics perspective, how the player resurrects his career is fascinating.

A few years ago, a season after undergoing arthroscopic debridement surgery, CC Sabathia changed his style of pitching. He found a cutter. In 2016, CC began to ditch his four-seam fastball and replace it with a cutter. He learned his cutter from former teammates that may have had decent careers: Mariani Rivera and Andy Pettitte, one of whom is very likely a first ballot Hall of Famerlargely because of this pitch. Sabathia’s cutter drove the resurrection of his career.

Note: The pitch type data is from Pitch Info, hosted on Fangraphs. Performance data is also from Fangraphs. Tunneling information is from Baseball Prospectus, through May 12th, 2018.

sabathia_pitches

CC Sabathia has been changing his pitch distribution quite a bit over the last five-plus seasons. The change that revitalized his career, though, came during the 2015 offseason. His four-seam fastball usage dropped from 28.3% to almost nothing at 2%, while his cutter usage increased from 0.6% to 31.6%. Since 2016, CC has increased his slider and cutter usage while decreased his sinker and change up usage.

stats.png

Statistical Summaries: ERA- and FIP- measure ERA and FIP, compare them to the league average, and normalize them to 100. An ERA- of 51, for example, is extremely good – it means CC Sabathia has an ERA 49% below league average. wOBA, or weighted On Base Average, is a batting average-like measure that combines a batter’s overall offensive contribution. R wOBA is wOBA from right handed batters against CC Sabathia, and (R-L) wOBA is the difference between righty and lefty wOBA against.

As CC’s cutter usage has increased, his performance has as well. Relative to league-average, his ERA and FIP have dropped annually since implementing a cutter. Each season he has used a cutter, CC has been above-average. I included innings pitched to indicate his surgical leave in 2014.

Most of this improvement has been driven by CC’s performance against right-handed batters. Righties had a .347 wOBA in 2013 and .370 wOBA in 2015 against Sabathia, both at least 54 points above lefty wOBA against him. Since adding a cutter, CC has lowered right handed batter wOBA against from .316 to .310 and now .278, with the largest gap between lefty and righty wOBA being 26 points.

Replacing a four seam fastball with a cutter has its benefits. A cutter runs in on the hands of a righty, inducing weak contact. It deceives batters, appearing as a fastball yet cuts glove side instead of running arm side. And for CC Sabathia, it tunnels well with his secondary pitches.

Pitch tunneling, in a basic sense, occurs when two pitches appear similar at the ‘point of no return,’ where the batter decides whether or not to swing. By the time a batter realizes he should or shouldn’t have swung, the second pitch would ideally be far from what he expected.

Below are two examples of pitch tunneling. These pitches were from at bats between Sabathia and Randal Grichuk early in 2018. CC tried to use his slider to set up the cutter. The dashed black lines are the pitch trajectories. The flags are the pitch destinations, while the smaller flags on the trajectories are pitch locations at Grichuk’s swing decision point.

Tunnels.png

The pitch sequence on the left was tunneled well. The two pitches are almost indistinguishable at the batter’s decision point. The sequence on the right, however, were poorly tunneled. It’s clear that the pitches thrown were different types and in different locations.

Statistical summaries: PreMax measures the average distance, in inches, apart the two tunneled pitches are at the batters’ decision point. The average PreMax is said to be about 1.54 inches. PlatePreRatio measures the ratio between the average perceived distance and average actual distance between the tunneled pitches at the plate. The perceived distance is the distance the batter expects will be between the pitches when they reach the plate. The median PlatePreRatio in 2018 is 11.8. This ratio represents how many times further the pitches are apart than expected. For example, the average pitch tunnel sequence results in pitches being 11.8 times further apart than expected.

tunneling_speed

CC Sabathia has improved his PlatePreRatios through replacing his four seam fastball with a cutter. He also has improved his tunneling skills with his cutter over time, as he has gotten more comfortable using it and as he has gotten further from his surgery. CC’s tunneled pitches are much further apart at the plate than expected when he leads with a cutter instead of a four seamer. The current assumed average PreMax is 1.54 inches, of which Sabathia is above with his cutter, though over time he is improving. Quite a bit of research is needed to better understand pitch tunnels, but it is generally assumed that tunnels with higher PlatePreRatios, all else being equal (pitch types, movement, location, PreMax), are harder to hit and are more successful.

One thing to note, though, is that not everything improved for Sabathia in regards to pitch tunnels. PreMax, in my opinion, is very important for pitch tunnels – perhaps mores than PlatePreRatio. Regardless of how far apart two pitches end up compared to their expected destinations, if the pitches can be clearly identified prior to the swing decision time, the batter can make a much more educated decision. Ideally, a batter decides whether or not to swing purely based on the perceived location and his opinion of whether or not he can make quality contact. Pitches with smaller PreMax measures appear more similarly and can deceive the batter. Pitches with higher PreMax measures provide the batters with more information – whether it be pitch type (which could influence a batter to not swing if he knows he struggles against it) or a variable like pitch location, which lowers the PlatePreRatio through providing a more accurate perceived distance.

All three of Sabathia’s commonly-used pitch tunnels, listed above, became more differentiable when the cutter replaced CC’s four seam. More research is needed to understand if this is actually bad, like I theorize, or if the PlatePreRatio increase is enough to offset any of the hypothesized issues with higher PreMax tunnels.

If Sabathia asked me for help (which is shiny 51 ERA- in 2018 suggests he doesn’t need), I would recommend that he begin to pitch backwards more often. See the table below:

Tunnels_both

Pitching backwards is when a pitcher uses his secondary pitches initially instead of their speedier offerings. The above table compares CC Sabathia’s tunneling sequences when his cutter is the first pitch to when his cutter is the second pitch. Each of his cutter-second tunnel sequences has better PreMax distances and better PlatePreRatios than his cutter-first sequences. As mentioned above, the average PreMax distance is 1.54 inches, of which Sabathia is below on two of his three secondary-cutter sequences. When leading with the cutter, all three of his sequences are further apart than average. Similarly, Sabathia’s sequences have a higher PlatePreRatio when leading with the secondary than when leading with the cutter.

CC Sabathia had to transform his game to adapt to his diminishing velocity. He’s excelled at this, utilizing the cutter instead of the four seam fastball. Despite his changed approach and success, there are ways he could improve, such as pitching backwards with tunnels. He plans to retire if the Yankees win the World Series, though. He’s had a storied career, and may be HOF bound.


Brandon Crawford’s changed approach – raised hands and raised stats

Brandon Crawford has always been known as a defensive shortstop. His three-straight Gold Glove awards can attest to that, as do advanced metrics (he isn’t pulling a Derek Jeter). It wasn’t until Crawford’s third full season (2014) that he became an above-average bat. Though, with a 101 wRC+, he was more average than above. Thanks to a power surge in the following season (that may or may not have been aided by the juiced ball), Brandon had his offensive career-year, running a 113 wRC+ along with 21 home runs, 11 more than his previous career high.

Essentially, this is a long-winded way of saying Brandon Crawford hasn’t been  a middle-of-the-order, annual Silver Slugger-contending batter. The majority of his value is produced on the field. Because of this, Brandon Crawford’s 44 wRC+ from the start of the season through April 25th was concerning but not devastating. All the analysis in this piece was done using data from March 29th, 2018 through June 26th, 2018.

April 27th, 2018 may be remembered as the day the Giants’ shortstop energized their offense. According to Alex Pavlovic, Crawford made a mechanical adjustment in his swing. In his own words, Brandon is “getting [his] hands up and into the right slot by the time [he] start [his] swing.” Below are two set positions, immediately prior to the pitcher lifting his leading leg in his motion. Notice his hand positions.

crawford bats.png

On the left is an at bat against Alex Wood from March 30th, 2018, where he struck out and went 1-3. On the right is an at bat versus Brooks Pounders of the Rockies, on May 19th, 2018. He went 3-5 with 4 RBI’s that day. Both images were from videos found on MLB’s Youtube page.

Like Brandon said in Pavlovic’s piece, the change was only a few inches of hand relocation. Below I highlighted the hand & bat angle to help. Note: the camera angle is slightly tilted, contributing partially to the angle of his hands in the second image. Through viewing multiple swings, I can confirm the angle seen is close to or equals what he is currently doing.

crawford bats highlight.png

It may still be tough to see, but it’s there. This subtle change, contrary to the current ‘air ball‘ revolution’s lowering of ones hands for added loft, has fueled Brandon Crawford’s May. He had one of the hottest May’s of 2018, running a .448 wOBA and a 190 wRC+.

How has this mechanical change led to such a hot streak? Well, one could say he’s gotten lucky. Pitchers began to throw more pitches in the strike zone, of which Crawford is taking advantage. On the left is a heat map of pitches thrown to Brandon Crawford prior to the mechanical chance, and on the right is after the change. All the heat maps are from Fangraphs.

crawford pitches

Pitchers aren’t the only ones locating the outside corner more. Crawford has increased his plate coverage since raising his hands. Before the change, he was struggling to make contact anywhere besides on the inside corner. Now, however, he is covering both corners, and up in the zone. Like above, the left heat map is from the period prior to the change, and the right heat map is from after the change was made.

crawford contact.png

What does this look like statistically? Through Statcast, we are able to measure the changes in Crawford’s batted ball distribution and quality of contact. The data in this table and the below distribution are from Statcast, through Baseball Savant.

crawford statcast.png

Brandon Crawford has hit the ball much harder since the hand position change, increasing his exit velocity by 6 mph! xwOBA, a stat that encompasses all offensive contributions and can be read like batting average, validates this improved batted ball profile. xwOBA uses a batter’s launch angle and exit velocity for each batted ball to calculate the expected wOBA value for each event, as an attempt to strip away defense and luck from the batter’s offensive performance. Brandon’s launch angle has lowered, however, furthering itself from the ideal fly ball range of the low to mid-20 degree range (though, some research may suggest that, at a 90 mph average exit velocity, a 13 degree launch angle may be optimal).

Average launch angle is deceiving, however, as extreme batted balls aren’t captured as well in the mean of all batted balls. A ground ball with a -10 degree launch angle and a pop up with a 45 degree launch angle would imply that the batter has an ideal launch angle of 17.5 degrees, though a ground ball and a pop up aren’t ideal outcomes. Below is Brandon Crawford’s launch angle distributions, before and after his hand position change.

launch_angle

If anything, Crawford has trimmed worse-balled balls in favor of ideal batted balls. Despite lowering his average launch angle, Brandon Crawford increased the frequency of high-performance batted balls, namely line drives. As seen in the post-change pink distribution, Brandon reduced the number of pop ups and extreme ground balls. This can be seen in his batted ball rate statistics. This data is from Fangraphs.

crawford rates.png

The changes in his offensive profile are reflected in the above table. Brandon’s increased line drive rate is seen in both the distribution and the rate statistics. His average launch angle decrease comes from replacing many fly balls with line drives. This high line drive rate helps explain the high BABIP (batting average on balls in play). Similarly, pulled balls are hit harder and, shift-dependent, can do more damage to the opposing team. Brandon’s K-rate decreased from dangerously high in one period to far below average in the post-change period, while his walk rate fell further below average between periods. Both of these drops were caused by the increased zone rate mentioned above (in the heat maps).

Brandon Crawford had a far-too high K-rate while being far too unproductive for his team. After receiving a bit of swing advice, raising his hands a few inches, he has become one of the hottest hitters in baseball. As the season has gone on, Crawford has slightly cooled off – his high post-change BABIP and line drive rates likely aren’t sustainable – though with the stronger plate coverage and better approach at the plate, Crawford shouldn’t return to his April self.

 

– tb


Democrats Are Good At Baseball — Big League

Maybe it’s the history. Maybe it’s the nostalgia for small-town Americana. Maybe it’s simply the fact that “baseball’s the perfect sport for nerds.” (I can relate.) Whatever the reason, politicians, their staffers, and other dwellers of “the swamp” have always been in love with baseball. Though politics and baseball are more intertwined than you might think, the most explicit crossover has always been the annual Congressional Baseball Game, played June 14th, which last year raised $1.5 million for charity.

Even though the game pits Democrats against Republicans, the Congressional Baseball Game is regarded as one of the few events that still promotes bipartisan camaraderie in Washington. Its participants—actual U.S. senators and congressmen (and three congresswomen)—practice months in advance. They play through injuries and even assassination attempts like last year’s shooting at a Republican practice. In the game itself, they take the field at an actual major-league stadium (Nationals Park) and pitch overhand at speeds of up to 80 miles per hour.

Clearly, Congress treats the game as seriously as if it were the major leagues—so I figured we at FanGraphs should too. For years, the game’s scorekeepers have kept track of each player’s basic stats; I’ve taken their work one step further and made a FanGraphs Leaderboard out of them. Yes, we now have a way to sabermetrically judge the baseball skills of our elected officials. I calculated all stats, from FIP− to wOBA to WAR, the same way FanGraphs does; there are even different sections for Standard, Advanced, and Value stats (unfortunately, there’s no batted-ball, Pitch Info, or Inside Edge Fielding data for congressional contests—get on that, guys). The overwhelming conclusion? Democrats are much better at the national pastime than Republicans; in fact, they’ve won the Congressional Baseball Game in eight of the last nine years (as far back as these stats go). To see if a blue wave is going to wash over the diamond again this year, let’s dive into the starting lineups:

Democrats

Projected Lineup AVG/OBP/SLG wRC+
2B Raul Ruiz .188/.278/.250 58
CF Pete Aguilar .429/.556/.429 126
P Cedric Richmond .650/.750/1.000 211
SS Tim Ryan .474/.524/.632 142
DH Jared Polis .429/.480/.571 126
C Chris Murphy .261/.346/.304 76
RF Jimmy Panetta NA/1.000/NA 219
1B Joe Donnelly .250/.400/.300 88
3B Tom Suozzi .000/.000/.000 -25
LF Hakeem Jeffries .200/.200/.200 35

 

Probable Pitcher ERA FIP BB% K%
RHP Cedric Richmond 2.38 4.61 10.6% 27.5%

Democrats can boast five of the seven best congressional baseball players by WAR, and four of them anchor a lineup that has averaged 12.7 runs per game since 2009. (The fifth is speedy pinch-runner Eric Swalwell, who is a perfect nine for nine in stolen base attempts and leads the league with 1.8 wSB, or stolen base runs above average.) Tim Ryan, who is rumored to be running for president in 2020, is a rare combination of speed (a 15.0 speed score) and power (.632 slugging percentage). Jared Polis leads the league in RBIs with 13 and has never struck out in 25 plate appearances, but unfortunately for Team Blue, he’s retiring from Congress this year. And look for singles hitter Pete Aguilar to earn a promotion to the top of the order this year thanks to his .429 average and 22.2% walk rate, perhaps displacing Democrats’ usual leadoff hitter, Raúl Ruiz, who is mired in a slump (a .528 OPS) but has gotten unlucky (a .214 BABIP).

But the real star of the Congressional Baseball Game is the Democrats’ own Shohei Ohtani: pitcher/slugger Cedric Richmond. It’s impossible to overstate how good Richmond is: he has 13 hits and 11 runs scored in just seven games. He has power (.350 ISO), speed (six for seven in stolen bases), and patience (a 28.6% walk rate). On the mound, the former Morehouse College pitcher has 57 strikeouts in 47 innings (including six complete games) and a 39 ERA−. Between his hitting and pitching, he has amassed 2.3 WAR—eight times that of the game’s second-best player, Ryan.

In the late innings, expect Linda Sánchez to pinch-hit for Democrats. The game’s longest-tenured female player is both a crowd favorite and a tough out with a .444/.500/.444 slash line in 10 plate appearances. And keep an eye on sophomore right fielder Jimmy Panetta, whose father Leon played in the Congressional Baseball Game back in the 1970s. Scouting reports of the younger Panetta are off the charts, but he was hit by a pitch and reached on catcher’s interference in his two plate appearances last year, so he couldn’t show off what he could do.

Republicans

Projected Lineup AVG/OBP/SLG wRC+
SS Ryan Costello .167/.400/.333 87
CF Jeff Flake .318/.348/.455 92
DH Kevin Brady .417/.517/.500 127
2B Steve Scalise .500/.750/.500 166
RF Mike Bishop .200/.333/.200 64
1B Tom Rooney .200/.200/.250 41
C Rodney Davis .375/.444/.375 102
LF Rand Paul .273/.273/.273 56
3B Trent Kelly .000/.667/.000 130
DH Barry Loudermilk .375/.375/.375 87

 

Probable Pitcher ERA FIP BB% K%
RHP Mark Walker 5.37 7.89 13.3% 8.0%

Mark Walker has been a godsend for a Republican team that long struggled with run prevention, but his pitching defies the sabermetric odds. Walker lives up to his name with poor control (10 walks and six HBPs in 14.1 innings) and strikeout numbers (six), but he has a solid 89 ERA−. A .283 BABIP in a league whose fielders don’t exactly cover a lot of ground suggests he’s been very lucky, but Democratic batters complain that his offspeed pitches are just very hard to get good swings on. If Walker runs into trouble, expect the GOP to turn to John Shimkus, who used to be their starting pitcher in the mid-2000s. Shimkus is Walker’s opposite as a pitcher: he has a below-average 6.89 ERA, but he is more of a strike-thrower (224 of his 358 pitches since 2009 have been strikes) and therefore has a 97 FIP−.

Ryan Costello and Jeff Flake constitute a potent one-two punch at the top of the lineup, and the fact that they are both retiring from Congress this year is a gut punch to Republicans’ future chances. Costello is a better player than his .167 average suggests. A great 30% walk rate has elevated his OBP to .400, and he’s been very unlucky with a .200 BABIP. He’s also got decent pop (.167 ISO) and is the GOP’s slickest fielder, manning shortstop every year since 2015. And Flake has been a constant presence on the Republican team since 2001 but is leaving office amid his feud with President Trump. Flake could stand to take more pitches (4.3% walk percentage) but he’s one of the few Republican hitters with power (a .455 slugging percentage).

The GOP’s best hitter by far is ageless wonder Kevin Brady, who first played in the Congressional Baseball Game in 1997 at the age of 42. Our statistics don’t go back that far, but he has amazingly posted a .451 wOBA from his age-54 through age-62 seasons. Although he’s not in the starting lineup, Chuck Fleischmann is Republicans’ second-most-valuable position player. He’s another pinch-running weapon off the bench, leading his team with four stolen bases (and no caught stealings) and a 26.4 speed score.

The biggest question mark of the night is whether Steve Scalise, the House majority whip who was shot in the leg at last year’s shooting and remained in critical condition for several days thereafter, will be able to man his old position at second base. Although it was once feared that he may never walk again, Scalise told Fox News this week that “being able to walk out on to that field Thursday night is going to be a special, special moment.” Even if he just gets one at-bat, it will be to his team’s advantage: known even before the shooting as one of the GOP’s hardest-working players, Scalise has gotten on base in three of his four career plate appearances. He’s also scored more runs (five) than anyone else on his team, although that’s more an indictment of a Republican offense that’s averaged only 4.4 runs per game since 2009. Only if they improve on that number, and if Walker continues his sleight of hand on the mound, do Republicans have a shot at winning this year.


Domingo German Gets Whiffs Like Shohei Ohtani

If you first heard of Domingo German when he threw 6 no-hit innings in his debut start against the Indians, you are not alone. Travis Sawchick posted last month that many hardcore baseball enthusiasts may be like you. Domingo German threw 613 perfect innings Tuesday, but he wasn’t perfect through 613 as Dee Gordon led off the game with a hustle double. He struck out 9 and walked none in a dominating performance.

I want to point out a start that I think is more interesting than either of those, and it occurred last Thursday against the Rays. German was excited after the game to have picked up his first career pitcher win. That’s not why I think it’s interesting. Thursday, Rays hitters swung and missed an astounding 26 times in 91 pitches. That’s the best rate in a start all year, in fact it’s the best rate since Yu Darvish baffled the Rays in July of last year.

Josh Hader got 15 swings and misses in 32 pitches in a relief outing against the Twins. Please appreciate Josh Hader before continuing.

Returning to our regularly scheduled programming, Domingo German now has a swinging strike rate of 15.8%, which ranks second in baseball behind only Max Scherzer.

As Jeff Sullivan put it yesterday in his excellent article about German, “when you sort by swinging strikes, you get a list of extremely talented pitchers.” This is that list, and the pitchers on it have elite stuff.

You can also look at contact rate, where German has the third lowest in the league, behind Shohei Ohtani and Scherzer. One third of the time batters have swung at German’s pitches, they have missed. Now swinging strike rate is just a function of contact rate, specifically the function swing rate*(1-contact rate) = swinging strike rate.

German was never much of a prospect. Kiley gave him a 40 FV in 2015 on the strength of his long healthy track record in the minors, and then he promptly needed Tommy John. He worked his way back to a 40 FV this year. His stuff graded out above average, but now he’s tougher to make contact against than Chris Sale or Noah Syndergaard. I can’t fully explain it. I have some guesses, and interesting things to show you, but I’m still frankly surprised and confused. Read the rest of this entry »


Nick Punto was Right: Evaluating the Game’s Dramatic Bullpen Evolution via Machine Learning

When I played for Oakland, the guys who weren’t playing tended to congregate at the far end of the dugout, next to the bat rack. Mind you, that was usually me. It’s kind of a weird place to stand since we were pretty much always in the way, but there weren’t a ton of options.

One of those days, I was down there with Nick Punto. I didn’t spend much time with him, but he was one of the funniest guys I’ve played with. He had just dispatched Billy Burns up the approximately fifteen flights of stairs to the clubhouse to make him a Pb&J. While we were waiting for Billy, I was asking Nick about how the game had changed since he’d started playing. He debuted in 2001. It was 2014.

I wasn’t taking notes, but I’d paraphrase what he said as, “Bullpens are way nastier than they used to be.”

Side note: It was probably fate that the first thing he thought of was the bullpen. I still can’t think of bullpens without thinking of the 2014 Royals. For those of us in the dugout, the Wild Card Game that year was heartbreaking. We had a four run lead in the eighth. I knew I’d come off the playoff roster if we won and went to Anaheim, but I’d get to make the trip, not to mention collect a full playoff share. What I didn’t know was that it would be my last game in the big leagues. 

I was trying to fathom what playing fourteen years in the big leagues would be like when Billy got back down. He had just gotten called up for September and (like me) wanted to be on the veterans’ good side. He was walking towards us when Punto gave us a quick wink.

“I said CRUNCHY peanut butter!” he yelled. “Go get another one.” And he took the sandwich and stomped on it.

Bullpens Aren’t Created Equal

As the right-handed hitting half of a first base platoon, I needed to be ready for lefty relievers. I’d get to the field and watch video on all the lefties in the other team’s pen. And in the fifth inning, I’d go inside and start getting loose and hitting flips in case I got to pinch hit. I was always asking for flips. I probably annoyed the hell out of Chili Davis, our hitting coach, especially since there was usually little chance I’d actually pinch hit.

There was a lot of variation in what we could see out of a bullpen.

We’ve talked a lot about how league average has changed. Do I even need to link to a story about how strikeouts and velocity have been rising? You’re reading the community blog at FanGraphs. You already knew.

With all the talk about aggregate changes, I think something that gets lost in the discussion is how some teams just have nastier pens than others. It’s tempting to see league average fastball velocity and forget that it’s just an average.

I’ve been thinking about what Shredder said that day (Nick Punto’s nickname is Shredder). Yes, bullpens as a whole have changed, but can we look at individual ones? Can we assign a “beginning”, “middle” and “end” to this story? Can we categorize the bullpens by where in the story they fall?

Let’s Try

It just so happens that FanGraphs has velocity and plate discipline stats going back to 2002, which is basically when Nick Punto started playing. That’s the data I used for this post. I did the analysis, and made the graphs, in R.

Our first chart represents fastball velocity in four seasons: 2002, 2008, 2013 and 2018.

It’s clear that relievers are throwing harder. What’s interesting is the 2002 curve is much more spread out. There was more team-to-team variation in what you’d see out of the pen. See that little blip all the way at the left? That’s the 2002 Expos, averaging 86.2 mph. Yes, with the fastball.

Fast forward to 2018. The curves are filled in with 80% opacity so we can see what’s behind them. Sure enough, all the way at the right, we’re in pretty uncharted territory. That’s the Yankees and Pirates, both averaging 95mph.

In case you’re wondering, that purple outlier hovering by itself at 90mph is the Padres (and their 87 bullpen ERA-).

More Than Velocity

It’s fun to look at how velocity has evolved, but I’d like to try looking at more variables. In fact, I’d like to look at ten variables and try to see how they fit together. We’re only going to be looking at input variables such as velocity and swing percentages. I’m not going to use results variables like ERA or WAR. I gathered the data from FanGraphs and built a correlation chart:

It’s not surprising to see that fastball velocity has a 0.65 correlation with o-swing percentage. As a hitter, it’s pretty simple. The faster the ball comes in, the less time you have to make a good decision. It’s also pretty straightforward to see that o-swing% and zone % (percent of pitches that are in the zone) have a strong negative correlation. If you’re going to swing outside the zone, I’m going to throw it outside the zone.

It also looks like hard throwing pitchers sacrifice control for velo (Zone and FBv correlate at -0.62). That or they take advantage of the higher o-swing% afforded by said velo and throw more pitches for chase. That’s so 2018.

I was interested to see that fastball velocity has a -0.29 correlation with fastball percentage. Brandon Moss used to say that pitchers who throw the hardest seem to use their fastballs the least. He may have been on to something.

Fun with Dimensionality Reduction

Now let’s use these variables to make a ten dimensional graph! In order to do this, we’ll need to start with a principal components analysis. PCA creates new variables, called principal components, that are linear combinations of our original ten. What’s nice is we can now express our data in terms of these new variables. Because each principal component draws from all ten of the original variables, we can actually graph our ten dimensional data using just two axes: Principal Components 1 and 2.

Before we move on, let’s take a look at our new variables:

In the correlation circle above, the horizontal axis is Principal Component 1 and the vertical axis is Principal Component 2. Each arrow corresponds to one of our original variables from FanGraphs. In order to interpret the arrows, we’ll start by look at how far they go horizontally. Let’s look at O.Swing%. It points very far to the right, but only a little bit down. That means that Principal Component 1 (horizontal axis) has a strong positive correlation with O.Swing. In other words, if you have a high score for PC1, it’s associated with having a high O.Swing rate. The fact that it only points a little bit downward means PC2 has only a weak negative correlation.

We can see that PC1 is going to be associated with arrows that point far to the right (positive) or left (negative). So PC1 looks like it’s going to be associated with high O.Swing rates, high fastball velocity, and high swinging strike rates. It will also be negatively associated with high zone rate, high fastball percentage and high contact rate. In summary, if you score high on PC1, you throw hard, throw a lot of offspeed, get lots of swinging strikes and throw lots of pitches out of the zone. Sounds familiar.

Let’s look at PC2. This one looks like it’s most associated with low contact rates.

One more point to make. Those percentages on the axis labels represent the percentage of the total variance that each PC captures. So by using PC1 and PC2 together, we can see over half the variance of our ten dimensional data.

K-Means and PCA Chart

I said earlier that I hoped our story would have a beginning, a middle and an end. I wanted to see if there were three distinct phases to the evolution of bullpens since the beginning of Punto’s career. To help visualize this, I ran a machine learning algorithm called K-means. It “learns” the data and generates clusters centered at different points. In order to run the algorithm, you have to specify how many clusters you want. I marked three (k=3). Ideally, the three clusters would represent some kind of narrative. (I got the idea for this method here.)

Finally, here’s the graph:

There’s a lot going on here. We’re looking at a two-dimensional representation of ten-dimensional data. The dots represent each team bullpen since 2002. The circles contain the bullpens in four different seasons: 2002, 2008, 2014 and 2018. Finally, the colors are our clusters. Sure enough, the clusters give us a pretty decent story. The points are basically moving from left to right.

These axes are our principal components. Like we said earlier, having a high score in PC1 means you throw hard, throw a lot of offspeed, throw lots of pitches for chase, and get lots of swinging strikes. The data is clearly moving to the right as the years go by, which means all of these things are increasing.

What’s cool is that the k-means algorithm settled on three clusters that definitely demonstrate an evolution in bullpens. We can call these “Phase 1,” “Phase 2,” and “Phase 3.” These are arbitrary names and even picking three was an arbitrary number, but it can help tell a story. Intuitively, a team in Phase 1 pitches like a 2002 bullpen, whatever that means. A team in Phase 3 pitches like a 2018 bullpen.

To simplify, I made another graph with just the four years we’ve been talking about.

The three cluster centers are in red. The 2014 Royals are their own color, as are the 2018 Yankees.

Phase 1 is associated with the lowest values of PC1. In Phase 2, the values of PC1 are higher but the PC2 values are lower. In Phase 3, the PC1 values are the highest, while PC2 is approximately equal to Phase 1. Again, these are abstract, but just meant to tell a story.

Every team in 2002 was in Phase 1. By 2008, the game had clearly changed. The circles hardly overlap and while the 2002 circle contained all bullpens in Phase 1, the 2008 circle has bullpens in all three phases. 2002 to 2008 appears to have the most drastic changes.

I figured that the 2014 Royals would be some type of temporal outlier. They were one of the only teams that didn’t try to play matchups to get those last nine outs. They didn’t need to. Herrera, Davis, Holland. I’d be hitting flips in the cage next to the visitors dugout in Kauffman, but once those guys came in the game the righty pinch hitters could pretty much sit back down.

It turns out that they are a Phase 2 bullpen right in the middle of the other 2014 teams. They had some guys that threw gas, but in terms of the way they attacked hitters, it was still a 2014 approach.

The 2018 circle is much more spread out. Twenty-three bullpens look like they could be at home in 2014 or even 2008, but there are seven outliers:

Rather than point to outliers in one variable such as velocity, we can look at these seven bullpens and say that using all ten of the original FanGraphs variables, these are some of the most unique bullpens we’ve seen.

In 2018, twenty-five out of the thirty teams are pitching in Phase 3. Again, this has nothing to do with success variables like WAR or ERA. It’s more about their velocity, their mix of pitches, and how they attack the strike zone.

If you’re interested, the five Phase 2 bullpens of 2018:

Cardinals, DBacks, Marlins, Reds and Royals.

And the point is?

It would be interesting to explore PCA and k-means further, maybe even look at starting rotations. PCA is pretty abstract, especially compared to something like ERA- or FIP. I wanted to dive into this to see if we could visualize the way things have changed. The k-means gave us a cool breakdown of the story, which we arbitrarily called Phases 1, 2 and 3. It was a fun way to represent how the game has changed.

Thanks, Shredder.


Examining the Struggles of Ozzie Albies Through the lens of Neuroscience

Ozzie Albies has been at the heart of his team’s unexpected push for the NL East division lead all season. He was there before Ronald Acuña came up. He’s been healthy since Acuña got hurt. He blasted through April with a triple slash of .293/.341/.647. A .647 slugging percentage! Everyone was astounded. Articles were written about how rare and mystifying it was, whether it was sustainable, and how it was nearly impossible to provide a comp for him because there hasn’t been a player like him before. He appeared to be imposing his will on anyone who dared to pitch to him.

Well, gang, May happened. And June is in the midst of happening. And while his overall performance to date still provides us great insight to the player we can look forward to, Albies has had a much tougher go of things. That triple slash slunk to .264/.306/.432 in May. So far this month, it’s at .154/.200/.346.

The good has been unprecedented; the bad has turned abysmal. Each has been more extreme than his profile ever seemed to offer. When Albies was first called up last year, Baseball Prospectus said he “has a slash-and-dash offensive approach that marries well with his advanced bat control and plus-plus speed.” But since he’s been in the Bigs, he’s been more of a free-swinging, freewheelin’ monster.

In 2017, he offered at more than 51% of the pitches he saw. Had he qualified, that would’ve placed him in the bottom 20% of the league, in the company of Yangervis Solarte and Brandon Crawford. This season he’s been even more severe, swinging at more than 55% of all pitches faced. That puts him in the bottom 5% of qualifiers. So, really, what is going on?

Neuroscience GIF-downsized_large

This gif shows the plate from the catcher’s view, and consists of only lefthanded plate appearances by Albies. It accounts for about 70% of his plate appearances and is where the struggles have really come in, as he’s hit only .232 from the left side as opposed to .318 from the right.

On the left side of the gif is a heatmap of Albies’s swing percentages. On the right is where pitchers have located to him. The first is through April, and the second is from May through 6/14. At the start of the season, pitchers filled the zone and challenged him. Per Baseball Savant, more than 41% of pitches he faced crossed the plate that month, and he used his exceptional bat control to punish those balls. However, since May, pitchers have thrown it in the zone far less — a shade under 33% of their total pitches to him. When you’re swinging at more than 55% of the pitches you’re seeing, but only one in three is over the plate, you’re bound to run into trouble.

There are two possible suggestions to make for Albies here. One would be mechanical, assuming something is wrong with his swing. That would probably be premature, given how good he’s been at such a young age. The other would be mental, which seems more likely. His advanced bat control appears to have convinced him that he can hit anything, so he’s going for it. But by doing so, he might be poorly manipulating the signals in his brain he uses to make contact.

Bijan Pesaran, a professor of neuroscience at New York University, explains it this way through the scope of ping pong players:

“When [they] are playing at a high level, they look at the ball up to the point where they hit it. As soon as the paddle makes contact with the ball, you can see their eyes and head turn to now look at their opponent. They think they are looking at their opponent when they are hitting the ball, but they are looking at the ball. Their eyes are tracking the ball, even though they are aware of their opponent.”

Pesaran also says that the cerebral cortex is arranged more like a mosaic than a traditional puzzle. That’s the part of the brain ballplayers would use for pitch recognition and location. If Albies is going to parts of the zone he’s unfamiliar with — parts he doesn’t approach when he’s hitting at a high level — he’s essentially attempting to rearrange the mosaic network that relays the signals from his brain to his swing. It also means he could be looking at the ball longer since he’s not used to seeing it in those places.

The result is a hitch in the 200 millisecond cycle where his brain processes a pitch and tells his body to swing, which may be causing, or at least contributing to, the struggles in which Albies finds himself swamped.

Ozzie Albies didn’t suddenly turn into a pumpkin after a flare of greatness. He’s too good for that. But he does need to adapt to a league that’s already adapted to him. His next step forward could take realizing his limits.

Pitch charts from Baseball Savant. All other data from FanGraphs. Gif made with Giphy.