Is Exit Velocity Important?

Last season, MLB released Statcast, an innovative tool used to evaluate player movements and athletic skill. Defensively, it can track how efficiently a player’s line to the ball was, how much ground he covered, arm strength, top speed, and many other factors. It also can track baserunning metrics, such as lead distance, grabbing an extra base, max speed, and home-run trot, among other things. Statcast also tracks pitching and hitting metrics. MLB teams can now use iPads in the dugout, meaning they have an endless supply of information at the touch of a finger.

Recently, Albert Chen of Sports Illustrated wrote a piece on various teams’ use of Statcast. The article notes how Pirates hitters would review a pitcher’s spin rate before an at-bat. If the spin rate was high, they would expect something lower in the zone. Even Kris Bryant credits Statcast, saying he improved his launch angle, aiding in his breakout, possibly MVP season. All teams have been using the data, says Chen, and teams have used the data in different ways. Daren Willman, who heads BaseballSavant, describes the use of Statcast as an “arms race,” as teams now have this bank of information at their disposal. Willman analyzes this Statcast data himself, looking at player comparisons and evaluations. The tricky thing, according to Willman, is knowing what information to look at. He claims “It’s so massive, it’s just about asking the right questions . . . the answers are all there.”

The Tampa Bay Rays, a forward-thinking club, tell their players on the first day of spring training that the Rays value their batted-ball velocity, rather than batting average. Similarly, the New York Mets decided to take Lucas Duda over Ike Davis to be their 1st baseman of the future. Duda soon started to mash the ball, before struggling with injuries. Davis, on the other hand, is still looking for major-league employment.

Some of the highest exit velocities belong to sluggers like David Ortiz, Josh Donaldson, Miguel Cabrera, and Giancarlo Stanton. Perhaps this is not surprising. There are, however, some players who are not in the upper echelon of MLB, such as Chris Carter or Khris Davis. Both of these sluggers have low batting averages, but high exit velocities. At the same time, both of these players have solid slugging percentages, both fluttering around .500. What can this data tell us? Is exit velocity related to batting average? Slugging percentage? wOBA?

My initial thoughts pointed me towards BABIP (batting average on balls in play). My thinking was that if these players hit the ball harder, on average, then their contact will more likely than not will find its into being a hit. If the ball is hit harder, the defense has less time to react and make a play. I was looking at BABIP instead of just batting average, since BABIP will overlook a player’s tendency to strike out. A lot of the guys with high velocities are big swingers, so it would make sense if they tend to swing and miss. So I set out to test these hypotheses, and the results may surprise you.

At first, I looked at the relationship between BABIP and exit velocity by performing a linear regression between the two. Here is the result:

BABIPtovelocity

No relationship, at all. R-squared of 0.03. Looks like I’m 0 for 1 so far. My theory that harder-hit balls would result in more hits, on average, looks to be proved incorrect, as there is no relationship between the two in the data. Perhaps this aligns with the idea that a pitcher really has no control of a ball once it is put in play (unless it is a HR), as unless the batter hits a HR, he still has little or zero control over the result (as a reminder, HR is not included in BABIP since the ball is not in play).

So, I will continue to my next ideas. If these players are big swingers, they probably strike out more, right? Well, sort of; a weak correlation exists, if any at all. I’ll take the loss on this one — 0/2. With a correlation of 0.11, it is hard to say a relationship exists. Here is the graph:

exit_velocityTOk

I then looked at other hitting metrics to see if a relationship exists. Specifically, I looked at the stats generally associated with exit velocity: Home runs, slugging percentage, and isolated power.

First, I’ll show the relationship between the two. A relationship definitely exists here. It may not be a direct relationship, but players with high exit velocities had more home runs. Now, some of this is tied to other factors, such as how often they could make contact with a pitch, what their fly-ball and ground-ball rates are, and how often they strike out. These various factors will also play a role in the amount of home runs hit, as will exit velocity. Nonetheless, as one might expect, a relationship exists. The R-squared on the regression is 0.37. Here is the graph:

HRtovelocity

Next, I looked at slugging percentages as well as isolated power. The difference between these two metrics is that isolated power equals batting average subtracted from slugging percentage. It tracks how often a player hits for extra bases, since singles are subtracted out of the equation. Nonetheless, both of these metrics track total bases and include more information about the hitter’s power.

After running my regression between slugging percentage and exit velocity, the graph shows another relationship. Again, it is a weaker relationship, but a relationship exists. The R-squared on the regression again was 0.37, so about the same value as home runs and exit velocities. So again, players with higher exit velocities are more likely to have a higher slugging percentage. Here is the graph:

SLGtoexit

Isolated power again shows a similar relationship, as the R-squared on the regression was 0.39. Other factors explain isolated power, just as they do with slugging percentage and home runs, which goes to show that other factors are important as well, such as strikeout rate. Nonetheless, isolated power is related to exit velocity in a positive notion.

ISOtoVelocity

For those wondering, I left out metrics such as OBP and wOBA because they incorporate how often a player walks, which has nothing to do with how hard a player hits the ball. I did run the regressions, and the R-squared values were around 0.30 for both metrics.

So what does this all mean? Should teams focus on exit velocity? What about launch angle?

For the record, launch angle did seem to have a weak relationship with HR, with an R-squared value of 0.25, so another relationship seems to exist.

Wrapping it all up, it seems that exit velocity is a good way to determine the power of a player. Yes, there are other things, such as launch angle, strikeout rate, fly-ball and ground-ball rate, and other factors. Is it the end-all, be-all of a player? No, of course not, but it may be better able to tell a player’s true power than a recent stretch of hot play. Also, players must also learn to work the count and draw walks, which is separate from exit velocity.

Nonetheless, it is smart to look at exit velocities. There are other important factors, and teams should not neglect these factors, but focusing on exit velocities is a good way to determine the raw power of a player. Also, it can show the potential in an undervalued player, who may have a low batting average, but has an ability to hit for power that is hiding beneath a cold stretch.

Anyways, it looks like major-league baseball teams do know more than me. Oh well, I’m working on it.


Ken Giles is Back to His Dominant Self

It didn’t take long for Ken Giles to make a name for himself, despite coming up as “just” a set-up man on a relatively bad Philadelphia Phillies team. In his debut season in 2014, Giles struck out 64 batters in 45.2 innings and posted a minuscule 1.18 ERA and 1.34 FIP as a 23-year old. Last year, Giles followed up his stellar freshman season with an equally impressive sophomore campaign, fanning 87 batters in 70.0 innings of work, notching a 1.80 ERA and 2.13 FIP. After a trade-deadline deal sent incumbent closer Jonathan Papelbon to the Washington Nationals, Giles officially took over the closer role in Philadelphia and finished the year with 15 saves, all coming after July 28.

After those two fantastic seasons in the back end of Philadelphia’s bullpen, the rebuilding Phillies decided that their young relief ace was more valuable to them as a trade chip than a current player, and on December 12, 2015 Giles was traded (along with a low-level prospect) to the Houston Astros for a quintet of players, including former No. 1 overall draft pick Mark Appel and young right-hander Vince Velasquez.

Giles’ role with his new club was not immediately obvious, but many speculated he would be the team’s closer heading into spring training. The club, however, kept quiet on the matter, further fueling public debate over Giles’ best fit on the team. On April 4, Astros manager AJ Hinch announced that club veteran Luke Gregerson would begin the season as the team’s closer. This displeased some — despite Gregerson’s success within the role in 2015 — due to the seemingly steep price the club paid to acquire Giles.

Hinch’s decision was validated almost immediately, as Giles’ season began about as poorly as one could’ve imagined. In 115.2 innings pitched between 2014 and 2015 in Philadelphia, Giles allowed just three home runs — a number he matched in less than four innings with the Astros, as he allowed longballs in three of his first four outings with Houston. Through April, Giles had allowed 10 earned runs in 10 innings. However, his peripheral numbers were not awful, as he struck out 14 batters and walked just four, giving him an xFIP of 3.23 despite the 6.75 FIP and 9.00 ERA. Giles’ HR/FB rate was an astounding 40 percent, compared to the league average of 11.8 percent over the season’s first month.

May was a slight improvement for Giles, as he went the entire month without allowing a home run and continued to strike out batters at a good rate. Over 11.1 innings, he fanned 14 batters and walked five while allowing five earned runs. For the month, he accumulated a 3.97 ERA, 2.00 FIP, and 3.93 xFIP. At the end of May, Giles had pitched 21.1 innings with a 28:9 K:BB ratio, and had a 6.33 ERA, 4.23 FIP, and 3.60 xFIP. Perhaps the most troubling statistic, however, was Giles’ ground-ball rate, which sat at just 31.1 percent after his first two months. Over his first two seasons, Giles’ ground ball rate was much higher, at 44.6 percent. Giles’ strand rate was also nearly 78 percent in his time with Philadelphia, but stood at just 66.9 percent over April and May of 2016.

While May was an improvement over April, Giles was still not nearly as effective as he was in his stint with the Phillies. June was a bigger step in the right direction, though, and Giles once again brought down his monthly ERA to 2.31. He allowed three earned runs in 11.2 innings of work, striking out 14 and walking just two. That month, his FIP and xFIP were both around 2.50 and his strand rate and ground-ball rate increased to 88.2 and 38.7 percent respectively.

July went even better than Giles could’ve hoped, as he allowed just three hits and no runs over 8.2 innings, striking out an astounding 18 batters while walking just two. Once again, his strand rate and ground-ball rate increased, posting 100 percent and 45.5 percent marks, respectively. For the month, his FIP was actually negative, at -0.31. August has gone well for Giles, too, as he’s struck out 21 batters against two walks in 10.2 innings (through 8/30). The long ball has hurt him a bit — solo homers accounting for two of the three earned runs allowed in the month — but his ERA for the month sits at 2.53 ERA, and he owns a 2.49 FIP. His strand rate in August sits at 88.2 percent, and his ground-ball rate has gone up again to 52.4 percent. On August 7, Giles even had a game in which he struck out six batters in just 1.2 innings.

Since the beginning of June, Giles’ numbers are eye-popping, especially when compared to his numbers through May:

BaseballEssential.com

Giles is a pitcher who relies heavily on his “stuff” to get outs. He throws just two pitches — fastball and slider — so getting hitters to guess on a wide variety of pitches isn’t his game. However, both his fastball and slider are excellent offerings, which gives him the ability to succeed despite a limited arsenal. When working with just two pitches, location is important to keep hitters off-balance. In the first two months of the season, Giles’ location was his issue — the velocity and movement on both pitches has been comparable throughout the season — as you can see from the heat maps of his pitches through May:

Fastball:

BaseballSavant.com/Plotly
BaseballSavant.com/Plotly

Slider:

BaseballSavant.com/Plotly
BaseballSavant.com/Plotly

The fastball seemed to be erratic, with no one area particularly heavily-worked compared to others. The highest-concentrated area was inside to right-handers, which is an area that allows batters to hit to the pull field, where the most damage is done. His slider was also left close to the zone most of the time, which limited his ability to generate swings and misses on the pitch. Since the beginning of June, however, Giles has improved his command considerably, as evidenced by the second set of heat maps from June-August:

Fastball:

BaseballSavant.com/Plotly
BaseballSavant.com/Plotly

Slider:

BaseballSavant.com/Plotly
BaseballSavant.com/Plotly

Giles’ fastball is more consistently located closer to the middle of the zone and towards the lower half, which not only allows him to force batters to swing at the pitch but keeps them from turning on balls and doing damage to the pull field. His slider heat map is almost identical to the one from April and May, but shifted lower by almost a foot. Instead of working from the middle of the zone to the bottom, he’s now working the slider from the bottom of the zone to down below the knees. This has given Giles the ability to not only get more whiffs on balls out of the zone, but to generate more ground balls with the pitch. The fact that the fastball and slider locations are more similar also likely gives Giles an advantage, as he can play the slider off of the fastball or vise versa.

giles table 2

Giles has also — perhaps even more importantly — changed his usage patterns since the end of May. He’s not only used the slider much more, and more effectively, but he’s also changed when he uses the slider, particularly to right-handed hitters. Compare Giles’ usage charts from the first and second “halves” of his season:

April-May:

BrooksBaseball.com
BrooksBaseball.com

June-August:

BrooksBaseball.com
BrooksBaseball.com

As you can see, Giles has begun to pitch to right-handed batters the same way he has pitched to left-handers this year. All season, Giles has used his fastball heavily to begin at-bats against lefties, and even more so when behind in the count. However, to righties — the majority of the batters he’s faced — he’d more or less mixed the two pitches equally in all situations. Yet, since the start of June, Giles has leaned more towards using the slider when ahead of righties and with two strikes. The adjustment has worked to perfection, as Giles has allowed just a .083 batting average and .167 slugging percentage to righties on the slider since June 1.

Thanks to both of the major adjustments he’s made, Ken Giles has been able to reclaim what looked in May to be a down year. Due to other struggles in the Houston bullpen, he’s even taken over the closer role, recording saves in four of his last five appearances. With the Astros desperately needing to make a push for the playoffs in the season’s final month — they enter play on August 30 two games behind Baltimore for the second American League Wild Card spot — Giles is the type of power reliever that could help the team’s playoff chances immensely down the stretch. If Giles could be the difference between winning and losing just a game or two in September, he could be the difference between the Astros making and missing the playoffs. With the way he’s been performing lately, there’s no reason to doubt that he will be a dominant closer down the stretch for Houston.


Can Dan Straily Keep Beating BABIP?

As a former prospect struggling to find his footing in the majors, Dan Straily wasn’t given an extended look in a big-league rotation after 2013. He bounced around from the A’s to the Cubs to the Astros. Now he’s on the rebuilding Reds. With the Reds, he has finally gotten another shot. The Reds were looking for someone with any kind of upside to fill the hole in their rotation. Straily fit the bill. 154 innings later, Straily is running an insanely low .239 BABIP, the third-lowest among qualified starting pitchers. That has helped him to a solid 3.92 ERA, which was at 3.50 before a recent blowup against the Angels. Before then, however, he had managed 10 starts in a row without allowing more than three runs. Can Straily keep running a BABIP this low? Let’s find out.

The first thing that sticks out to me about Straily is that he’s an extreme fly-ball pitcher. He has the third-lowest groundball percentage and the eighth-highest fly-ball percentage among qualified starters. He also has allowed the 11th-highest average launch angle on batted balls out of the 92 pitchers who have thrown at least 2000 pitches this year. Ground balls go for hits far more often than do fly balls (although fly balls go for extra-base hits far more often), so that explains part of why Straily has such a low BABIP.

If you’re like me, you would have thought that since Straily gets a lot of fly balls, maybe he gets a lot of popups. That would certainly help him keep a low BABIP, as popups almost never go for hits. Although Straily’s fastball has good rise (he’s tied for 27th out of the 78 qualified starters who throw four-seamers), he doesn’t actually generate many popups. In fact, his IFFB% of 7.9% this year puts him firmly below the league average of 9.7%. While his career IFFB% is at 11.8%, that doesn’t help explain why he’s run such a low BABIP this year specifically. Let’s look elsewhere.

Does he do a good job of limiting quality contact? He has allowed the 39th-highest exit velocity out of the 92 pitchers who have thrown at least 2000 pitches this year. That’s below average. He’s also below average in terms of hard-hit rate: he has the 30th-highest out of the 81 qualified pitchers. Worse, he’s tied for the seventh-lowest soft-hit rate. His line-drive rate is worse than average, the 32nd-worst out of 81. These are some troubling signs.

On the other hand, there is some good news. Straily has a nasty changeup. Observe:

DanStraily_original.gif

Of the 73 qualified pitchers who throw a changeup, Straily’s is tied for the sixth-most drop. That’s not surprising, especially when you consider this: there are 133 pitchers who have thrown at least 150 changeups this year, and Straily’s has the fifth-lowest average spin rate. A low spin rate allows gravity to do its job and make that sucker drop right off the table.

Straily has a nice slider, too. It’s a frisbee, with solid horizontal movement and decent drop, without sacrificing too much velocity. Observe:

giphy.gif

I don’t think that Straily will maintain a .239 BABIP. Although his extreme fly-ball tendencies seemingly make it easier for him to maintain a lower BABIP, he doesn’t do enough things right otherwise. He allows too much quality contact. On the other hand, he has three solid pitches, which are also his three most-used pitches (his sinker and curve aren’t great, and he uses them accordingly). His four-seamer has good rise, and, despite mediocre velocity, that can work. Just look at what Marco Estrada is doing with a four-seamer that has good rise and averages a mere 88 MPH.

Straily’s change and slider have above-average swinging-strike rates, at 15.8% and 14.5%, respectively. The change and slider even have average groundball rates (44.9% and 46.7%). They both have lofty O-Swing percentages as well, which leads me to believe their swinging-strike rates are for real (45.5% and 38.8%). My advice for Straily would be to stop pitching to contact. He seems to be pitching to contact because his Zone% this year is at 46.9%, the highest of his career. That mark ties him for 20th-highest among the 81 qualified starters. It is far above the league average of 44.8%. So, he should stop pitching to contact because 1) he has strikeout potential and it would be worth trying to tap into it and 2) his luck with BABIP will probably run out soon.

Data from FanGraphs and Baseball Savant. Gifs courtesy of Bleacher Report and MLB.com.

Thanks for reading!


Constructing a Lineup for the Blue Jays

Jose Bautista recently came off the DL for a second time this year. John Gibbons has stated that he will predominantly DH and not occupy his usual spot in RF (this is good news if you’re a fan of the Blue Jays or a fan of outfield defense in general). But perhaps the bigger question is, where in the lineup is he going to hit? Last year he was their No. 3. He then moved to the leadoff spot, and he’s even hit second for 10 games this year.

The reason this is even a perceived issue is that Devon Travis has looked quite decent in the leadoff spot in Bautista’s absence. But let’s get something straight: Travis is no Bautista. Coming into the 2016 season, Bautista’s numbers compared to every other human in MLB since 2010 have him 1st in HR, 4th in wRC+, 3rd in wOBA, 3rd in runs scored, and 2nd in BB%. That’s spectacular. The issue has become do you keep both Travis and Bautista at the top of the lineup and simply shift Josh Donaldson and Edwin Encarnacion down one slot?

That’s the reason I started thinking about this; I was perplexed that both Bautista and Travis were going to be put ahead of Donaldson and Encarnacion. The idea that the Blue Jays’ two best hitters would be moved down the lineup for a player with just about a full season of MLB under his belt, and another who’s had a little more than 80 plate appearances since late June didn’t add up. This is a pennant race, supposedly the most critical time of the year.

Don’t get me wrong, this isn’t about dumping on Bautista, or even Travis. They’re great and good hitters, respectively. It’s about maximizing the production of your lineup. So before I moan and let everyone know my opinion is best, I thought I’d look at the data and let the numbers speak for themselves.

Hitting statistics from 2002 to 2015 were gathered and filtered by batting order. This produced 420 cases (each team the last 14 years) with six variables per place in the batting order (variables were: wOBA, BB%, ISO, wRC+, OBP, & HR). Data was then analyzed utilizing multiple regression analyses to identify what metrics at different spots in the order best predicted team runs. Results can be seen below.

View post on imgur.com


Figure 1. R2 values for total team runs with wOBA values for each spot in the batting order.

The most obvious component of Figure 1 is the drop in R2 for the 3rd place hitter. At first this may seem counter-intuitive, as it’s typically assumed that your 3rd place hitter is the team’s best.  And as that that player goes, so should the team. But the most likely rationale is that most teams, regardless of how awful or great they are, can typically muster at least one decent hitter. They place that hitter 3rd and away they go. Think about the Blue Jays and the Tigers in 2015 — they had Miguel Cabrera and Jose Bautista. So comparing Cabrera and Bautista’s 2015 stats shows an edge for Cabrera with a .413 wOBA compared to Bautista’s .389. Yet the Blue Jays vastly outscored the Tigers. This is because good to great teams have more than one “3rd place hitter.” And they apparently stack them 2nd and 4th in the order. In fact, these two spots in the batting order combined to account for slightly more than 50% of the variance explained when analyzing team runs (using the 2nd and 4th place hitter’s wOBA).

So if a team like the Blue Jays can afford to have Jose Bautista taking up the “menial” 3rd spot in the order, shouldn’t they do it? The mean wRC+ and wOBA for No. 3 hitters in the AL last year was 116 and .351. Bautista, who’s having a down year by his standards, has a wRC+ of 115 and wOBA of .346 as of August 29th. ZiPS has him closing out the year with a wRC+ of 132 and wOBA of .369, well above league average. So it could work.

But does Bautista hitting 3rd help them?

Well, since Donaldson has the best offensive numbers (and is the reigning MVP), the offense should be built around him. And the 2nd and 4th spots in the order are the most crucial in the presented team-runs-scored analysis, so we’re going to move forward with the idea that Donaldson hits 2nd. A runs-scored model that predicts the amount of runs a player will score when batting in the #2 spot can be seen in Figure 2.

View post on imgur.com

Figure 2. Predicted runs for the #2 hitter by actual runs scored. Using wOBA values for all 9 hitters [((315.5*wOBA2)+(132.15*wOBA3)+(65.6*wOBA4)+(71.63*wOBA5))-100].

This analysis produces an R2 value of .7 and incorporates, in order of model entry, the wOBA of: the 2nd hitter, 3rd hitter, 4th hitter, and 5th hitter respectively. The variables that enter are quite sensible; the greatest amount of variability of the 2nd hitters runs scored is the wOBA of that hitter. Followed by the next three spots in the lineup, all in sequential order. If this is done for each spot in the lineup the same pattern emerges. Where the hitter’s own wOBA is the greatest source of runs-scored variability and the 2-3 hitters following him account for an additional ~20%.  So essentially, if you want to score runs, bunch your hitters together. Don’t spread them out, and don’t try to place a poor hitter in the middle to get him more fastballs. Stick all of your threats in a row.

But is there a specific combination of clustering? Predicting how many runs Donaldson will score in a season, depending on the order of Bautista and Encarnacion around him, reveals the following results: 132 runs/season where the order is JB – JD – EE, 130 runs/season where the order is JD – EE – JB, and 129 runs/season where the combination is JD –JB – EE.  Again, this is the number of runs scored specifically by Donaldson when batting in the 2nd spot of the order. These results reveal error-term differences between combinations, something that wouldn’t be significant over the remaining ~30 games of the season. So as long as they’re clustered together it’s fine. This indicates that sensible lineup options have JB – JD – EE batting in succession, with Donaldson occupying the 2nd spot, and with some combination of Travis, Troy Tulowitzki, and Russell Martin surrounding them (pending on matchups/splits/who’s hot/etc.).

If clustering these three hitters together is the option, the question that follows will invariably be: who leads off? Answer — It really doesn’t seem to matter (as long as that person isn’t Kevin Pillar). Using a similar run-prediction model has Travis, Martin, Tulo, Saunders, Bautista, & Upton all averaging over 110 runs/season when batting leadoff having JD and EE behind them.


Dave Dombrowski Still Can’t Value Relievers

In 2015, the Boston Red Sox had one of the worst bullpens in Major League Baseball. Red Sox relievers were worth -1.3 WAR with  a FIP of 4.64, finishing 30th in the league in both measures. They allowed opposing hitters to hit .261 with a BABIP of .300. Unsurprisingly last offseason, newly-installed president of baseball operations Dave Dombrowski set out to remake Boston’s bullpen. Throughout his long and storied career as a general manager, Dombrowski has consistently turned lagging franchises into contenders. His one weakness, as Dave Cameron pointed out last year, has been constructing bullpens. After examining Dombrowski’s tenure with the Detroit Tigers, Cameron wrote, “There was not a single aspect to pitching that the Tigers bullpen excelled at during Dombrowski’s tenure.” In the 2015 offseason, Dombrowski made two significant trades to bolster the back end of the Red Sox pitching staff. He shipped four prospects to the San Diego Padres for closer Craig Kimbrel and sent left-handed starter Wade Miley to the Seattle Mariners in exchange for reliever Carson Smith. Both of these moves reveal that despite his years of experience, Dombrowski still has difficulty properly valuing relievers.

THE KIMBREL TRADE

From 2011-2015, Craig Kimbrel led all relievers with 12.6 WAR. He struck out 40.9% of opposing hitters, allowing a .159 batting average with a 1.73 FIP. Only Aroldis Chapman struck out more hitters over the same time period. Kimbrel’s league-leading 224 saves were 58 more than the closest reliever, Huston Street. The difference between Kimbrel and Street is roughly equivalent to the difference between Street and Addison Reed, who had the 15th-most saves from 2011-2015.

A closer examination of Kimbrel’s peripheral stats, however, reveals that he’s been slipping from his career peak in 2011 and 2012. In 2015, Kimbrel’s FIP rose to 2.68. Opposing hitters hit more home runs against him and their batting average against his four-seam fastball rose from .180 from 2011-2014 to .212 in 2015. In 2016, this decline has continued. Kimbrel’s walk rate has ballooned to 12.2%. His ground-ball and fly-ball rates have reversed themselves and he’s allowing much more hard contact. Just take a look at the chart below.

GB/FB LD% GB% FB% IFFB% SOFT MED HARD
2011-2015 1.33 20.2% 45.6% 34.2% 12.3% 20.1% 55.6% 24.3%
2016 0.64 21.0% 30.9% 48.1% 5.1% 14.8% 53.1% 32.1%

Opposing hitters are now hitting more of Kimbrel’s pitches as fly balls, they’re grounding out less often, and they’re making more hard and less soft contact than ever before. These factors have turned Kimbrel from an otherworldly reliever to merely an effective one. Looking at his yearly WAR figures, we can see that this transformation has been underway for a while now.

2011 2012 2013 2014 2015
WAR 3.2 3.3 2.3 2.3 1.5

In 2015, Kimbrel ranked 19th in reliever WAR, right between Justin Wilson of the Yankees and Keone Kela of the Rangers. That’s hardly inspiring, especially since Kimbrel earned $9 million in 2015 while Wilson and Kela made the league minimum.

Considering the price in prospects the Red Sox paid to acquire Kimbrel, they need him to perform at an elite level. In November 2015, Boston sent 3B Carlos Asauje, SS Javier Guerra, OF Manuel Margot, and LHP Logan Allen to the Padres for Kimbrel. Asuaje profiles as a utility infielder. According to Ben Badler of Baseball America, Logan Allen, whom the Red Sox drafted in the 8th round, had the talent of a 2nd or 3rd round pick. Margot and Guerra were both among the top 100 or even top 50 prospects in the minors depending on which prospect list you prefer. Using the prospect valuation system developed by Kevin Creagh and Steve DiMiceli (you can read about their methodology here), I’ve estimated the cost to the Red Sox in terms of the surplus value of Margot and Guerra. Due to the varying nature of prospect valuations I’ve included the players’ rankings in Keith Law’s Top 100 prospects and Baseball America’s Top 100 as of February 2016.

Prospect BA Ranking Surplus Value Keith Law Ranking Surplus Value
Manuel Margot 56 $22,400,000 25 $62,000,000
Javier Guerra 54 $22,400,000 34 $38,200,000
Total $44,800,000 $100,200,000

Even if Kimbrel were the pitcher of 2011-2012 that would still be an astronomically high price to pay for a reliever who throws 60-70 innings per year. Now that Kimbrel is a 2-WAR reliever, it’s even worse.

THE SMITH TRADE

After acquiring Kimbrel, Dombrowski wasn’t finished remaking the Red Sox bullpen. On December 7, 2015 he traded left-handed starter Wade Miley and right-handed reliever Jonathan Aro to the Seattle Mariners for right-handed reliever Carson Smith and left-handed pitcher Roenis Elias. Aro is currently pitching at Triple-A Tacoma and Elias has a grand total of three appearances for the Red Sox this season, so the crux of the trade is Smith for Miley.

Based on their salaries and performances in 2015, Smith and Miley were both valuable pitchers and trade assets. Relying heavily on his slider, Smith held opposing hitters to a .194/.278/.262 batting line. He struck out 32.4% of opposing hitters with a 2.12 FIP and finished fifth among relievers with a 2.1 WAR. Additionally, Smith comes with five more years of team control. He isn’t arbitration-eligible until 2018 and won’t become a free agent until 2021. In 2015, Miley was a 2.6-WAR pitcher, best among any qualified starter on the Red Sox. From 2012-2015, Miley threw an average of 198 innings per season. Prior to the 2015 season, he signed a team-friendly three-year, $19.5-million contract from 2015-2017 with a $12-million club option in 2018.

After signing David Price to a seven-year contract in December 2015, the Red Sox believed they had an excess of starting pitching. With Price, Rick Porcello, Miley, Clay Buchholz, Joe Kelly, and Eduardo Rodriguez, they had six starters for five rotation spots. Additionally they had prospects Henry Owens, Brian Johnson, and knuckleballer Steven Wright waiting in the wings. In order to bolster the bullpen, Dombrowski decided to trade Miley, recognizing that he was the most valuable trade chip among the remaining starters. Porcello had just underperformed in 2015 and was entering the first year of a four-year, $82.5-million extension. Joe Kelly, while having an electrifying arm, had not really shown himself to be an effective starter. While Buchholz had pitched well in 2015, he managed only 18 starts. And Eduardo Rodriguez, the 23-year-old left-hander and potential top-of-the-rotation starter, was untouchable. This left Miley as the most logical trade chip.

By trading Miley, a serviceable innings eater, the Red Sox left themselves open to injuries and ineffectiveness. While Steven Wright effectively stepped into the rotation after Rodriguez dislocated his kneecap in spring training, Buchholz and Kelly were disasters. In 22.1 innings as a starter, Kelly allowed opposing hitters to hit .316/.437/.564 for a wOBA of .419 or the equivalent of Mike Trout this season. He sported a walk rate of 16% and a 5.88 FIP. In his 88 IP as a starter, Buchholz allowed opposing hitters to hit .268/.347/.470, good for a .349 wOBA and a 5.68 FIP. Since 2010, Buchholz has never been healthy and effective at the same time. For all of the talk about Kelly improving last season, a look at his peripheral numbers revealed a pitcher that was merely getting lucky with stranding runners as opposed to improving his underlying performance. By trading away Miley, the Red Sox cost themselves a cushion for the failures of Buchholz and Kelly. In order to fill the rotation void, Dombrowski traded highly-regarded pitching prospect Anderson Espinoza (the 19th-best prospect in baseball according to Baseball America) to San Diego for Drew Pomeranz. Carson Smith, meanwhile, underwent Tommy John surgery in May after straining a flexor muscle in spring training.

In trading for Craig Kimbrel and Carson Smith, Dave Dombrowski has revealed that his biggest weakness remains properly valuing bullpen talent. For a baseball executive with a generally sterling record, this may seem like a minor flaw, but it’s one that caused him to overpay for a declining closer, to trade Miley while relying on a pair of risky starters, and then to swap a prospect who garners comparisons to Pedro Martinez to fill the resulting void in the rotation. With Smith’s injury and the failings of Buchholz and Kelly, Dombrowski has little to show for all his bullpen efforts other than generously restocking the Padres’ farm system.


Yasmany Tomas Is Better Than You Think

When the Arizona Diamondbacks signed Yasmany Tomas to a $60-million deal, many thought the Cuban “third baseman” would be an instant star. Little is known about Cuban players when they come over; their skills are often exaggerated and their numbers in the Cuban National Series inflated. While some players, such as Yoenis Cespedes and Jose Abreu, do come over and become instant stars, others, such as Hector Olivera, simply don’t have what it takes to make it in the majors. For the better part of last year, Tomas seemed a lot closer to the bust category than major-league stardom. That assessment seems destined to change soon.

Too quick and binary is our collective assessment of players. They’re either good or bad and we know within the first weeks of April. We care little about their story, or struggles to adapt. It’s the Twitter era; context and nuance is dead.

That is the story of Yasmany Tomas. The Diamondbacks miscast Tomas as a third baseman and the metrics hated him there. They probably knew he was not a third baseman, but there he was. Unable to help the team defensively, and struggling a bit in his first offensive season at the major-league level, Tomas got a label. He was a bust, just another of the missteps in a reign of terror  for a Diamondbacks front office that doesn’t even know the rules.

But that label loses all context. Craig Edwards reminded us about context with regards to Byron Buxton’s struggles. To paraphrase Edwards: Buxton has been really bad, but he’s also young and has plenty of time to figure it out. With Tomas, the story is similar. Yes, Tomas was a -1.3 bWAR and -1.4 fWAR player in 2015, but reducing a player to a single number does him an injustice. He was actually a positive contributor on offense. As a starter (non-pinch-hitter) he had a 103 OPS+. That’s not bad. He was also just 24. Joc Pederson and Jorge Soler are just 24 this year and we think of them as young players. Why are we so unforgiving with Tomas?

Fast-forward to this year and Tomas is still not very well regarded in baseball circles. He’s at 0.2 bWAR, -1.1 wins above average, and 0.6 fWAR. He still grades out as a very average player, but is now around 10% better than league average offensively according to the advanced stats. He’s got 26 homers (24th in the league) and a .519 slugging (30th in the league), both better than Paul Goldschmidt, Carlos Beltran and Giancarlo Stanton.

Those raw numbers suggest that Tomas is already among the top-30 or so sluggers in the league.  He even gets on base at a non-Trumbonian clip. But my early season introduction to xSLGBB said that Tomas was due for some improvement in his slugging percentage based on his batted-ball profile. Andrew Perpetua’s set of stats based on batted-ball information (the Google doc at the bottom of the post, also available on xstats.org) appears to show that Tomas has leveled. Perpetua’s xSLG stat shows Tomas’ expected slugging percentage at .549. Such a mark would tie him with Josh Donaldson.

I appreciate Perpetua’s stats, but I made up my own (xSLGBB), just for these types of analysis. I ran the numbers and by my xSLGBB, based on the league-wide expected set of outcomes from when I ran this the first time in May, Tomas is expected to improve by a grand total of .005 points of slugging.

Still, even if he has already normalized to the stats that we would expect based on his batted-ball profile, a 25-year-old with 26 home runs and a top-30 slugging percentage is pretty darn good. Yes, he has deficiencies in his game, but Tomas still has room for improvement. He’s never going to be Kris Bryant, Nolan Arenado, or one of the other MVP-type of young stars in the game, but he’s quietly hitting himself out of the bad label that we too quickly stuck him with.


Ask Not for Whom the Bell Tolls; It Tolls for the NL DH

Depending on the news outlet, the designated hitter coming to the National League is either a foregone conclusion, or something that will never, ever happen.  Regardless of which outcome is true, it’s still a fine idea to think about, and it’s also fun to try to identify which current 2016 teams might most benefit from the inclusion of a DH spot on their roster.

To create a manageable list of names, I searched for player-seasons since 2013 that resulted in an offensive runs above average (Off) value of 25 or greater, a defensive runs above average (Def) value of -5 runs or less, and filtered for National League teams.  My thinking was that by FanGraphs’ own rule of thumb, 25 Off is a notch below great, and -5 Def is starting to make its way into the poor range.  The results are as follows:

Player Qualifying Seasons
Paul Goldschmidt 4
Freddie Freeman 3
Joey Votto 3
Jayson Werth 2
Andrew McCutchen 1

This is interesting for a couple reasons.  The first of which is that this once again proves that the answer to pretty much any question about baseball can be “Joey Votto.”  The second of which is that most of the names on this list are generally thought to be acceptable, sometimes even exceptional, fielders.

Andrew McCutchen’s lone appearance on this list is somewhat of an outlier: a -8.6 in 2014.  He’s been below average defensively in the past, but this is his low water mark for the past five years, going beyond the lower boundary of my arbitrary cutoff of 2013.  I think that we can ignore this for another reason: his bat doesn’t play at DH going forward at 2016 levels of production, rendering his inclusion moot for this purpose.  His contract extends to 2017 with an option for 2018, so I doubt we’ll see NL DH at bats for McCutchen on the Pirates before then.

Jayson Werth shows up twice, but outside of his defensively tremendous 2008 season, I don’t think that anyone would put him in the company of elite defenders.  Like McCutchen, Werth also is only under contract until the end of the 2017 season, which would qualify him for a maximum of one year of Nationals DH service.  I think that he can be dismissed from the list.

In his first full season with the Braves in 2011, Freddie Freeman was a disaster at first base (-23.6).  He was better in his second year with a -13 Def.  He’s gotten better over the intervening years, with his high water defensive mark coming in 2015 with -3.9 Def.  He has gone from an awful defender to a below-average defender, and playing for a Braves team that won’t be good for a couple years will probably stay at first even with the inclusion of a DH spot.  But, being that he is under contract until 2021, should the DH rule be put into effect, we may see him getting meaningful at bats as a DH before the end of the decade.

Signed through 2018 with a club option for 2019, Paul Goldschmidt’s defensive rating seems to be a victim of positional adjustment more than anything.  His worst defensive performance was -11.5 Def in 2012, his first full season in the MLB.  Since then his performance has been within the run adjustment for his position, outside of this current season.  He is on pace to have a truly bad defensive year in 2016.  But, he’s not even the most eligible DH candidate on his own team.  The Diamondbacks would likely be better served putting Yasmany Tomas at DH and let Goldschmidt continue to play most days in the field.

The third first baseman on the list is Joey Votto.  Like Goldschmidt, he is penalized heavily for being a first baseman.  But, dissimilar to Goldschmidt, he is on the wrong side of 30 and signed to an immensely long and expensive contract.  Of all the teams with players on this list, the Reds might be one of the best-positioned to take advantage of the DH immediately.  Outside of a magnificently terrible start to 2016, Votto has shown that he is still an offensive juggernaut, with a skillset that doesn’t seem to be deteriorating at all.  His defense, on the other hand, peaked in an injury-shortened 2012 season and has gotten progressively worse in each full season he’s played since.  For the 2016 season he’s performed at his worst in the field, with a -13.6 Def.  He accounts for 100% of the Reds’ currently committed payroll for 2021, and is still signed for two more years beyond that.

I’m sure there are more teams that would benefit from the addition of the DH, and I’m sure there are teams that would acquire other talent to man their DH positions.  Realistically, I think that most teams would end up using the DH much the same way as it’s been used in the AL for decades: as an extension of a hitting career, and a half day off for players while still keeping their bats in the lineup.  But, among the teams that met the criteria laid out above, I think that the Reds would be the team most suited to immediately improve through the designated hitter.


An Inquiry Into How Players are Ranked

Perspective
How we rank players in our own minds can tell us a lot about what we value in a ballplayer. For decades the statistics that mattered to sportswriters and the public at large were those that were simple, easily understood, and still relevant to the game. Stats like batting average (AVG), runs batted in (RBI), and home runs (HR) were regularly quoted when writing articles or voting for MVP awards. Each of these numbers tells a piece of the story of what a ballplayer is. AVG shows a players ability to put a ball in play and reach base, RBI is a representation of run creation and hitting while men are on base in front of you, and HR show your power in hitting.

These numbers still hold great significance today. That said, they are not flawless expressions of player prowess with the bat. A player could have a high average and still struggle to get on base often due to strikeouts or weak contact. RBI is often a product of opportunity as much as hitting success. After all, you can still receive RBI when creating an out. HR meanwhile can be a very one-sided affair if your average is low, leading to an all-or-nothing scenario for a hitter.

I’m not trying to disparage anyone from using AVG, RBI, and HR in a debate of great players, but when you use them keep in mind that they make up only a fraction of what a ballplayer can be.

Modern statisticians have begun using much more advanced numbers like WAR or OPS+ to determine a players quality. These numbers take into account positional skill differences, park factors, and many other aspects of the game. Much like the traditional stats mentioned before, these stats have both positive and negative aspects to them. No one stat can give you a complete picture of a player’s skillset and value.

Whenever an article comes out discussing the quality of a player’s career or season we often get quotes like these:

“Since Trout debuted in 2011, he leads all players with 37.9 WAR. Further, that 37.9 WAR through Trout’s age-23 season are the most by a player in the modern era.” — ESPN Stats & Information

OR…

“Harper finally displayed his prodigious tools last season, as he led the National League in runs (118) and home runs (42) while leading MLB in OBP (.460) and slugging percentage (.649).” — ESPN Stats & Information

While all of the numbers in these quotes are valuable, and even more so impressive, they come with very little context with respect to the league as a whole. It’s great that Trout has 37.9 WAR since 2011, but who is second? And by how much is he second? So Harper led the league in OBP, but what was the league average? Or how many plate appearances did he have? Did he miss any time with injury?

Each of these questions would further add to our understanding of the value and quality of the players mentioned, but that information is never going to be answered in this context. Additionally, this practice of “cherry picking” the best stats to fit our argument negates the whole and presents the players out of context. For example, these numbers neglect the fact that Harper struck out about 25% of the time that season. Even by today’s standards that is a lot of strikeouts. I understand of course that a lawyer is never going to give out unnecessary information about a client’s failings, but in the context of ranking players it is paramount that we take into account as much of the information as we can. Ultimately, we find ourselves back where we started.

If all stats are flawed, then how are we to determine an adequate ranking for players? I propose that we use more stats. That’s right. More stats, not less.

When you fixate a ranking on a single stat, then that stat accounts for 100% of your result every time. It doesn’t matter if the stat is meant to incorporate a host of stats together. Your results are the result of a singular point of reference. If you use three stats, then each is equivalent to one-third of your conclusion.

What would happen if we used 20 different stats to determine a ranking? While each individual stat is devalued, the whole average together will give us a better understanding of the whole spectrum of a player’s ability in the game. Be warned…results may incite head-scratching.

There is a great axiom in the world of baseball stats that goes something like this: “Just because a stat has Babe Ruth at the top and Mario Mendoza at the bottom does not mean it is a good stat.” Like all statistical analysis, take this one with a grain of salt.

Methodology
My process here is rather simple. Take a group of player data, a single year or all-time, across 20 stats. Rank each player individually against the others in the set from 1 to the total number of players across all the data. Finally, average each player’s rankings across the 20 stats. Our result…rAVG (Rank Average).

For ease in data gathering and processing, I’ve decided to use the 19 dashboard stats from FanGraphs plus hits to make 20 total stats. For all-time stats, the pool of players has been limited to players with a minimum of 5,000 plate appearances.

Notes:
• Each position has t50/b50: how many times a player ranks in the
  top 50 or bottom 50 across all categories.
• * denotes active player.

All-Time • Position Players (895 total)

Name - Pos
rAVG
t50
b50
1
Willie Mays - OF
93.2
17
0
2
Barry Bonds - OF
95.3
16
0
3
Tris Speaker - OF
105.3
15
0
4
Rogers Hornsby - 2B
110.7
16
0
5
Stan Musial - 1B/OF
113.6
17
0
6
Ty Cobb - OF
118.2
16
0
7
Alex Rodriguez* - SS/3B
118.9
15
1
8
Honus Wagner - SS
133.1
14
0
9
Mel Ott - OF
136.2
15
0
10
Eddie Collins - 2B
136.6
16
0
11
Babe Ruth - OF
137.2
16
1
12
Hank Aaron - OF
143.6
14
0
13
Mickey Mantle - OF
147.7
15
1
14
Ted Williams - OF
150.2
16
2
15
Lou Gehrig - 1B
156.1
15
1
16
Charlie Gehringer - 2B
158.5
13
0
17
Larry Walker - OF
159.7
13
0
18
Chipper Jones - 3B
162.4
15
0
19
Frank Robinson - OF
163.2
14
1
20
Jimmie Foxx - 1B
167.8
16
1
102
Mike Piazza - C
272.7
9
2

Thoughts

  1. Larry Walker. At first glance this list appears to contain all the requisite names for a best-of-all-time list… that is until you reach #17 Larry Walker. I can assure you that I have not fudged the data in anyway. I, like you, are equally as shocked to find Mr. Walker parading alongside greats like Ruth, Mays, and Gehrig. Maybe we all should re-evaluate our opinions on Larry Walker.
  2. Mike Piazza. I have included him at the bottom of the chart, because he is the highest-ranking catcher of the 73 that met the 5,000 plate appearance requirement. While ranking #102 would appear to be a slight to him, when viewed in the context of the total list of 895 players…Piazza ranks in the top 12% of all players in history.
  3. Babe Ruth. Many of you, me included, probably feel that there is no way that the Great Bambino could rank outside of the top 10 all-time. I will remind you that this list is a ranking of statistics. It cannot evaluate impact on the game, cultural relevance, or popularity. It simply counts each stat as 5% of the whole and spits out a result. A closer look at Babe’s numbers and you will find that he was a terrible baserunner (SB & BsR) and his defense left much to be desired as well. Out of 421 outfielders he ranks 229 in SB, 411 in BsR, and 110 in Def. All this serves to remind me that no player, however great they might be, is without deficiencies.

Conclusion
As part of my research into this topic I ran numbers for each of the nine positions all-time and the cumulative all-time list seen above. In order to keep this article from becoming a novel, I’ve chosen to only include the top 20 of all-time here. The rest of this information will be available for viewing some time in the near future either on here or on my website.

While I may not agree entirely with the outcomes of this exercise in rankings, I do feel that it has caused me to better consider the totality of a player’s stat line rather than a few simple metrics. No one stat can give you a well-rounded, complete view of a player’s value and skill.

I await your fevered comments below.


Using Statcast to Substitute the KC Outfield for Detroit’s

As I write this post the KC outfield defense is ranked No. 1 in Defensive Runs Saved (DRS) with 43, and is No. 2 in UZR at 28.6 (first is the Cubs with 29.0).  KC sports one of the best, if not the best defensive outfield in the majors this season.

Detroit on the other hand has a fairly poor one.  They rank last in DRS, with -44, and last in UZR at -31.8.  Though Baltimore gives them a good run for their money, Detroit is probably the worst defensive outfield in the majors so far this season.

So I wondered if we could do an analysis to show what would happen if we substituted them entirely for one another?  How would that work?  Well, one simple approach would be to just use the DRS metrics for each team and basically say that DET would go from -44 to +43, so that’s a swing of +77 runs. Using the 10 runs per win thumb-rule, that’d be a pretty big swing, nearly eight games. Detroit is a whole lot better.  But I’m not sure this method is really the best we can do.  After all, we have all this Statcast data now.  Could we use that?

I set out to try to do just that.  So my first step was to hypothesize that the likelihood of a ball hit to the outfield actually dropping for a base hit could be correlated to the launch angle provided by Statcast and then that this likelihood would change depending on the team.  So to test this theory out I went to Baseball Savant and grabbed all the Statcast data for balls hit to the outfield for KC and for Detroit.

The KC data consisted of 1722 balls hit to the OF (when removing the few points that had NULL data for launch angle).  I took these 1722 points and bucketed them by launch angle in buckets that were 2 degrees each.  I then calculated the percentage of hits to total (hits + outs) for each bucket.  This percentage was the likelihood that a ball hit to the outfield at a certain launch angle would end up being a base hit.  This led me to my first realization, which was that anything that was basically < 8 degrees on launch angle (so including all negative angles), and made it to the OF, was a guaranteed hit.

The results of this analysis for the 1722 KC points made a lot of sense intuitively.  As the launch angle increased, so did the likelihood that it was an out, so my hit percentage trend went down.  Using a simple linear regression projecting the likelihood of a hit by angle had a 92.5% R^2.  This equation was going to work nicely.

I then considered running the same drill but this time using exit velocity of the hit to see how that impacted the likelihood of a ball being a hit.  There have been at least a couple article written on this topic, and the results I got matched up with the projections I had seen in other articles on the topic.  That’s to say the trend isn’t linear, but more parabolic. Using a simple second-order polynomial trend, a very reasonable projection could again be made of a hit likelihood based on the exit velocity of a ball hit to the OF.
Using these two points of data for any ball put in play to the outfield (exit velocity and launch angle) it seems as though OF defense could be projected fairly reasonably.
I proceeded to re-run those same drills using Baseball Savant’s Detroit outfield data. Launch angle provided another great fit, 95% R^2 and a slightly higher overall trendline than KCs (notice the higher y-intercept or “b” value).  KC’s OF was almost 4% more likely to catch a ball just from the “b” value.
Using a simple second-order poly trend for Detroit’s exit velocity also resulted again in an 85% R^2, very similar to that of KC.  It also showed the expected parabolic action.
What I now had was a way to project the likelihood of the KC outfield or the DET outfield making a play on any ball hit to the outfield.  All I needed to know was what the angle and exit velocity was.  Lucky for us, Statcast gives us all that information.
My next step was to take all the OF plays made by Detroit and, using my newfound Detroit projection system, project the number of real hits based on the hit events to the OF.  My Detroit projection system projected 1089 hits, in reality there were 986 hits. Not perfect, and something that could undergo some more tweaking, but reasonable.  My projection system was overly simplistic — I took the likelihood from the angle * the likelihood from the exit velocity.  If the multiplication was > 25% (i.e. 50% for each as the minimum threshold) then I projected a hit; else, an out.
So my Detroit projecting Detroit resulted in 1089 hits.  When I substituted the KC projection equations in, the Detroit projected hit to the OF dropped to 903.  This was a reduction of 186 expected hits!  Wow.  That’s some serious work the KC outfielders would’ve done.
The last step here was then to attempt to convert this reduction in hits to a reduction in runs.  I grabbed FanGraphs’ year-to-date pitching stats by team and used that to do a simple regression on hits allowed to runs allowed.
This showed strong correlation with a ~77% R^2.  Using the slope of this equation it shows that each hit allowed correlates to 0.7298 runs.  This means that a reduction of 186 hits would correlate to a reduction of 136 runs! Again, using the 10-run thumb-rule, that’s a nearly 14-win move.  That’s amazing improvement.   Now of course we are expecting drastic improvement; we’re talking about replacing the worst OF defense in the league with the best!
Conclusions
Are there some bold assumptions made here? Yes.  However, I do think it’s a fairly reasonable approach.  It’s fun to see all the different ways this new Statcast data can be used.  This same drill could be run on all sorts of “swap” evaluations and could be a whole lot of fun for a variety of what-if scenarios.  I enjoyed attempting to answer this question using the new data and hopefully you found this entertaining as well!

Power and Strikeouts

Adam Dunn Photo.png
Adam Dunn is an all-time leader in both home runs and strikeouts, a connection that could be universal. (Photo by Danny Moloshok for the Associated Press.)

 

I’ve been a Washington Nationals fan since the team moved to D.C. in 2005. One of my favorite players to watch — though he was with the team for just two seasons — was Adam Dunn. The 6’6, 250-pound lefty masher was an incredible physical specimen who could hit home runs like nobody’s business. Unfortunately, the only thing he did better than hit homers was strike out. He’s 36th on the MLB all-time home run list with 462, and third on the all-time strikeout list with 2,379. Because of his high strikeout numbers and sub-par batting average on balls in play, he sported a lifetime batting average of just .237.

I bring up Adam Dunn because he’s a prime example of the baseball truism that I’ll be investigating today: Do power hitters tend to strike out more often?

This claim is deceptively tough to evaluate because there’s no one clear way to tell if, and to what degree, a player is a power hitter. I came up with as many rational ways to measure power as I could and compared each with strikeout rates. I’ll let you decide for yourself exactly how well each metric relates to power.

Traditional Stats

Let’s start with the most obvious measure of a power hitter: Home-run hitting.

Here’s the correlation between a player’s home-run rate (HR/AB) and strikeout rate (K/AB).

HR per AB v. K rate.png

r = 0.527

A correlation coefficient of 0.527 isn’t bad, and you can see a clear upward trend in the data, but let’s keep going.

Home runs obviously aren’t the only way to measure power. Let’s see what happens when we expand our study from home runs to all extra-base hits.

EBH per AB v. K rate.png

r = 0.427

So it turns out there’s actually even less of a correlation with extra-base-hit rate than with home-run rate.

There is a flaw to evaluating power using per at-bat rates. If a player has a high strikeout rate his rate of any type of hit will be lower. Here’s what happens when we redo the previous two graphs using home runs and extra-base hits per hit instead of per at-bat.

HRsperH vs. K rate

r = 0.609

EBHperH vs. K rate

r = 0.627

Much higher correlation. Correlation in the .600 range isn’t the goal — but it’s definitely an indication that something’s there. Since non-per-at-bat rates seem promising, let’s try per ball in play as opposed to per hit.

HRperBIP vs. K rate

r = 0.634

EBHperBIP vs. K rate

r = 0.669

Even stronger correlation. Let’s move on now to a classic measure of power: Isolated power (ISO).

ISO vs. K rate

r = 0.508

Good correlation, but not as strong as we just saw with HR and XBH per hit and per BIP. But when you look at what ISO actually is, it’s a per-at-bat rate statistic.

Screen Shot 2016-08-16 at 7.19.48 PM.png

Why don’t we redo ISO as per hit or and per ball in play instead of per at-bat?

ISOperH vs. K rate

r = 0.642

ISO per BIP v. K rate

r = 0.673

So it turns out reworking ISO as per ball in play actually gave us our strongest correlation yet at 0.673.

Side note: I tried adjusting the ISO coefficients a couple of different ways since valuing a triple twice as much as a double and a home run three times as much as a double but just 1.5 times as much a triple seemed odd to me. As it turned out, the correlation didn’t get any better. Touché sabermetrics community, touché.

Statcast Stats

One of the great things about doing this study in 2016 is that we aren’t limited to traditional outcome-based stats. That being said, one of the less great things about doing this study in 2016 is there’s only one full season of publicly available Statcast data. As a result, I’m lowering my minimum observations per player from 1000 plate appearances to 100 at-bats. For context Manny Machado led the league in plate appearances in 2015 with 713. So we’re clearly going to see decreased correlation because of poor sample size. To give you an idea of what that looks like, here’s a few of the correlations from the previous section compared with what they would have been had I used 2015 Statcast data instead:

Stat 1000 Plate Appearance Correlation 100 At-Bat Correlation
HR per BIP 0.634 0.457
EBH per AB 0.427 0.133
ISO per BIP 0.673 0.495
HR per AB 0.527 0.302

What you should take from this is that the strength of pretty much all of the correlations we’re going to look at will be diluted. Many stats that appear to have rather weak correlation could have a real relationship given more data, we just can’t know. It’s unlikely we’ll see some really indicting evidence that a specific measure of power implies a higher strikeout rate, but it could give us a good clue of where to look in the future. So with that out of the way, let’s crunch some numbers.

One obvious way to use Statcast to measure power is to look at exit velocity. If you tend to hit the ball hard, chances are you’re a power hitter. Here’s how average exit velocity correlates with strikeout rate.

Avg. EV vs. K rate

r = 0.338

There’s some correlation, albeit pretty weak. Perhaps power isn’t best represented by whose hits on average are the hardest but rather who has the highest rate of very hard-hit balls. Home runs tend to be hit at least 95 mph, so let’s check the correlation between rate of 95+ mph balls in play and strikeout rate.

HR.EV vs. K Rate

r = 0.393

There’s better correlation, but it’s still rather weak. Let’s move on.

Next up is launch angle. Power hitters hit more fly balls because that’s the only way to get a ball out of the park and a common way to hit a double.

Avg. LA vs. K rate

r = 0.260

There’s even less correlation than with exit velocity, and when I looked at the rate of “home-run launch angles” (25˚ – 30˚) the correlation went down even further to 0.093. While we’re on the subject, I checked the correlation for the rate of balls in play that both had an exit velocity of at least 95 mph and a launch angle between 25˚ and 30˚ and got 0.323 — lower than both exit velocity-only correlations.

Perhaps distance will yield better results. Below is the correlation between average ball in play distance and strikeout rate.

Avg. Dist. vs. K rate

r = 0.353

Still not much correlation, but as with exit velocity it would make sense for the true sign of power to be high rates of balls in the 300 feet range rather than the exact distribution of balls hit 100/200 feet.

300perBIP vs. K rate

r = 0.398

So we see improved correlation, but 300 feet was a rather arbitrary number. Let’s try 350 feet.

350perBIP vs. K rate

r = 0.481

There’s some decent correlation here, but maybe we’ve made a mistake in lumping together distances to all parts of the field. Here’s what happens when we redo the previous two graphs but only count balls hit to center field that went an extra 50 feet.

300:350perBIP v. K rate

r = 0.416

350:400perBIP vs. K rate

r = 0.463

The correlation went up from 300 to 300/350 and down from 350 to 350/400 (interestingly both by .018). This brings up an interesting question: Does power manifest itself more or less on balls in play in different parts of the field? In looking at this I organized players by their handedness — dividing balls in play by pull/center/opposite field not LF/CF/RF. (I omitted switch-hitters from this part and looked only at balls hit to the outfield.) Rather than show 21 graphs, I made a table below with the correlation coefficients.

Location Avg. Exit Velocity Avg. Launch Angle Avg. Distance HR Range Exit Velocities 300+ ft. 350+ ft. 400+ ft.
Pull .306 .433 .399 .327 .386 .442 .293
Center .410 .148 .270 .379 .267 .353 .388
Oppo .336 -.147 0.021 .293 .028 .054 .215

The last stat I’m going to look at is arc angle. Arc angle is a stat I created to evaluate a batted ball’s trajectory. You can find out more about it in my Hardball Times article. Just note that it’s only for balls hit in the air and lower angles are fly balls while higher angles are line drives.

Avg. AA vs. K Rate

r = -0.474

So none of the Statcast stats yielded a correlation coefficient of 0.5 or more. As I said at the top this is likely — at least in part — a sample-size issue. I’ll update these numbers after the season to see what difference that makes.

Recap

That was a lot, so here’s a table of all the correlation coefficients and increase in strikeout rate per unit of the stat for the comparisons we made.

Stat Correlation  Coefficient Increase in K Rate per 1 Unit of Stat
Home Runs per AB .527 2.16
Extra Base Hits per AB .427 1.40
Home Runs per Hit .609 0.63
Extra Base Hits per Hit .627 0.53
Home Runs per Ball in Play .634 1.85
Extra Base Hits per Ball in Play .669 1.44
Isolated Power .508 0.67
Isolated Power per Hit .642 0.21
Isolated Power per Ball in Play .673 0.61
Average Exit Velocity .338 0.01
Home Run Exit Velocity Rate .393 0.32
Average Launch Angle .260 0.01
Average Ball in Play Distance .353 0.002
300 + ft. Balls in Play Rate .398 0.49
350 + ft. Balls in Play Rate .481 0.77
300 + ft. LF/RF 350 + ft. CF Rate .416 0.72
350 + ft. LF/RF 400 + ft. CF Rate .463 1.12
Average Arc Angle -.474 -0.01
Location Avg. Exit Velocity Avg. Launch Angle Avg. Distance HR Range Exit Velocities 300+ ft. 350+ ft. 400+ ft.
Pull .306 .433 .399 .327 .386 .442 .293
Center .410 .148 .270 .379 .267 .353 .388
Oppo .336 -.147 0.021 .293 .028 .054 .215

As to our initial question: Does power correlate with strikeouts? I think it’s pretty clear that yes, power correlates with strikeouts in some capacity. As for how much it correlates and what exactly power is? That’s not clear. Hopefully additional seasons of Statcast data will help.