Jacob deGrom is Leveling Up

So far this year, more than 170 starters have thrown at least 10 innings. Of those starters, Jacob deGrom has been the fifth best in all of Major League Baseball. In the prior three seasons he was 12th overall, then 28th, then 12th again. He’s already been worth more than two wins…in less than a third of a season. Last year, he was worth 4.4. John Edwards noted just how berserker his start has been:

johntweet1

Nine wins, y’all. DeGrom is on pace to be worth nine wins. The last pitcher to be that good was Randy Johnson in 2004. Being that deGrom is “only” the 5th best pitcher so far this season, that means four others — Max Scherzer, Justin Verlander, Gerrit Cole, and Luis Severino — have been even better, and that they’re on pace to break that nine WAR barrier, too. Given that less than a third of the season has passed, maybe none of them will, or maybe we’re in for a heck of a season from the mound despite a ball that favors hitters.

DeGrom might be of particular interest, though, because he’s showing us a completely different look this year than in the past. Just see for yourself.

Mets GIF-downsized_large

Those heat maps are all from the catcher’s perspective. DeGrom is combining his crazy high talent level with a whole new level of conviction. The result? Video game-like command that’s yielded a career-high 12.1 strikeouts per nine and a typical 2.45 walks per nine.

degromwhiffs

DeGrom is just baffling hitters. His four-seam fastball is generating whiffs at more than twice the average rate of the whole league. It’s always been above average but it’s off the charts this year. What’s interesting is it’s got less run right now, per Brooks, meaning it’s straighter. That isn’t fascinating on its own, but his changeup is straighter, too. Basically, the two pitches look more like each other for deGrom in 2018 than they ever have, but they’re working different parts of the zone. That means they’re creating a wrinkle for hitters that they’ll continue to have a difficult time ironing out moving forward.

All of his offerings have created pretty much league average swing-and-miss or better. There are two outliers: the slider and the sinker. Like the fastball and changeup, the slider appears tighter in its movement to the plate, with less drop but slightly more side-to-side break. I can’t discern if it’s playing up because of that, or because of his other stuff, or if he’s due for some regression on whiffs there. It’s something to keep an eye on, though.

Meanwhile, the curve is plowing away at the low, glove side corner. And the sinker isn’t a pitch anyone uses for whiffs very often, but deGrom’s has been about 80% worse than average this season. Instead of throwing it more arm side, though, he’s using the other side of the plate so it zings back to the edge of the zone to steal called strikes.

Let’s take a breath and recap. DeGrom’s generating a crazy amount of whiffs with his fastball up in the zone. He can mess with hitters’ eye level with his changeup low in the zone. The sinker can steal strikes on the edge. And then the curve and slider are breaking toward that same spot with pinpoint authority. Is this even fair?

Hitters will certainly say no, but that’s kind of the point. Bless their hearts, though; they’re trying. DeGrom’s improved command has coaxed them into 8% less hard contact against him so far this year compared to last year. That’s nice by itself, certainly. But it’s fueled almost the entirety of deGrom’s 8.6% increase in soft contact generated. He now leads the league by that measure at 29.9%. Hitters are hitting less against him, and when they do manage to put the bat on the ball, they’re making life easy for defenders.

The last pitcher to show this kind of jump — from really good to amazing — was Corey Kluber from 2013 to 2014. In 2013 he was worth 2.8 wins in 147.1 innings. A year later he was worth 7.4 wins in 235.2 innings. He generated more soft contact, too, but only half as much as deGrom has added this season, and it didn’t come directly from his hard contact allowed. He struck out about two more batters per nine than the year before. His stuff was in the zone but he didn’t quite command it like deGrom has.

There isn’t much precedent for what Jacob deGrom is doing this season. Time will tell if he maintains his new dominance, but for now he’s pacing nearly the entire league. He’s leveling up. 

League average whiff rates and WAR from FanGraphs. Heat maps and deGrom whiffs per pitch type from Baseball Savant. Gif made with Giphy.


A player’s take on xwOBA

When I was playing in the Arizona Fall League in 2012, I led the league in line-outs. At least it seemed like it. It was the fall before I was Rule-5 eligible and I was hoping to show the Padres I could hit high level pitching. Unfortunately, a .726 OPS in the desert wasn’t going to have them breaking down my door with a team-friendly extension in hand.

If only there were x-stats! XwOBA is the shiny new eight-figure toy that we hitters can play with after an 0-15 slump. “But I was hitting the ball hard. See, look!” Back in the pre-Statcast dark-ages, a lineout might have had some anecdotal benefit buried in the bottom of a report. Now we have the data.

There’s been a lot written about xwOBA this week. Craig EdwardsTom Tango and Jonathan Judge have all weighed in. I was especially interested in the ways they addressed it’s predictive capabilities.

Judge’s study compared season xwOBA for pitchers with the following season. Tango explored the correlations of small sample sizes of xwOBA to a larger sample.

I looked at this through the lens of a player. When a guy is getting lots of hits but they are bloopers and seeing-eye grounders (remember when ground balls went through the infield?) it’s soft hot streak. Likewise, a guy might be hitting the loudest .220 in the history of the PCL.

If you’re hitting the ball hard, they’ll start falling. Right? I wanted to test this theory by measuring xwOBA’s predictive capability month-to-month.

Methodology

(All data from BaseballSavant)

I started by getting data for each month of the regular season in the Statcast Era (2015-) for players with 50 PA in that month. I then did a series of inner joins in R to get what I’ll call “double-months.” A double month is when a player has 50 PA in two consecutive months. So Aaron Judge in April-May 2017 is one player-double-month. 

The column labels in the Double Month data frame were: “wOBA,” “xwOBA,” and “Next month wOBA.” I ended up with 3,173 data points. Running these correlations gives us an idea of how your month might predict your next month.

I also wanted to see whether you’d be better off using your entire previous season to predict the next month. For this I got full-season data (min 200 PA) for 2015 and 2016 and did another series of inner joins to get a data frame representing the previous full-season metrics and the current month metric. These columns would look like this:

“Previous season wOBA,” “Previous Season xwOBA,” “Current season month wOBA.”

I got 2311 of these data points.

For good measure, I also created a data frame for double-seasons. If you had 200 PA in two consecutive seasons, congratulations: you just got a double-season. There ended up being 532 of them.

Finally, I ran all the correlations.

Results

Double-Months

wOBA to Next Month wOBA: r=0.203

xwOBA to Next Month wOBA: r=0.274

 

Previous season to current month:

wOBA to wOBA: r=0.238

xwOBA to wOBA: r=0.25

 

Double-Seasons

wOBA to wOBA: 0.403

xwOBA to wOBA: 0.451

 

The differences are small, but they are consistent. xwOBA appears to be a better short term predictor than wOBA. What interested me the most was that while wOBA predicts your next month better if used in large sample size, the opposite is true for xwOBA. If you want to use xwOBA, you’re (slightly) better off using the most recent data.

Let’s talk about this in baseball terms. Baseball is so complex that a couple broken bat bloopers here and there can give you a really good month. Maybe you’re getting shifted but the pitcher doesn’t execute his spot and misses away and you shoot the wide open side of the infield a couple times. Maybe you made the mistake of hitting the ball hard in the middle of the field against the Cubs. Stats like wOBA practically scream regression to the mean.

But there’s no hiding from Statcast. If you’re hitting the ball hard it probably means you’re seeing the ball well and are consistently on time. Plate appearances aren’t independent events; we feel things in the cage one day that might get us locked in for a week. Or the other way around.


Analyzing Ozzie Albies

Ozzie Albies is one of this season’s breakout stars, however the one thing that stands out to me about the Atlanta Braves second baseman, is that he’s tied for the home run lead in the Majors with 13. This is pretty impressive considering this is his first full season in the and since he was never projected to be a power hitter in the Minors. He is also a stolen base threat and is decent defensively. Is he becoming a contender to Jose Altuve for the title of best second baseman in the game or is this unsustainable?

Let’s start by looking at the basics: Albies is hitting .277/.312/.588 with a .376 wOBA. One look at his batting line and we can clearly see that he’s not an elite contact hitter, who walks at a below average level. This is proven further by his 4.2 BB%. Interestingly his below average walk rate isn’t due to a high strikeout rate, as he strikes out at a decent 18.4% of the time. In other words he’s generally putting balls in play. His .275 BABIP implies that he’s not getting lucky either, while his unsustainable .311 ISO combined with his 34.5% Hardhit% indicates that his power is not really as good as it seems. A look at his HR/FB% makes it even more obvious: 21.0% is more than double his highest previous rates of 8.2% (from last season) and 7.6% (his highest rate in the Minors).

Albies swings at pitches outside the strike zone at a 35.8% rate, and surprisingly connects 76.1% of the time. Albies hits pitches outside the strike zone more often than other hitters. Think about that for a moment. He swings a lot at pitches inside the zone too (80.0%), but connects at a surprisingly below average 84.8% rate. What’s going on here? He also swings at an above average rate as seen through his 54.9% Swing%. If it wasn’t obvious before, he prefers to swing rather than take a pitch. I can’t imagine how that won’t affect him negatively in the future, once pitchers start challenging him more at the plate. According to: this analysis by Jeff Zimmermann  ,

Albies has improved his launch angle from 15 to 17.3 degrees. Combined with the fact that he also hits more fly balls (43.1 FB%) than ground balls (36.1 GB%), and there‘s at least some merit to him improving his power this season. However, everything else appears to be the same according to him.

So what conclusion does all of this information bring us to? Albies has improved his power but not nearly as much as his current production indicates. Despite his improved launch angle, he still doesn’t hit the ball particularly hard and seems to have too many of his fly balls end up becoming home runs. His plate discipline is below average and he swings at too many pitches that he shouldn’t. This is something that should and most certainly will be taken advantage of by pitchers in the near future. What happens when they start challenging him more at the plate? Will he keep connecting so well with pitches outside the strike zone?  In short, I just don’t think that he’s going to keep up his current pace. I fully expect more of his fly balls to be caught and for his batting average to thus drop to the .260- .270 range. My guess is that he finishes with 20-23 home runs and a batting line in the vicinity of .265/.300/.440. Albies‘s biggest concern going forward should be his plate discipline. If he becomes more patient and starts taking more balls, he can truly become an elite second baseman. Until then he‘s just a good player riding performing better than his talent level indicates.


Revisiting Changes in Spin Rate and Spin-Surgers

Why I Care About Spin (and You Should, Too)

After last week’s deep dive on Gerrit Cole’s release point change and resulting spin increase, I decided it was time to brush off the old physics textbooks and try to identify a causal link between the two. Before I get into the results, I’ll warn you that the second part of this article where I talk about which mechanical changes correspond to the trends we see in the data is almost entirely guesswork. I’m in way over my head on this stuff and you should consider most of it wild speculation in the hopes of provoking the interest of people who can write “biomechanics” without a spell-checker. But as my dad (who happens to be a mathematician himself) has said, “sometimes asking the right questions is more important than finding the answer yourself (Forman, 2018)”.

I think it’s important to explain to readers why I decided to revisit the question of release point and spin. Up to this point, baseball Research and Development departments and private labs like Driveline have learned an incredible amount about the effects of spin on a baseball; however, how to increase one’s own spin rate remains to be understood.

The significance of this research should not come as a surprise to anyone who has been paying attention to baseball since the public dissemination of Trackman data. As noted in last week’s piece, Trevor Bauer has spent five years of his life trying to naturally boost his spin rate and I’m guessing he’s not the only pitcher going down that rabbit hole. If this link between release point and spin truly exists and is widely generalizable, breakout pitchers could be identified long before their true talent level is shown in their ERA and WHIP. Observers could test the sustainability of a pitcher’s success just by looking at changes in their release point. As this summer’s historically slow free-agent market has demonstrated, teams are starting to turn inward to their player development systems for a cheap, alternate talent pool. If this research is confirmed, teams could unlock the true spin potential of their own players, consequently spiking the talent level of the entire field (which fans of the game like myself love to see).

More than anything, this research question makes me excited about the future of baseball. I see a baseball future in which pitchers intentionally vary their fastball spin rate to high and low extremes to get maximum separation on their four-seam lift and sinker drop. One where hitters take batting practice off of virtual reality AI replications of pitchers with realistic spin patterns and pitch physics so their first time facing the pitcher feels like the third time through the order. Harnessing spin rate is not just another tool to which the rest of the league will soon respond. It is an entire framework for understanding the game we all love that changes the nature of the competition itself. Now, how do we get there?

Gerrit Cole’s Adjustment

First, data was scraped from Baseball Savant on every pitch Gerrit Cole has thrown in the 2018 and 2017 season. Because we want to examine within-pitch spin variation, a subset was created containing only four-seam fastballs. A simple linear regression was run using all available release point coordinates and release velocity. We use the variables “release_pos_x,” “release_pos_y,” and “release_pos_z” as regressors. X-axis release point is measured from the center of the rubber from the perspective of the catcher, so right-handed pitchers will have negative values. Z-axis release point measures the height of the release point using the bottom of the rubber as a baseline. Y-axis release point tracks the extension of the pitcher. All measurements are in feet.

Gerrit Cole Release Point Effects

Velocity***0.230.020.00

Regressor Estimate Standard Error P-Value
X-Axis 0.03 0.14 0.80
Y-Axis*** 0.55 0.09 0.00
Z-Axis*** 1.19 0.15 0.00

First, the estimates suggest that there is a positive relationship between an increase in y-axis release point and the spin rate of that pitch. The plot below demonstrates this. Velocity is listed on the x-axis because it is such an important predictor of spin rate. To see the effect of y-axis release point, pick any given velocity value and look at the difference in spin between a point with a relatively small y-value and a large one. The results are pretty jarring:

r-spin2

The color of the points represents how many standard-deviations away from Cole’s mean spin-rate that pitch was. Because spin-rate varies so much from pitcher-to-pitcher, we should look to see how changes in release point affect within-pitcher spin variation.

This same observation between y-axis release (extension) and spin has been documented previously in Nagami et al., as follows:

“The angle at which the fingertips reached forward over the ball during the top-spin phase was highly correlated with ball spin rate. In other words, ball spin rate was greater for the pitchers whose palm was facing more downward at the initiation of the back-spin phase.”

Because the angle between the palm and ground increases as release position along the y-axis increases, we can confirm our intuition: the longer you hold onto the ball, the more spin it has. Can this be used to help transform pitchers with mediocre fastball spin to elite rotation anchors as has been seen with Gerrit Cole this year? To answer that, we need to have a more sophisticated understanding of the biomechanical process of spinning the baseball.

Again, Nagami et al. has an answer,

“The greater the ball speed, the more downward it must travel. To accomplish this, pitchers with a faster speed would need to hold the ball longer, which means that the palm would have to face more downward at the initiation of the back-spin phase. This would result in a longer period for acceleration to produce spin, and thus produce a higher ball spin rate.”

This suggests that because higher velocity pitches have to be thrown at a steeper angle downward [because downward acceleration due to gravity has less time to act on the pitch], the pitcher then holds the ball longer as it is traveling down the y-axis and thus has more time to impart spin on the ball. Work is force times distance. If we want to transfer more energy into an object, we can either increase the magnitude of the force or apply it across a larger distance vector. We already knew that higher velocity pitches have higher spin. The results of our regression, however, suggest that even after controlling for velocity, release position along the y-axis (that is, releasing the ball further in front of the rubber) has a statistically significant effect on the spin rate. This means that for two pitchers with equivalent velocity, a one-foot increase in y-axis release increases the spin rate of that pitch by half a standard deviation. While no pitcher can actually extend his release point by an entire foot, small adjustments in spin can have career-altering results. In combination with a velocity increase and z-axis release point increase, it seems Gerrit Cole has found his optimal release point for maximizing spin. If this isn’t his peak, the MLB better look out.

Next, there is the problem of accuracy. Can an individual pitcher adjust his y-axis release position to improve the spin rate of his fast ball to a significant extent while still throwing strikes? The answer seems to be yes. As the spin rate of a pitch increases with fixed action of rotation, the deflection force increases orthogonal to the velocity vector of the ball. It speeds the air above the ball, which decreases the air pressure relative to the air below it. The air below it travels upward, pushing the ball along with it and generating “lift”. This is referred to as the Magnus effect. Not only does this means pitchers can spin the ball more without sacrificing strikes, but the Magnus effect alone makes pitchers more effective for two reasons. First, because hitters cannot optically track the ball in the last few milliseconds of the pitch, their brain oftentimes has to linearly extrapolate the trajectory and guess where the ball will end up at the point of contact. This means a small amount of lift can create the perception of a “rising fastball” in the batter’s mind. Second, vertical ball movement decreases the area of pitch-plane and bat-plane intersection. More simply put, the ball is harder to hit with upward movement.

Why is a Higher Release Point Better?

Second, and perhaps more surprisingly, a higher z-axis release point was significantly correlated with spin rate. Last week I forgot to mention that clicking on these plots takes you to my official “plotly” page where the graphs are all cool and interactive, so try it out if you’re interested.

r-spin

I tried to find a convincing explanation for why the estimate for the z-axis was positive without any luck. A few potential explanations come to mind. First, the higher you hold the ball, the more gravitational potential energy it has. Conservation of energy and the fact that the ball is thrown downward suggests that extra potential energy could be transferred to rotational kinetic energy, which is directly proportional to angular velocity. One of the problems with this theory is that, in general, the gravitational potential energy is not large enough to have a significant impact on spin compared to the overwhelming kinetic energy the pitcher is transferring to the ball.

The second (and more likely) potential explanation I came up with is that when pitchers throw with a three-quarters delivery, they decrease the component of force that they exert orthogonal to the moment arm on the ball. This is the only force that matters for torque (and the resulting rotational acceleration). When managers say the pitcher throws “through” the ball instead of “around” it, this is what they’re talking about. The rest gets transferred as translational kinetic energy, which is applied to the center of mass and contributes to what we call “velocity”. However, theoretically the math does not change along with the arm angle. The only thing that would change is the spin axis, which means the Magnus effect would have less of an upward component and would push the ball sideways. Because Trackman calculates spin regardless of the axis, this should not affect our estimate. The change would have to be a mechanical quirk that could be picked up on a high-speed camera.

We have to keep in mind, however, that not all spins are equal. For example, throwing over-the-top has the same transverse spin rate but adds gyro-spin. Gyro spin is the spin of a projectile which is rotating around a spin axis that is parallel with the direction of the velocity vector (as shown in the picture below). This is sometimes referred to by those within the industry as “not useful” spin, due to the fact that it does not trigger the Magnus effect. This change would again have to be due to another mechanical quirk at the release point that are beyond my abilities to track as a college undergrad who has no biomechanical experience and a Khan-academy video’s worth of knowledge. Answering the question of why z-axis release height is correlated with spin rate really should be left to a dedicated biomechanical researcher who has access to a lab.

 

Is this true for everyone?

Our next task is to test whether or not this trend is generalizable. This is a little easier said than done. In order for release point to be a useful regressor, it has to be variable so that we can test the effects of a change. The problem is that release point consistency is also a skill that Major League teams prioritize both for command and tunneling (making two distinct pitch-types seem alike until the very last second). Ideally, we’d have release point data distributed as a Gaussian, but for now we will have to make do with release point varied as a conscious effort by the pitcher. That causes another problem: if our regressors covary with a variable that correlates with spin rate and that variable is erroneously left out of the regression, it will create an endogeneity problem. This is especially prevalent with release point data that is roughly constant until a conscious correction is made, meaning the release point varies with time (along with potentially velocity, a different pitch-mix, stride length, workout regimen, etc.). This means a study of multiple pitchers will have time-variant error. We are using a fixed effects model, meaning that we time de-mean both the regressors and variables of interest (as shown below). Data on every four-seam fastball thrown by this year’s starting pitchers over the last two years was collected and spin was regressed on the release point. For those following along at home, we used the absolute value of the X-axis release position so we get the measure of sideways extension for both left-handed pitchers and right-handed pitchers.

Population Release Point Effects

Velocity***0.080.000.00

Regressor Estimate Standard Error P-Value
X-Axis*** -0.06 0.00 0.00
Y-Axis*** 0.27 0.01 0.00
Z-Axis*** 0.16 0.01 0.00

I’m going to give you a taste of one of the applications of this research. We can calculate predicted change in spin rate by using the regression coefficients above. If we weigh changes in release point, multiply them by the standard deviation in spin, and add them together, we should be able to get an idea of which pitchers making mechanical changes and (more importantly) how important those changes are in terms of spin rate. Below is a list of pitchers who rank the highest in “weighted release point change” based on recorded changes in release point from 2017 to 2018.

Weighted Release Point Leaderboard
Name Weighted RP Change 2017 Spin Rate (RPM) 2018 Spin Rate (RPM)
Kyle Hendricks 36.2 2021 2073
Clayton Richard 27.1 2085 2132
Reynaldo Lopez 16.2 2119 2099
Mike Foltynewicz 13.6 2258 2369
Dallas Keuchel 12.9 2041 2089
Stephen Strasburg 10.4 2175 2100
Daniel Mengden 9.6 2092 2110
Gio Gonzalez 9.4 2220 2177
Andrew Cashner 9.2 2099 2129
Gerrit Cole 8.6 2155 2326

First thing’s first, while this isn’t the most important thing in the world, it is comforting to see Gerrit Cole’s name near the top of the list in the metric we created with his spin change in mind. Full disclosure, I was only going to show the first ten pitchers but realized he was sitting at 11th. Still pretty good. Second, there are a lot of interesting names accompanying him. Mike Foltynewicz has made drastic strides this year in limiting hard contact, which has been reflected in his ERA and WHIP. I like Daniel Mengden quite a bit this year. He has had flashes of brilliance including his most recent outing where he limited the Red Sox to 1 earned run over 6 innings. Also, it is interesting how a lot of the guys listed here are known as extreme low-spin pitchers (Dallas Keuchel is a great example). This can also have a tactical advantage by exploiting the flip-side of the Magnus effect. The lower your transverse spin, the more drop you have relative to the rest of the league. For them it might be disadvantageous to be on this list. As a result, it might be worth examining the rate of return a pitcher gets from arm angle changes at different ends of the spin spectrum. We note that some pitchers our model predicts would increase spin rate actually experience a decline in spin rate which demonstrates the complexities of the biomechanical process of spinning a baseball. It should be kept in mind that our model is relatively simple, that our model should be used as a general guideline for understanding mechanical changes and not the last word on spin rate, and that release point should not be studied independent of other factors. For example, more complex models might start by examining the interaction effects of release point changes and velocity to determine diminishing or increasing marginal returns to mechanical tweaks as velocity increases.

Where do we go from here?

As mentioned earlier, the study of spin rate and the relationship between spin and release point has wide applications for internal baseball research and development departments along with casual observers wondering if a short-term spike in spin rate is sustainable. While I realize I’m getting into the habit of ending articles by saying smarter people should take a look and see if this is a real thing, the next step is figuring out exactly why we are seeing these trends in the data. Then, we will finally have a strong basis for answering the question of which factors contribute to a pitcher changing his own spin rate.


Salvador Perez Has a Complicated Relationship With the Strike Zone

Between catching pitches for one of the worst pitching clubs in Baseball (The Royals have the worst team ERA in baseball), and being made a fool by Adeiny Hechavarria at the plate (5/14/18), Salvador Perez is having an embarrassing year. Yet below the obvious misfortune, a slow insidious killer lies. Salvador Perez seems to have forgotten about the strike zone.

In 2016 Salvador Perez won a Silver Slugger award. How can a relatively recent award winning catcher have forgotten about the strike zone? Well, the thing is, the strike zone and Ol’ Salvador have been in a tenuous relationship for a long time now. From 2016 to 2018, nobody in the MLB has swung at more outside pitches than Perez. Over the past 4 years, Perez has swung at 42.5%, 44.2%, 47.9% and 49.1% of pitches outside the strike zone (O-Swing%), respectively. All these percentages place him near the top of the leaderboards for each of these years. His contact rate on outside pitches during that time (O-Contact%) is 73.6%, 65.8%, 70.4%, and 63.1%, respectively. The nature of Perez’s efficacy on swinging for outside pitches is worth a deeper dive.

Does Perez benefit from his lack of plate discipline? In order to simplify the the study, I am going to only be looking at Salvador Perez in 2018 so far. Whether the lack of discipline worked for him in the distant past is not the focus, instead I am going to look at the efficacy of this kind of batting for Perez moving forward, using 2018 data to support my prediction. Perez’s season started April 24th due to a MCL tear. As of the end of play on 5/18, Perez has seen 333 pitches this year. Perez has swung at 56.4% of those pitches, meaning that he has swung at roughly 187 of all of the pitches he has seen this season. Of this 187 pitches swung at, Salvador Perez has swung at approximately 46 pitches outside the strike zone this season. One look at Perez’s Swing% heat map shows that he seems to believe that the strike zone is larger than it actually is.

Perez swings at a markedly higher percentage of pitches outside the strike zone than his contemporaries. Jorge Alfaro, and Wilson Ramos are the only two Catchers so far in 2018 that have swung at outside pitches at anything near the rate of Perez’s O-Swing of 49.1%, with the other catchers at a rate of 44.1% and 43.2% respectively, (Min PA 100). Perez has been a far better contributor to his team this season when he has shown more plate discipline. He has had a far inferior wOBA on days in which he has an O-Swing above 50%. His average wOBA on 50% O-Swing days is an abysmal .237, which is .067 less than league average for catchers and is .078 less than the overall league average. In comparison, on days in which Perez has an O-Swing% below 50, his wOBA is .440, a vast improvement, and a wOBA that puts him .04 above Mike Trout. If an outlier game against Detroit on May 5th in the below 50% dataset in where he had a wOBA of .000 is removed, his below 50% O-Swing wOBA would become .484, a number that would put him not far off the wOBA of Mookie Betts (.495). All this is to say that Perez is a very valuable hitter on the days in which he shows better, more league average (29.9% O-Swing) plate discipline.

What of the pitches that Perez swings on outside the strike zone, and actually makes contact? Perez boasts a 63.1 O-Contact%, which is the best contact percentage of Catchers (100 PA minimum) with above an 40% O-Swing. Are these contacts worth anything, or are they just mostly foul balls and popups? Perez has made contact with 22 pitches outside the strike zone. (There is a discrepancy of approximately 6 pitches here between the data supplied to FanGraphs, and the data supplied to BaseballSavant. I have decided that this slight difference does not compromise the integrity of the article, as my conclusions are the same. As such, some of the pitch numbers may be slightly off due to the slight difference between the O-Swing and O-contact% of FanGraphs and the statistical equivalent Chase and Chase Contact% of BaseballSavant, however the use of BaseballSavant was necessary for the exact pitch breakdowns.) Of these 22 pitches Perez has fouled off 13 of them, and has hit the other 8 chased pitches. Of these 8, he hit into an out in 7 of them, with the remaining contact being a single. So while Perez’s contact numbers while chasing are impressive, they amount to naught. Even with this high contact percentage the previous conclusion still stands, Perez is a bad hitter when he is in a chasing mood, and a very good one when he works the strike zone.

Is there something special about the 46 pitches that Perez chased outside the strike zone? (The data of both sites confirm that Perez has swung at 46 pitches outside the strike zone, so there is no problem here.) Is the number mostly made up of pitches that are right on the edge of the zone? The answer to both these questions is no. Perez has been lit up for a total of 19 swinging strikes to just the outside bottom-right of the Strike Zone alone. Meaning that of the 46 chased pitches so far this season, a staggering 41% of them have been swinging strikes to the outside bottom-right. The final tally of Perez’s adventures outside the strike zone sit at a pitiful, but not wholly unexpected, 24 Swinging Strikes, 14 Fouls, 7 hit into outs, and 1, lone, sad, pathetic, inconsequential, single.


In conclusion, Salvador Perez desperately needs to work on his plate discipline if he wants to continue to be a Major League catcher worth anything close to the $7.5M and $10M the Royals are paying him this year and the next. If Perez cannot reverse the negative course that his batting discipline has been on the last couple of years, his O-Swing% having jumped 4.9% in the past two years alone, he will begin to become an non-factor at the plate. Perez’s WAR has been in a steady decline ever since his O-Swing% began the leap to its current heights. If Salvador Perez cannot find more discipline at the plate, the former Silver Slugger will no longer be worth having on a Major League Team.

(Data courtesy of Fangraphs and Baseballsavant)


Let’s Enjoy This Michael Brantley While we Can

It’s been a tough couple years for Michael Brantley. In 2016, he played in just 11 games. In 2017, he played in more than eight times as many…and still topped out at just 90 games. He registered a mere 418 plate appearances in that span because of injuries and was only worth 1.5 wins.

These injuries were the kind that start small, like inflammation or a sprain so often do, and cost a player a few games. Then news comes out about them being more serious than expected or about how the player has experienced a setback. And when those types of injuries start to pile up and happen in back-to-back years, it’s easy to wonder when, exactly, that player will be themselves again. Or if they ever will.

So far in 2018, though, Michael Brantley is showing us he’s back to being his vintage self.

brantley5

Alone, the numbers this season are impressive. But compared to 2014, they’re downright eerie. It’s as if he’s looking into a mirror and seeing the 2014 version of himself looking back. He was worth 6.5 wins that year. The biggest difference is that he’s traded in steals for more power — he had 23 stolen bags in ’14 and is on pace for about 5 this year — but that matches the direction of today’s game, anyway.

Everything else paints a special picture. The league’s average strikeout rate has hovered around 16.5% for the last five years. Its average isolated slugging is around .150, and the average weighted on-base average is about .325. Brantley has been 50% better than average at not whiffing, at least 20% better at driving the ball, and 60% better at creating offense. Those kinds of results put him in rarefied air.

If we look at the single season leaderboards, we can see just how rare. Here’s a list of qualified players since 2014, which was when Brantley was last healthy for a full season, who have struck out in less than 10% of their at-bats and had an ISO of .170 or better:

  • Michael Brantley, 2014
  • Victor Martinez, 2014
  • Michael Brantley, 2015
  • David Murphy, 2016

There were 537 qualifiers over that time period. It happened four times. Brantley did it twice. No one managed to do it in 2013 or 2017. While we’ll have to wait to see if they can keep it up, the only three players to do it so far in 2018 are Brantley, teammate Jose Ramirez, and Nick Markakis(?!).

In many ways, ISO and strikeout rate in tandem can inform us a great deal about who’s being productive and how. Brantley’s skill at deciding when to swing is truly unique.

But what really makes his start to the 2018 season special is that he’s 31. With evidence building over the last several years that players peak earlier than we ever thought, it was fair to wonder if the time he lost to injury meant we were all robbed of some of his best years. Aging curves consider as large a pool of players as possible, though, so getting to witness players who force exceptions is always a blast. His 15 game rolling wOBA and K% averages tell us we’re having a pretty good time.

brantley4

The bigger the gap between the red and blue lines, the better. We can see what he was like at his peak in 2014 and his valleys over the last couple years. As the space between the two lines grows in 2018, so does the one where we get to appreciate what he’s doing. We don’t know when the next injury will come or when Father Time will show up. We should enjoy this Michael Brantley while we can.

Data from FanGraphs.


Gerrit Cole and the Pine Tar Controversy

Hot take: the Astros are good at baseball. This is thanks in no small part to Gerrit Cole’s early success on the mound. After the news broke that the Astros signed the righty, several analysts wondered why they would give up three national top-100 prospects (Musgrove, Moran, and Feliz) for two years of control over what some called a “soft” upgrade at starting pitcher. We found out in a big way. After 8 starts, he leads all qualified starters in FIP (1.56), WAR (2.8), K-BB% (35.6%), and is second only to Max Scherzer in Z-Contact% (75.3%). He has been, to date, (subject to some debate) the most valuable starting pitcher of 2018.

He is not the only bright spot of the Astros’ 2018. 35-year-old Justin Verlander looks to be returning to his 2011 AL Cy Young self. While it is early, he seems to be a promising candidate for this year’s award as long as his teammate Gerrit Cole doesn’t steal it from him. Charlie Morton, once journeyman, has found his home and filed his very own claim to being one of the very best pitchers in the league. I haven’t even mentioned Dallas Keuchel, proud owner of his very own 2015 Cy Young trophy, who is a top-35 pitcher on STATS’ command leaderboard and is finding his way back to his pinpoint control that helped his team win their first franchise World Series in 2017. Oh… also Lance McCullers is filthy. It’s looking like blue skies ahead for the reigning world champs and the beginning of the season only confirmed the rosy outlook. Then, a simple tweet:

 

Ever since Trevor Bauer pointed out the potential source of the increased spin rate of Gerrit Cole, there has been a cloud surrounding the validity of the Astros’ recent pitching success. It’s easy to understand the frustration for someone who claims to have spent a solid five years of his life trying to naturally improve his spin rate. The advantage of a high spin rate has been documented extensively in the literature by people a lot smarter than me, so I’m not going to go into it (plus my last encounter with the subject ended with a B in high school physics). All you need to know is the faster the ball spins, the more it moves in the last few moments before it reaches the hitter, which makes it harder to hit which, as you might guess, a pitcher generally likes.

In the Statcast era, teams are clamoring for every inch of an advantage and detecting small changes in fastball spin-rate is everything. If the Astros really are using some sticky substance (or just training new acquisitions in the art of spinning baseballs), we should be able to detect it in some way. Here are the average spin rates before and after the transition to Houston of some of their finest pitchers:

Astros Spin Rate Changes
Player Spin Before (RPM) Spin After (RPM)
Justin Verlander 2535 2591
Charlie Morton 2103 2244
Gerrit Cole 2165 2332
“SOURCE:”
FanGraphs Team Stats

Just looking at this table, however, can be misleading. Average spin-rates can look a lot different depending on where you split the data. We would never know, for instance, if Gerrit Cole’s spin rate spiked to 2300 the start before moving to the Astros, which got lost in the pure volume of data suggesting a lower overall spin-rate before the move. It is important to understand exactly where this significant change in spin rate occurs.

The key behind detecting significant changes in data is this:

,

where X is the observed prior data. η here is what’s called a changepoint. Change point detection uses likelihood-based estimation to find the number of different population means (or variances) in a time series. That’s just a mathy way of saying it looks at how likely the data fundamentally changes as some point (or points). Before we can say the Astros are cheating, we should look at if the change in spin rate is really that significant to begin with and determine where that change actually occurs. We’re going to be using Bayesian changepoint detection. The advantages of Bayesian detection as opposed to Binary segmentation are twofold. First, the probability of having a change point is directly proportional to the prior probability of observing the data. This helps prevent overreaction to new information and makes the overall estimation process much more robust, which is especially important in this case. It is tempting to see a big number next to the post-Astros spin-rate chart and jump to conclusions, but it is important to appropriately weight the prior probability of that spike occurring. Second, detecting the changepoint requires a much smaller window of data. This is important in this case as well. If we are correct that the change happens in a 1-game window, i.e. it happens as a result of a game-to-game transition to a different team, predicting changepoints among small data-windows is especially important. Specifically, our algorithm computes the probability of having a particular changepoint configuration as follows:

,

where π(η) is the prior probability of that configuration and f(X|η) is the likelihood of the observed data given the change point configuration. There’s some other math behind the detection algorithm, but for now we’ll just take a look at the plots. First up: Charlie Morton.

There’s a pretty clear changepoint here. The posterior probability spikes at timepoint 23. That game date is 9/30/2015, a late-September game against the Cardinals, notably while Morton was pitching for the Pittsburgh Pirates. The Astros weren’t even his next team (in November, he was traded to the Phillies). Several analysts weighing in on the spin question have noted that spin rate is positively related to velocity. In an oft-quoted interview with Matt Gelb of the Philly.com:

“For some reason, I just went out there and tried to throw the ball hard one game. I wound up throwing it harder.”

Below is the change-point detection plot for Morton’s velocity. It looks like there was clearly a change during the spin-rate spike.

Regardless of the cause, our likelihood-based examination suggests it would be naïve to attribute Morton’s spike to an organizational conspiracy to increase fastball spin with a foreign substance.

Second, Gerrit Cole:

There is a clear spike at time 86, which is the time of his trade. Something has changed. However, look at the spike in spin rate at time 44. This could provide a hint to a given organization that a player is capable of a spike in spin rate given a change in mechanics. There is a 20% probability that that game contained a change point, which would be higher except Cole’s spin significantly declined right after that start. If spin rate is associated with a specific mechanical quirk, not only could that help us acquit the Astros, but also identify potential steals on the free agent or trade market that have yet to harness the maximum potential of their spin rate.

Some have hinted that high fastballs increase spin-rate by a significant enough margin to where a change in location could be responsible for Cole’s dramatic spike. Below is the graph of Cole’s spin rate broken down by location.

Cole’s spin rate increases by about 100 RPM when he pitches high and inside. That’s a significant jump, but not enough to explain a 300 RPM spike.

In a recent interview with MLB radio, manager A.J. Hinch mentioned that two things could potentially be behind the change in spin rate (without divulging any organizational secrets). First, he said that sinkers have a drastically lower spin rate than four-seam fastballs, and their pitching staff has prioritized the four-seamer.

Below are his pitch-usage charts before and after his transition to the Astros thanks to Brooks baseball:

First thing first, Hinch is right about pitch usage. Sinker rate is way, way down. But, as you can see below, it cannot account for within-pitch spin variation, as his individual sinker and 4-seam spin are both spiking this year.

Gerrit Cole Spin Rate Change
Pitch (as Recorded by Statcast) Spin (RPM)
Sinker (Pirates) 2121
Sinker (Astros) 2288
4-Seam (Pirates) 2165
4-Seam (Astros) 2332
“SOURCE:”
Brooks Baseball

The second factor that Hinch hinted at was “getting behind the baseball”. We can examine the relationship between release point and spin rate and see if this can really explain such a significant jump. Below is a 3D plot of spin rate by release position.

r-releasepoint

This seems like it could actually explain a good bit of the change. There are two changes associated with the spike in spin rate. First, he’s releasing the ball much further along the Y-axis than before. Second, it looks like he is releasing the ball higher, but closer to his body than the pitches at the very top left of the distribution. Almost every pitch he has thrown in the neighborhood of (-2.2, 54.4, 5.6) has been at around 2300 RPM. Further research should be done on the physical mechanics that generate that spin and if this could really be a causal relationship.

After a quick look at Cole’s mechanics, it does look like there is a conscious change in release point. Below is a screenshot from Cole’s start against the nationals on 9/29/2017. This looks like one of those pitches further along the x-axis than our sweet-spot that we found on the release point chart. See how his elbow is almost parallel to his shoulder.

Remember that one-game blip in spin rate that showed up on his changepoint plot? Below is a screenshot of a 97-mph fastball from that start. Notice how his elbow is higher than his shoulder. This could be the change that Hinch was talking about when he said “getting behind the ball” could be behind the increase in spin. When watching the video of the previous pitch, it looks almost as if he’s throwing around the ball instead of throwing through it.

Apologies for the blurry screenshot (it was one of two fastballs with media on baseball savant). Lastly, here’s a screenshot from this year. His release point is much, much more vertical than the previous two.

Overall, we should not jump to conclusions on the Gerrit Cole spin question. Just to be perfectly clear, I personally have no idea how mechanical changes actually affect spin rate. I haven’t done the experiments myself and certainly have not spent as much time as Trevor Bauer trying different grips and substances in a controlled setting. However, this article suggests that there is an association, whether it be correlative or causative, in Gerrit Cole’s release point that has come with an increase in spin on both two-seam and four-seam fastballs. If further research can confirm this association, the results could be of incredible use to teams looking for value either in their player development system or trade market.


What Does April Mean?

                            HOW MUCH DOES APRIL MATTER ?

April is the first hard evidence of what may be in store for the new season. But like any single month, April usually has some extreme results. Typically 40% of all teams start conspicuously well (.600 +)  or poorly (.400 minus) where by season’s end only 12% of all teams are at those outer edges of win percentage.  With 85% of the season to go, plenty of time remains for fates to change.  Or do they really alter that much post-April ?   This study focuses on that question and is based on the April records of all teams from 2000-2017  compared to their post-April results and chances for making the playoffs.  Two issues are addressed.  First, how closely have teams’ remaining matched their Aprils and second, what effect April has had on teams’ playoff chances.

                 PREDICTICE VALUE OF APRIL FOR REMAINDER OF SEASON (ROS) RESULTS

April records of all teams from 2000-2017, were divided into six win % categories: Excellent (.650+ win % ), Good (.550 – .650), Slightly Positive  (.500 – .549), Slightly Negative (.450-.499),  Weak (.350 -.449) and Poor (below .350).   Teams in each April win category were compared to their post-April and full season win/loss percentages. The percentage of teams in each April category who played at playoff level (.580), contention level (.540+) or near-contention (.500) after April were also measured as well as the percentage of teams in each April win category who made the playoffs.  Following are the results:

April W/L                    April Win % Last 5 month Full Yr .580+ last 5 .540+ Last 5 .500+ Last 5 Made Playoffs   Pct of All Pct of Playoff
Category Teams Average Win % Win % Months Months Months Teams Teams
.650+ 60 0.688 0.535    0.558 25% 53% 72% 60% 11% 23%
.550-.649 119 0.595 0.519 0.530 18% 44% 65% 43% 22% 33%
.500-.549 104 0.519 0.510 0.511 19% 37% 55% 33% 19% 22%
.450-.499 75 0.472 0.495 0.491 11% 29% 48% 20% 14% 10%
.350-.450 126 0.409 0.484 0.473 11% 25% 37% 14% 23% 11%
under.350 56 0.305 0.445 0.424 4% 7% 23% 4% 10%    1%

Each level of April performance has had better remaining performance and playoff chances than the level immediately below it. Contrasts between the top 1/3 and bottom 1/3 April teams are stark.  Two thirds of teams who’ve started well (.550 +) play  .500 + ball afterward and roughly half make the playoffs.  Only one third of teams who have been .449 or worse in April play .500+ ROS and just 9 % of these early strugglers have made the playoffs. Nearly 4 of every 5 of all playoff teams were .500+ or better in April and less than 1 in 10 playoff teams started .425 or worse. April has done a good job of quickly identifying contenders and non-contenders.

Much of that comes from April strongly relating to prior season results and current season records having solid resemblance to prior season records. So April simply confirms that many (if not most) teams are headed for the same general fate as last year.  While these are generally reliable maxims, they’re hardly infallible. Many teams April wrongly suggest things will stay the same.   Likewise April often may indicate a change is coming in the current season and it doesn’t materialize.   That will be explored next – just how often does April “fool” ?

HOW MANY TEAMS ARE APRIL “FOOLERS” ?

Roughly 3 in 8 teams had more than .100 pt change in their April v. rest of season (ROS) win percentage. For some teams at the April extremes, even huge April v. ROS differentials don’t change their basic seasonal fate. The 2003 Yankees started the year at .769 win %  but played “only” .596 ball the rest of the way and won over 100 games.  Conversely a .260 point jump in win % for the 2000 Tigers over the last 5 months still didn’t push them over .500 on the season. Still for a great number of teams the last 5 months can wash out a great deal of the good or bad April does. Hot starters fall apart and teams buried in April can rise from the ashes.

To show both the amount and type of “foolery” April provides a logical starting point is prior year records. Fully 70 % of teams with 90+ wins the prior year have .500 + Aprils v. 37% for teams with 69 minus wins. Since many teams tend to have Aprils which are “characteristic” of their prior season , measuring the true amount of “April deception” can be done in two ways: 1 – How many Aprils “characteristic” of last year give true v. false signals of another similar season ?  2 – How often do  “uncharacteristic” Aprils end up being true v. false signals of a better or worse year ?

 1st – How reliable are “characteristic” Aprils.  The following chart illustrates this

Prior Yr Wins     April Win Category       # Teams         % win  90+   % win 81+    % Playoffs

90 +                       .520 +                            83                       64 %               84 %             62 %                                   80-89                     .520 +                           71                        48 %              76 %             49 %                                         70- 79                    .499 minus                  73                           5 %              22                  7%                                              Below 70              .499 minus                   65                          3 %                6 %              6 %

As can be seen, prior year good teams with good Aprils have had even better odds than their already favorable odds of making the playoffs.   On the other side, prior year sub-.500 who have sub-.500 Aprils have seen their already thin odds get even slimmer.  Only 9 of 138 such teams have overcome the odds of a bad start.   The percentage of teams finishing above .500 is also remarkably different.   80 % of good teams with good Aprils end up plus .500 on the year, where a mere 14 % of bad teams with bad Aprils do. Conclusion: “characteristic” Aprils are highly reliable indicators of either continued contention or non-contention.    

“Highly reliable” does not mean “perfect”. The 90+ game winner/good April formula didn’t work for  defending world champions 2004 Marlins, 2013 Giants nor defending NL West champion 2008 D-Backs or 2005 Dodgers.  Nor did bad team/bad April deter the 2015 Rangers, 2011 D-Backs, and two Rockies teams (2009, 2007) plus the 2007 Cubs from rising up and making the playoffs.  But these are exceptions.

Applying these principles of “characteristic” Aprils to 2018 would bode well for the Red Sox, Yankees, Astros, Cubs and Diamondbacks. It would not for the Tigers, Rangers, Marlins, Padres, White Sox, Orioles, Royals and Reds.  Only 1 in 15 sub-.500 prior year teams with sub-.500 Aprils have made the playoffs so such odds would indicate  none of the above 8 will either.  Of course none of these clubs were expected to contend but neither were the Braves, Phillies, Pirates, and Mets who’ve had good Aprils.  Which brings up “uncharacteristic” Aprils.

2nd – How often are “uncharacteristic” Aprils false or true signals of a change in team fortunes ?

Logically when Aprils deceive, teams often return to their “true selves” (good or bad) in the last 5 months. The data largely supports that logic.  But it also supports the notion that some uncharacteristic Aprils correctly signal changes in a team’s fortunes. This time the data is parsed more finely to show where the false and true indicators of change can appear.

Prior Yr

90 + wins  + .401 – .499 April      26 % win 90+  – 77 % win 81+    29% in playoffs

90 + wins  .399 minus April        14 % win 90+  – 43 % win 81+     21 % in playoffs

70-79 wins  .600 plus April         39 % win 90 + – 72 % win 81 +    44 % in playoffs

70 -79 wins .500-.599 April        22 % win 90 +  – 51 % win 81 +    25 % in playoffs

Below 70 win  .600 + April         18 %  win 90+   – 64 % win 81 +   18 % in playoffs

Below 70 win  .500-.599 April     7 % win 90 +   – 22 % win 81 +   14 % in playoffs

 

More extreme “uncharacteristic” Aprils have greater accuracy in signaling real change. Of the 90 + winners who started below .400 in April, less than half finished above .500 and only 22 % made the playoffs. By contrast, 90 + prior year winners who had .520+ Aprils made the playoffs 62 % of the time.  So very bad Aprils from previously good teams are strong early warning signs of trouble ahead.

What about extremely strong Aprils from previously bad teams ? Of the 29 teams who had prior losing records but .600 + plus Aprils, 20 finished above .500 and 10 made the playoffs.  This includes some notable turnarounds:  2000 Cardinals, 2000 White Sox, 2006 Tigers, 2012 Orioles, 2013 Red Sox, 2015 Astros, 2015 Cubs, 2017 Rockies, 2017 D-Backs.    Of course, excitement over great Aprils by prior bad teams should be tempered by the fact that 65 % of such teams still have missed the playoffs. This may somewhat damp expectations raised by the start of the 2018 Mets.  Although the 2015-2016 Mets were playoff teams so it’s quite possible their strong April 2018 could be a legitimate sign of revival.  Teams who gained 15 or more wins over the prior year showed an average of 120 points jump in April v. prior year    win %.  The Mets have jumped 220 points.

Mildly uncharacteristic Aprils have a higher incidence of sending wrong signals. Only 22 % of the bad (“below 70”) teams with .500-.599 Aprils finished over .500. Only 23% of the 90+ winners who started .400-.499 became bad sub-.500 teams thereafter. Former 70-79 winners with good but not great Aprils (.500-.599) have boosted their playoff odds but at 20% those odds are still below average. Seven teams fit this description in  2018 – Mariners, Pirates, Phillies, A’s, Braves, Blue Jays, Giants.  Historical odds are that 1 will make the playoffs, 2 would be very optimistic.

However, one type of mildly uncharacteristic April is very noteworthy. Previously good teams who show up with modestly bad Aprils (.400-.499) have seen a big drop in playoff odds (29% for bumpy .400-.499 starts v. 62% for “characteristic” good Aprils).  That is not good news for 2018 Dodgers, Nationals or Twins who all were expected to be playoff teams again.  Which brings up the next issue – how does April influence playoff chances ?

                  WHAT IMPACT DOES APRIL HAVE ON MAKING THE PLAYOFFS  ?

As one old adage goes “you can’t win the pennant in April but you can lose it”. While there is a definite truth in this, April has the fewest games of any month, and still leaves ample time to recover. The 2001 Oakland A’s had a miserable 8 -17 April but went on to win 102 games.  That said, they were the only sub-.400 April team to win 100 + games and only one of 8 teams who started that poorly and still made the playoffs.   Another 10 such bad start teams played the rest of the season at contender levels (.540+ ) yet failed to make the playoffs and their poor Aprils were instrumental in that.

While it is clearly possible to recover from a bad start, bad Aprils leave a diminished margin of error. Two-thirds of all playoff teams start the year solidly (.530+ ), the vast majority (78%) are at least .500 + and fully 95 % of all playoff teams have avoided disastrous (sub. 400) Aprils.   Teams who’ve started miserably haven’t been able to count on fellow playoff contenders being in the same underwater boats.  They have to play serious catch-up with rivals whose yachts have begun to float away.  An “average” playoff team has a .564 April win percentage.  So a club with a 9 -15 April usually has to catch teams who’ve gone 14-10 or better.  If the 15-9 April teams plays at “only” .537 ROS and gets 88 wins, it takes a .580 ROS from the poor April starter to overcome that.  Playing at .580 + level ROS has been done by only 15% of all teams, which equates to being one of the top three teams in one’s league after May 1st.

Those 8 playoff teams who started sub-.400 averaged a .604 ROS win percentage.  Five of the eight were .580 + ROS and the lowest ROS win percentage was .572 which translates to 93 wins over a full season.    In addition, there were four teams with a .580+ ROS that missed the playoffs since their average Aprils were 9 – 14.    Despite Herculean May 1st-on efforts and being better ROS than their key divisional or wild card rivals those four teams (2004 Giants, 2005 Indians, 2011 Red Sox, 2012 Angels) lost out on the playoffs because of their inferior April records to rivals who had solid or even stellar starts.

 

Losing ground to playoff competitors due to a bad April and rivals’ typically good starts makes the task tougher. Even slightly below .500 starts mean a club usually needs to play at .560+ ROS to catch up as most playoff bound teams are already above .550 + in April. The other side is that very strong Aprils can provide a cushion to play at less than .550 ROS (the average playoff team is .581 ROS). There are even occasions where teams with less than a .520 ROS have made it in due to strong Aprils:            2016 Mets, 2015 Astros, 2014 A’s, 2006 Cardinals, and 2000 Yankees.  Ironically there are two World Series winners in that group (Cardinals and Yankees).   The Cardinals were particularly unusual as they were the only team to make the playoffs with a sub-.500 ROS record.

The lone .600 + ROS who missed the playoffs (2005 Indians) provide a classic example. They were the second best team in the AL after April, outplaying their division rival and subsequent World Series winner White Sox   84-55  to  82-56  ROS.   However, with the 17-7 April of the Sox, and Indians’ poor 9 -14 start, Sox gained a 7.5 game cushion.  The double whammy was that the Tribe’s poor April also cost them the wild card to Boston.   Had the Indians played even 12-13 instead of 9-14 they’d have ousted Boston.   Of course Cleveland’s 13-16 July didn’t help either, nor did going 1 -5 the last week of the season (including a 3 game sweep by the Sox) after the Indians had whittled the lead down to 1.5 games on Sept. 24.  But the April cushion built by the Sox allowed them to withstand an incredible Aug/Sept run by Cleveland and the Tribe was forced to play unbelievably well to stay in the hunt.   This all happened before the second wild card was introduced in 2012, and if it had applied back then Cleveland would have made it as that second wild card.   So has this second wild card now made it easier for April stumblers to recoup ?

HOW THE SECOND WILD CARD HAS CHANGED PLAYOFF ODDS

Adding a second wild card team has changed the odds in some meaningful ways as follows:                         Prior to 2012, teams had a 94 % chance of making it with 92+ wins , but only 43% with 87-91 wins did.  After 2012, teams had a 100 % chance of making it with 92+ wins, and an 83 % chance with 87 -91 wins.

Clearly 87 – 91 wins stands nearly twice the chances of making the playoffs than before when 92 wins was the benchmark to lock in a spot.  Prior to 2012 good Aprils helped but a team needed to be stoking the engine every month to reach that 92 plateau.  Only when teams hit the magic 580 ROS (94 win pace full season) did they have a near lock (96%) getting in.  Now that lock is at 560 ROS. This is a critical difference. That can make it easier for a strong April team to ease into the playoffs with a lower ROS or it can make it easier for a team that stumbles in April to recoup. In fact for those teams who started less than .450 and then made the playoffs prior to 2012, the average ROS was .604.  For the three teams that have done that since 2012, the average ROS is .574.  But that only 3 teams have made it with subpar Aprils says the new wild card system has not been a boon to bad April starters.

It’s a lot easier to have a .600+ April than a .600 + ROS.  111 teams have had such Aprils, but only 42 have done so ROS.  The new wild cards have actually made it easier for the good April teams to coast in with lower ROS records and there is no shortage of good April teams.  Since 2012, each year an average of 15 teams start the year with .520 + Aprils to compete for 10 playoff spots.  Of those teams, 56% (or 8 per year) make it in.    That leaves only 2 spots on average per year for sub-.500 April teams to compete for.

Following is a breakdown of the number of playoff teams after 2012 who’ve played at given ROS levels categorized by their April win levels

APRIL WIN LEVEL 600+ROS 580-599 ROS 560 -579 ROS 540-559 ROS 520-539 ROS 500 -519 ROS  # teams
600 Apri 4 4 6 4 2 3 23
550-599 3 4 2 2 2 0 13
500-549 4 3 4 1 1 0 13
450-499 3 3 1 1 0 0 8
400-449 0 0 1 0 0 0 1
below 400 0 0 2            0 0 0 2

As can be seen, 49 of the 60 playoff teams were .500 + in April. However, 91 % of the under .500 April teams who made the playoffs were forced to play .560 + ROS to get in.   Whereas only 64% of teams who started .550 or better were able to play under .560 ROS and still make it. So the second wild has so far given the fast start April teams a better chance to ease coast into playoffs at a lesser ROS pace rather than make it easier for the slow starters catch up.   But the same math could be applied to any month, good or bad, so to paraphrase the classic Passover question:  why is April so different from all other months ?

PSYCHOLOGICAL EFFECTS OF GOOD OR BAD APRILS

Despite the fact that April is only 15% of the overall season, when early-mid season personnel decisions are being made, April results can still have considerable impact.   April’s record can influence decisions such as: the patience a team has for a younger player with early season struggles or whether the team tries for a mid- May trade to replace an injured starter and/or considers promoting a top notch AAA player despite his arb clock issues.  When trading season starts in June, 35% – 40% of the team’s record at that point has been baked in by April’s wins and losses.  Even by the late July deadline, April still comprises 25% of the season.  If early results have affected fan attitudes and attendance, ownership may be either more or less willing to commit dollars to bigger name players at the deadline.

These factors may give April importance beyond its mathematical impact on the standings.   That teams tend to mirror their April win-loss %’s as the season progresses may be in part that April can create a sort of self-fulfilling prophesy.   The 2014 Cleveland Indians were 75-60 after April 30th, the Oakland A’s were 70-63 over the same time. Yet Oakland got the wild card by 3 games over Cleveland due to a 18-9 April v. the Indians 10-17.   Early season results were still impacting July decisions as the A’s were buyers and the Indians sellers.   To be sure, the A’s record worsened post-July and the Indians got better.  But without Jon Lester (2.35  ERA with A’s) and Jeff Samardzija (3.14) who knows how much worse it might have been for the A’s .  While ridding themselves of Justin Masterson may have helped the Indians and trading Cabrera didn’t hurt, how much better would they have been had they gotten an OFer and starting pitcher in July instead of being sellers ?

Conversely, the 2016 White Sox benefited from a 17 -8 April despite a May tailspin which left them with a 29 -27 record on June 4th and only 2 games behind in the AL Central. They then traded a very talented younger prospect named Fernando Tatis Jr.  for James Shields.  Needless to say this is a trade that has not worked well short or long term.  Despite the fact that the Sox had 3 straight losing seasons prior to 2016,  management seemed to believe that April/early May represented the success they felt the team was capable of as opposed to the more recent reality of losing.  Successful Aprils can sometimes keep wishful thinking alive for too long.

CONCLUSIONS

Clearly April has proven to be a good proxy for the team’s chances going forward that season. But April records have to be viewed in context of all evidence.  How much of a factor were injuries, over or under performances, or new offseason acquisitions?  Thoughts of dumping contracts and rebuilding, while too premature for May 1st, are still logical for poorer prior year clubs who are off to bad starts.

For clubs in the playoff hunt, April can have real impact since 3-5 more or fewer wins can make/break. Since 2013 teams with 87-88 wins have made the playoffs in 10 of 11 cases where only 2 of 14 teams with 85-86 wins have. Clearly these margins apply to other months’ results too, but as noted before, a very good or bad April can affect team decisions in June and July.   Having some breathing room afforded by a 16-10 April instead of the catch-up pressure of a 10-16 start can play into the psychology as well.

Winning the division winner is far preferable to having to win a one-game playoff as a wild card and April provides a good checkup on division rivals. Both Boston and NY look like they will fight it out all year although Toronto can’t be ignored. In the Central, the Indians’ are in a strong position with their chief rival, the Twins, are 4.5 games back already and the rest of the Central bad teams who’ve started with bad Aprils.   Houston’s good start helps especially since all of their closest chasers (Mariners, Angels, A’s) were sub-.500 teams last year, but the Angels improved through offseason acquisitions.  In the NL East, even though the Nationals are 5 games back they’re chasing teams who were all sub-.500 last year (Mets, Braves, Phillies).  Where the Dodgers, who are 8 games back, are pursuing a D-Back team that won 93 last year.  Turner’s absence hurt in April but so did the lack of offense from the rest of the team which may continue particularly since Seager is lost for the year.    Plus the bullpen woes cannot be overlooked. So April’s 12-16 record cannot be easily dismissed as an aberration.  Nor can the historical evidence of diminished playoff odds (20-25% range) of good teams who’ve had the Dodgers’ kind of April.   We shall all see soon enough.


World Series Hangover: A Different Look

Following up on a recent Jay Jaffe post, I am examining the question of whether there is a World Series hangover. Unlike that post (which was great, but answered a slightly different question than I am interested in), I compared the full season performance of World Series winners and losers relative to their true talent level. I looked at all teams that went to the World Series in 2012-2016. I only went back to 2012 because that is far back as I could find projections in my less than thorough internet search. Finally, I omitted 2017 because they have played too few games this year for my purpose. As a proxy for true talent level I used projected wins from Clay Davenport.

Why did I look at actual versus projected win totals? I did not find changes in absolute win totals informative in terms of the question I was asking. Teams change year-by-year. Changes in absolute win total could simply reflect talent level changes. By using projected wins as a baseline I hoped to control for, at least somewhat, changes in talent level. Using projected wins as a baseline also allowed me to examine whether any changes in performance across years was due to over/under performing in the World Series year versus over/under performing in the year after the World Series.

Let us get to it. Below you will find the projected win total (pWins), the actual win total, and the average projected win total (average pWin) for the World Series teams in the year they went to the World Series (WSyear) and the year afterwards (WSyear+1). Busy figure, bear with me.

Win totals versus projected win totals

My main point here is that the average projected win total is similar for the World Series year and the year after (a 1.3 win increase). The second point is to show the raw data as good practice. Next, I cared about how the teams performed compared to their projections in each year. That information can be found in the figure above, but better yet, here is a figure showing actual win total minus projected win total for the year the teams went to the World Series.

Actual win total - projected win total (WS year)

This is interesting. Teams that went to the World Series outperformed their projections by 8.2 games on average. With the exception of the 2012 Tigers all teams outperformed their projections (note that the 2017 Astros and Dodgers outperformed their projected win totals by six and seven games, respectively). The probability of 9/10 teams outperforming their projected win total is 0.010. Teams that go to the World Series outperform their talent level. What about in the year after the World Series? Below is the same figure as above with the year after the World Series added.

Actual win total - projected in total (year after World Series)

Alright then. In their post World Series season teams have, on average, performed right at their true talent level (-0.8 wins). What have we learned? Obviously the sample is small and the data for the year after the World Series trip is quite noisy. That said, within this sample, teams were projected to win a similar number of games in their World Series year as the year after. They substantially outperformed their projections in the year they went to the World Series. They then came back to earth in the year after their World Series trip.

Keeping in mind my question was regarding a year-by-year change in a team’s performance relative to their true talent level, I conclude that there is a World Series hangover of a sort. Yet, its nature is quite different than one might think. Rather than teams underperforming after going to the World Series it appears that they over-performed in the year they went to the World Series. In other words, any World Series hangover may result from our powerful friend regression.


The Effect of Batted-Ball Direction on Launch Angle

Fortunately Statcast now has a function that allows to sort for batted ball direction. This opens the chance for some new studies. Until now we just had launch angle (LA) and exit velocity (EV), however, that is not quite perfect because we already new that it is easier to pull fly balls for power. This was known intuitively for a long time https://www.fangraphs.com/fantasy/getting-to-know-fly-ball-pull-percentage-fb-pull/ but was hard to quantify until now.

One of the effects is certainly that parks are bigger in center field than they are down either line. However I also looked at EV and average distance of balls pulled, hit to center and oppo at angles of 20-35 degrees which are typical HR angles. For this article I only looked at right handed hitters, -45 to -15 was defined as pull, -15 to 15 as center center and 15 to 45 degrees as oppo.

View post on imgur.com

You can see that pulled balls yield a 343 ft distance and 92.4 EV. To center it is slightly lower (91.4/338) but to opposite field it drops dramatically to 290/86.2. From a physics standpoint that makes sense because the contact on inside pitches is supposed to be further out front so that the swing is slightly longer and thus has more time to accelerate to contact which probably means more bat-speed at impact.

wOBA supports this, while liners are relatively stable in production, the wOBA of pulled fly balls is dramatically higher. On grounders this trend is reversed and oppo grounders are better than pulled grounders.

View post on imgur.com

I also looked at the top and bottom 20 of the league in pull and oppo LA:

View post on imgur.com

You can see that pull LA has a pronounced positive effect while oppo LA even has a slightly negative effect. It might make sense to try to lift more on pulled balls and slightly try to suppress LA (“get on top”) on oppo hit balls. Not sure if this is possible with the same swing though, I think usually the guys having a high FB pull rate also have high grounder pull rates because that is the natural tendency of the swing.

So it seems to be pretty simple: pull the ball in the air and be productive.

However it isn’t quite as simple. Already before Statcast it was known that pulled balls are hit on the ground at a much higher frequency https://www.fangraphs.com/blogs/the-pros-and-cons-of-pulling-the-baseball-2/.

Launch angle supports that, pulled balls last year had an average LA of 5.6 degrees vs 13.1 for balls up the middle and 20 degrees oppo.

This makes sense and actually is something that isn’t easily combatted with the modern swing. The modern swing goes slightly up and pulled balls are hit out front. You can lift a ball like this but if you are a little too far out front the bat has risen above the plane of the pitch which means you hit the top of the ball and roll over hitting a hard topspin grounder, often into the shift.

This is especially pronounced on low pitches.

There are some hitters who have developed a tool to combat that rolling over with the uppercut swing as I have shown in this article https://www.fangraphs.com/community/finding-keys-to-elevate-the-ball-more/ by using a steeper bat angle but it is not easy to do as the league still tends to have much lower launch angles on low and especially away pitches https://www.fangraphs.com/community/effect-of-pitch-selection-on-launch-angle-and-exit-velocity/.

I broke this down a little more looking at batted ball directions and pitch locations inside the zone

View post on imgur.com

You can see that low pitches that are pulled are especially hard to lift, most extreme is that on low and away pitches but even the down and in pitch only yields a modest 6 degree LA.

I also looked at pulled balls above 10 degrees on low pitches. The leaders in that stat were in this order Stanton, Machado, Salvador Perez, Hunter Renfroe, Nelson Cruz and Mookie Betts. Those were some pretty good hitters last year, so maybe that is a skill that deserves further examination.

We all have seen Bryce Harper pull outside pitches for a homer and it does happen but generally trying to pull anything away is not a good receipt. If it works it usually is on pitches up (still yields a positive 7 degree LA to pull up and away pitches).

An adjustment that might make sense is trying to hit up and away and middle away balls to center rather than the other way. That way you could bring down the average EV of those pitches from a too-high-upper-20s average EV to a better low 20s EV, which yields a better BABIP on those pitches which tend to yield lower EVs. I elaborated in this article why mid-20s LAs are ideal but actually average LAs should be lower (between like 12 and 18 or so)
https://www.fangraphs.com/community/why-launch-angle-can-only-be-optimized-not-maximized/.

Overall an LA optimizing strategy using batted ball direction could look like this.

View post on imgur.com

So pulling the ball is good but only if you have the skill to put it in the air. Selecting the right pitches to do it certainly helps. On pitches that are low and away it still makes sense to follow the old advice to hit it were it is pitched. And for pitchers it might make sense to work the outside corner more, however that is also a fine line since you need to prevent the old Jose Bautista strategy of creeping closer to the plate and turning the outside pitch into a middle pitch. For this you need to pitch inside some to keep the hitters honest.