Archive for Research

Reason For Optimism For… Matt Davidson?

Matt Davidson was not good last year. He got 443 plate appearances in his first full MLB year on a rebuilding White Sox club, and it didn’t go well as he posted a WAR of -0.9. That mark was seventh-worse in MLB for position players with at least 400 PA. There’s little mystery how he got there, as he combined DH-only caliber defense with a paltry 83 wRC+.

Davidson achieved that uninspiring number by hitting like a three-true-outcomes guy without the walks, more or less a poor man’s Chris Carter. Good news first: last year, he ran a pretty decent ISO of .232, putting him close to good-to-great hitters like Francisco Lindor, Anthony Rendon, and Anthony Rizzo, cracking 26 homers along the way. His raw strength is very real: he blasted a tape-measure 476-foot moonshot out of Wrigley with a 111MPH exit velocity in July. Big power is a good trait to have, but it’s been devalued in today’s game, where guys like Carter and Logan Morrison can hit 35+ homers in a year and then can’t find contracts of even $5M the following offseason.

Still, significant pop is necessary for a high offensive ceiling, so what’s holding Davidson back? In a word, strikeouts. He struck out a horrifying 37.2% of the time in 2017, second-most in the majors.  Unsurprisingly, his whiff rate was a scary 16.3%, sixth-highest among his peers; for reference, that’s identical to how often hitters swung and missed against Andrew Miller last year. The walk rate that keeps most K-prone sluggers’ OBP somewhat afloat wasn’t in evidence, as Davidson walked only 4.3% of the time. You won’t be shocked to find that he finished second-worst in K/BB with an ugly 0.12. Although he did hit the ball hard (we’ll come back to that), his flyball-heavy batted ball profile and below-average speed kept his BABIP suppressed to .285. That mark was in close agreement with his xBABIP of .283.

The astronomical K% and below-average BABIP held him to an ugly .220 AVG, which combined with the poor BB% led to a truly abysmal OBP of .260, second-worst among hitters with 400+ PAs. The only guy worse in that column was Rougned Odor, who has a similar offensive profile, but at least he can partially blame a particularly unlucky .224 BABIP.

Looking at last year’s stats, there appears to be approximately zero reason for optimism for Matt Davidson. He hit for power well, but was near the top of all the peripheral leaderboards that you really don’t want to be at the top of.  So why is this post being written at all? In short, Davidson seems to have turned over a new leaf this spring.

Now, I know the sabermetric kneejerk reaction to that last sentence: spring training means nothing and spring training stats mean less than that. But that’s not entirely true, as this excellent piece in the Economist way back in 2015 details. If you don’t want to read the whole piece, that’s fine, because it can be summed up very briefly: a hitter’s strikeout rate in spring training actually has a pretty high correlation with their strikeout rate in the regular season. Of course, one of the chief objections to drawing conclusions from spring training stats is the tiny sample sizes with which we’re working. Fortunately, strikeout rate is one of the fastest-stabilizing peripheral rates there is; Fangraphs itself puts the threshold for stabilization of strikeout rate at about 60 PA.

That piece was linked somewhere recently and I read it for the first time. A couple days later, being entirely starved for any form of baseball through this long winter, I reached the rock bottom of scouring the spring training stats of the team I supported, the White Sox. To my own surprise, there was actually something interesting buried there; as you might guess, it was in Matt Davidson’s stat line.

Luckily for us, and this piece, Davidson’s played the most of any White Sox this spring, totaling 60 PA as of March 20. He’s struck out twelve times, a K rate of 20%. He has walked seven times, for a walk rate of 11.7%. In this small sample, he’s almost halved his strikeout rate and nearly tripled his walk rate from 2017. On the one hand, that sounds like an insane improvement that cannot possibly be maintained; on the other, those rates from spring training are by themselves quite unremarkable for a major league hitter. Using BBRef’s summed 2017 stats to calculate league-wide rates, 20% K and 11% BB would have both been slightly better than average league-wide in 2017.

A significant walk rate improvement wouldn’t actually be terribly surprising. If you peruse Davidson’s player page, you’ll find that before last year he never posted a BB% worse than 9.1%, ranging up to 12.0%, from Double-A onwards, a total of five seasons spent mostly at Triple-A plus a month in the majors with Arizona. His walk rate at least doubling this coming year wouldn’t be coming out of left field; rather, it would be him returning to the player he has been in that sense for pretty much his entire professional career minus last year. It will probably come down from 11.7%, given that MLB pitchers likely have better control than those he’s faced this spring, but still, a big jump in walk rate seems likely for him this year.

That strikeout rate is a different animal, though. He’s always struck out a lot, never posting a K rate below 20% at any stop in the minors, and the whiff rate mentioned previously supports that. On the other hand, the sample size is now at the point where this being a complete fluke is pretty unlikely. Is this a real improvement or a mirage? I don’t know, and we don’t have plate discipline numbers in ST to see underlying patterns, but according to Davidson himself, making more contact is exactly what he’s trying to do. It sure seems like he’s succeeding in that thus far. As another small data point, he doesn’t seem to have a pattern of ST flukes in K rate, as in 58 PAs during last year’s spring training he struck out in 37.8% of his plate appearances, a number that echoes his full-season 37.2%.

This wouldn’t be as interesting a case if Davidson did nothing well offensively. He’s a large and very strong man, which is why he hasn’t just been released by the White Sox years ago. Take a look at his contact profile. Basically, last year, he pulled balls, hit more fly balls than ground balls, and vaporized balls in to play, with a quality-of-contact triple-slash line of 15.7% Soft/46.1% Med/38.2% Hard. His HR/FB% was a robust 22.0%, rubbing statistical shoulders with established sluggers like Nelson Cruz and Edwin Encarnacion. In short, when he actually did hit the ball, he looked for all in the world like a poster child for the fly ball revolution. Those underlying numbers hint at a lot more offensive potential than anyone outside of the White Sox organization sees in him, if he could just reduce that giant 32.9 K-BB%.

Now he’s showing signs of significant improvement in that fatal flaw of plate discipline. It doesn’t seem like the improvement in K% and BB% thus far in spring training has cost him much in power, considering that he’s demolished ST pitching to the tune of .358/.433/.679 (1.113 OPS & .321 ISO). Obviously, he’s not going to keep hitting quite that well, but the still-rebuilding White Sox aren’t about to outright bench or demote him either. Maybe it’s all a lot of noise, and he’ll be bad again this year. Or maybe Matt Davidson, at the age of 26, is about to be the Next Big Breakout™. Just as a reminder, it took J.D. Martinez until 26 to figure it out and become the “King Kong of Slug”; Justin Turner was 29-year-old replacement-level utility infielder who suddenly blossomed offensively in 2014; Jose Bautista was almost 30 before he turned into a nightmare for AL pitchers in 2010. So, here’s an prediction I would have laughed off for 2018: Matt Davidson is about to bust out in a big way.

 

UPDATE 3/29: Davidson hit three homers on a cold day in Kauffman Stadium, every single one of them with a 114+ MPH exit velocity. He also walked and did not strike out. Jump on the bandwagon now while there’s still room.


Temporarily Replacement-Level Pitchers and Future Performance

As I’d like to think I’m an aspiring sabermetrician, or saberist (as Mr. Tango uses), I decided to test my skills and explore this research question. How did starters, who had 25 or more starts in one season and an ERA of 6.00 or higher in their final 10 starts, perform in the following season? This explores whether past performance, regardless of intermediary performance, adequately predicts future performance. Mr. Tango proposed this question as a way to explore the concept of replacement level. From his blog: “These are players who are good enough to ride the bench, but lose some talent, or run into enough bad luck that you drop below ‘the [replacement level] line’.” Do these players bounce back to their previous levels of performance, or are they “replacement level” in perpetuity?

To explore this, I gathered game-level performance data for all starters from 2008 through 2017 from FanGraphs, grouped by season. I then filtered out pitchers who had fewer than 25 starts and had an ERA less than 6.00 in their final 10 starts. This left me with a sample of 78 starters from 2008 through 2016 (excluding 2017 as there is no next year data yet). I assumed that a starter with an ERA above 6.00 was at or below replacement level. Lastly, as some starters were converted to relievers in the following year, I adjusted the following year ERA according (assuming relievers average .7 runs over nine innings less than starters: see this thread).

final10.png

Seems like the 10-game stretch to end each season is a bit of an aberration. The following year’s adjusted ERA is much closer to the first 15+ games than the final 10 games for pitchers in our sample. In fact, the largest difference between any first 15+ game ERA and its following year adjusted ERA is .58 runs, in 2011. The smallest difference between any last 10 games ERA and its following year adjusted ERA counterpart, for comparison, is 1.7 runs, in 2009.

Using adjusted ERA corrects for the potential slight downward bias in our following year totals. Following year games started fell by ~9%, while reliever innings increased from zero to each season’s value. Relievers, on average, have a lower ERA than starters. As mentioned above, I adjusted each season’s following year ERA by .3 runs per reliever inning pitched (my assumed difference in runs allowed between starters and relievers per inning pitched). Another source for potential downward bias is sample size – of the 78 pitchers who fit our sample qualifications, only 69 pitched in the majors the following season. A survivor bias could exist in that the better pitchers in the sample stayed pitching, while the worse pitchers weren’t signed by a team, took a season off or retired.

What is driving these final 10 game ERA spikes? It has been shown that pitchers don’t have much control over batted ball outcomes. Generally, it is assumed pitchers control home runs, strikeouts and walks – the basis of many defense-independent pitching stats. Changes in these three stats could explain what happens during our samples’ final 10 games. Looking at each stats’ rate per nine innings, however, would be misleading, as each season exhibits uniform change (such as the recent home run revolution, or the ever-growing increasing in strikeouts). I calculated three metrics for each subset (first 15+, last 10 and following year) to use in evaluation: HR/9–, K/9– and BB/9–. All three are similar to ERA– in interpretation – a value of 100 is league average, and lower values are better.

Further, not necessary math details: for example, a value of 90 would be read as the following. For HR/9– or BB/9–, a value of 90 means that subset’s HR/9 or BB/9 is 10% lower, or better, than league average.  For K/9–, a value of 90 means that the league average is 10% lower, or worse, than the subset’s K/9. To create these measures, I calculated HR/9, K/9 and BB/9 for each subset and normalized them to the league value for each season – including the next year’s value for the following year’s rates. Then, I normalized these ratios to 100. To do that, I divided HR/9 and BB/9 by the league averages and multiplied by 100. Because a higher K/9 is better (unlike HR/9 and BB/9), I had to divide the league average by K/9 and then multiply by 100, slightly changing its interpretation (as noted above).

final10-2.png

As mentioned above, the issue of starters-turned-relievers within our sample likely influences our following year statistics. I was able to adjust the ERA, but I did not adjust the rate stats – HR/9, K/9 or BB/9 – as I have not seen research suggesting specific conversion rates between starters and relievers for these.

Interestingly, our sample of pitchers improved their K/9– across the three subsets, despite having fluctuating ERAs. They were below average, regardless, but improved relative to league average over time. Part of this could be calculation issues, as league K/9 fluctuates monthly, and I used season-level averages in calculations.

Both HR/9– and BB/9– drastically get worse during the 10 start end-of-season stretch. These clearly drive the ERA increase. In fact, despite seven of the nine seasons’ samples having better-than-average HR/9 in their first 15+ starts, every season’s sample has a much-worse-than-average HR/9 in their last 10 starts, where eight of the nine seasons’ samples HR/9 are 40%+ worse than league average. Likewise, though less drastically, our samples’ BB/9 are much worse than league average in the last 10 starts subset. Unlike HR/9–, though, our samples’ BB/9– is worse than league average in the first 15+ starts subset. The first 15+ games’ HR/9– and BB/9– are identical to the following year’s values, unlike K/9–.

It appears that starters with an ERA greater than or equal to 6.00 in their final 10 starts, assuming 25 or more starts in the season, generally return to close to their pre-collapse levels in the following year. This end of season collapse seems to be driven primarily by a drastic increase in home run rates allowed, coupled with an increase in walk rate. These pitchers performed at a replacement level (or worse) for a short period and bounced back soon after. Mr. Tango & Bobby Mueller, in their email chain (posted on Mr. Tango’s blog), acknowledge this conclusion: “they are paid 0.5 to 1.0 million$ above the baseline… At 4 to 8 MM$ per win, that’s probably an expectation of 0.1 wins to 0.2 wins.” We can debate the dollars per WAR, and therefore the expected wins, but one thing’s for sure – past performance is a better predictor of the future than most recent performance.

 

– tb

 

Special thanks to Mr. Tango for his motivation and adjusted ERA suggestion.

How Long Before Things Go Bad?

Spring is a time for optimism, in baseball and in life. Teams are starting to think about their opening day starters and more broadly, their starting rotations. Some rotations look “set” while some have a “battle for the 5th spot”. Some are toying with the idea of a 6-man rotation.

But here’s the thing: we know that (almost) every team will end up using a 6-man rotation, whether they like it or not. Eventually, your favorite team will need to call in reinforcements. This can happen because of poor performance or injury. But hey, we’ll cross that bridge when we come to it, right?

… when do you think we might come to it?

We know, as do those in charge that teams use something like 11 starters per year (in 2017: 11.3). In a six-month season, how long does it take before the first reinforcements arrive?

Cumulative Starters Used, 2017

In a few words, not very long. Some pitchers have injuries, some get moved to the bullpen, some sent to the minors. Either way, at least one of them will be gone pretty soon, so don’t name the puppy.

Of course, fate comes at different paces. In 2017, the Cardinals didn’t use a sixth starter until June 13th. And even then, Marco Gonzales only pitched because they had a double-header. In contrast, Junior Guerra, the Brewers’ opening day starter, was injured that same opening day. He wouldn’t pitch in the majors for another seven weeks (and it turns out, not very well either).

Half of teams used a sixth starter before April 25th. 90% of teams used a sixth starter before their 50th game.

Some of those sixth starters, along with their full-season WAR: Alex Wood (3.4), Mat Latos (-0.3), Mike Clevinger (2.2), Mike Pelfrey (-1.0).

We know that teams need depth. Not only that, but life comes at you fast.

Data: Baseball Savant


Let’s Strategize Under the Potential Extra Inning Rule

As I’m sure you know, Major League Baseball is toying with the idea of putting a runner on second base sometime around the 12th inning. While I’m not doing this to argue its validity or lack thereof, I’m going to discuss and evaluate some scenarios that could happen under those conditions. It won’t be anything groundbreaking; I’ll be demonstrating the metrics involved with a team under the various circumstances I induce.

The following scenarios are played out to score at least one run in a given inning. Top or bottom of the inning, I envisage the same sort of conditions will play out for both teams. And because there is never any telling what part of the order will start with this setup, I speak in generalizations.

I’ve thought about what would be the likeliest of moves under this arrangement and I’m going to guess it would come down to the most boring events in baseball; the offense bunts the runner to third or the pitcher intentionally walks the first batter attempting to set up the double play. Of course, there will be times when the managers decide to simply attack the situation as-is. That’s more of a volatile situation and therefore much harder to work with.

First, the basics. From 2010-2015, having a runner on second base with no one out produces the following:

  • The predicted number of runs scored is 1.100
  • The percent chance of scoring a run under those conditions is 61.4%

So from the get-go, the offense is expected to score a run in three out of every five chances.

Play the bunt or a standard defense?

Let’s start off with the first of two scenarios; the bunt to move the runner over to third. I feel like this is the most likely action but also the most difficult to work with because of varying defensive strategy. Will the defense make an anticipatory shift for a bunt or will they be in ‘straight up’ formation? In 2011, Bill James found out that bunting in sacrifice situations produced a .102 batting average. Not like we needed that because we could have guessed that you’re going to be out roughly 90% of the time.

To bunt or to swing away?

So assume the hitter lays down a bunt that moves the runner while making an out at first. Run expectancy is now 0.95 with a 66% chance of scoring a run. Your run expectancy went down 0.15 runs BUT you increase your chances of scoring by a little less than 5%. Would bunting make sense to you as a manager? Taking out any sacrifice-type contact, if your hitter produces an out and the runner has to stay at second, your run expectancy drops to 0.664 and the chance of scoring a run plummets to roughly 40%. Still feel the same way (regardless of the hitters bunting ability)?

Walk or pitch to the next hitter?

Keeping with the initial decision, we have a runner on third and one out. Pitch to the next hitter or put him on to set up the double play? Our strategy could be further altered because at this point the defense might be inclined to bring out a ground-ball pitcher or create a split situation (lefty vs lefty and vice versa). But again, let’s go with the assumption that the team will do the safest thing by having the next hitter walked. That puts runners on first and third with one out. That decision causes run expectancy to jump back up 0.18 to 1.13 and but the probability you’ll score at least one run drops to 63.4%. Would you make that same call (remember, we are in a vacuum)?

Runners on first and third with one out produce the following expectancy:

  • Average number of runs scored is 1.130
  • The chance of scoring a run under those conditions is 63%

One of a couple of outcomes will follow should you elect not to intentionally walk the hitter. He will drive in the run by putting the ball in play various ways (sacrifice fly, fielder’s choice, hit, etc) and accomplish what the offense set out to do; score at least once to put the pressure on the home team. Or, the hitter could strike out, ground out (which could turn into a double play, an out at home, etc) or fly out.  If contact is made, this could alter our base-out states: two outs and runners at various bases (first and third, second and third, second or first should the runner somehow get thrown out at home). Due to the randomness of contact in this event, we’ll stay with the intentional walk.

To bunt or to swing away, pt II?

So what about the offensive strategy for first and third, one out? The options are much more vast. You could sacrifice bunt to move a runner over to second (assuming the runner on third is held up), thereby dropping run expectancy to 0.580 and dropping your scoring chances to 26%. The risk here is having the batter somehow bunt into a double play; runner at third is tagged/thrown out and the batter is thrown out at first. Do you, as a manager, take the initial risk that set up this problem? It is challenging to turn a double play on a bunt but if the defense is ready, it makes it easier to do so.

This time, let’s assume the hitter botches the bunt to the first base side and the overeager runner is thrown out at home (or caught in a rundown), runner safe at first. Now, with two outs, there’s a runner on first and second, we sit at a very poor run expectancy of 0.429 and have just over a one in five chance of driving in that run.

Walk or pitch to the next hitter, pt. II?

At this point, again with neutral context, you can walk the batter to load the bases, (if the hitter is too good and the next isn’t great, etc.) or you can just pitch to the batter (maybe bringing in a bullpen specialist). Walking the batter gives the offense a 10%better chance of scoring and a .33 increase for run expectancy.

If you elect to pitch to the batter either the final out is made or runs score. Walking the batter loads the bases and forces the defense to hope for the best. The latter situation would actually produce the most excitement; a crucial decision would need to be made. Either way, my tangent baseball universe will end; three outs, inning over or the needed run(s) score.

While I don’t necessarily agree with or enjoy the thought of the game being altered in this way, it could produce some interesting strategical decisions and test the maneuvering skills of team managers.

This post and others like it can be found over at The Junkball Daily.


Do Teams That Shift More Have Lesser Defenders?

Defensive shifts are designed to prevent hits. By placing fielders in spots of higher hit frequency, the logic follows, fewer batted balls will drop in as hits. Notably, though, as the number of shifts has drastically increased, the league-wide BABIP hasn’t changed. Since 2011, shift deployment has increased tenfold (though BABIP has actually increased 1.7% – .295 in 2011 to .300 in 2017). Better positioning could lead to teams utilizing fielders who have less range, as they’d be located closer to batted balls. Do teams who shift frequently employ worse-ranged fielders?

First, the recent MLB environment. Through a combination of enhanced analysis and deeper data, teams across MLB are increasing shift usage. Positioning fielders in locations of high hit density, for specific batters, allows them to field more batted balls. Every team is increasing their shift usage, driving the total shifts deployed up.

shifts_league

The intuitive result of this would be batters are recording fewer hits. As fielders field more balls, they should convert more of those previously-hits into outs. However, league-wide BABIP has actually increased as shift usage increased. Perhaps the quality of the batted balls has decreased, though – trading doubles and triples for singles. According to the league-wide wOBA, though, the overall quality of offense has increased.

woba_league

Clearly, shifts aren’t having the effect one would expect them to have. Rather than explore what effect they do have (as if they had no effect, why would teams continue to shift?), I want to see if perhaps the defenders being used are worse. Perhaps shifts have allowed teams to mask poor defenders with better positioning.

After browsing the data, I thought it was best to compare year-to-year changes in range runs saved above average to changes in shift deployment, in attempts to analyze the effects of a large change in shift use on range runs above average (RngR). This variable doesn’t measure data for shifts — any shift-influenced batted balls are excluded. This exclusion is what makes RngR perfect for analysis — we can isolate plays which are standard and similar fielder-to-fielder and control for frequency of shifts.

To do this, I first prorated range runs above average to a 150 defensive game rate (RngR.150), as each team had slightly different innings totals. I then took the year over year difference in RngR.150 as RangeDiff, to analyze changes in range runs above average. Similarly, I took the year over year percentage changes in shift deployments. Due to the drastic increase in shift usage across the majors, comparing these absolute numbers would be meaningless here, so I scaled these percentage changes to each season’s average change in shift usage. This variable, ShiftScaleYOY, represents a team’s shift usage change as standard deviations above or below the season average change. All this data is from Fangraphs, 2011-2017 team defense statistics and shift deployment.

My hypothesis is that teams that have a drastic increase in shift usage between seasons, compared to league-average, would have worse defenders, as measured by range. The results:

positions.jpg

First, notice the axes. Third basemen have a larger variance. Teams with larger increases in shift usage year-to-year, relative to the rest of the league during this same time periods, appear to have defenders at third with range values closer to zero. This is difficult to see through inspection, however. There doesn’t appear to be much of a relationship with 2nd basemen or shortstops.

When I regressed the between-year standard deviation measurement of shift changes on between-year range change, with dummies for position and season, the shift change variable was insignificant. In fact, there were no significant variables, and the R-Squared was merely .13%. Notice the symmetry in the above graphs, though. A team’s range values seem to converge as the team’s standard deviation of shift changes increases.

To explore this, I ran two regressions, with subsets where the dependent variable, Range.150, was positive and negative. The positive regression had an R-Squared of 9.2%, implying it poorly describes the variance in positive Range changes year-over-year. 2017, 3B and SS were all statistically significant, at the 99% confidence level. This implies that there is a 2.15 range per 150 defensive games decrease in 2017 versus the other seasons, that there is a 1.5 run increase for being a third baseman and a 1.4 run increase for being a shortstop over a second baseman. The negative regression had an R-Squared of 8.6%, again implying this model poorly describes the variance in the data. Here, however, 2017 and 3B both were statistically significant, at the 99.9% or greater confidence level. The values were greater, but the direction of implication was the same – 2017 implies a 2.7 run increase, and a third baseman has a 2.4 run decrease over second basemen. These analyses suggest that 2017 resulted in fewer outlier defenders and that third basemen were higher variance than second basemen.

There are a few issues or improvements with this analysis that could be made. First, publicly available data is limited – comparing shifted plays and non-shifted plays would be best for this analysis. What I did could be seen as cursory, at best an introduction. Secondly, the sample size of defensive shift data is small. Defense data for individual, full-time players is generally utilized in three-year samples, and I was using single-year measurements (albeit at the team level, slightly larger samples per position than individual players). Lastly, a deep analysis on shift impacts on player abilities would use individual players – comparing his or her defensive prowess on shifted and non-shifted plays. This would allow us to try to measure the impact of shifts on defensive performance, to better understand if teams would employ different-skilled players as they increase shift usage or if their players perform differently with shift usage.

There are suggestions in the data that certain years or positions differ with respect to defensive range. Nothing suggested relative increases in shift usage impacts range or quality of defenders on the field. All in all, I think this study can be summarized by the wisdom of Albert Einstein: “the more I know, the more I realize how much I don’t know.”

 

– tb


Will We See a Record Number of Three True Outcomes Specialists in 2018?

Last season was the year of the three true outcomes specialist.  Aaron Judge’s dominant three true outcomes season was the most prominent example of this: he ranked second in home runs (52) and walks (127) and first in strikeouts (208).  In total, 57% of his plate appearances resulted in one of the three true outcomes.  He was the American League Rookie of the Year and in the running for the 2017 American League Most Valuable Player award, finishing second.  His performance helped the Yankees reach the American League Division Series.

We know that the three true outcomes rate has been increasing.  In part, this is due to the average player increasing his rate of home runs, strikeouts and walks.  But there is also the unusual player in the mold of Judge who takes an extreme approach at the plate resulting in dominant three true outcomes seasons.  The number of these hitters has been increasing over time.

Figure 1. Three True Outcomes Specialists per Season, 1960-2017

View post on imgur.com

Figure 1 shows the number of dominant three true outcomes player seasons over time.  To get here I examined all players since 1913 with at least 170 plate appearances in a season.  I considered a dominant season one with a three true outcomes rate of at least 49%.  There have been 132 player seasons with a three true outcomes rate of at least 49%.  All of them have taken place after 1960.

The graph shows that the number of dominant seasons has been increasing over time.  Since Dave Nicholson first did it in 1962, most years have had at least one player cross the threshold.  Since 1994, every season has had at least one.  From 2001 to 2010 there were four seasons with five three outcomes hitters.  There was six in 2012 and eight in 2014.  The trend is currently peaking with 13 in 2016 and 16 in 2017.  The trend is a bit more extreme but similar to the average increases in three outcomes rates over time.  It seems that more players pursue (and teams tolerate) an approach to hitting that includes extreme rates of the three outcomes.

It is worth pointing out that those 16 players in 2017 account for about 4% of all players with at least 170 at-bats.  Three true outcomes specialists are more common but still rare.  Who are those 16 players?  Table 1 lists them including the home run, walk and strikeout rates, and the combined three true outcomes rate for the year.

Table 1. Three True Outcomes Specialists, 2017

Player HR/PA BB/PA SO/PA TTO
Joey Gallo 8% 14% 37% 59%
Aaron Judge 8% 19% 31% 57%
Ryan Schimpf 7% 14% 36% 56%
Chris Davis 5% 12% 37% 54%
Miguel Sano 6% 11% 36% 53%
Alex Avila 4% 16% 32% 52%
Mike Zunino 6% 9% 37% 51%
Drew Robinson 5% 12% 35% 51%
Jabari Blash 3% 14% 34% 51%
Keon Broxton 4% 9% 38% 51%
Chris Carter 4% 10% 37% 50%
Mike Napoli 6% 10% 34% 50%
Kyle Schwarber 6% 12% 31% 49%
Matt Olson 11% 10% 28% 49%
Cameron Rupp 4% 10% 34% 49%
Eric Thames 6% 14% 30% 49%
Jake Marisnick 6% 8% 35% 49%
2017 Averages 3% 9% 21% 33%

The list includes many of the unique player stories of the year.  Aaron Judge’s rookie year was historic.  Joey Gallo made waves, particularly for his extreme three true outcomes rates.  Miguel Sano was an All-Star who helped lead the Twins to a bounce back year and a wildcard spot.  Eric Thames was a surprise story of the year, returning from a year in Japan and sparking the Brewers to an early lead in the National League Central.

Notable about this list is the young cohort of hitters who have consistently taken the all or nothing approach of the three true outcomes specialist.  Judge, Olson, and Blash all made their MLB debut in 2017.  Gallo still qualified as a rookie despite making his debut in 2016.  Keon Broxton, Ryan Schimpf, and Kyle Schwarber are in their second year.  Sano has been a specialist for three years running.  Sure, there are old hands like Napoli and Carter, and Davis who take the all or nothing approach, but the record number of specialists the last couple years have been due to this young cohort of three true outcomes specialists.  A new record will come down to 2018 rookies who practice this all or nothing approach heading into their major league debuts, and the number of teams willing to tolerate the strikeouts that come with this approach.


The Trickiest Third Strike Pitcher in MLB

I ran some queries over at Baseball Savant and came across this tidbit of information. Since 2015, no other pitcher froze hitters on strike three more than Cleveland Indians’ Corey Kluber.

cKluber

I decided to write an article on Kluber’s caught looking data along with how he’s able to be the best at getting hitters held up on that third strike.

Sifting through the last three years of Statcast data, and filtering the results down to a 5000 pitch minimum, Kluber ranks second overall to Clayton Kershaw (2.38%) in called third strike ratio to total pitches (2.28%).

So, why am I not writing about Kershaw? Well, I’m not concerned with ratio because, in this case, the ratio is independent of the number of times Kluber is able to deal that third strike. Kershaw might be better at working over hitters (thereby throwing less) but that doesn’t necessarily lend itself to more swing-less third strikes.

Kluber has thrown with two strikes nearly 1500 more times than Kershaw has in the last 3 years. But, Kershaw his pitched much less (mainly due to injuries), so we’re not going to ‘punish’ Kluber for this. And, we’re talking about a difference in the ratio that’s a tenth of a percent.

Moving on, I wondered if there is any advantage pitching in the American League? First, I looked at the overall plate discipline numbers for the entirety of Major League Baseball from 2015-2017.

mlbPlateDiscipline

So we have a 3-1 ratio of swings, as well as contact, in verses out of the zone. Now I’ll compare the AL vs NL three-year average.

alnlPlateDiscipline

We’re talking about fractions of a percent difference, with the only real disparity (if you can call it that) is the out of zone contact where the AL has a nearly 1% difference. So, there is no advantage to pitching in either league in terms of the type of at-bat you’ll experience.

Using a minimum of 1000 pitches each year, I found that Kluber finished first in 2015, third in 2016, and 2nd in 2017 in strikeouts looking. Furthermore, in context of plate appearances with two strikes, Kluber is ahead in the count (1-2/0-2 count) 24% of the time, even at 45%, and behind (or, a 3-2 count) 31% in those three years. Nearly a quarter of every two-strike situation, hitters are forced to be aggressive at the plate; and just under a third of the time, the batter has to make a mandatory choice.

Before I proceed,  I need to point out that there is some discrepancy as to what Kluber actually throws. He uses something of a sinking fastball that is hard to classify; it goes either way but my main source of research indicates it’s basically a sinker. And with his breaking pitches, which some sites call it a slider, some call it a curve, but it may be a slurve.  For argument’s sake, we will refer to both of them as a sinker and a slider.

So what is it that Kluber is using that’s laying waste to hitters on strike three? His sinker, which he’s thrown for strike three 108 times (50%) since 2015.

kluberPitchTypes

The above graph is his pitch selection after strike two the last three seasons.

His sinker location when he throws regardless of the count. Good luck telling a hitter where to concentrate his swing when he throws it.

chart (21)

chart (22)

However, something changed in 2017; he cut back on his bat-confining sinker by 7% and increased his change-up and slider/curve/slurve usage 1.5% and 7.3% respectively.

kluberSIvsCH

Just for curiosity’s sake, Kluber’s release points are nearly identical on all three pitches. So the hitter may not know whats coming at him with the intention of ending up as strike three (until its too late).

chart-(23)

OK, so he leaned more on his slider last year. What can we make of that using his last three years’ run values in the context of runs above average?

Screen Shot 2018-02-28 at 4.48.06 PM

The sinker, his bread and butter pitch for strikeouts, seems to hover around league average in terms of run value. Upping his change and slider usage appears to have paid dividends; Kluber seems to believe those are better suited to set the batter up for the strikeout. I would also venture to guess his sinker isn’t nearly as effective when thrown earlier in the count, hence the negative run value.

To note, Kluber’s two-strike stats: .136 BA/.392 OPS/10-1 K-BB

His sinker is clearly working when he needs it to.  Overall, it’s his least-effective pitch as hitters eat it up for a .300 average. Nevertheless, according to the data, it’s a tough pitch to gauge when used for that third strike.

Maybe Kluber will start using his slider more with two strikes. However, if he does so, that could cause him to be dethroned as the ‘King of Caught Looking’; his slider is swung at more than any other pitch he has, thereby causing a swinging strikeout.

Regardless, Kluber should still be able to put batters away with that devastating sinking fastball; opponents have 2-to-1 odds they’ll be dealing with it when the count has their backs are against the wall.  It usually doesn’t end well.


Predicting Arbitration Hearings; Was Mookie an Outlier?

Mookie Betts went to an arbitration hearing. Marcus Stroman went to an arbitration hearing. George Springer and Jonathan Schoop did not. Other than the obvious differences between these players, there are others— related to the arbitration process itself— that may have affected these outcomes. Particularly, the differences and qualities of their filings.

To those unfamiliar with the arbitration process, eligible players and teams who are unable to come to a settlement ahead of the given deadline, submit salary filings which reflect either party’s evaluation of the player’s worth. Even after filing, teams and players are able to negotiate a one-year contract, but in some cases, a panel of arbitrators will decide a salary: either the player’s bid or the team’s bid, but not any number in between. This “final-offer arbitration” system is designed to create compromise and negotiation between bargaining parties as the threat of losing a large amount of money increases the incentive to settle early while a midpoint is still available. By extension, teams and players are encouraged to moderate their bids as an outlandish one is surely to be challenged and lost.

But, two different theories exist as to how the difference in bids itself affects the likelihood of hearing. Some argue that higher differences between teams and players in valuation would increase the likelihood of an arbitration hearing as the difference in bids reflects differences in valuation. However, others— namely Carell and Manchise in Negotiator Magazine (2013)— argue that differences in bids increase the risk of heading to a hearing and incentivize teams and players to hammer out a settlement.

Using two separate probability models and data on all players that filed for arbitration between 2011 and 2017, I examined the likelihood that a player goes to an arbitration hearing based on the differences in bids between the player and the team. The models both control for the player performance— by incorporating the effect of WAR— and utilize a dummy-variable for Super-Two status— controlling for the effect of players granted a “bonus year” of arbitration eligibility. The only difference between the two models is the variable of interest. The first uses the ratio of the absolute bid differences to the midpoint between the two salaries in order to measure the effect of a growing gap between filings relative to the actual size of the filings. The latter model separates the two effects to understand whether absolute gaps and absolute filing size have an effect on arbitration hearings. The model specifications and regression results are shown below. The table below essentially shows the marginal effect on likelihood to go to hearing due to a 1 unit change in the corresponding variable.

Model 1:

Model 2:

Results:

 

Both models demonstrate highly significant coefficients indicating that players with large gaps in salary filings are less likely to enter hearings. In fact, in the aggregate sample of players an increase of $100,000 in bid differences reduces the likelihood of a hearing by 2.7% and a 1% increase in Bid Difference to Midpoint Ratio decreases the likelihood of a hearing by 1.1%. This stands as an incredibly significant effect considering only 16.73% of players in the sample even made it to a hearing. Quite evidently, teams and players are incredibly risk-averse and fear losing the arbitration hearing and being forced to agree to a suboptimal salary. Thereby, the incentive to settle is driven up by higher bid differences.

Another interesting result shows that in all samples, an increase in filing midpoint by $100,000 increases hearing likelihood by 0.56%. As such, all else equal, players with higher filing midpoints are more likely to head to a hearing. The intuition behind this is best explained considering this with the negative coefficient on WAR, as both WAR as Midpoint are highly related but have opposite and significant signs. While WAR indicates that better players are less likely to head to a hearing, the positive coefficient on Midpoint states that “better” players are more likely to head to a hearing.

Though these indicate opposite effects, considering the effect of a high midpoint with WAR constant and vice-versa, the theory provides explanatory qualities. A more aggressive salary bid— given an exogenous and fixed level of production— is easier to dispute for a low-value player than a high-value player. Thus, independent of the player’s production level, a higher Midpoint leads to a higher likelihood to enter an arbitration hearing. As such, the positive coefficient on Midpoint likely reflects bad players bargaining for extra money rather than good players— whose effects on hearing likelihood are captured by the WAR coefficient. Considering the WAR coefficient independent of the filing midpoint as well, teams are more likely to focus their negotiation efforts on their better players, thereby reducing the likelihood high WAR players end up in hearing.

The final variable of interest in these regressions is the dummy-control for Super-Two status. As mentioned earlier, Super-Twos represent young players with substantial playing times who are rewarded with an extra year of arbitration eligibility. The models predict that Super-Two status increases the likelihood of hearings by 14.3%-16.9% depending on the model. As such, these young players seem more likely to challenge their teams in salary evaluations. This too comes as no surprise since challenging a team in your first (and bonus) year of arbitration eligibility can lead to significant level effects in subsequent arbitration hearings. A salary increase from the league minimum to $545,000 to even $1M can snowball into much larger raises in the following years with an arbitration victory. As such, these players may have a higher incentive to enter hearings and capture these multiplicative effects.

Now, revisiting the four cases above— Betts, Stroman, Springer, and Schoop— some interesting cases do pop out. Betts may not have been the most likely candidate to head to an arbitration hearing, the $3M difference between Betts and the Red Sox was incredibly high and reflected an enormous risk for either party entering a hearing. The predicted path for Betts was likely closer to George Springer’s contract extension or Jonathan Schoop’s 1-year deal. By contract, Stroman may represent the classic arbitration case, a low-risk hearing for either party, bargaining over a small fraction of their bids. And while Stroman expressed his frustration— or lack thereof— following the hearing, history shows that the Stromans of the world will likely end up there again. Ultimately, the final offer arbitration system does its job: those who disagree significantly tend to work toward compromise, while those who disagree a little take a change and roll the dice.


Making Baseball Slow Again

If you’re a baseball fan, you may have noticed you’ve been watching on average 10-15 minutes more baseball then you were 10 years ago.  Or maybe you are always switching between games like me and never stop to notice. If you’re not a fan, it’s probably why you don’t watch baseball in the first place: 3+ hour games, with only 18 minutes of real action. You are probably more of a football guy/gal right?  Believe it or not NFL games are even longer, and according to a WSJ study, deliver even less action.

The way the MLB is going, however, it may not be long before it dethrones the NFL as the slowest “Big Four” sport in America (and takes away one of my rebuttals to “baseball is boring”). Currently, the MLB is proposing pitch clocks and has suggested limiting privileges such as mound visits.

Before I get into the specific proposal and the consequences of these changes, let me give you some long winded insight into pace of play in the MLB.

A WSJ study back in 2013 broke down the game into about 4 different time elements:

  1. Action ~ 18 minutes (11%)
  2. Between batters ~ 34 minutes  (20%)
  3. Between innings ~ 43 minutes (25%)
  4. Between pitches ~ 74 minutes  (44%)

The time between pitches or “pace” is what everyone is focused on, and rightly so. It makes up almost twice as much time as any other time element and is almost solely responsible for the 11-12 minute increase in game length since 2008. Don’t jump to the conclusion that this is all the fault of the batter dilly-dallying or the pitcher taking his sweet time. This time also includes mound conferences, waiting for foul balls or balls in the dirt to be collected, shaking off signs and stepping off, etc. Even if we take all of those factors out, there are still two other integral elements that increase the total time between pitches: the total batters faced and the number of pitches per plate appearance (PA).  If either of these increase, the total time between pitches will increase by default. In the graph below, I separated the effects of each by holding the rest constant to 2008 levels to see how each factor would contribute to the total time added.

Any modest game time reduction due to declining total batters faced was made up by a surge in pitches per PA. Increasing pace between pitches makes up the rest.

As we have heard over and over again in the baseball world, the average game time has increased and is evident in the graph above. It’s not just that the number of long outlier games has increased; the median game time has actually crept up by about the same amount.

Plenty of players are at fault for the recent rise in game time. You can check out Travis Sawchik’s post about “Daniel Nava and the Human Rain Delays” or just check out the raw player data at FanGraphs. Rather than list the top violators here, I thought it would be amusing to make a useless mixed model statistic about pace of play.

A mixed model based statistic, like the one I created in this post, helps control for opposing batter/pitcher pace and for common situations that result in more time between pitches. Essentially, for the time between each pitch, we allocate some of the “blame” to the pitcher, batter, and the situation or “context”.

I derive the pace from PITCHf/x data, which contains details about each play and pitch of the regular season. I define pace as the time between any two consecutive pitches to the same batter excluding intervals that include pickoff throws, stolen bases, and other actions documented in PITCHF/x (This is very similar to FanGraphs’ definition, but they calculate pace by averaging over all pitches in the PA, while I calculate by pitch). For more specifics, as always, the code is on GitHub.

It’s a nice idea and all, but does context really matter?

The most obvious example comes from looking at the previous pitch. Foul balls or balls in the dirt trigger the whole routine involved in getting a new ball, which adds even more time. The graph below clearly shows that time lags when pitches aren’t caught by the catcher.

The biggest discrepancy comes with men on base. Even though pickoff attempts and stolen bases are removed from the pace calculation, it still doesn’t account for the game’s pitchers play with runners on base. This includes changing up their timing after coming set or stepping off the rubber to reset.

The remainder of the context I’ve included illustrates how pace slows with pressure and fatigue as players take that extra moment to compose themselves.

As the game approaches the last inning and the score gets closer, time between pitches rises (with the exception of a score differential of 0, since this often occurs in the early innings).

And similarly, as we get closer to the end of a PA from the pitcher’s point of view, pace slows.

Context plays a large part in pace meaning that some players who find themselves in notably slow situations, are not completely at fault. I created the mixed model statistic pace in context, or cPace, which accounts for all of the factors above. cPace can essentially be interpreted as the pace added above the average batter/pitcher, but can’t be compared across positions.

When comparing the correlation of Pace and cPace across years, cPace seems like a better representation of batters’ true tendencies. My guess is that, pitchers’ pace varies more than the average hitter, so many batters’ cPace values benefited from controlling for the pitcher and other context.

After creating cPace, I came up with a fun measure of overall pace: Expected Hours Added Per Season Above Average or xHSAA for short. It’s essentially what it sounds like: how many hours would this player add above average given 600 PA (or Batters Faced) in a season and league average pitches per PA (or BF).

The infamous tortoise, Marwin Gonzalez, leads all batters with over 3 extra hours per season more than the average batter.

That was fun. Now back to reality and MLB’s new rule changes. Here is the latest proposal via Ken Rosenthal:

The MLB tried to implement pace of play rules in 2015, one of which required batters to keep one foot inside the box with some exceptions. The rules seemed to be enforced less and less, but an 18- or 20-second pitch clock is not subjective and will potentially have drastic consequences for a league that averages 24 seconds in-between pitches. Some sources say the clock actually starts when the pitcher gets the ball. Since my pace measure includes the time between the last pitch and the pitcher receiving the ball, the real pace relative to clock rules may be 3-5 seconds faster.

Let’s assume that it’s five seconds to be safe. If a pitcher takes 20 seconds between two pitches, we will assume it’s 15 seconds. To estimate the percentage of pitches that would be affected by these new rules I took out any pitches not caught by the catcher, assuming all the pitches left were returned to the pitcher within the allotted five seconds.

The 18-second clock results in about 14% of the pitches with no runners on in 2017 resulting in violations of the pitch clock. This doesn’t even include potential limits on batters times outside the box or time limits between batters, so we can safely say this is a lower bound. If both of the clocks are implemented in 2020, at least 23% of all pitches would be in violation of the pitch clock(excluding first pitch of PA). Assume it only takes three seconds to return the ball to the pitcher instead of five, and that number jumps to 36%!

And now we are on the precipice of the 2018 season, which could produce the longest average game time in MLB history for the second year in a row as drastic changes loom ahead. I don’t know who decided that 3:05 was too long or that 15 minutes was a good amount of time to give back to the fans. Most likely just enough time for fans to catch the end of a Shark Tank marathon.

Anyways, if game times keep going up, something will eventually have to be done. However, even I, a relatively fast-paced pitcher in college, worry that pitch clocks will add yet another element to countless factors pitchers already think about on the mound.

There are certainly some other innovative ideas out there: Ken Rosenthal suggests the possibility of using headsets for communication between pitchers and catchers, and Victor Mather of the NYT suggests an air horn to bring in new pitchers instead of the manager. Heck, maybe it’ll come down to limiting the number of batting glove adjustments per game. Whatever the league implements will certainly be a jolt to players’ habits and hardcore baseball fans’ intractable traditionalist attitude. The strategy, technology, and physicality of today’s baseball is changing more rapidly than ever. When the rules catch up, I have a feeling we will still like baseball.

 


Analysis and Projection for Eric Hosmer

Eric Hosmer is one of those guys you either love or hate. His career, which includes one World Series championship and two American League pennants, has been just as polarizing.

First, who Hosmer is. Consider his WAR each season since 2011:
1.0
-1.7
3.2
0.0
3.5
-0.1
4.1

Interesting pattern; let’s look into that. The chart below is Hosmer’s career plate discipline (bolded data are positive WAR seasons).

Nothing appears to be out of sorts, no obvious clues to suggest a divergent plate approach.

Moving on, I noticed his BB/K rate did relate to his productive seasons; that alone can’t possibly explain his offensive oscillations. While his strikeout and walk rates did vary, the differences were a matter of two or three percentage points, at best.

So, I decided to look at his batted ball contact trends and found that his line drive rate directly correlated with his higher WAR seasons; 22%, 24%, 22% in 2013, 2015, and 2017 respectively with 19%, 17%, 17% in 2012, 2014, and 2016 accordingly.

OK, so his launch angle must be skewed. But, like his plate discipline, no outliers were demonstrated; his 2017 season should be easy to pick out. The below animation is a glance at Hosmer’s three-year launch angle charts, in chronological order.

 

How about his defense? Well, something seems off about that, too.

He’s won a Gold Glove at first base four out of the last five years. He looks great on the field, but unfortunately, his defense reflects the same way as a skinny mirror; his UZR/150 sits at -4.1 and his defensive runs saved are -21. Since 2013 (the first year he won the award) he ranks 13th in DRS and 12th in UZR/150 out of all qualifying first basemen. So, middle of the pack basically but worth four gold gloves? Probably not.

As we could have surmised, he’s simply an inconsistent player. Falling to one side of the fence yet?

One thing is a certainty; his best season was, oddly enough, his walk year with the Kansas City Royals in 2017. Now, I’m not about to speculate that Hosmer played up his last year with the Royals to get a payday (which he most certainly got). Looking back at his WAR in the first part of the article, you can see his seasonal fluctuations suggest he was due for a good year.

Keeping with the wavering support of Hosmer, is the contract he acquired to play first base with the San Diego Padres. His eight-year deal (with an opt-out in year five), will net him $21 million each season. He will draw 25.8% of team payroll. When his option year arrives in 2022, he’s due for a pay cut of $13 million in the final three years.

A soundly contructeed contract as, according to Sportrac’s evaluation, his market value is set at $20.6 million a year. To note, the best first baseman in baseball, Joey Votto, signed a ten-year deal in 2015 for $225 million dollars (full no-trade clause). Starting in 2018, Votto is slated to make just $4 million more than Hosmer will in the early portion of his deal. Did San Diego overspend? It all depends on what their future plans are for him.

In any case, Hosmer will join a team that, following his arrival, is currently 24th in team payroll. In 2019, they will hop to 23rd. It could go down further upon the arrival of their handful of prospects who look to be the core of the team.

So who will the Padres have going forward? Using wOBA, probably the most encompassing offensive statistic, I decided to forecast what the coming years will look like for Hosmer. It goes without saying that defense is nearly impossible to project. So, for argument’s sake, we’ll continue to assume Hosmer will be an average defender at first.

Since Hosmer’s rookie year in 2011, the league average wOBA is approximately .315. Hosmer should stay above that through the majority of the contract. But, let’s be more accurate. Using both progressive linear and polynomial trend line data (based on both Hosmer’s past performance and league average wOBA by age), I was able to formulate a projection for Hosmer through age 35 (no, I’m not going to lay out any of my gory math details).

OK, I lied. Here is the equation I used to come to my prediction :

{\displaystyle y_{i}\,=\,\beta _{0}+\beta _{1}x_{i}+\beta _{2}x_{i}^{2}+\cdots +\beta _{m}x_{i}^{m}+\varepsilon _{i}\ (i=1,2,\dots ,n)}

From age 28 on is what we want to look on from. Hosmer is expected to take a dive offensively in 2019 with a bounce-back year in 2020, sticking with his past trends. A year before his opt-out clause (where he’s slated to make $13 million), his wOBA is expected to regress at a stable rate. He’ll continue to be league average or better during the twilight years of his career.

Prognosis

Hosmer seems to be appropriately compensated. You could argue that he’s making too much, but the Padres had the money to give him and they are banking on Hosmer to be highly productive at Petco. But, chances are (according to his history), he won’t maintain (or exceed) his 4.1WAR in 2018. He’ll be labeled as a bust but ought to have a few good years in him during the $21 million salary period. And, as my forecast chart shows, his 2022 pay cut comes at just the right time.

*This posts and more like it can be found over at The Junkball Daily