Archive for Outside the Box

Does Speed Kill?

Speed kills. At least, that’s what people say.

Speed is certainly a good tool to have. All else equal, any manager would pick the faster guy. Of course, speed is a huge asset in the field, especially for outfielders. Good speed increases range, providing a sort of buffer zone for players who don’t get a good jump on the ball or who don’t read the ball well off the bat. No one in their right mind, when given the choice, would pick the player with less range (again, all else equal). And so we can all agree that speed very clearly increases a player’s value in the field.

Whether or not speed increases a player’s value at the plate is a different story. The faster guy may leg out an infield hit every now and then or stretch a single into a double or a double into a triple, but this won’t significantly increase a player’s value outside of a small uptick in average.

Luckily, Baseball Savant’s sprint-speed leaderboard gives us some interesting data to examine (you can find the interactive tool here).

wSkcbNu.0.png

Here, we can see that the league average sprint speed is 27 ft/s. Catchers, first basemen, and designated hitters are typically below league average. And it comes as no surprise that outfielders, especially center fielders, are typically above league average.

If we look at the fastest player at each position for 2017, we can come to a better understanding of the value of speed.

scWVCyU.0.png

Notably, of the nine players on this list, only four of them have a wRC+ above 100 — league average. Is this significant? Probably not as a stand-alone statistic. But it is safe to say that speed does not directly correlate to value. And it certainly doesn’t correlate to value at the plate. Even when examining the WAR column, you won’t be blown away. Dickerson and Bryant are having great years, but for the most part these players represent a pretty average group.

As mentioned previously, only four of these players are above average in terms of creating runs (highlighted in red and orange). The players with wRC+ values in red have not had success because of their speed. They all have ISOs that are at least 50 points above league average. Basically, their success can be attributed to power, not speed.

However, JT Realmuto’s ISO is essentially league average. Did speed boost his value that much? (NOTE: speed is not taken into account when calculating wRC+; still, the value of each outcome, which is considered in the calculation, can be affected by speed) Realmuto’s speed puts additional pressure on opposing defenses, especially relative to other catchers, but I would be very hesitant to say that speed alone created a difference of 9 wRC+ between him and the average player.

Billy Hamilton is the fastest player in the league. And while most would call him a plus defender, very few would call him a good all-around player. His wRC+ value of 57 is seventh-worst out of all qualified players (highlighted in blue). Although he leads the league in stolen bases, even that wasn’t enough to raise his WAR above a dismal 0.5. We can safely say that speed does not correlate to success.

What about specific teams? Do teams compiled of speedsters at every position win more games?

w9geMYB.0.png

Here is the same image as above with only Marlins players highlighted. Miami has a player with above-average speed at every single position, save for Justin Bour at 1B who has been a top-20 player in the MLB based on offensive production this year. Without question, the Marlins have a lot of speed, but still, they are six games under .500 and 10.5 games out of the wild-card race in the National League.

HXJEqCo.0.png

Here is the same image with San Diego players. The Padres are a speedy team. They have not one, but two players above league average at three different positions. Even their catcher, Austin Hedges, is only slightly below league average while still significantly faster than the average catcher. Despite having one of the fastest teams in the MLB, the Padres are 14 games below .500 and 19 games out of first place in the NL West.

Speed isn’t a stand-alone tool. It is a great complement to someone who makes contact at high rates (see: Ichiro) and it can put pressure on a defense, forcing fielders to rush to make a play. Furthermore, it is a crucial tool in the field, increasing range for all players, most significantly for outfielders. However, speed in and of itself is by no means an indicator of overall value. In baseball, speed doesn’t kill.


Detroit’s Batted-Ball Readings Are Hot

Editors Note: Analysis in this article was conducted using Baseball Info Solutions Hard Hit batted ball data.

To be clear, this did not begin as an example of investigative journalism. While I do occasionally enjoy media pieces such as Spotlight and S-Town, my curiosity in this topic all began with the incredible amount of attention given to a seemingly mediocre player named Nick Castellanos. To give some examples, below are three popular FanGraphs/RotoGraphs articles written about Castellanos:

In theory, the hype surrounding Nick Castellanos makes sense. High hard-hit rate, few ground balls, sustainable HR/FB%, and a decent home ballpark. If only he could get those strikeouts down and avoid bad luck, he could turn into Kris Bryant or Nolan Arenado. The analytics community, who have been waiting for the Castellanos breakout for five years, is more divided than ever on the Tigers third baseman. Some continue to beat the drum while others are abandoning ship, arguing that the breakthrough will never happen.

This season, Castellanos is not the only Detroit Tigers player who has received love from the analytics community:

The claims brought up by all of these writers have one thing in common: high or increased hard-hit rate. As presented in Matthew Ludwig’s article The Value of Hitting the Ball Hard, hard-hit rate and wRC+ have a positive correlation. In general, a player who hits the ball harder would be expected to have more favorable results when they make contact.

This brings us to the question, is it possible for so many Detroit Tigers players to be underperforming their batted-ball profiles? In order to gauge exactly how much harder the Tigers are hitting the ball than their opponents this year, I took a look at the hard-hit rate for the Tigers as a team. The point that is colored “Tiger orange” represents the Detroit Tigers.

Screen Shot 2017-06-17 at 2.59.39 PM

It isn’t even close; the 2017 Detroit Tigers are currently the best team at making hard contact and the worst team at preventing hard contact. Thinking qualitatively, are the Tigers hitters really that much better at making hard contact than the hitters on the Astros, Nationals, or Diamondbacks? Are the pitchers really that much worse at preventing hard contact than the pitching on the Padres, Orioles, or Reds? If so, the results are not proving it. The Tigers currently rank ninth in runs scored and 20th in runs against. Park factors and other variables do apply, so it may be possible that the hitters are getting unluckier and the pitchers are getting luckier than the batted-ball data shows. Assuming that players’ abilities are transferable across stadiums, we should small differences in hard-hit rate for Tigers hitters and pitchers when looking at home/away splits.

Screen Shot 2017-06-18 at 9.18.25 PM

Quadrant I (x,y) represents the teams that have a higher hard-hit rate for both hitters and pitchers on the road than at home. Quadrant III (-x,-y) represents the teams that have a higher hard-hit rate for both hitters and pitchers at home than on the road. The Detroit Tigers (orange point) rank as the team with the largest negative difference for both hitters and pitchers. One thing to note about the data is that 22 out of the 30 points lie within either quadrant I or quadrant III. This could give some validity to the assumption that hard-hit rate is not consistently measured from park to park. There could be a variety of reasons for this (humidity, air density, etc.). For more on this, I would point to Andrew Perpetua’s article Home And Road Exit Velocity. If there was truly something unique about Detroit causing these balls to be measured harder, this trend would be seen over a wider time period. Let’s look at where the Tigers ranked for the years 2012-2016.

Screen Shot 2017-06-19 at 10.14.50 PM.png

See that orange circle almost directly in the middle of the chart? That is the Detroit Tigers. The only point that has a closer distance to the direct center is the Atlanta Braves, who now play in an entirely different city and stadium.

So what about all other stadiums? If hard-hit rate is being artificially increased at Comerica Park, it is likely that there are slight adjustments at all ballparks. Based on 2017 data, the difference for each stadium (hitters or pitchers) is listed below:

Screen Shot 2017-06-19 at 9.05.05 PM

Looking at an individual-player level (min. 50 AB home and away, min. 20 IP home and away), let’s see how many Tigers batters appear on the top 20 away-home hard-hit-rate difference leaderboard for hitters and pitchers. Detroit Tigers players are highlighted in orange.

Screen Shot 2017-06-19 at 9.37.18 PMScreen Shot 2017-06-19 at 10.31.05 PM.png

I can see four possible scenarios to explain why Detroit Tigers players may be experiencing this phenomenon:

  1. Tigers hitters and pitchers have actually experienced large splits between home/away hard-hit rate this year (with no other variables changing)
  2. Something about Comerica Park is causing increased error in the variables used for the quality of contact algorithm
  3. Changes are being made to the ball or environment at Comerica Park, making it act differently
  4. Small sample size bias is skewing the data

Unfortunately, this is about as far as I can take this piece. Something is going on in Detroit this year that is skewing the hard-hit-rate calculations. However, the whys and hows beyond the data are not clearly evident. Until then, I will continue to monitor this unintended project of investigative journalism from the sidelines.


The Value of Hitting the Ball Hard

There is value in the fly ball. That statement isn’t something that will surprise any fan. Even someone who knows very little about baseball could piece together the logic behind it. The most valuable individual outcome is a home run. How do you hit a home run? Hit a fly ball. As Travis Sawchik found for 2016, fly balls produced a wRC+ of 139, while ground balls put up a mark of 27 wRC+.

Of course, the sabermetrically inclined will quickly point out that it’s not that simple. Judging the value of a hit based on whether it is a fly ball or a ground ball is a futile exercise. You have to consider batted ball distance, launch angle, and exit velocity. Much has been made about the recent “fly ball revolution” occurring throughout the league. And while some believe hitting more fly balls really does increase the value of a player, data suggests that the fly ball revolution is hurting as many batters as it’s helped.

It’s possible that there are benefits to hitting more fly balls, but that doesn’t seem to correlate to an increased value.

2LMPbub.0.png

There really is no correlation between fly ball % and wRC+. So, it seems that value is added not by hitting the ball higher, but by hitting the ball harder.

Ll87TiG.0.png

Now this is a pretty clear correlation. Hit the ball harder and a better outcome is more likely. A soft liner toward the second baseman will probably be an out. But, a laser to right-center field could be a triple.

This trend is not a new development or a new discovery. As far back as 2002, when batted-ball data became available, there has always been a positive correlation between Hard% and wRC+. In fact, the average correlation (R-squared value) between these two variables over the last 15 years is .475.

Hard% also has predictive value. Take a look at the data for 2017 thus far.

yKhUSON.0.png

Although the correlation from past years isn’t there, it doesn’t need to be. We should no more expect the data to already have an R-squared value above .4 than we would expect an MVP to have a WAR higher than 6 at this point of the season. Because there are quite a few outliers that will come back to the mean, Hard%, based on its historical data, has considerable predictive value.

Ignoring the one point above the 200 wRC+ line (Mike Trout, whose entire career is an outlier), let’s examine a couple outliers. First, the point on the far right toward the bottom. Nick Castellanos is hitting the ball harder than Aaron Judge, who just set a Statcast record for hardest home run ever hit, but only has a wRC+ of 82 — well below average. Towards the top of the chart at the 175 wRC+ mark, we see that Zack Cozart is making hard contact only 32% of the time.

It is reasonable to expect, based on this chart, that Castellanos’s numbers will start to improve and Cozart’s will regress. As it turns out, Andrew Perpetua found the same outliers by looking at exit velocity and xOBA in a RotoGraphs article last week. These statistics all point toward the same thing — Castellanos has been very unlucky and Cozart has been just the opposite. The takeaway here is that Hard% can be used as a predictor for value even over a smaller sample size.

If Hard% is such a good indicator of success, what is the actual value of hitting the ball hard? Hitting the ball hard has been a hallmark of both HR leaders and batting champions. Over the last five years, the HR champion has an average Hard% of 40.12 and the batting champion has one of 35.16%. Although the almost five-point spread is a lot, a Hard% above 35% is nothing to laugh at — it’s still in the upper half of all players.

For the last full season (2016), increasing Hard% by even just 5% added 13 points to the wRC+ value. That is pretty significant. For context, 13 wRC+ is the difference between Aaron Judge and Yonder Alonso so far this year. But has it always been this way? Not exactly. In 2002, a 5% increase in Hard% increased a player’s wRC+ by 20 points. This points toward an interesting trend.

J3hjnZi.0.png

For the last 15 years, the correlation between Hard% and wRC+ has decreased. In other words, hitting the ball hard is not as valuable as it once was. My initial thought was that players aren’t hitting as many HRs as they did in 2002. But that is simply not true. 14.2% of flies result in HRs — the highest rate ever recorded. Perhaps this trend is a result of defenses shifting. Are batters hitting the ball harder than ever, but fielders are now better positioned? The shift is certainly a powerful tool — it kept Ryan Howard out of the Hall of Fame. Still, I’m not convinced the shift is solely responsible for this eerie trend.

Hitting a ball hard is much more important than hitting it high, that is, if you can’t have it both ways. However, the value of hitting the ball hard has decreased for more than a decade. Looking at the data, is it possible that in 10 years we’ll see a sort of “v” shape, indicating a return to the value of hitting the ball hard? Maybe. But for now, this is an interesting trend with no clear indicator.


Ballplayers and the Karmic Practice of Yoga

Injuries are something that pronounce their impact differently on every player in the game. Some guys have freakish bodies and recover faster naturally. Others push themselves to accelerate their return. But recovery from some injuries can’t be sped up. Maladies like inflammation are plainly matters of time.

JA Happ went on the 10-day DL on April 18 for having it in his elbow. He’s finally back on the mound in the Majors after being out for more than a month.

Kendall Graveman just hit the DL for soreness in his throwing shoulder and is “taking anti-inflammatory medication and resting,” per Susan Slusser of the San Francisco Chronicle. Manager Bob Melvin says he’s been through this before, that it’ll take longer this time, and that the team is going to “let this thing calm down” before trying to build up his endurance again. The passivity in his words is telling.

And if you have the heart to remember the end of Roy Halladay’s career, you’ll remember inflammation in his throwing shoulder cost him time on the DL amidst his body simply telling him, “please, no more.”

Inflammation is a general response from the body that results from cell agitation. It can occur from normal use — “normal” being a relative word. It intends to clear out damaged cells but the process causes pain, discomfort, and inherently imbalanced levels of certain proteins in our bodies. And the things a ballplayer does every day, the extreme motions they constantly put themselves through for more than half a calendar year, make them prime candidates to become victims of it.

Enter yoga.

There’s no causal relationship between yoga and reduced injuries. But as I researched its impact on ballplayers for a job, I couldn’t help but think of the benefits. And I did find that it has been connected to balancing the proteins that can get whacked out in players’ bodies through the course of a season.

Researchers have studied a particular form called Hatha yoga, which combines poses (asanas), breath control (pranayama) and meditation. They explored its ability to help aid in recovery from the regular wear-and-tear we put our bodies through. “Regular” is another relative term — think of the twist and torque your favorite hitter exhibits on each swing and how that could eventually cause a dreaded oblique strain.

The study’s trippiest finding centers on epinephrine levels in the brain, which are fueled by the adrenal gland and play a large role in maintaining both physical and emotional stress. In focusing on the differences between novices and experts, experts experienced higher levels of epinephrine on a regular basis. That surprised even the researchers.

Common sense might tell us that the more we have of something, the more we get used to it, and then the less impact it has on us. It’s why a person going skydiving for the first time can find it exhilarating while it’s just another day at the office for the instructor they’re attached to. It’s the same when a pitcher isn’t excited about his velocity inching up through the spring. He can expect it because of what he’s thrown in the past.

The study found the opposite with yoga, though. The body adapts to the poses, breathing patterns, and meditation in Hatha yoga, and the person gets better at it; but chemically, they don’t get used to it. It doesn’t become old hat. Instead, the practice becomes invigorating and those who practice it build up what becomes an expandable physiological embankment of wellness.

What’s more is that, based on the study’s parameters, a player could approach expert level at yoga over the course of a single season. A few hours a week could help keep their protein levels balanced through the summer and avoid the fickle complications of inflammation. And beyond even that, it offers a fresh, low impact way to optimize their body that could pay long-term dividends.


A Situational Lineup: Management Questions With No Clear Answers

It has come to my attention that in the 1880’s and early 1890’s an interesting management phenomenon presented itself around baseball. At this time, managers were not required to submit a lineup card before the start of the day’s game. Due to this, the first time through the batting order could be constructed the way the manager saw fit, based upon situations in the game. That being said, once the lineup went through its progression once, its construction would pervade throughout the rest of play. In lieu of this, an interesting set of strategical questions come into play. How would managers set lineups if this rule existed today? How would this effect run totals for the season for a given team? Would lineup construction change its form or remain largely the same as the way it is done now? This article is not one that analyzes or provides solutions but, instead, provides questions that are interesting and engaging to any baseball connoisseur.

The implications and strategy behind this lineup maneuverability are something that provides tons of differing opportunities for discussion. I think the lead-off hitter, if this rule was applied to the game today, would remain mostly the same. Managers would continue to look for an on-base machine to start off the game in a positive fashion. Along with this, I believe that the seven through nine batters would remain mostly static. Managers would look to place their worst hitters and their pitcher in these spots in order to diminish their number of at-bats in impact situations. With these assumptions established, a world of possibilities open up for the two through six hitters in the lineup. Each manager would approach this construction differently based upon the day’s match-up and the game’s progression. That said, here are a set of interesting scenarios that can provide interesting implications for the progression of a game and for run production in that game.

Let’s assume we’re the Angels and we have their current set of middling players that play alongside a healthy, and studly, Mike Trout. It’s the top of the first inning and the first two outs have already been made, no one’s on base, and we have to choose who will hit. Although there are no runners in scoring position, would you (as the manager) decide to hit Trout in this spot? Or, would you wait and hit Trout to lead off next inning and hope he starts off the inning strong? Or, would you wait to bat Trout sixth and hope that the first two batters in the next inning get on base and Trout can drive them in?

If you choose the latter, the implications of such would be a diminished number of at-bats in the game for Trout. Would it be worth it to wait on an impact situation to have Trout hit for the first time, even if this led to one less at-bat for the rest of the game? I think, personally, in this scenario I would hit Cameron Maybin in the three hole, following Yunel Escobar and Kole Calhoun. I think Maybin has enough pop to hit a home run every once in a while with the bases empty. I also think that if he got on base, I’d hit Trout directly following in the four hole. If it were a single by which Maybin got on, he would go first pitch and try to swipe second. If he got thrown out, it would be fine and I’d have Trout leading off my next inning, followed by Albert Pujols and Luis Valbuena. If he swiped the bag, we would now have a runner in scoring position for our best hitter, which is exactly what we want.

I can see as I’m writing that my ideas are getting harder and harder to follow, but I think this is a direct result of the vast array of possibilities this type of management choice presents. It would be interesting to see major-league managers, much more knowledgeable than myself, go about making these decisions on a daily basis. What do you think would be the best lineup set in this situation? And what other situations would be interesting to discuss as baseball fans?


The 2016 Strike Zone and the Umpires Who Control It

Introduction

One of the most-discussed issues in Major League Baseball is the consistency of the strike zone. The rule-book strike zone states “The STRIKE ZONE is that area over home plate the upper limit of which is a horizontal line at the midpoint between the top of the shoulders and the top of the uniform pants, and the lower level is a line at the hollow beneath the kneecap. The Strike Zone shall be determined from the batter’s stance as the batter is prepared to swing at a pitched ball.” After watching games throughout the regular season and playoffs, it is easy to realize this is not the strike zone that is called. Each umpire has tendencies and dictates his own strike zone and how he will call a game. With the rise of PITCHf/x and Trackman in the last few years, umpires have been increasingly monitored and judged for their accuracy and impartiality. For this reason, umpires are criticized for incorrect calls more than ever before and I believe are now trending towards enforcing the rule-book strike zone more than in years past.

The purpose of this research will be to do two things. First, I will focus on identifying overarching themes where I look at finding how umpires are adjusting to modern technology but also how the rule-book strike zone is not the strike zone we know. After this, I will dive into a few umpire-specific tendencies. The latter would be helpful to teams in preparing their advance reports by knowing how certain umpires call “their” strike zone dictated by situations in a game.

Analysis

Using PITCHf/x downloaded through Baseball Savant, I have looked at major-league umpires since 2012 in regards to their accuracy in correctly labeling pitches, primarily strikes, and their tendencies dictated by specific situations. While the height of the strike zone is often influenced by the height of the batter, there are other factors to take into account such as the how the batter readies himself to swing at a pitch. Unfortunately, the information publicly available to conduct this research does not include the batter handedness, pitcher name, or measurements of individual strike-zone limits. For this reason, a stagnant strike zone serves our needs best. The height of the strike zone shall be known as 1.5 feet from the ground to 3.6 feet from the ground. This is the given strike zone of a batter while using the pitchRx package through RStudio when individual batter height is not included.

All PITCHf/x data is from the Catcher/Umpire perspective, having negative horizontal location to the left and positive to the right. The width of home plate is 17 inches, 8.5 inches to both sides where the middle of the plate represents 0 inches. After calculating the average diameter of a baseball at 2.91 inches, we add this to the width of the plate. Therefore our strike-zone width will be 17 + 5.82, or 22.82 inches. The limits we will then set are going to be -.951 to .951 feet (or 11.41/12 inches). Throughout the paper I will be referring to pitches that fall within the boundaries of our zone as “Actual Strikes” and pitches correctly identified as strikes within this zone as “Correctly Called Strikes.”

Called Strike Accuracy By Year

As Table 1 shows, correctly identifying strikes that fall in the parameters of the rule-book strike zone has risen substantially. While 2015 has a higher percentage of correctly called strikes, 2016 PITCHf/x data from Baseball Savant was incomplete, with 28 days’ worth of games unavailable at the time of this research. A rise of 5.90 percent correctly called strikes from 2012 to 2015 shows the rule-book strike zone is being more strictly enforced.

table-one

While this provides some information, we can also look into where strikes are correctly being called using binned zones. Understanding that the evolution of umpires over the last five years is taking place and trending toward correctly identifying strikes more today than in years past, we can analyze where, in the strike zone, strikes have been correctly labeled.

Called Strike Accuracy by Pitch Location

In Table 2, we can see a tendency among umpires. Strikes are called strikes more routinely over the middle of the plate and to the left (from umpire perspective). As I have mentioned before, the publicly available PITCHf/x data I used did not include batter handedness and I am unable to determine who is receiving the benefit or disadvantage of these calls. Presumably from previous research on the subject, lefties are having the away strike called more than their right-handed counterparts, explaining the separation between correctly identifying strikes in zones 11 and 13 versus 12 and 14.

Binned Strike Zone
binned-strike-zone

table-two

While one may argue that there should not be strikes in these bordering zones, we consider any pitch that crosses any portion of the plate a strike. Due to our zone including the diameter of the baseball on both sides of the plate, the outer portion of the plate includes pitches where the majority of the ball is located in one of these zones.

Called Strike Accuracy by Individual Umpire

When gauging an umpire’s ability to correctly identify a rule-book strike, an 85.67% success rate sets the mark with Bill Miller, while Tim Tschida ranks at the bottom of this list, only calling 71.57% correctly. We can infer from Tables Three and Four along with Table One, that while umpires are calling strikes within the strike zone more often, they are still missing over 17% of these pitches. It is important to note that this information does not take into account incorrectly identifying pitches outside the rule-book strike zone as strikes, which when considering an umpire’s overall accuracy, should absolutely be taken into account.


table-three

table-four

Called Strike and Ball Accuracy by Count

One of the most influential factors in whether a taken pitch is called a strike or a ball is the count of the at-bat. We have all seen pitches in a 3-0 count substantially off of the plate called a strike, just as we have seen 0-2 pitches over the plate ruled balls. Table Five shows the correct percentage of strikes and balls by pitch count. While this shows that umpires are overwhelmingly more accurate at identifying strikes as strikes in a 3-0 count (91.06%) as compared to an 0-2 count (56.66%), we must acknowledge this only paints part of the picture. Umpires are conversely most likely to correctly labels balls in 0-2 (98.73%) counts and misidentify balls in 3-0 (90.32%) counts. I included their accuracy of correctly identifying both strikes and balls here as opposed to throughout the entire paper because we can clearly tell through this information that umpires are giving hitters the benefit of the doubt over pitchers. Umpires are far more likely overall to correctly identify a ball than a strike, as evidenced by the fact that there are no counts during which umpires correctly call less than 90% of balls.

table-five

The data in Table Five is corroborated by the visualizations in Figure One and Figure Two. These visualizations of the strike zone include pitches off of the plate and we can see that in a 3-0 count, a more substantial portion of the rule-book strike zone is called strikes while also incorrectly identifying balls as strikes. While in a 0-2 count, a smaller shaded area of the rule-book strike zone works with our findings that less strikes are identified correctly but more balls are correctly called.

figure-one-and-two

Called Strike Accuracy by Pitch Type

The next area I looked at was whether pitch type significantly altered the accuracy of umpires. In order to do this, I grouped all variations of fastballs into “Fastball” and all other pitches into “Offspeed”, while omitting pitch outs and intentional balls. I was able to see how umpires fared in correctly identifying strikes by pitch type in Table Six.
table-six

Not surprisingly, we see Bill Miller near the top of the list with both Offspeed and Fastball accuracy. For umpires as a whole, the difference in accuracy between the two is not large (79.05% Offspeed accuracy vs. 78.91% Fastball strike accuracy). On the other hand, what may come as a surprise is the fact that eight of the top ten highest accuracies were for Offspeed pitches.

Called Strike Accuracy for Home and Away

One of the most-mentioned tendencies of referees or umpires in any sport is home-team favoritism. Whether a foul or no-foul call in basketball, in or out-of-bounds call in football, or a strike or ball ruling in baseball, many think that the home team receives more of an advantage than their visiting counterparts. Looking at top and bottom half of innings, away and home team respectively, we can identify trends and favoritism in major-league umpire strike zones.

While a difference of .62% accuracy may seem like a lot, especially in a sample size of over 650,000 total pitches, we can look at this on a game-by-game level to see the actual discrepancies. For simplicity’s sake, we can assume 162 games a season, making for roughly 11780 games played in our data set (this subtracts all games from the unavailable 2016 data). This leaves us with 23.03 Correctly Called Strikes out of 29.05 Actual Strikes for away teams per game, meaning that 6.02 strikes were not called. As for home teams, we have 22.04 Correctly Called Strikes a game with 28.02 as the Actual Strikes, averaging 5.98 missed strikes a game. By this measurement we can see that more hitter leniency was given to the away team than the home team.

During this time frame, while a higher percentage of strikes were judged correctly, hitters were given more leniency as the away team than the home team on a game-by-game basis.

table-seven

Called Strike Likeliness in Specific Game Situation

Included in Table Eight are the three most and least likely umpires to call any non-fastball a strike below the vertical midpoint of our zone. I split the strike zone at 2.55 vertical feet and looked at any pitch (not necessarily within the zone) below that height. Here, we are not judging an umpire’s accuracy of correctly identifying pitches, but rather looking at where a certain umpire may call specific pitches. We can see that Doug Eddings is 5.34% more likely to call a strike on a non-fastball as compared to Carlos Torres.

While this does not paint the entire picture, we are able to see how their tendencies can play an important role in the game. Information like this may be valuable to a team in deciding how to pitch a specific batter, which reliever to bring into a game, or factor into being more patient or aggressive while at the plate.
table-eight

Conclusion

External pressures and increased standards are undoubtable effects on umpire strike zones. As evidenced throughout this paper, strike zones are called smaller than the rule-book strike zone specifies. And while umpires are trending toward correctly identifying strikes, situations such as count and pitch type can affect their judgment.

While the system in place is not 100%, we must understand that these umpires are judging the fastest and most visually-deceptive pitches in the world and are the best at what they do. Major League Baseball must use modern technology to their advantage and provide the best training for umpires to achieve the goal of calling the rule-book strike zone. Another option, while more drastic and difficult to implement, may include adapting the definition of the rule-book strike zone, something that has not been changed since 1996.


The Least Interesting Player of 2016

Baseball is great! We all love baseball. That’s why we’re here. We love everything about it, but we especially love the players who stick out. You know, the ones who’ve done something we’ve never seen before, or the ones that make us think, “Wow, I didn’t know that could happen.” It’s fun to look at players who are especially good — or, let’s face it, especially bad — at some aspect of this game. They’re the most interesting part of this game we love.

But not everyone can be interesting. Some players are just plain uninteresting! Like this guy.
http://gfycat.com/TinyWeakBonobo
OMG taking a pitch? That’s boring. You’re boring everybody. Quit boring everyone!

https://gfycat.com/GargantuanCreamyAmberpenshell
You caught a routine fly ball? YAWN! Wake me when something interesting happens.

But it’s hopeless; nothing interesting will ever happen with Stephen Piscotty. I’m sure the two GIFs above have convinced you that he was the least interesting player in baseball last year. But, on the off-chance that you have some lingering doubts, we can quantify it. I’ve made a custom leaderboard of various statistics for all qualified batters in 2016. For each of these statistics, I computed the z-score and the square of the z-score. In this way, we can boil down how interesting each player was to one number — the sum of the squared z-scores. The idea is that if a player was interesting in even one of these statistics, they’d have a high number there. Here are the results:

Click through for an interactive version

I don’t need to tell you who the guy on the far right is. On the flip side, though, there are two data points on the left that stick out. The slightly higher of the two is Marcell Ozuna, with an interest score of 1.627. The one on the very far left is Stephen Piscotty, with an interest score of 0.997. That’s right — if you sum the squares of his z-scores, you don’t even get to 1! This is as boring and average as baseball players get.

Where the real fun begins, though, is when you start making scatter plots of these statistics against each other. I’ve made an interactive version where you can play around with making these yourself, but here are a few highlights:


AVG vs. SLG


IFFB% vs. OPS


ISO vs. wRC+

Pretty boring, right? But wait, there’s more! Let’s investigate a little further what went into his interest score. Remember how we summed his squared z-scores and got a value below 1? Well, let’s look at the individual components that went into that sum.

The Most Boring Table Ever
Statistic Squared z-score
LD% 0.108
GB% 0.002
PA 0.296
G 0.220
OPS 0.001
BB% 0.057
SLG 4.888e-05
WAR 0.007
BABIP 0.141
K% 0.103
IFFB% 0.0004
ISO 5.313e-05
FB% 0.007
wOBA 0.022
AVG 1.69e-29
wRC+ 0.025
OBP 0.006

Yes, you’re reading that right — where he stood out the most was in games played and plate appearances. Yay, we got to see that much more boring! Also, I think it is especially apt that his AVG was EXACTLY league average.

All right, time to step back and be serious for a second. As Brian Kenny is always reminding us, there is great value in being a league-average hitter. Piscotty was worth 2.8 WAR last year, just his second year in the league. He’s already a very valuable contributor to a very good team. Maybe it’s time we started noticing guys who do everything just as well as everyone else, and value their contributions too?

(Nah, I’m going to go back and pore over Barry Bonds’s early-2000s stats for the next few hours.)

All the code used to generate the data and visualizations for this post can be found on my GitHub.


dSCORE: Pitcher Evaluation by Stuff

Confession: fantasy baseball is life.

Second confession: the chance that I actually turn out to be a sabermetrician is <1%.

That being said, driven purely by competition and a need to have a leg up on the established vets in a 20-team, hyper-deep fantasy league, I had an idea to see if I could build a set of formulas that attempted to quantify a pitcher’s “true-talent level” by the performance of each pitch in his arsenal. Along with one of my buddies in the league who happens to be (much) better at numbers than yours truly, dSCORE was born.

dSCORE (“Dominance Score”) is designed as a luck-independent analysis (similar to FIP) — showing a pitcher might be overperforming/underperforming based on the quality of the pitches he throws. It analyzes each pitch at a pitcher’s disposal using outcome metrics (K-BB%, Hard/Soft%, contact metrics, swinging strikes, weighted pitch values), with each metric weighted by importance to success. For relievers, missing bats, limiting hard contact, and one to two premium pitches are better indicators of success; starting pitchers with a better overall arsenal plus contact and baserunner management tend to have more success. We designed dSCORE as a way to make early identification of possible high-leverage relievers or closers, as well as stripping out as much luck as possible to view a pitcher from as pure a talent point of view as possible.

We’ve finalized our evaluations of MLB relievers, so I’ll be going over those below. I’ll post our findings on starting pitchers as soon as we finish up that part — but you’ll be able to see the work in process in this Google Sheets link that also shows the finalized rankings for relievers.

Top Performing RP by Arsenal, 2016
Rank Name Team dSCORE
1 Aroldis Chapman Yankees 87
2 Andrew Miller Indians 86
3 Edwin Diaz Mariners 82
4 Carl Edwards Jr. Cubs 78
5 Dellin Betances Yankees 63
6 Ken Giles Astros 63
7 Zach Britton Orioles 61
8 Danny Duffy Royals 61
9 Kenley Jansen Dodgers 61
10 Seung Hwan Oh Cardinals 58
11 Luis Avilan Dodgers 57
12 Kelvin Herrera Royals 57
13 Pedro Strop Cubs 57
14 Grant Dayton Dodgers 52
15 Kyle Barraclough Marlins 50
16 Hector Neris Phillies 49
17 Christopher Devenski Astros 48
18 Boone Logan White Sox 46
19 Matt Bush Rangers 46
20 Luke Gregerson Astros 45
21 Roberto Osuna Blue Jays 44
22 Shawn Kelley Mariners 44
22 Alex Colome Rays 44
24 Bruce Rondon Tigers 43
25 Nate Jones White Sox 43

Any reliever list that’s headed up by Chapman and Miller should be on the right track. Danny Duffy shows up, even though he spent most of the summer in the starting rotation. I guess that shows just how good he was even in a starting role!

We had built the alpha version of this algorithm right as guys like Edwin Diaz and Carl Edwards Jr. were starting to get national helium as breakout talents. Even in our alpha version, they made the top 10, which was about as much of a proof-of-concept as could be asked for. Other possible impact guys identified include Grant Dayton (#14), Matt Bush (#19), Josh Smoker (#26), Dario Alvarez (#28), Michael Feliz (#29) and Pedro Baez (#30).

Since I led with the results, here’s how we got them. For relievers, we took these stats:

Set 1: K-BB%

Set 2: Hard%, Soft%

Set 3: Contact%, O-Contact%, Z-Contact%, SwStk%

Set 4: vPitch,

Set 5: wPitch Set 6: Pitch-X and Pitch-Z (where “Pitch” includes FA, FT, SL, CU, CH, FS for all of the above)

…and threw them in a weighting blender. I’ve already touched on the fact that relievers operate on a different set of ideal success indicators than starters, so for relievers we resolved on weights of 25% for Set 1, 10% for Set 2, 25% for Set 3, 10% for Set 4, 20% for set 5 and 10% for Set 6. Sum up the final weighted values, and you get each pitcher’s dSCORE. Before we weighted each arsenal, though, we compared each metric to the league mean, and gave it a numerical value based on how it stacked up to that mean. The higher the value, the better that pitch performed.

What the algorithm rolls out is an interesting, somewhat top-heavy curve that would be nice to paste in here if I could get media to upload, but I seem to be rather poor at life, so that didn’t happen — BUT it’s on the Sum tab in the link above. Adjusting the weightings obviously skews the results and therefore introduces a touch of bias, but it also has some interesting side effects when searching for players that are heavily affected by certain outcomes (e.g. someone that misses bats but the rest of the package is iffy). One last oddity/weakness we noticed was that pitchers with multiple plus-to-elite pitches got a boost in our rating system. The reason that could be an issue is guys like Kenley Jansen, who rely on a single dominant pitch, can get buried more than they deserve.


Maximizing the Minor Leagues

Throughout each level of the minor leagues, a lot of time and effort is devoted to travel. A more productive model would be for an entire level playing in one location. Spring training’s Grapefruit and Cactus Leagues are a great example. Like spring training, the goal of the minor leagues is to develop, not to win. In this system, players would have more time to work on strength, durability, and skill development. This system could be in effect until the prospect reaches Double-A. At that level, players could start assimilating themselves to playing ball all over the map. However, this is merely a pipe dream. The more realistic option to improving the minor leagues would be to raise each player’s salary.

In 2014, three ex-minor-league baseball players filed a lawsuit against Major League Baseball, commissioner Bud Selig and their former teams in U.S. District Court in California. Sports Illustrated attorney and sports law expert, Michael McCann, explained their case.

“The lawsuit portrays minor league players as members of the working poor, and that’s backed up by data. Most earn between $3,000 and $7,500 for a five-month season. As a point of comparison, fast food workers typically earn between $15,000 and $18,000 a year, or about two or three times what minor league players make. Some minor leaguers, particularly those with families, hold other jobs during the offseason and occasionally during the season. While the minimum salary in Major League Baseball is $500,000, many minor league players earn less than the federal poverty level, which is $11,490 for a single person and $23,550 for a family of four….

The three players suing baseball also stress that minor league salaries have effectively declined in recent decades. According to the complaint, while big league salaries have risen by more than 2,000 percent since 1976, minor league salaries have increased by just 75 percent during that time. When taking into account inflation, minor leaguers actually earn less than they did in 1976.”

Like many big corporations, MLB teams would never increase minor-league salary just because it is the right thing to do. What’s in it for them? Think about it like this.

economics-milb

At point A, when the average MiLB player has a wage set at W2, the player will take Q2 hours out of the day to work toward baseball. As you can see, there is room to improve, as point B is optimal. Accomplishing point B would mean increasing a player’s salary to W1. In turn, players could afford to take Q1 hours out of the day toward baseball. With most minor-league players needing to find work in the offseason or even during the baseball season, a raise in salary would give them the opportunity to be full-time baseball players. These prospects would spend more time mastering their craft, speeding up the developmental process.

With a season as long as 162 games, there is no telling how much depth could be needed in a given year. Just ask the Mets. That’s why it is important to maximize the development in a team’s farm system. At the end of the day, this is merely a marginal benefit. It will not take an organization’s farm system from worst to first. However, it only takes one player that unexpectedly steps up in September to alter a playoff race, proving worth to the investment.


The St. Louis Cardinals Have a Type

This series will cover various trends I’ve observed major-league baseball teams following. Some trends will be analytical while others will be more…”conceptual.” Trends may span a season, or even several, it doesn’t matter, I don’t want to limit myself out of the box. Ideally, I’d like to cover all 30 teams, but I also don’t want to expect too much of myself out of the box, either. After all, I don’t have Francisco Lindor’s smile to pull it off.

Image result for francisco lindor trips on bat gif

Maybe that kind of thinking is limiting — not the part about Lindor’s smile (though in a weird way it does tie in), but maybe I like to undertake something with the caveat that I might not follow through because of experience, mine or others (see Stevens, Sufjan — 50 states project). Or maybe I don’t want to underestimate the extent of my laziness. Or maybe I’m just a glass-half-empty kind of guy…

And maybe all of this relates to my face.

From Wikipedia: “Physiognomy is the assessment of a person’s character or personality from his or her outer appearance, especially the face.”

It’s no secret that we’re all judged on our outer appearance. Some studies have shown it even relates to how well we’re paid. A predisposition towards handsome exists in baseball, too. It’s in the old scouting maxim, “the good face,” which essentially is the baseball colloquialism for “hottie.” But it can also refer to the potential presence of naturally-elevated levels of testosterone, as a strong jaw and well-defined cheekbones are sometimes indicative of the hormone.

Hogwash! Right? Well, have you ever wondered what it would be like to look like Brad Pitt and thought about how differently people would react to you? Now, I’m not talking about a matter of right or wrong, but people would generally respond more positively to you, both socially and professionally, and that does have an impact on confidence, which plays a massive, albeit intangible role in a baseball player’s on-field success.

But, come on. With all the advanced methodology we have to evaluate players, isn’t the “The Good Face” adage a thing of the past? I’m sure it’s probably lost some of its weight in the player-evaluation process, but it hasn’t disappeared. In fact, in the evaluation process used by (arguably) the most successful team of this decade, it’s very much alive. In recent memory, there have been enough handsome doppelgangers in their mix to wonder if the “Cardinal way” isn’t some iron-clad philosophy the organization established to allow them to get the most out of their young players, but that it might just be a certain type of guy!

You’re at FanGraphs, and so I assume you’re a savvy individual and that you know a ruse when you read one, but I want to qualify this writing by saying that this proposition is roughly ~0.0000000000000001% serious. Okay, so essentially, there are six archetypes for Cardinals players’ faces.

  1. The Wain-Os

oval

Pseudoscience says: “If you have an oval face shape, you always know the right things to say.”

wain0

2. The Kozmakazies

square

Pseudoscience says: “If you have a square-shaped face, you are gung-ho and a total go-getter.”

squaress

3. The DesCarpenSons

rectang

Pseudoscience says: “You value logic and you’re a really good thinker. Plus, you’re an awesome planner.”

4. The Lynnburger

heart

5. The Ambiguous Pham

ambig

Pseudoscience says: “If you have a diamond face shape, you’re a control freak. You’re very detail-oriented…”

6. We Don’t Want No Scruggs

oblon

Pseudoscience says: See the Wain-Os

Yes, it helps they’re in the same uniform, and yes, I very obviously cherry-picked some of those, but aren’t you still a little floored? The variation here rivals the lack of distinguishability featured among the male contestants on the Bachelorette.

So isn’t this proof of old-school scouting at work? What gives with all the talk of the Cardinals’ cutting-edge front office — are they just masquerading with the hiring of NASA data analysts and organizational philosophies? Or have they truly married the new school and old school? Maybe there is something to building a roster of similar-looking players that prevents “fault lines” from forming.

Or maybe…

Think back to the hacking scandal of 2015. The Cardinals’ new Director of Scouting, Christopher Correa, hacked the Astros’ database for information on players regarding the draft, bonuses, and trade talks. Keep in mind, he was working for the Cardinals, not a brand-new expansion team; he could’ve hired anyone he wanted to work for him. He could’ve had his own NASA data analyst, just like Jeff Luhnow had done before him. I know that in the minds of these men, there’s a lot at stake, and so they look for any competitive advantage they can, but this scenario feels like it’s the smartest kid in class copying off the other smartest kid in class on a math test.

So what did Jeff Luhnow have access to that Correa didn’t?!

It was one file, actually. A file buried deep within the infrastructure of the Astros’ database. A file called “Stardust” (Yes, like in Rogue One). Allow me to explain.

This is daughter Luhnow. Her name and age are unknown (Jeff did not respond to my tweets), but my wife estimates her to be 19 in this photo. If we work off that number, she’s at least 20 now, and that means she’s probably been able to identify boys she thinks are cute for around 15 years, which lines up perfectly with when her dad was hired by the Cardinals in 2003.

Imagine it, “it’s easy if you try;” one day, a five-year-old daughter Luhnow wandered into her dad’s office and climbed up onto his lap while he was looking at some files of some players he was targeting to acquire. Mostly just talking to himself, Jeff explained the pros and cons of each player to his daughter and showed her their pictures. When he got to a young pitcher in the Atlanta Braves’ farm system, she put her hands to her mouth and giggled. “What’s so funny?” Jeff asked his suddenly-bashful daughter. Her face was nuzzled in her dad’s chest, so the words were a bit muffled, but Jeff heard them clear as day. “He’s cute,” she responded.

It was a strange moment for Jeff — he wrestled back the protective instincts welling up inside him, but as he looked at the picture of the lanky, young right-handed pitcher, he realized that she wasn’t wrong. Adam Wainwright was handsome in an awkward, President’s son kind of way.

While this was the deciding factor for Jeff, he was thrilled that he wouldn’t have to admit that to his bosses, because Wainwright was also a top-100 prospect. So the Cardinals sent J.D. Drew and Eli Marrero to the Braves for Wainwright, Jason Marquis, and Ray King (an admittedly motley crew).

Jeff remembered that moment and would, from time to time, call his daughter into his office and gauge her reactions to the players he’d show her. Eventually, however, he didn’t need to call her in anymore. Daughter Luhnow liked baseball, and liked looking at the pictures of the young men; it was like reading a Teen Beat magazine with her dad!

Before I go any further, I want to note that this is one of those conceptual pieces I referred to in the intro, and that the parts about daughter Luhnow are entirely fictitious. There are also no underlying misogynistic themes at play here. I believe a woman could run a major-league baseball team as well as any man — I just think the idea of a team as renowned and successful as the Cardinals being run on the lustful whims of a teenage girl is really funny.

So the way I see it, she had her own Excel spreadsheet where she could rate the features of potential acquisitions on the same 20 – 80 scale as scouts. She could comb through high-school, college, minor-league, and major-league rosters and highlight her favorite guys by coming up with an overall score.

This authoritative list, while completely undisclosed until now, has unwittingly been at the center of a couple of controversies. It is what ultimately drove the wedge between Walt Jocketty and the Cardinals, and also, as previously mentioned, is the holy grail that Christopher Correa was in search of when he hacked the Astros’ database — and what he is currently serving a 46-month jail sentence for.

About the moves that Correa made without the elusive “Stardust” file. He had an idea of her type of guy based on previous transactions, and he was able to make some quality, daughter-Luhnow-inspired acquisitions. Of course, that’s hardly a silver lining. Try explaining to your cell mate that you’re in prison for hacking into someone else’s computer for a list of cute, young men (some of which are still in high school!).

You get it. You’re on board. The Cardinals’ success has largely been driven by a teenager’s romantic fantasies. Okay, maybe not. Regardless, I still have a hard time telling the difference between Adam Wainwright and Michael Wacha and I want to see if you are any better. Here are eight pictures of the two Cardinal pitchers, four of each; in the comments section, please attempt to sequence these correctly, and that’s it. This is what happens in the doldrums of the offseason!

answer-key