Archive for Research

The Angels Are Defying the Strikeout Trends

While perusing through the newly introduced +Stats section released recently by FanGraphs, I couldn’t help but notice at the time that three Los Angeles Angels players held the top three spots for the lowest K%+ for 2019 thus far among qualified hitters, with an additional two Angels players joining them to round out the top 30. The first two players were David Fletcher and Tommy La Stella; these two players are roughly average hitters at best, but they have each run a far-below-average K% thus far in their professional careers, so seeing them at the top here isn’t too shocking in a small sample size. The third was of course Mike Trout, who has decided he doesn’t feel like striking out anymore while still maintaining his incredible hitting prowess. Out of all position players that have been qualified hitters in both 2019 and 2018, only Matt Chapman has lowered his K%+ by more in absolute terms (Chapman’s -67 to Trout’s -62), and nobody has lowered their K%+ in percentage terms more than Trout has, as detailed in the chart below:

Overall, no other team in baseball has more than two players in the top 30 of this K%+ measure, and by simple deduction, a handful of teams have not had one single player within that cutoff. Devan Fink has also written about how the Angels are not striking out in 2019, but I was curious to see how their players are stacking up with other recent seasons, so I set the parameters to include all qualified seasons from this decade, and the results were surprising.

Although this is in just a small sample size as mentioned earlier, it’s still noteworthy that those three players make up three of the top four qualified seasons since the beginning of this decade. I’ve also highlighted Andrelton Simmons‘ 2018 season, which was also another top-10 placing for the Angels. Although Simmons doesn’t appear in this chart for his 2019 season, he wasn’t far off with his K%+ of 46, ranking 12th for the season. With all of these Angels players posting such low K%+ figures, it had me even more curious as to how they stack up as a team historically, and whether this is an intentional approach they’re implementing. Read the rest of this entry »


Getting Ejected Works

Getting mad at an umpire, and then tossed from the game, may seem like an ineffective display of emotion since calls are never reversed after a little more yelling. But what about future calls? In order to answer this question, we need good data on a large number of adjudicated events. Close out and safe calls happen fairly rarely, and good data quantifying how close the play was would be difficult to collect. But the home plate umpire calls balls and strikes for every batter, and pitches at the edges of the zone provide plenty of opportunities to grow or shrink the zone slightly.

It’s difficult to measure the zone in a particular game since there aren’t enough pitches at each spot on the boundary of the zone, but by combining data from many games, we can get a clear idea of what the average zone looks like. As for quantifying the zone, it’s easy to get carried away with details (location of each side, correcting for player height, etc.), but with enough data, all of those variables should average out and we can focus on the simplest measure: zone size.

During the past four years, there have been 308 games featuring an ejection over the strike zone, containing about 47,000 pitches. Splitting by team (team with ejected player/coach/manager and opposing team) and before/after the ejection, we have groups with between 9,500 and 14,000 pitches, plenty for a good estimate of the strike zone.

The results, shown below, show two clear trends: first, one team is clearly justified in being upset as their hitters face a larger zone. Second, we see that umpires fix this, even over-correcting slightly, after making an ejection.

Umpires are Human

We all see the humanity of umpires in their fallibility, but it shows in other ways too: the zone shrinks on 0-2 counts and expands on 3-0 ones, showing that they don’t like ending an at-bat with their own judgement call. This doesn’t mesh well with the fiery persona of the umpire and their emotive strike-three calls, but we have to remember that they are playing a part, and their main goal is to keep the game firmly in their control. We see more evidence of this here: if umpires ejected arguing players out of a sense of holy wrath, we would expect no change in the strike zone at all.

Instead, we see a clear reaction in the direction that the arguing player desires. While the data cannot point to the exact mechanism, I see two distinct explanations: signaling and aversion to conflict.

In the signaling hypothesis, we suggest that players are frequently sending messages to the umpire, but the umpire considers these messages according to the cost in sending it. A few words muttered under their breath doesn’t cost them anything, and so it is usually ignored. An ejection is costly, so the umpire takes that signal seriously.

The second hypothesis is a simple human aversion to being yelled at in front of a crowd of thousands. It’s not a fun experience for anyone, so they take action to avoid it happening again.

About the Models

To measure the zone, I took two approaches, k-nearest neighbor (which knows nothing about the expected shape of the strike zone) and a logistic regression based model (which looks for a rounded rectangle). Error estimates were calculated using bootstrapped samples. Both gave similar results, and the code and data behind this post are available on Kaggle.


Evaluating Trevor Bauer’s Pitch Usage

Trevor Bauer is a walking headline. Whether he is turning himself into some kind of pitching robot in a lab or calling out his peers for using a foreign substance to enhance their spin rate, Bauer tends to attract plenty of attention away from the field. However, Bauer’s most noteworthy accomplishments lately have occurred on the field. Last season, Bauer had more fWAR than Blake Snell, winner of the American League Cy Young Award, despite pitching fewer innings and landing on the injured list for over a month. Bauer, unsatisfied with last year’s performance, developed his previously sparingly used changeup in the offseason to complement his already ample repertoire. Taking a look at Bauer’s pitch usage this year shows a clear difference in the way he attacks righties as opposed to lefties.

Here is his pitch mix vs. righties this season:

And his usage vs. lefties:

Of course, the small sample size caveat applies at this point of the season, but Bauer has been featuring a changeup against lefties at a much higher rate than last season.

Here’s Bauer’s 2018 pitch usage against lefties:

Bauer now throws his changeup twice as often as last season against lefties, and so far the results have been good. The pitch has produced a 75% ground-ball rate when put in play, and opposing batters have only recorded a single hit off of it.

While Bauer has certainly adjusted his method of attacking lefties, an early breakdown of how he has attacked righties is even more intriguing. Here’s his 2018 pitch usage vs. right-handed batters:

Comparing Bauer’s 2018 and 2019 pitch breakdown against righties reveals a few monumental adjustments. Bauer has evidently abandoned his signature knuckle curve and replaced it with a sharp increase in the usage of a cutter. In my opinion, these adjustments were made in the name of tunneling. Sliders and cutters both have primarily sideways movement, which makes it more difficult for the batter to differentiate between them. Curveballs and changeups both tend to break downwards, causing the same confusion for batters. By pairing these pitches against righties and lefties respectively, Bauer decreases the chance that a batter can read the pitch correctly out of his hand.

Up until this point, Bauer has been sharp, using his new changeup and dedication to tunneling to strike out a third of the batters he has faced and firing seven no-hit innings on April 4th against the Blue Jays. Through five starts, Bauer has struck out 32.6% of batters and at least seven in each outing. As Bauer continues to tweak his approach, perhaps he could benefit even further by lowering his fastball usage, mimicking the strategy of many pitchers before him, in order to combat hitters who sit on the pitch ready to unleash uppercut swings. By lowering his fastball usage and further utilizing his tunneling ability, Bauer will be even more unpredictable to hitters.

Time will tell if Bauer’s new strategy will be successful all year, but based on his dedication to both analytics and his craft, he seems to be on pace for another Cy Young caliber season.

Gabriel Billig is currently a student at Baruch College studying data analytics.


Are Ted Williams’ Hitting Philosophies Still Relevant Based on the Data?

In hindsight, it’s unfortunate that Ted Williams philosophies on hitting took so long to become universally accepted. His thoughts on batting were clearly ahead of his time and it has only been in the past few years that the more prevalent “swing down” views have largely exited the baseball community.

In his book, The Science of Hitting, Williams suggested an upward swing path that aligns the bat path and pitch path for a better chance of contact – about 5 degrees for a fastball and 10-to-15 degrees for a curveball. This research note is not about the total amount of loft in the swing today — everyone knows that swing loft is greater now than in Williams’ day. However, there are some very interesting findings in the data in terms of whether players are utilizing consistent amounts of swing loft for different pitch locations, which is implied in Williams’ book.

One observation that seems to hold in many sports is that the best performers are typically out in front of the popular views of the day in terms of changing mechanics for the better. However, as we will see in the data, this does not necessarily mean that these superior mechanics are being understood and directed by conscious understanding.

It turns out that there is a very important element that wasn’t considered by Williams in his book which the data shows the best hitters are “considering” — the amount of Vertical Bat Angle (VBA) in the swing. VBA can be defined as the amount of vertical swing tilt as viewed from the center field camera. The swings in Williams’ day as well as the illustrations in his book clearly have much less VBA than today’s hitters. While there is no broad data on VBA, a study of minor league hitters by David Fortenbaugh in 2011 showed the following averages of VBA at contact:

There is evidence which suggests that VBA goes well beyond player “style” and is more of a core swing mechanic that is associated with higher quality contact as well as superior levels of performance. Here is a chart showing VBA by playing level.


Read the rest of this entry »


The Most- and Least-Potent Pitch Combos in 2018

I believe that pitches aren’t thrown in a vacuum, and the effectiveness of one pitch is certainly affected by the pitches that preceded it. Thus, I wanted to identify the most- and least-potent 1-2 pitch combinations in the 2018 Major League Baseball season. To accomplish this, I built a Pitch Combo Effectiveness Tool based on all 2018 pitches thrown in the major leagues.

The approach I took was to evaluate every pitch as the second pitch in a 1-2 combo (forcing us to exclude first pitches in an at-bat). I defined these pitch combos using the pitcher, the pitch types of both the first and second pitches (e.g. “four-seam fastball followed by a curveball”), and the pitch location change from the first to the second pitch (e.g. “the second pitch was further down and more inside than the first pitch”). I then gauged the effectiveness or value of these pitch combinations using the sum of the wOBA added for both the first and second pitches. Lastly, to ensure we were only looking at common pitch combos, we filtered the results to pitch combos observed at least 10 times in 2018.

The chart showing every pitch combo is below, and you can click it to go to the full tool and results:

Most and Least Effective Pitch Combos by wOBA Added
Most and Least Effective Pitch Combos by wOBA Added

Read the rest of this entry »


Why There May Just Be Hope for the Miami Marlins in 2019

As the 2019 season begins, Las Vegas determines the annual over/under win totals for all 30 major league teams and gives us a chance to examine intriguing over/under win lines for the upcoming season. Not surprisingly, the Miami Marlins found a spot right at the bottom of the list at over/under 63.5 wins. Will the Miami Marlins, under the ownership of Derek Jeter and the tutelage of Michael Hill, elude the worst record in baseball? Call me crazy, but there are a number of reasons why Vegas’ determination of 63.5 wins is undervaluing the Marlins.

J.T. Realmuto, a 2018 All-Star and arguably the last star on the Marlins roster, was acquired by the Philadelphia Phillies for Jorge Alfaro, Sixto Sanchez, and Will Stewart this past offseason. While Sanchez is a potential budding ace pitcher and Stewart has a real future as a middle-of-the-rotation starter, Alfaro is the most interesting addition for the 2019 season. He rates as a guy with incredible raw power when he puts the bat on the ball, with the only issue thus far in his career being that his contact percentage is quite low:

The K% is good for 245th out of 247 players (min. 350 PAs) and the BB% ranks in the 8th percentile among those same 247. By looking at his O-Swing%, it’s good for second-to-last and 16% above the 2018 league average of 30.9%, and clearly he’s not making enough contact at 61%. However, when Alfaro does manage to put bat on ball, the results are quite impressive:

How about a video of the swing in action? This ball, at 115 mph off the bat of Alfaro, was absolutely crushed, and I think Junichi Tazawa’s reaction says it all…

With more patience and a better approach at the plate, the Marlins could have something special in Alfaro. It’s evident that this improved approach could be on it’s way by analyzing his second-half statistics from July 2018 to September 2018:

Alfaro managed to cut his K% and increase his BB%, while performing as an above-average hitter according to wRC+. He made strides at the plate by lowering his whiff percentage outside of the zone from 28% in the first half to 25% in the second half, and his batted ball quality improved against breaking pitches, which he had struggled with mightily in the first half, as his xwOBA increased from 0.246 to 0.338 in the second half and his whiff percentage on breaking balls decreased from 34.68% in the first half to 26.52% in the second half. Read the rest of this entry »


An Analysis of the Relationship Between Pitcher Size and UCL Tears

A UCL tear is a death sentence for a player’s season, and it can have large repercussions for the team and league as a whole, making it crucial for front offices to understand what puts players at a heightened risk for this injury. In this research, the height, weight, age, and fastball velocity of MLB pitchers in the years 2000-17 are analyzed to determine the impact of pitcher size on UCL tear probability. The results of this study will aid executives and front offices in evaluating pitchers and their risk of needing Tommy John surgery. Moreover, these findings may aid pitchers in lowering chances for injury by guiding their offseason training goals.

1. Introduction

As Tommy John surgery and UCL tears are thrust further into the spotlight, more is revealed about possible factors and causes. In this paper, I will inspect the correlation between pitcher size (BMI) and UCL tear probability in order to determine whether the former has a statistically significant impact on the latter. The data used in this study was taken from FanGraphs, the Lahman Database, and Jon Roegele’s Tommy John Database, all of which are publicly available sources. Due to the many variables which are closely correlated with BMI and have an impact on UCL health, such as age and velocity, pitcher size was analyzed independent of these variables, which are controlled through partial correlations.

2. Analysis

2.1 BMI and Tommy John: In Aggregate

When the data set is viewed in its entirety, the results are overwhelming. The mean BMI of pitchers who have undergone Tommy John surgery is 27.09, whereas the mean BMI of pitchers who have not is 26.34. The difference between these means is statistically significant, as the p-value (odds of the difference existing due to chance) in a two sample t-test is .000001153, far below the .05 benchmark commonly used in statistics. To test this relationship in a different way, the BMIs of the 2,383 pitchers in the data set (298 who had torn their UCL, 2085 who had not) were split into deciles. The correlation between decile number and probability of Tommy John was .91, with a p-value of .0002556, revealing that there is statistically significant linear correlation between UCL tears and BMI, with higher-BMI pitchers having higher risk for Tommy John surgery. The graph of these deciles and the probability of Tommy John is shown below. Read the rest of this entry »


Is Yoan Moncada’s Breakout Coming?

Yoan Moncada has frustrated talent evaluators over the past two years. He’s about as physically talented as a baseball player can be; while still a prospect, the team here at FanGraphs thought he merited future grades of 60 hit, 60 power, 70 speed, 50 field, and 70 throw, with an OFP of 70 good for No. 1 overall prospect status. Prospects don’t get evaluated much better than that; in fact, a 70 OVR on a position player is as good as it gets. He was the kind of prospect that could headline a trade for a top-five starting pitcher, a bonafide ace, in his prime on a team-friendly contract with three years left.

Flash forward two years, about a year and a half into Moncada’s major league career, and he hasn’t performed quite as billed. Instead, in 901 career plate appearances before Opening Day 2019, he posted a 97 career wRC+ and 3.1 total fWAR, almost exactly league-average or slightly below. His defense at second base has not impressed, and so he’s being moved to the hot corner in the wake of 1) the White Sox whiffing on Manny Machado, and 2) the White Sox drafting “future Gold Glove second sacker” Nick Madrigal with the 4th overall pick in 2018. If nothing changes, he’s be in danger of becoming a utilityman.

Moncada’s offensive struggles are a little unusual. He has two traits required to be an offensive monster — power and patience — in abundance. Last year, his average exit velo of 90.6 mph was in the 86th percentile, while his 4.12 pitches seen per PA was in the 81st percentile. However, those positive traits were offset by the modern game’s bugaboo — strikeouts. Moncada struck out in an ugly 33.4% of his PAs last year, behind only Chris Davis and Joey Gallo, and his career K rate sat at 33.6% this offseason. This is very concerning, as contact issues are a flaw that are difficult to resolve.

The profile above seems to describe a three-true-outcomes hitter like the aforementioned Gallo. Dig a little deeper, though, and you’ll find that how Moncada struck out that often is not normal, and in a sense he doesn’t actually have contact issues, at least not 33.4% bad. He didn’t chase many pitches out of the zone last year — only 23.3% — sitting in the 87th percentile of qualified hitters. Neither does his whiff rate of 12.2% (league average in 2018 was 10.7%) jibe with that huge strikeout rate. Taken together, we can conclude that while Moncada’s contact ability may be somewhat below-average, he limits how much he swings-and-misses by rarely chasing pitches out of the zone. So if Moncada doesn’t chase much, and doesn’t swing and miss that much, how is he striking out so much? Read the rest of this entry »


The Evolution of Stealing Bases at the College Level

Since the end of the BESR era, there was a downward trend of runs per game, home runs per game, and stolen bases per game in college baseball. After introducing flat-seam balls, home runs per game and runs per game have been on an upward trend. Both of these rule changes would seem to have no impact on stolen bases per game, and why would they? Analytics suggests that stealing bases is not worth the risk. I still believe there is value in stealing bases in today’s game, and the decline of it has hurt teams’ performance, especially squads that are at a disadvantage to Power 5 Conference teams. Programs such as Wright State, UCF, UCONN, and Campbell are able to stay competitive year after year by implementing the run game in their offense.

In 2018, 38 of the top 50 teams in stolen bases had a record above .500, while 38 of the bottom 50 teams in stolen bases have a record below .500. Out of the 35 non-Power 5 teams in the 2018 NCAA Tournament, 14 of those teams were ranked in the top 50 in stolen bases.

Read the rest of this entry »


Does Warm Weather Create Better Players?

My high-school-aged son sits at home yet again. Why? Because another of his baseball games has been canceled due to the wet and cold Ohio spring, and my thoughts turn again to our days playing baseball in Florida. Before we moved to this less-agreeable northern climate, it was a rarity to have a game canceled due to weather. Not only that, but games were scheduled year-round, which of course meant more baseball on the calendar. This situation reminded me of the familiar equation known to baseball fans:

Good weather leads to more playing.
More playing means better players.

But is this true? After all, it’s well-known that the best player in baseball, Mike Trout, is from cold-weather New Jersey. Many quickly point to the fact that California, Texas, and Florida are at the top of the list for states with the most MLB draftees, but they’re the three most populous states. Perhaps proportionally they don’t stack up to colder states after all.

I decided to look at the data from the last two drafts — 2017 and 2018 — to see if there is a relationship between a state’s average temperature and how well its players do in the draft. Do warmer-weather states really produce more MLB draftees than average?

To do this, I first gathered population data from each state to determine what percentage of the overall US population it contains. Then I did the same for each states’ MLB draft population. Finally, I compared those two figures and determined the percentage difference between their population proportion and their draft proportion. I call this figure the “Draft Difference”.

For example, let’s say State X makes up 10% of the US Population, but the State X’s draft class makes up only 8% of the overall class. Its Draft Difference is calculated as:

(Draft-Population)/Population = Draft Difference

In this case,

(8-10)/10 = -.20 = -20%

A state with 10% of the US population should, all things being equal, contribute 10% of all players in an MLB draft. But, in this case, State X did 20% worse than should be expected just from its population size. Read the rest of this entry »