Archive for Research

The Compassionate Umpire or The Cold Automated Zone

Note: This is a piece I have blogged about previously for a British baseball site located here, and this is a slightly updated version.

Jeff Sullivan does pieces on the worst called balls and strikes at the halfway mark and end of each season. These are usually quite bizarre calls that have some unusual circumstances behind them, but for the most part they don’t have too much influence on the game. However, in this postseason, there was a poor “strike” call which had a huge impact on a game.

In bottom of the second inning of Game Three in the NLDS series between the Braves and the Dodgers, Walker Buehler was in a difficult situation with two outs and runners on second and third after an error from Cody Bellinger. The Dodgers decided to intentionally walk Charlie Culberson, loading the bases, to get to Braves pitcher Sean Newcomb – a fairly standard approach in the NL. But Buehler fired four balls to Newcomb and walked in the first run of the game, bringing up Ronald Acuna, who Buehler threw another three balls to to end up down 3-0 in the count.

Then came “ball four,” but it wasn’t called a ball despite being two inches above the top of the zone, as home plate umpire Gary Cederstrom called a strike. That meant Buehler threw another pitch to Acuna, who launched it for a grand slam, resulting in a score of 5-0 and not 2-0. The potentially “pitcher friendly” call by the umpire cost the Dodgers three runs in a game they ended up losing by just one.

To go to a hyperbolic extent, this meant they lost the game, they then had to play a further game in the series against Atlanta, they were then more tired than the Brewers in what became a seven-game series, they were then more tired than the Red Sox, and they therefore lost the World Series. Certainly a stretch, but it’s not hard to see the effect in the game considering the Braves managed just three runs in the other 35 innings of their four-game series.

Not every mistake made by an umpire has an easily identifiable ramification like that, but they do happen in most game, and it is no surprise that MLB and the WUA (World Umpire Association) want to have the smallest number of mistakes possible. Nowadays they can do this by looking at how many calls an umpire got right or wrong thanks to systems that track the speeds and trajectories of pitched baseballs. Read the rest of this entry »


What to Make of Dallas Keuchel

Despite the generally slow free agent market and the continuing increase of bullpen usage, starting pitchers have done fairly well for themselves this winter. Patrick Corbin inked a nine-figure deal, blowing past most projections to get a guaranteed $140 million. The Rays shelled out their largest free agent contract ever, giving Charlie Morton $30 million over two years. Nathan Eovaldi parlayed a strong second half and postseason heroics into a four-year, $67.5 million pact to return to Boston, and J.A. Happ got half that from the Yankees for his age 36 and 37 seasons. Even past-their-prime options such as Lance Lynn, Anibal Sanchez, and Matt Harvey were given eight figures, the former two on multi-year guarantees.

Yet arguably the most accomplished hurler among this year’s crop of free agent starters remains unsigned – Dallas Keuchel. FanGraphs’ Crowd Source and MLB Trade Rumors projections both had the 2015 Cy Young winner in the neighborhood of four years and $80 million, which would exceed Eovaldi’s deal for the second-highest guarantee among starters.

Of available starters, Keuchel was worth the second-most WAR last year (3.6, behind only Corbin’s 6.3), and projects to be the second-most valuable next year (3.3 WAR, just behind Corbin’s 3.5). Much has been made of his decline in punchouts (his strikeout rate dipped to 17.5% in 2018, fourth-lowest among qualified pitchers), but his velocity has remained steady and he’s continued to limit both walks and homers while inducing lots of ground balls. In 2018, Keuchel topped 200 innings for the third time in five seasons, and he’s been an above-average starter in all of those years.

At 31, he’s not young, but he’s younger than Happ (36), Morton (35), Sanchez (35), and Lynn (32), all of whom received multi-year deals. It’s fair to say that Keuchel doesn’t have the upside of Corbin or Eovaldi (or maybe even that of Morton or Yusei Kikuchi), but his consistency and track record should appeal to plenty of teams in need of a rotation upgrade.

Happ, a southpaw with a similar reputation for durability and above-average-but-not-elite performance, and Keuchel have been almost identical over the past three years (518 innings and 9.1 fWAR for Happ, 518.1 and 8.6 for Keuchel). But Happ is four years older, so over the course of his next contract, Keuchel’s output could quite reasonably look a lot like Happ’s recent past – that is, a 170-inning, 3-win metronome.

However, there seems to be some concern or trepidation surrounding Keuchel, a pitcher whose raw stuff was never overpowering, and the sustainability of his results. And looking at some of his underlying metrics, it’s easy to see why. Read the rest of this entry »


Batter Performance vs. Pitcher Clusters

Managers are always attempting to optimize their lineups for success. Whether they make in-game decisions like double-switches and lefty-righty matchups, or choose to change things up based on recent or historical performances, every move is meant to give their team the competitive advantage. What if they also made alterations based on pitcher groupings? In this article, I will attempt to determine if batter performance is impacted by pitcher clusters that are organized by pitch speed and pitch proportion.

The parameters used to cluster pitchers are below:

  • Proportion of Pitch Thrown
  • Average Pitch Speed

These statistics were calculated for the following pitch types:

  • Changeup
  • Curveball
  • Eephus
  • Cutter
  • Four-seam fastball
  • Sinker
  • Two-seam fastball
  • Knuckle-curve
  • Knuckleball
  • Slider
  • Splitter

*All data in this study is from 2010-July 2017 (MLB Gameday). Read the rest of this entry »


Are Analysts Affecting the Behavior They’re Observing?

Introduction and Hypothesis

One of the longest standing tenets of sabermetrics, stemming from Voros McCracken’s seminal 2001 work on DIPS (Defense Independent Pitching Stats) theory, is that pitchers ought to try for strikeouts rather than focusing on inducing weak contact. McCracken asserted that pitchers have little control over the quality of contact they allow. However, they do control if they strike the batter out (good) or walk him (bad) or allow a home run (even worse). Put another way, McCracken found a strong negative correlation between a pitcher’s strikeout rate (K%) and his runs allowed per nine innings (RA9). It is a simple logical step from here to conclude that pitchers ought to try to strike batters out.

Or is it?

Might McCracken’s DIPS observations only hold as long as pitchers are trying to generate weak contact? If they begin to focus solely on strikeouts, might this observed correlation weaken? Might we find more pitchers who are able to generate strikeouts but are not particularly successful at preventing runs?

As an analogy, consider a farmer whose goal is to get a big harvest of high-quality crops. To this end, he regularly waters and fertilizes his plants. He hires a consultant who does some studies and points out that fertilizing is closely correlated with the quality and quantity of the harvest. As a result, the farmer shifts all of his efforts to fertilizing and ignores watering altogether. Clearly this is not the best strategy. In the same way, might a pitcher be hurt by focusing on strikeouts and ignoring the quality of contact his pitches will generate if the batter does make contact?

With this in mind, might we, as analysts, in fact be affecting the very phenomena that we’re observing? Read the rest of this entry »


An Analysis of Pitch Movement at Coors Field

Since opening in 1999, Coors Field has provided the most offense-friendly environment in baseball. Despite the inherent volatility in park factors for single-season data, Coors has “won” the park factor title in 15 of the past 20 years, never finishing lower than third. The dramatic increase in home runs may be the most striking effect of the thin air about a mile above sea level, but all balls in flight, including pitches, are affected. Due to the lower air density, the spin-induced movement of a pitch thrown at high altitude will be lower than that of a comparable pitch closer to sea level.

Check out the average movement on Adam Ottavino’s pitches in 2017 and 2018 separated by home (purple) and away (black).

Ottavino pitch chart

You may recall Ottavino said recently that he is confident Babe Ruth couldn’t hit any of this stuff. Read the rest of this entry »


Prospecting for the Mookie Betts of Pitching

Over the past several years, we have watched a number of hitters in the minors display good contact skills with average or below-average power be labeled with 45s and 50s only to burst onto the scene with an explosion of power they never showed any hint of previous. Mookie Betts might be the best example, along with guys like Jose Ramirez, who show up to the big leagues and announce themselves by mashing.  Naturally, prospect hounds, analysts, and the baseball community investigated how these guys went so overlooked (unless you were Carson Cistulli). It was surmised that contact quality mixed with good exit velocity and appropriate launch angles allowed hitters to maximize their output even without Aaron Judge levels of thump.

This investigation, however, is not a hunt for the next minor leaguer who will smash his way onto the scene, but rather a search for the pitchers who will try to stop them. With modern conditioning and institutions (read: Driveline) making it more possible than ever to gain velocity, one no longer must be naturally gifted a 6-foot-5 frame with easy 95 to be considered a prospect. Furthermore, with openers, bulk guys, firemen, and more, traditional pitching roles are going by the wayside.

This analysis attempts to seek out pitchers who possess above-average command or secondary offerings but lack the prototypical velocity grades we are seeing in today’s game. Identifying these pitchers would make them intriguing candidates for these high-intensity velocity training plans. While you may not find the next Luis Severino, you could uncover an explosive fireman reliever, matchup guy, or high-octane backend starter that pushes you closer to October glory.

The process for this analysis involved using the 2018 updated prospects list from THE BOARD, developed by Kiley McDaniel, Eric Longenhagen, and Sean Dolinar at this very site. I started by sorting for prospects who either currently have > 55 command or project for the same. This brought the sample to 85 pitchers. Next, I sorted out pitchers who have a present FB grade of > 55. Our sample now sits at 38 pitchers who have or project to have above-average command and an average-to-below-average fastball. Before diving into the next set of data, I wanted to provide some broader notes about this group. Notable pitchers with top 100–130 considerations on this list include Atlanta’s Kolby Allard and Joey Wentz, Miami’s Braxton Garrett, and the Angels’ Griffin Canning. There are 16 lefties and 22 righties. The Phillies lead the way with five of these guys, the Cubs and Rockies are tied with three each, and then the rest of the league has one or two on this list. Additionally, the average age of this group is 22.8 years old.

Now that we have our assorted pool, it is time to sort through this group’s off-speed arsenal. This part of the analysis was more subjective. I have attempted to group pitchers with similar traits that could fill a variety of roles. What follows is three tables of guys who could benefit most from additional velocity.

Elite Pitch Guys (70 Grade Pitch)
Name Pos Tm Age FB SL CH CMD
Eli Morgan RHP CLE 22.5 45 / 45 50 / 55 60 / 70 45 / 55
Logan Shore RHP DET 23.9 40 / 45 40 / 45 60 / 70 50 / 60

This first group features two right-handers with a current 60-grade pitch that projects for 70. Of the 38, these two are the lone members who feature a current 60 pitch. Of the two, Morgan has the higher upside based on his slider. Both have fastballs that sit around 90 mph, but additional velo training could push the value of these guys up a tier. Guys from this tier could be featured as openers or one-time-through-the-order relievers that rely on one elite pitch. The selling point of this group is that they have that elite pitch to lean on even without elite velocity.

Mid-to-Backend Starter Type (One 60 and 55)
Name Pos Tm Age FB CB CH CMD
Pedro Avila RHP SDP 21.8 50 / 50 55 / 60 55 / 60 45 / 55
Joey Wentz LHP ATL 21.1 45 / 50 45 / 55 60 / 60 45 / 55
Braxton Garrett LHP MIA 21.3 50 / 50 55 / 60 40 / 55 45 / 55
Foster Griffin LHP KCR 23.3 45 / 45 55 / 60 50 / 55 50 / 55

The next group features players with multiple 55-or-better future offerings, led by Padres righty Pedro Avila, who is rocking two future 60-grade pitches. Previously mentioned notables Garrett and Wentz also fall into this category. This group represents backend starter types who are useful during the season but less useful during the postseason. Additional velo here could push these guys into strong No. 3 starters or high-leverage multi-inning guys.

Kitchen Sinkers (High Secondary Scores)
Name Pos Tm Age FB SL CB CH CMD ARS
Griffin Canning RHP LAA 22.5 50 / 50 50 / 50 50 / 50 45 / 55 45 / 55 155
Peter Lambert RHP COL 21.6 50 / 50 45 / 50 50 / 55 55 / 60 45 / 55 155
Jose Lopez RHP CIN 25.2 50 / 50 50 / 50 50 / 50 40 / 50 50 / 55 150
Aaron Civale RHP CLE 23.4 45 / 50 55 / 60 40 / 45 45 / 50 50 / 60 155
Cole Irvin LHP PHI 24.8 40 / 40 45 / 50 50 / 50 40 / 45 45 / 55 145
Alec Mills RHP CHC 26.9 45 / 45 50 / 50 40 / 40 55 / 55 55 / 60 145
Cory Abbott RHP CHC 23.1 45 / 45 50 / 55 45 / 45 40 / 45 45 / 55 145

The last group of guys profile as backend starter types who live on off-speed stuff and have no margin for error with their fastballs. I identified these players by adding their FV non-fastball pitch grades together, noted as ARS in table (ARS = FCH+FSL+FCB). These guys walk the command and off-speed tightrope to end up as backend starters in the best case, or just middle-relief guys or up-and-down starters. Occasionally these guys become Kyle Hendricks, Tanner Roark, or Doug Fister, but these are exceptions and not the rule. Almost everyone in this group is older for a prospect, so the ceiling is limited, however, additional velo for these guys could turn them into more dynamic multi-inning relivers, bulk guys, or high-end No. 4-5 starters.

I should also note that all these guys fall into different buckets of age, level, and body types. Arguably, the most critical component of a prospect on this list would be targeting high-makeup guys who would be willing to experiment and acknowledge that they could use more gas to ascend to the next level. Some of these pitchers may be maxed out physically or unwilling to change what already seems to work. This analysis also looks past statistical performance, level, and even present pitch value a bit. What this analysis does do is identify guys who could rapidly improve with additional velocity due to advanced command and secondary. The margin for error is incredibly slim for this type of pitcher, but through intense training and velocity gains, pitcher X throwing 90-92 bumping to 94-96 with already above-average command and secondaries would vault them into a new tier of player. For teams looking to squeeze every ounce of value out of their farm system, this could be another way to target undervalued talent that has yet to be unlocked and developed.


Where Did Madison Bumgarner’s Four-Seamer Go?

Something appears to have happened to Madison Bumgarner. Specifically, his four-seam fastball has gone missing. Depending on which data source you use, it figuratively and literally disappeared. Regardless of data source used, Bumgarner’s fastball isn’t performing.

Two leading data sources disagree on what has happened to Bumgarner’s fastball. Because of this, I chose to look at both sources independently: Pitch Info (through Brooks Baseball) and Statcast (through Baseball Savant). This analysis spans four seasons, 2015 through 2018, encompassing Bumgarner’s two best and two worst complete seasons.

According to Pitch Info, Bumgarner threw four-seamers in 2018 at a career-low frequency — 34.5% of the time in 2018, down from 48.2% in 2016 and 49.6% in 2015. It has been losing effectiveness since its peak in 2014. Using Pitch Info’s runs above average metric, we see Bumgarner’s four-seamer peaked in quality at 1.25 runs above average per 100 pitches in 2014 and has dropped each year since then: 0.97 in 2015, 0.39 in 2016, -0.35 in 2017, and -1.14 in 2018, a career low.

bum brooks.png

As seen in the Pitch Info Whiff Percentage charts above, Bumgarner’s four-seam fastball had its lowest whiff rate of our period of study in 2018 (seen on the left), likely leading to it’s ineffectiveness. Similarly, Bumgarner’s four-seam is measured to have had more vertical sink, independent of gravity, than it had throughout this period (seen on the right). Depending on the pitch, more movement generally increases whiff rates. A four-seam fastball moving more like a two-seamer, however, would lose swing-throughs: sinkers (two-seamers) generate more contact in the form of ground balls.

Screen Shot 2018-10-10 at 3.45.35 PM.png

Bumgarner produced his highest ground-ball rate with his fastball since 2013 while also generating the fewest whiffs with his fastball of his career. Couple the results with the change (increased vertical movement), and it appears his fastball began to behave like a two-seam fastball.

This seems to be clear already. According to Statcast, Bumgarner threw his four-seam fastball only once in 2018, as compared to 38.6% of the time in 2016 and 41.1% of the time in 2015. He replaced them mainly with two-seam fastballs, but also with some curveballs and changeups.

bum_pitches_16-18

When comparing Statcast to Pitch Info, I wondered if Statcast could have been misclassifying four-seam fastballs as two-seamers. Through looking at the above plots, however, it’s clear a cluster of pitches was missing in 2018. The above graphs are of every pitch Bumgarner threw, by horizontal (x-axis) and vertical (y-axis) movement, colored by Statcast pitch classifications. Even when ignoring pitch type labels, a pitch type is seen to be missing. Specifically, Bumgarner’s high-rising, fairly straight pitch was no longer thrown. On a side note, notice how inconsistent 2017’s movements were: likely because Bumgarner had to recover form a major shoulder injury and struggled.

With Statcast data, we can evaluate what happened with greater depth than through other methods. Below is a table of statistical changes in both Bumgarner’s two-seam and four-seam fastballs.

fastball stats

Velocity is measured in miles per hour, spin in revolutions per minute, extension is feet from the rubber, and horizontal and vertical movements are in inches from release point. Ignore 2017, as it was a very inconsistent year (as seen with the movement chart above). Both two-seam and four-seam fastballs in 2015 and 2016 had significant vertical rise due to spin. In 2018, however, Bumgarner couldn’t or wasn’t spinning his fastballs as much, resulting in less rise and more downward movement. This could be why Statcast is misclassifying his fastballs.

Why has Bumgarner lost spin on his fastballs? The data suggests two reasons why, both of which could be correlated. He’s lost velocity, and release speed correlates with spin rate. Similarly, Bumgarner has less extension on his fastballs than in 2016. His 2018 extension is similar to his 2015 extension, but because he’s lost velocity, the loss of extension could be penalizing. This loss of extension could explain the loss of spin if it’s related to grip or release.

Extension loss to home plate reduces the perceived velocity the batter anticipates, making it easier for the batter to time the pitch. Both loss of velocity and extension would, when combined, greatly benefit the batter at the expense of Bumgarner’s fastball.

What could have caused the loss of velocity and extension? Bumgarner is 29 years old, so there is a chance he’s entered his decline. The likely culprit, however, is injury: Bumgarner fell of a dirt bike in April 2017, injuring his left shoulder, and he broke his left hand on a line-drive comebacker in spring training in 2018, requiring surgery. Being left-handed, both injuries could have significantly affected his 2018.

One year away from free agency, Bumgarner likely hopes he can recover lost velocity and spin on his fastball. Whether it was an organizational change, a declining skill set, or driven by injury, his 2018 fastball difference was one to forget. His shoulder should be better healed, one year further removed from his accident, and hopefully his throwing hand does the same.

This and other postings like it can be found on my personal blog, First Pitch Swinging.

Why Alex Bregman Will “Out Regress” Mookie Betts

A significant challenge in baseball research is identifying when a player has made a transformational adjustment that results in a step-change in playing level (i.e. J.D. Martinez in 2013) vs. a player who has a great, yet unrepeatable year. Mookie Betts and Alex Bregman both had excellent years in 2018 and a call for regression would be expected. However, this research note presents data which suggests that Mookie Betts did indeed make a transformational mechanical change and will likely perform at high levels going forward while Alex Bregman’s improvement does not share the same solid underpinnings.

I recently examined the relationship between backspin and performance in this post. One of the key takeaways from that research was that no player in the highest backspin quartile (since the data started in 2015), has consistently put up “superstar” numbers. In fact, Mookie Betts was in the high backspin group and had the second highest wRC+ of 122 over the 2015-2017 time period – not “bad” but far from a super-star level. With Betts’ phenomenal 2018, I was curious if he was the only high backspin hitter to “break out” or if he made a significant change to his swing mechanics to hit the ball more “square.”

After reading that he and J.D. Martinez were working together on mechanics, I was curious if his backspin profile changed from prior years. Not only did it change, Betts had the largest reduction in backspin of all Qualified Hitters in 2018! Here is a list of the top and bottom ten backspin changers over last year:

null

Alex Bregman, on the other hand, had the sixth largest increase in backspin of all Qualified Players. Take a look at a comparison of Exit Velocity (EV), Launch Angle (LA) and Distance for the two players on well-hit fly balls (EV>=90, LA>=15).

null

Both Betts and Bregman had an EV increase of approximately one MPH. The change in the launch angle profile between the two hitters is significant – Betts added five degrees of launch angle compared to Bregman’s two-degree reduction. Betts should have had a distance gain; however, the fact that he didn’t is actually a positive based on the data. Thus, while Betts is showing a 13 ft. distance decrease over last year, Bregman had a 14 ft. increase. Most of Bregman’s distance increase is from backspin – a very unhealthy source based on the data.

While beyond the scope of this research note, the mechanical drivers responsible for changes in spin are Vertical Bat Angle, the amount of Explicit Swing Loft (also referred to as “Attack Angle), and the ball contact point (above or below the ball equator). Backspin increases with lower levels of Vertical Bat Angle and Explicit Swing Loft (Attack Angle) while “square” contact increases with larger values. More to follow on this in a future post. Because of the link between swing path quality and backspin, using distance as a performance metric in isolation is highly problematic – and can lead one to the opposite conclusion in projecting performance. In other words, it matters where the distance change is coming from.

In addition to the amount of backspin, other metrics such as the Standard Deviation of Launch Angle and a player’s IFFB% also have a strong relationship to the quality of a player’s swing path. Using a quartile ranking system for each of the three metrics, four players were in the top and bottom quartiles for all metrics in both 2017 and 2018. The difference in performance of the two groups is quite telling:

null

Wow! Considering only swing path quality metrics, the performance between the two groups is worlds apart.

To get a sense of the magnitude of the change for Mookie Betts in 2018, he was in the fourth quartile for all three metrics above in 2017. He moved from the fourth to the second quartile in backspin, fourth to first in Standard Deviation of Launch Angle, and fourth to second in IFFB%. Alex Bregman, on the other hand, moved into the fourth quartile for all three swing path quality metrics in 2018.

I have followed Bregman’s swing for some time and have made some timely performance predictions (in both directions) based on video. The backspin and swing path quality data, on the other hand, point to longer term issues that may not surface immediately. After all,  backspin improves the performance of balls hit but is inversely related to player performance given sufficient frequency (i.e. PAs). Thus, getting the precise timing of a performance shift based on the above data is difficult. However, without a swing path change for Bregman, the odds suggest that significant regression is not a matter of “if” but “when”.


Analyzing Underlying Factors Impacting Tickets Sold for Major League Baseball Games

I. Introduction

In 2017, Major League Baseball exceeded 10 billion dollars in total revenue for the first time. Ticket sales were a major component, making up 29.84 percent of this revenue (Statista.com). Due to the fact that fans continue to spend money once inside the stadium, 29.84 percent is merely a lower bound on revenue from ticket sales. For example, the average 2017 ticket price was 31 dollars; however, once inside the stadium, fans spent an average of 16 additional dollars on food (Statista.com).

II. Data

The data for this project are in an unbalanced panel format and contain 60,705 observations from 35 teams spanning from 1992 to 2017. Other than the 2017 season data, which I collected myself from baseballreference.com, the data from 1990 to 2016 were scraped from baseballreference.com by Troy Hepper, a consultant at Morgan Franklin Consulting, and shared on his github.com page.

Descriptive statistics of my game by game data are displayed in Table 1. The dependent variable is the percentage of tickets sold relative to a stadium’s capacity (PERCENTSOLD). PERCENTSOLD ranges drastically from a little bit under 2 percent to over 150 percent with a mean of around 66 percent. PERCENTSOLD is sometimes greater than 1 because for certain important games ticket sales exceed stadium capacity; however, only 76 out of 60,705 observations exceed 110 percent and these outliers have almost no effect on the estimated coefficients in the models.

The explanatory variables in this model are designed to control for the time effects of when a baseball game was played, the quality of the home team, and the quality of the opponent. To control for the time that a game was played, indicators for the month and year are included in the model. To control for day of the week and whether or not the game was played at night or during the day, four dummy variables were created indicating whether or not a game was a night game during the week (NIGHTWEEKDAY), a day game during the week (DAYWEEKDAY), a night game during the weekend (NIGHTWEEKEND), or a day game during the weekend (DAYWEEKEND). Due to the immense popularity of the first game of the season, an indicator variable for Opening Day is also used.

The quality of the home team is assessed using both information on payroll and playoff chances. Better teams have better players and since players are paid based on skill and production, better teams consistently have higher payrolls. The payroll variable created here is the percentage deviation from league average payroll (HOMEDEVIATION). The minimum percentage deviation is a little under 20 percent of the league average while the maximum is over 280 percent of the league average. A standard deviation of a little under 40 percentage points shows the consistent variability of team payroll throughout the data. The playoff chances of a team are weighted by the number of games back or up they are on the guaranteed divisional playoff spot.

The quality of the visiting team is assessed using information on payroll and the opponent’s relationship with the home team. Fans want to come to the park to see good teams play so more attractive visiting teams will consistently have higher payrolls. The visiting team’s payroll variable (AWAYDEVIATION) is constructed the same way as the home team’s payroll discussed above. Because fans want to see their teams make the playoffs and the best way to do this is by beating the teams in your division, an indicator variable to assess the draw of a divisional game is used as well.

III. Regression Specification and Results

To better understand the relationship between the explanatory variables and the long-run demand for tickets, the data were analyzed using three panel data estimation techniques: one-way fixed effects, two-way fixed effects, and random effects models. For these data, it is clear that a fixed effects model is a better fit due to the fact that the unobserved metric of fan loyalty, which is constant over time, correlates very strongly with the two explanatory variables that control for payroll. The reason that fan loyalty is constant over time is that it is clear that for some teams, like the Chicago Cubs, the teams are deeply engrained in the culture of their cities and the fan bases remain loyal to these teams no matter what. On the other hand, for certain teams, like the Oakland Athletics, fan bases consistently disregard their teams and never become engaged. Because loyal fans spend more money and demand higher quality teams, owners of these teams must spend more on players. For this reason, payroll is correlated highly with the omitted variable, fan loyalty, making the use of a fixed effects essential for unbiased coefficient estimates.

The results of the three separate panel estimation techniques are recorded in Table 2; however, this paper will focus on the results of the following two-way fixed effects model:

In this model, T represents the team, S represents the season, and G represents the gth home game for each season. An interesting conclusion is that except in the case of DAYWEEKEND, both the fixed and random effects estimation have the same sign and approximate magnitudes for each coefficient.

In the two-way fixed effects model, all variables except the time fixed effect for 1996 are significant at any standard level. The largest coefficient is that of the Opening Day dummy, which causes an estimated 38.7 percentage point increase in percentage of tickets sold. Interestingly, the year dummy variable shows an approximate 11 percentage point drop in PERCENTSOLD in 1995 in comparison to 1994. This drop is most likely due to the disdain towards baseball fans developed following the players’ strike of 1994. Another interesting league wide trend is the approximate 4 percentage point drop in PERCENTSOLD from 2007 to 2009 during the Great Recession. For the average sized stadium, this sized drop would result in a decrease of a little over 1,700 fans per game. According to statista.com, the average ticket price in 2009 was 26.6 dollars. Thus, the resulting setback of losing 1,700 fans paying 26.6 dollars per game over the course of 81 home games would be around 3.7 million dollars. According to the Hardball Times, league average revenue in 2007 was 171 million dollars so for the average team, a 3.7 million dollar drop in revenue in 2009 would result in around a two percentage point decline in revenue from ticket sales alone. This is economically significant for a profit maximizing firm like a baseball team.

Using April as the base case, the coefficients of all other month dummies are positive. This indicates that the first month of the season is the weakest month for maximizing PERCENTSOLD. Notably, July and August dominate the percentage of tickets sold with an estimated 13 to 14 percentage point increase in PERCENTSOLD in comparison to April. Economically, maximizing games played in July and August while scheduling off days during April would result in increased revenue; however, if three more games were scheduled in July and August, the increased number of fans paying the 2017 average price of 31 dollars per ticket would result in a little over 500,000 dollars in increased revenue, which is an economically insignificant increase of .2 percentage points.

The indicator variables designed to control for game time and game placement during the week also shed light on what type of games maximize PERCENTSOLD. In the model, NIGHTWEEKEND was left out and the coefficients of the other three dummies were negative. This tells us that weekend games played at night are the most popular. DAYWEEKEND seems to have the least effect decreasing PERCENTSOLD by around 1 percentage point, while NIGHTWEEKDAY has the most effect decreasing PERCENTSOLD by 14 percentage points.

The coefficient of HOMEDEVIATION can be interpreted as a 50 percentage point increase would result in a 14 percentage point increase in PERCENTSOLD. The other assessment of the home team, games back from the playoffs, predicts that for a five game lead on the division a team will see an approximate 2.5 percentage point increase in PERCENTSOLD while with a ten-game deficit a team will see a 5 percentage point decrease in PERCENTSOLD. This variable is particularly effective because on Opening Day everyone is 0 games back from the playoffs so it has no effect, but as the season continues and the games back variable becomes smaller or larger, its increased effect over the course of the season is naturally weighted in the model.

The coefficient AWAYDEVIATION has a smaller coefficient than HOMEDEVIATION, but is also positive and statistically significant. The effect of opponent is also shown in the divisional game dummy which tells us that if an opponent is in a team’s division, the percentage of tickets sold increases by a little under 1 percent. Although the divisional dummy is statistically significant, even if in 2017 the MLB had scheduled 40 more games against divisional opponents for each team, this change would have added under 500,000 dollars in revenue and increase total revenue by less than .2 percentage points, which is an economically insignificant change.

Overall, the data seem to tell the story that one would expect; however, it is always nice to attempt to quantify these relationships. For further information, the author can be contacted at marinojc@kenyon.edu.


Exploring Batter xwOBA and its Applications, Part 1

We are around the halfway point of the fourth season for which we have had Statcast data. One of the primary metrics created with Statcast data, introduced on the excellent Baseball Savant, is xwOBA (expected weighted on-base average), which I have noticed being adopted more for public analysis, including at this site.

The primary component of xwOBA is a statistical model that estimates the wOBA that each batted ball is expected to have produced based on its exit velocity and launch angle. In addition, actual strikeouts, walks, and times hit by a pitch are added in, as it is done in the normal wOBA formula.

There have been some explorations this year into the potential for predictive value added by xwOBA for pitchers, by Craig Edwards and Jonathan Judge, and batters, by Tom Tango and recent major leaguer Nate Freiman.

The pieces related to pitchers indicate what we would expect from our traditional DIPS principles: there is little evidence that pitchers have enough control over their results on balls in play to make including balls in play particularly worthwhile. For batter xwOBA, the pieces by Tom Tango and Nate Freiman serve as good jumping off points for a deeper dive, which is what I would like to present here (now that I’m finally done dragging my feet on writing this for a couple of months).

There is nothing too crazy presented here – think of this as a PSA on what batter xwOBA does, what goes into it, and why it is more of a stepping stone to future Statcast-based predictive metrics than something you should apply in a forward-looking manner today.

What does xwOBA do and what does that mean?

At the beginning of the article, I introduced the primary component of xwOBA as a statistical model that estimates wOBA for batted balls based on their exit velocity and launch angle. This more or less regresses the results of all batted balls to the mean wOBA we would expect of them without impact from or knowledge of the defense or park in which they were hit. In this way, it strips out a form of what could be called “BABIP luck” or “batted ball luck” that is associated with those things it does not include.

This is potentially powerful for predicting future performance, though it is not a predictive metric. In the case of batters, we know that they have substantially more control over their batted ball results than pitchers, generating a much wider range of BABIP and HR/FB% on a year-to-year or career basis than pitchers. Therefore, including analysis of balls in play for batters makes much more sense than for pitchers, which batter xwOBA could help to do.

However, while I have been seeing xwOBA regularly used to comment on early season breakouts or slumps, I have not come across a close look under the hood of batter xwOBA to both test its possible predictive capabilities and identify what sources of noise or “batted ball luck” it leaves in. Let’s see what we can find out.

What goes into xwOBA?

Statcast Quality of Contact Categories

To start, I decided to use some of the new “quality of contact” categories that the Statcast crew have defined. You’ve probably heard of barrels, the category that produces the highest wOBA (1.445, according to my calculations*), consisting generally of very hard hit fly balls and high line drives. It’s also the category seemingly most indicative of skill and thus signal amidst the noise, which is why it is the only one regularly used so far. The other five categories do contribute to xwOBA though, so let’s look at a quick summary of them.

*most of the numbers I use in here will be based on what I calculated using R from 2015-2017 Baseball Savant data, which may differ very slightly for a variety of reasons from what you see elsewhere – including, most likely, my personal failures. 

Statcast Quality of Contact Type Summary (2015-2017 data)

Table 1 - Quality of Contact Summary

Some of those names are more self-explanatory than others – if you would like to know more specifics, here is a Tom Tango blog post explaining them as well as providing some visualizations to help.

Aside from the specifics of what each of the six quality of contact types refer to, the takeaway should be this: While barrels contribute the highest wOBA on average and are most representative of skill, well over 90% of batted balls are not barrels. Expected results on these non-barreled balls are still fed into the xwOBA model. For batters, how much less indicative of skill are these other batted balls? And if they are less indicative of skill, are they useful to include?

First, let’s simply look at how each quality of contact type correlates year-to-year. Unfortunately, we only have three full seasons of data to compare, but let’s do what we can. For players with at least 300 batted balls in each year, I calculated the year-to-year R² value for the rate at which players hit each quality of contact type. (e.g. 2015 Barrels/batted ball to 2016 Barrels/bb)

Year-to-Year R² of Statcast Quality of Contact Types

Table 2 - Year to year correlations for Quality of Contact types

^Red denotes categories that produce poor batting results, green denotes good batting results

From the above table, we can get a sense of why the Statcast crew has focused on barrels – they are the only quality of contact type that produces both above average results and quite a bit of year-to-year reliability. Balls categorized as “topped” or “hit under” appear to approach barrels in reliability, but are worth very little. The “flares and burners” and “solid contact” categories produce close to half the value of barrels, but are far less reliable on a year-to-year basis.

For comparison, below are the year-to-year R² values for a few other things for the same set of hitters. Each of these metrics refer to the number of occurrences of that event per plate appearance.

Year-to-Year R² of Some “per PA” Metrics

Table 3 - Year to year correlations for other plate appearance metrics

This is pretty cool to me. Barrels per plate appearance or per batted ball seem to be in at least the same vicinity of year-to-year reliability as K% and BB%, which are two of the most important simple analysis tools out there for hitters. Barrel% is also a distinctive step above HR% in both sets of years compared.

But, what I really wanted to test going into this was smaller sample reliability, given the usage of xwOBA in so many early season articles.

In the following tables are R² values for the same quality of contact and per PA metrics we have discussed so far, but instead of looking at year-to-year R², we are testing the relationship between roughly the first third of a season (before June 1st) and the final two thirds of a season (June 1st onward).

R² Comparing Pre-June 1st to June 1st Onward – Statcast Quality of Contact Types

Table 4 - Pre and post June 1st correlations for Quality of Contact types

R² Comparing Pre-June 1st to June 1st Onward – Some “per PA” Metrics

Table 5 - Pre and post June 1st correlations for other PA metrics

Note: I simply proportionally adjusted my batted ball minimums for batters in this sample (batters with min. 100 bb before June 1st and min. 200 bb from June 1st onward), weirdly producing 149 batters in each year…

In general, of course, these R² values are a bit worse than the year-to-year ones. Strikeouts and barrels look the best here, with the next tier probably being topped, hit under, and walks.

What struck me most was something I figured I would find here: flares and burners take a significant hit in this smaller sample. How many flares and burners a player hits through a couple of months tells you very little about how many they will hit for the rest of the season.

To help visualize this, below are two graphs from the 2017 “pre-June 1st to June 1st onward” comparison: flares and burners per batted ball (R² = 0.11) and barrels per batted ball (R² = 0.64).

plot_2017_FlaresandBurners

plot_2017_Barrels

There is no doubt here that barrels are more indicative of a repeatable skill in partial season samples than flares and burners. (I want to say thanks to Aaron Judge for stretching out the barrels graph, by the way.)

This is why, earlier in the article, I said that xwOBA only strips out certain types of batted ball luck. In a small sample, players could hit some extra soft line drives, hard ground balls, or bloop singles instead of cans of corn or weak grounders, causing them to have an uncharacteristically high wOBA and xwOBA. Our analysis to this point deems knowing about those flares and burners to be not very useful for assessing a batter’s future results partway through a season.

But how much of an impact could that possibly have? Well, I calculated that flares and burners produced a .633 wOBA from 2015-2017 while making up about a quarter of all batted balls. According to FanGraphs, the highest wOBA ever recorded in a qualified batting season was .598 by Babe Ruth in 1920.

So yes, I think that lucking into some extra peak Babe Ruth plate appearances could have a relevant impact on a batter’s small sample xwOBA.

Up next

We have covered a lot so far, so I will break things here. In Part 2, we will look at a similar analysis on wOBA and xwOBA themselves, see if we can create a more simplistic metric than xwOBA that is comparably predictive in small samples, and discuss how the Statcast crew is likely working to create predictive metrics based on Statcast data (since that’s not what xwOBA is, making this analysis pretty unfair to them!).