Author: Paul Schale

Author Archive

Getting Ejected Works

April 23, 2019

Getting mad at an umpire, and then tossed from the game, may seem like an ineffective display of emotion since calls are never reversed after a little more yelling. But what about future calls? In order to answer this question, we need good data on a large number of adjudicated events. Close out and safe calls happen fairly rarely, and good data quantifying how close the play was would be difficult to collect. But the home plate umpire calls balls and strikes for every batter, and pitches at the edges of the zone provide plenty of opportunities to grow or shrink the zone slightly.

It’s difficult to measure the zone in a particular game since there aren’t enough pitches at each spot on the boundary of the zone, but by combining data from many games, we can get a clear idea of what the average zone looks like. As for quantifying the zone, it’s easy to get carried away with details (location of each side, correcting for player height, etc.), but with enough data, all of those variables should average out and we can focus on the simplest measure: zone size.

During the past four years, there have been 308 games featuring an ejection over the strike zone, containing about 47,000 pitches. Splitting by team (team with ejected player/coach/manager and opposing team) and before/after the ejection, we have groups with between 9,500 and 14,000 pitches, plenty for a good estimate of the strike zone.

The results, shown below, show two clear trends: first, one team is clearly justified in being upset as their hitters face a larger zone. Second, we see that umpires fix this, even over-correcting slightly, after making an ejection.

Umpires are Human

We all see the humanity of umpires in their fallibility, but it shows in other ways too: the zone shrinks on 0-2 counts and expands on 3-0 ones, showing that they don’t like ending an at-bat with their own judgement call. This doesn’t mesh well with the fiery persona of the umpire and their emotive strike-three calls, but we have to remember that they are playing a part, and their main goal is to keep the game firmly in their control. We see more evidence of this here: if umpires ejected arguing players out of a sense of holy wrath, we would expect no change in the strike zone at all.

Instead, we see a clear reaction in the direction that the arguing player desires. While the data cannot point to the exact mechanism, I see two distinct explanations: signaling and aversion to conflict.

In the signaling hypothesis, we suggest that players are frequently sending messages to the umpire, but the umpire considers these messages according to the cost in sending it. A few words muttered under their breath doesn’t cost them anything, and so it is usually ignored. An ejection is costly, so the umpire takes that signal seriously.

The second hypothesis is a simple human aversion to being yelled at in front of a crowd of thousands. It’s not a fun experience for anyone, so they take action to avoid it happening again.

About the Models

To measure the zone, I took two approaches, k-nearest neighbor (which knows nothing about the expected shape of the strike zone) and a logistic regression based model (which looks for a rounded rectangle). Error estimates were calculated using bootstrapped samples. Both gave similar results, and the code and data behind this post are available on Kaggle.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG