Hello World! As a software developer, automation is my way of life. It kills me to see the tedious yet important job of calling balls and strikes performed at less than 90% accuracy. Worse, catcher framing is now a thing, which is essentially baseball’s equivalent of selling the flop.
Today, I want to talk about how automating the strike zone would affect the MLB run-scoring environment. Don’t we all want to save the environment?
Let’s pretend that before the 2014 season, home plate umpires were fitted with earpieces giving them a simplified Pitch f(x) feed of balls and strikes. They heard a high beep for a strike, a low beep for a ball. They then called balls/strikes exactly as they were told, resulting in a perfect zone.
The most damaging ball/strike errors happen when ball 4 or strike 3 was thrown but not called. Sometimes the umpire is redeemed by luck, and a walk/strikeout happens eventually anyway, but not nearly every time. Think of how many times you’ve seen a 3–0 count where a ball was called a strike, only to have the hitter swing and ground out harmlessly on the 3–1 pitch.
For these experiments, let’s look at short description of the situation, the number of instances of that situation in 2014, and net runs that would have been added if a perfect zone had been called.
Data courtesy of Baseball Savant; click on a situation to see the query I used.
Are you surprised? The umpires made 545 more extra outs than extra ‘safes’. Using a rough walk minus out run differential of 0.6 runs, we see that a perfect zone would have added 0.07 runs per game. Interesting, but not huge.
But think again—this effect isn’t limited to plate appearances that should have ended with a bad call. We all know that the count affects the expected run value all on its own. So let’s expand this to all ‘bad calls’ in 2014.
Balls and strikes don’t obviously translate to runs. So I’ll use someone else’s much more careful research and use a ball minus strike run value of approximately 0.14 runs. Here’s what happens when we apply a perfect zone to all balls and strikes. Brace yourself!
Whoa. Are you kidding me? If we’d run last season with a perfect strike zone, the run environment would go from 4.07 runs/game to nearly 5! That’s the highest level since 2000. I know what you’re thinking: this is crazy, and probably wrong.
I also found this result to be larger than expected, to say the least. So let’s back up, check the mirrors, and look at the frequency of called strikes vs. balls.
There are a ton more called balls than called strikes. This makes sense because batters are more likely to swing at strikes. But the ratio of balls to strikes is only about 2:1, that doesn’t account for the 5:1 ratio among ‘mistaken’ balls/strikes! How do we account for this?
Here we dive into speculation, but stay with me for a minute. Maybe there’s a logical explanation.
What sequence of events must occur in order for a Pitch f(x) strike to become a ball?
By this ridiculously rough method, we would expect bad ‘ball’ calls about 1.5% of the time (0.10 * 0.35 * 0.45). Compare that with the observed value of 1.2%
Conversely, the sequence for a Pitch f(x) ball becoming a called strike is as follows:
We therefore expect bad ‘strike’ calls about 5.7% of the time (0.15 * 0.7 * 0.55). Again, compare that to the observed value of, wait for it, 5.7%. Boom!
I welcome your comments, criticisms, or even praise 🙂