Author Archive

Automate the Strike Zone, Unleash the Offense

Hello World! As a software developer, automation is my way of life. It kills me to see the tedious yet important job of calling balls and strikes performed at less than 90% accuracy. Worse, catcher framing is now a thing, which is essentially baseball’s equivalent of selling the flop.

Today, I want to talk about how automating the strike zone would affect the MLB run-scoring environment. Don’t we all want to save the environment?

Let’s pretend that before the 2014 season, home plate umpires were fitted with earpieces giving them a simplified Pitch f(x) feed of balls and strikes. They heard a high beep for a strike, a low beep for a ball. They then called balls/strikes exactly as they were told, resulting in a perfect zone.

Experiment 1: Walks/Strikeouts overturned

The most damaging ball/strike errors happen when ball 4 or strike 3 was thrown but not called. Sometimes the umpire is redeemed by luck, and a walk/strikeout happens eventually anyway, but not nearly every time. Think of how many times you’ve seen a 3–0 count where a ball was called a strike, only to have the hitter swing and ground out harmlessly on the 3–1 pitch.

For these experiments, let’s look at short description of the situation, the number of instances of that situation in 2014, and net runs that would have been added if a perfect zone had been called.

Data courtesy of Baseball Savant; click on a situation to see the query I used.

Situation Instances Net Runs (Rough)
Strike 3 thrown, batter safe 146 -88
Ball 4 thrown, eventual out 691 415
Difference 545 327 (.07 team runs per game)

Are you surprised? The umpires made 545 more extra outs than extra ‘safes’. Using a rough walk minus out run differential of 0.6 runs, we see that a perfect zone would have added 0.07 runs per game. Interesting, but not huge.

But think again—this effect isn’t limited to plate appearances that should have ended with a bad call. We all know that the count affects the expected run value all on its own. So let’s expand this to all ‘bad calls’ in 2014.

Experiment 2: All balls/strikes called correctly

Balls and strikes don’t obviously translate to runs. So I’ll use someone else’s much more careful research and use a ball minus strike run value of approximately 0.14 runs. Here’s what happens when we apply a perfect zone to all balls and strikes. Brace yourself!

Situation Instances Net Runs (Rough)
Strike thrown, ball called 8724 -1212
Ball thrown, strike called 40557 5633
Difference 31833 4422 (.91 runs per game per team)

Whoa. Are you kidding me? If we’d run last season with a perfect strike zone, the run environment would go from 4.07 runs/game to nearly 5! That’s the highest level since 2000. I know what you’re thinking: this is crazy, and probably wrong.

Sanity checking

I also found this result to be larger than expected, to say the least. So let’s back up, check the mirrors, and look at the frequency of called strikes vs. balls.

Called Ball 233421
Called Strike 123922
Difference 109499

There are a ton more called balls than called strikes. This makes sense because batters are more likely to swing at strikes. But the ratio of balls to strikes is only about 2:1, that doesn’t account for the 5:1 ratio among ‘mistaken’ balls/strikes! How do we account for this?

A possible explanation

Here we dive into speculation, but stay with me for a minute. Maybe there’s a logical explanation.

What sequence of events must occur in order for a Pitch f(x) strike to become a ball?

  1. Pitcher throws in strike zone: ~45% (Zone %)
  2. Hitter takes said pitch in the strike zone: ~35% (100% – Z-Swing %)
  3. Umpire makes bad ‘ball’ call: ~10%

By this ridiculously rough method, we would expect bad ‘ball’ calls about 1.5% of the time (0.10 * 0.35 * 0.45). Compare that with the observed value of 1.2%

Conversely, the sequence for a Pitch f(x) ball becoming a called strike is as follows:

  1. Pitcher throws out of zone: ~55% (100% – Zone %)
  2. Hitter takes said pitch outside the strike zone: 70% (100% – O-Swing %)
  3. Umpire makes bad ‘strike’ call ~15%

We therefore expect bad ‘strike’ calls about 5.7% of the time (0.15 * 0.7 * 0.55). Again, compare that to the observed value of, wait for it, 5.7%. Boom!

More reasons to automate

  1. Automatic things happen faster. As a professional automator, I guarantee this will speed up play, by more than you think. I bet the umpire thinks for about 1 second on every pitch. That’s just the obvious part.
  2. Set the umpires free. Focusing on something as difficult as calling balls/strikes squeezes out the umpire’s attention on other important matters, such as enforcing pace of play.
  3. Crazy cool things will happen. For example, we will finally see what happens to an insane control pitcher’s K-BB%. V-Mart might never strike out!

I welcome your comments, criticisms, or even praise 🙂