Archive for March, 2017

Adjusting Appearance Data for Base-Out State

So far, we’ve developed some mathematical principles for visualizing appearance data for relief pitchers, and for measuring how apart they are. The goal has been to say something about how pitchers are being used, not only in a vacuum, but in the context of the way in which the team has chosen to divide up its relief innings for the season. We’ve only partially gotten there so far, but today let’s take a slight detour to ask: Is the underlying data conveying the most useful information?

Inning and score differential at the time of entering the game are the critical data elements in answering questions related to usage. The numbers and tables in my previous articles all focused on using these two elements. Here’s an example of the underlying data being used, in the form of three Daniel Hudson appearances which appear identical.

Three (Similar?) Daniel Hudson Appearances
Date Player Season Inning Score
6/28/2016 Daniel Hudson 2016 8 1
8/20/2016 Daniel Hudson 2016 8 1
9/21/2016 Daniel Hudson 2016 8 1

Inning and score differential are critical; however, as data elements are concerned, they are somewhat raw. Fortunately, those aren’t the only data elements we can look at. The next-most impactful data, I would argue, is the base-out state at the time that the pitcher enters the game.

Let’s establish a baseline: It’s the norm for relief pitchers to enter the game in a clean inning (no outs, no runners on base). Among pitchers with 20+ relief appearances in 2016, this was the situation in 68.1% of appearances. That’s a very high percentage, considering that there are 24 base-out states. It’s also very intuitive when we think about the game. Among other reasons, pitchers need time to warm up, and mostly, they do so while their own team is batting. It’s also the only base-out state which is guaranteed to happen every inning.

It would be atypical – and therefore, interesting – for a pitcher to be used frequently in other base-out states. Moreover, we should be giving credit to pitchers who are being used in that way. An appearance where a pitcher enters with a four-run lead but the bases loaded should not be viewed in the same way as an appearance where a pitcher enters with a four-run lead in a clean inning. More than likely, the manager has two different pitchers in mind for each of these scenarios.

Adjusting the inning is easy: Credit partial innings in the event that the pitcher enters with more than zero outs in the inning. This will bump the inning component of every pitcher’s “center of gravity” up a bit, giving credit to players for working slightly later in the game when called upon mid-inning. (Note: we could also define terms in a different way, and say that a pitcher who enters in a “clean” 9th inning is actually entering at inning 8.0, as 8 innings have been recorded prior to his entrance; however, this makes the resulting metric less intuitive.)

Adjusting the score differential doesn’t seem as straightforward at first, but fortunately, we can use the concept of RE24 to accomplish this. Given that entering in a clean inning is the default status, we will make no adjustment to the score differential for a given appearance if the pitcher entered in a clean inning. For any other base-out state, we will add or subtract the difference between expected runs in that base-out state and expected runs in a clean inning state (0 on, 0 out).

Let’s return to the three appearances shown above. As you might have guessed by now, they are not identical. Rather, they illustrate the importance of adjusting for base-out state.

Three Daniel Hudson Appearances (in greater detail)
Date Player Inning Score Outs Bases Adj. Inn. Adj. Score
6/28/2016 Daniel Hudson 8 1 0 ___ 8.00 1.00
8/20/2016 Daniel Hudson 8 1 0 123 8.00 -0.82
9/21/2016 Daniel Hudson 8 1 2 _2_ 8.67 1.16

If you were to ask Daniel Hudson to recall what he could about these three appearances, he’d probably feel very differently about each of them (if he remembers, anyway). In the first case, he’s coming into a clean 8th inning, protecting a one-run lead. It was a situation he found himself in with some regularity in 2016, prior to assuming the closer’s role.

The second situation is an absolute bear. Jake Barrett has allowed a leadoff single to lead off the inning, and poor Steve Hathaway, who shouldn’t be touching this game situation with a 10-foot pole at this point in his career, has subsequently allowed a double and a walk to load the bases. Hudson has been brought in to protect a one-run lead with the bases loaded and nobody out. The opposing team has an expected run value of 2.282. While technically Hudson has been given a lead, it’s one that he would be hard-pressed to keep, even if he does everything right. The reality is that this appearance is associated with an expectation that Arizona will trail by the end of it – as you can see on the play-by-play log, the Padres have a 70.6% win probability at this point. It would be silly to give this appearance the same treatment as the first two. (Hudson, by the way, does a masterful job of escaping this situation without surrendering the lead!)

The third case is the one I want to focus on. Rather than a clean inning, Hudson was asked to get the third out of the 8th inning, with the tying run standing on second base. While the Leverage Index at the time of entry for this appearance is higher (3.50) than in the first instance (2.17), Hudson actually has an easier job: He needs just one out instead of three, and the opposing team is expected to score fewer runs in this situation, all else being equal. In the “clean” 8th inning, he can be expected to give up 0.481 runs, while in the two-out, runner-on-second situation, he can be expected to give up just 0.319 runs. Moreover, the chance of scoring at least one run – presumably the more important question where one-run leads are concerned – is also lower in the “higher leverage” situation. (This doesn’t even account for the batter, Hector Sanchez, who is hardly Wil Myers at the plate, and is probably inferior to the 4-5-6 hitters in the Phillies lineup, as well.)

This brings up an important distinction between leverage and run prevention. Leverage Index, certainly, is an important tool. What it measures, however, is variance in win probability for a single at-bat. Managers rarely have the luxury of giving their pitchers one-batter appearances in the regular season. Even the notoriously fleeting Javier Lopez averaged nearly three batters per appearance in 2016. Managers must therefore determine how to maximize the value of relief appearances as a whole, not just at the time when the reliever is entering the game. Leverage Index shows how much variance can arise from the current plate appearance, but a manager may very well be better served having their best pitcher throw the entirety of the 8th inning, rather than having him get the third out in a situation that commands high leverage but still has relatively low run expectation.

Next time, we’ll look at how base-out state adjustments impacted the raw inning-score matrix data in 2016, to draw conclusions about which relievers were used most often in high-pressure, mid-inning situations, and whether that sort of usage aligns with what we’d expect from an optimal manager.


An Attempt to Quantify Quality At-Bats (Part 2)

In my first article, I created a definition for what I feel like constitutes a quality at-bat. I also examined a few test cases1 and hypothesized different ways in which this data could be used going forward. As a reminder, my definition of a quality at-bat (QAB) is an at-bat that results in at least one of the following:

  1. Hit
  2. Walk
  3. Hit by pitch
  4. Reach on error
  5. Sac bunt
  6. Sac fly
  7. Pitcher throws at least six pitches
  8. Batter “barrels” the ball.

 

To calculate a QAB percentage I divided the player’s total number of QABs by his total number of plate appearances. I then dove a little deeper into QABs to see what conclusions I could draw from this statistic.

The first thing I did was run every hitter in 2016 who had more than 400 at-bats and created a leaderboard. I displayed the players with the best QAB% and the worst QAB% below. The average QAB percentage in 2016 was 48.54%.  Not surprisingly, Mike Trout leads all hitters and is followed closely by Joey Votto — a player who always finds a way to get on base. The player that stuck out to me most on this list was Chris Carter. This is a player who had a lot of trouble getting a contract this offseason, despite leading the league in homers. In fact, he had so much trouble that he considered going to Japan before finally signing with the Yankees. However, he had the 10th highest QAB percentage. Mike Napoli’s QAB% also surprised me because I do not view him to be a particularly elite hitter; yet he ranked number four between two of baseball’s best hitters.

Players with best QAB% Players with worst QAB%
Name QAB % Name QAB %
Mike Trout 64.02% Josh Harrison 41.83%
Joey Votto 63.52% Rajai Davis 41.82%
Freddie Freeman 57.93% Andrelton Simmons 41.74%
Mike Napoli 57.89% Ryan Zimmerman 41.67%
Josh Donaldson 57.71% Alcides Escobar 41.40%
Paul Goldschmidt 57.65% Jason Heyward 41.34%
Dexter Fowler 57.61% Adeiny Hechavarria 41.32%
DJ LeMahieu 57.30% Jonathan Schoop 40.49%
David Ortiz 55.27% Salvador Perez 40.22%
Chris Carter 55.16% Alexei Ramirez 38.46%

 

One commenter on my last post pointed out that OBP could be highly correlated with QAB%. They were right. In fact, there is a strong correlation of r2=.82 between OBP and QAB%, which makes sense since they share many of the same parameters. After this finding, I decided to create an interactive scatter plot of OBP and QAB% to see what the data looked like and to see if I could find any interesting patterns. If you interact with the graph you can see that the five players who seem to be a little above the data between .3 and .35 OBP are Chris Carter, Mike Napoli, Michael Saunders, Miguel Sano, and Jason Werth.

 

Click here for an interactive version

Why does QAB% seem to favor this group of players more than others? By investigating the other parameters in my definition of QABs, I found that these five hitters were taking a lot of pitches. In fact, all five of these hitters were in the top 15 last year in pitches per plate appearance, with Jason Werth and Mike Napoli being numbers one and two, respectively. Additionally, Chris Carter’s score was likely higher since he barreled the 8th most balls last season. This leads me to believe that QAB% tends to favor or distinguish hard-hitting, patient sluggers.

Is QAB% another way in which we should be evaluating hitter performance? Probably not. As much as I love seeing Chris Carter on a list with the best players in baseball, this statistic uses an old-school mindset that does not show true value. That being said, it can still be helpful. It is a good way to show which hitters are taking a lot of pitches. It also helps quantify what coaches and broadcasters mean when they say a player had a  “good at-bat.” Finally, perhaps you watched a lot of Indians games last season and you couldn’t help but feel like Mike Napoli was the best hitter ever. His QAB% may identify why you feel that way. Mike Napoli is a good hitter, but not nearly as good as former MVP Josh Donaldson despite the fact that they both have a very similar number of at-bats that a coach would call “quality”.  Overall, I think this statistic does a good job of quantifying something that used to be a lot harder to quantify. At the very least, QAB% has given me a reason to be excited about Chris Carter joining the Yankees, my favorite team. Opening day cannot come soon enough.

 

  1. In my first article I made a mistake with my test cases. Barrels, a Statcast statistic, did not start being counted until 2015. I had provided QAB numbers starting in 2014. With the way I wrote my code this actually caused the barrels in 2015 and 2016 not to be counted. I should not have provided 2014 numbers at all, and the numbers for 2015 and 2016 were a little lower than they should have been. All of my calculations have been corrected for this article.