Expected Pitch Value

May 26, 2021

There is an index called pitch value that calculates the increase or decrease in runs scored depending on the pitch type. In this article I will look to create an environment-neutral version of pitch value.

Shortcomings of Existing Pitch Value

Pitch Value (hereafter PV) and RV use the average or sum of the variable values of RE288. This method has the advantage of being able to measure how much a pitch actually increased or decreased the number of runs scored on that pitch. However, the metric is not consistent enough to be used in a single year given that it depends on a relatively small number of batted balls and plate appearances.

The following is the average delta_run_exp (RV/100) of sliders for pitchers who threw 500 or more of them in each year from 2017-20, with the data obtained from Statcast.

The correlation coefficient is 0.14, which means that there is almost no correlation. Even if a pitcher records an excellent RV/100 in one year, there is no way to know what kind of value he will record the following year. It seems that it is difficult to measure the stable value of a pitch type with the existing PV and RV.

Using xwOBAvalue for Situation-Neutral Run Value and Batted Ball Evaluation

We can try to make improvements in measuring the value of pitches with a small number of at-bats or pitches in a single year.

First, we use a situation-neutral scoring value for events that occur rather than a change in scoring value. For example, a home run with no runners on base and a home run with runners on base have different values in the existing RV, but the situation-neutral scoring value is calculated using the average scoring value of home runs in all situations combined. The reason for this is that it is not appropriate to evaluate the ability of a single pitch to prevent runs from being scored if it depends on the circumstances in which it is thrown.

Another correction is to use the xwOBAvalue (estimated_woba_using_speedangle in Statcast) instead of the actual batting result when a pitch is hit. The pitcher has little control over whether a batted ball becomes a hit or an out, and it is known that the number tends to be unstable in a single year. If we consider that it is difficult for a pitcher to control the number of batted balls in a season, the batted ball number of pitch type in a season is even smaller, so the index becomes less stable. Therefore, for batted balls, we use the value of runs (xwOBA_value), which is estimated from the speed and angle of the batted ball. The purpose of this is to remove the influence of defense and chance as much as possible.

In this way, we try to calculate the pitch value as situationally neutral as possible.

Calculate wOBA by count

I will call this situation-neutral pitch value xPV (expected pitch value) for now.

The first step is to find the wOBA by count. Here, the wOBA by count is calculated based on “all final batting results that have passed that count.” Note that this is not the same as the batting results recorded at the time of that count.

For example, if a batter misses a strike in an 0-1 count and the count goes to 0-2, and then strikes out on three pitches, one strikeout is recorded in the 0-1 record. But if a batter hits a single in that 0-2 count, a single hit is recorded in the 0-1 record¹. Also note, 0-0 is the count that has elapsed in all counts, so 0-0 = wOBA for all at-bats in that period.

Calculating the Run Value by Count

Using this wOBA by count, we can calculate the value of points scored by count.

(count wOBA after pitching – count wOBA before pitching) / wOBAscale (≈1.15 in Statcast csv data)

First, when the count changes, the actual RAA is calculated as:

(wOBA of the count after the pitch – wOBA of the count before the pitch) / 1.15

If a batted ball occurs, then this is used to calculate RAA:

(xwOBAvalue – wOBA of the count before the pitch) / wOBAscale

Total the value, Take the Average

The xPV is calculated by summing and averaging the RAAs calculated in this way.

The advantage of this xPV is that it reduces the influence of chance as much as possible and increases the consistency of the index by giving it a situation-neutral value. The following is the year-to-year correlation of the xPV/100 (xPV per 100 pitches) of sliders for pitchers who threw at least 500 sliders from 2017-20.

The correlation coefficient was 0.49, which is a moderate correlation and much improved over the 0.14 of RV/100.

For xPV, I referred to this article.

¹The reason why we use hitting stats through a count instead of hitting stats at that count is that we can take into account the effects of events that occur only in a particular count, and we can also evaluate pitches that are not directly related to the batting results. For a detailed explanation, snin’s article is very helpful.

I have also put the R code here.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG