Plate Discipline Correlations, 2008-2013

Plate Discipline Correlations, 2008-2013 

In fall 2008 FanGraphs was kind enough to release new plate-discipline metrics, including first-pitch strike percentage (F-Strike %), outside-the-zone swing rate (O-Swing %), and inside-the-zone swing rate (Z-Swing %).  At the time, Eric Seidman was even kinder when he investigated the correlation of these plate-discipline statistics with standard pitcher metrics like WHIP, FIP, BB/9, and K/9. Very thoughtful indeed.

Now we have another 4.5 years of plate discipline data, compiled by Pitch f/x rather than Baseball Info Solutions. It may be worthwhile to see how these numbers compare with Seidman’s, as well as add a measure of uncertainty to the correlations. It is possible for two factors to have a strong relationship, but because of small sample sizes or other forms of variability, the correlation value may not be as precise a measure as a high R-value may suggest.

Bootstrapping

Correlation coefficients, which fall between -1 and 1, allow us to measure the strength of linear dependence between two variables, such as O-Swing % and K %. We can use bootstrapping techniques to obtain 95% confidence intervals for these correlation coefficients. Calculating confidence intervals for correlations adds a measure of uncertainty to the process—narrow intervals indicate we can have greater confidence that the R-value we obtain represents the true correlation between the two metrics.

Bootstrapping is a statistical technique in which we resample our current sample, in this case 500 times. This repeated process allows us to assign measures of accuracy to sample estimates, such as medians, means, or correlation coefficients. For our purposes here, it is only important to note that we can be 95% confident that the true R-value lies between the intervals. If the interval includes 0, meaning absolutely no correlation, we can conclude that there is not enough evidence to indicate any relationship between the two variables.

First Strike %

These correspond well enough to the values obtained by Seidman, with one exception worth noting. While he used K/9 and BB/9 to correlate with F-Strike %, here we examine the correlation with strike and base on balls percentages. Our correlation coefficient is similar in magnitude at .24 versus .19, but its wide confidence interval approaches the null value and suggests the estimate is not very precise. This is worth noting, especially considering that BB % appears to have such a strong correlation with F-Strike % of -.79 with relatively narrow confidence intervals. Seidman observed a similar pattern—pitchers who get into an 0-1 count are more prone to not walking batters than striking them out.

First Strike %

       R-Value                    (95% CI)

K%

0.24

(.024, .455)

BB%

-0.72

(-.848, -.604)

WHIP

-0.52

(-.649, -.376)

FIP

-0.41

(-.576, -.237)

 

O-Swing %

O-Swing % is the percentage of pitches a pitcher pitched outside the zone but still generated a swinging strike. Think anyone facing Pablo Sandoval. Here we again see relatively moderate correlations with relatively tight confidence intervals ranging from 0.30 to 0.19. Pitchers who induce swings at pitches outside the zone may be especially tricky for hitters to do damage against. So far this season Adam Wainwright and Matt Harvey are both in the top three in O-Swing %, and top two in both WHIP and FIP.

O-Swing %

   R-Value        (95% CI)

K%

0.39

(.274, .548)

BB%

-0.44

(-.637, -.254)

WHIP

-0.50

(-.677, -.317)

FIP

-0.45

(-.650, -.283)

Z-Swing %

We can see from the results below that Z-Swing %, the rate of inducing swings at pitches in the zone, bears little relationship with any of these metrics. Seidman’s analysis showed that the correlations were negligible at best. The confidence intervals for all of these measure metrics include 0, meaning that we cannot be 95% confident that there is any relationship present. A quick glance at the leaderboards shows that Ian Kennedy and Miguel Gonzalez are near the top of the list this season, and these guys aren’t exactly shoving.

Z-Swing %

   R-Value        (95% CI)

K%

-0.17

(-.370, .035)

BB%

-0.17

(-.381, .048)

WHIP

-0.09

(-.276, .111)

FIP

0.10

(-0.09, .286)

All data courtesy of FanGraphs.

 Because I’m a believer in open data, you can find my R code here.





9 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
stich09
10 years ago

Can’t find your R code… I’m learning to play around with R and it would be fun to see your code.

sam
10 years ago
Reply to  stich09

Try this link, it works for me, and just click on R code document. If you still cant see it, I can send it to you via email

http://www.synopsissyndrome.com.php53-13.ord1-1.websitetestlink.com/?attachment_id=34

Also if you have any questions about what I did (its not commented out perfectly) just let me know happy to explain

stich09
10 years ago
Reply to  sam

Thanks Sam. I’ll see what I can figure out.

Filip Piasevoli
10 years ago

I’m learning R as well and I’m looking forward to examining your code. Very interesting stuff!!

Neil Weinbergmember
10 years ago

How did you get the correlation graphs into the article? I’ve written a few things for this section and haven’t had any luck with images. Thanks.

Simon
10 years ago
Reply to  Neil Weinberg

In the HTML tab in WordPress there’s an “img” button where you can copy-paste the URL of an image to embed it In the article. Worked for me.

Neil Weinbergmember
10 years ago
Reply to  Simon

Excellent thanks. Can’t believe I couldn’t find that.