I’d like to share some of my thoughts and research on how we evaluate Major League Baseball pitchers. I think for the most part when we use statistics to discuss a pitcher, we are really looking at the pitcher from one or more of the following three perspectives: 1) ability, 2) performance, and 3) contribution. Before I get into my research, I will take a moment to describe what I mean by each of the three terms.
When I use the word ‘ability’, I am describing the physical and mental skills the pitcher has at his disposal. Some examples of ability are: how hard he can throw, what kind of movement he has on his pitches, how well he can locate, how well he mixes his pitches, etc. With the introduction of PitchFX, we are now capable of measuring ability better than ever before. With that being said, it is still difficult to accurately and meaningfully quantify many aspects of ability. Since a pitcher’s performance is based at least in part upon his ability, performance statistics can sometimes be used as a substitute for direct ability measures.
Performance literally describes how well a pitcher performed. In other words, it refers to the outcome or outcomes resulting from that pitcher throwing pitches. Nearly all baseball statistics describe performance. Some statistics measure a pitcher’s individual performance fairly well, whereas others combine the pitchers performance with the performance of his team and other factors. For example, ERA is generally not considered a great measure of a pitcher’s individual performance; however, FIP is considered a better measure of individual performance.
I have not found much reference to the word ‘contribution’ in the baseball literature, but I do think it is an important concept to consider. Contribution is a word I use to describe a pitcher’s contribution in helping his team win baseball games. By this general definition, I suppose ERA (and other performance measures) could also be considered a contribution measure in some respects, since wins are related to runs allowed. Therefore, I also propose that the relationship between ability, performance, and contribution is not divided by solid lines but is instead a spectrum where each statistic can be considered somewhat a part of each category. However, in an attempt to clear up this somewhat murky discussion, I will offer stats such as W-L, WAR and WPA as the most obvious contribution stats*.
*Note: Contribution stats can be measured directly (ie. W-L) or derived from performance stats (ie. fangraphs WAR is derived from FIP).
Now on to my research… The hypothesis that drove this work was: pitcher ability measures are more consistent between seasons than performance or contribution. This hypothesis is based on my belief that unlike performance and contribution, which are affected by countless outside factors, a pitcher’s ability is within himself and therefore less likely to dramatically change between seasons.
To test this, I took each pitcher that pitched a minimum of 120 innings in each season from 2008-2011. This gave me a pool of 63 pitchers.
For my ability measure, I took the statistic whiff/swing. I like this measure of ability because to me it is the simplest measure of an isolated part of a pitcher’s ability. Since the batter has already decided he will swing, we are only looking at the pitcher’s ability to throw a ball that will evade a hitter’s bat. I know ability to hit the ball is also heavily dependent on the hitter’s ability, but I think that using pitchers that pitched 120 innings in each season will let me take the individual batter out of the equation and use this as a measure of pitcher ability.
For my performance measures I used ERA and FIP from FanGraphs. I agree ERA is not the best performance measure, and may be considered more of a contribution; however, I have included it nonetheless. Finally, for my contribution measure I decided to use FanGraphs WAR.
I calculated the average whiff/swing, ERA, FIP, and WAR for each pitcher of the four-year period. I also calculated the standard deviation within each pitcher for each stat and the within pitcher coefficient of variation (stdev/avg). Coefficient of variation is the best way to report the variability of each statistic over the four seasons because it effectively normalizes each stat by the units they are reported in.
Globally, over the four-season period the 63 pitchers in my group had an average:
whiff/swing = 0.205
ERA = 4.03
FIP = 3.97
WAR = 3.08.
The average within pitcher coefficient of variation was:
9.6% for whiff/swing
18.5% for ERA
12.0% for FIP
and 47.7% for WAR.
So what does this mean? Well, I know this is just a start, but based on this I believe my hypothesis was correct. A pitcher’s ability is much more consistent between seasons than their performance and/or contribution. Furthermore, performance is more consistent than contribution. It appears as though the further you get from pure ability measures the more difficult it will be to accurately/reliably predict a pitcher’s future performance and contribution. I’d like to do some further research on performance prediction to confirm this but, my guess is that trying to predict future WAR from past WAR will be extremely difficult. Perhaps predicting future WAR from past ability measures may prove to be more effective.