Pitching Stats and the Quality of Batters Faced

by Gabe

July 2, 2010

Pat Andriola’s recent post about a pitcher’s opposition prompted me to present something I’ve been playing with for a few months. Several months ago, around the time of the Cy Young Awards, I saw a debate on another website focusing on the question of who the best pitcher in baseball was. The debate primarily centered on Roy Halladay and Tim Lincecum. One thing that was continually brought up in defense of Halladay was that he’d faced much stiffer competition than Lincecum, and that needed to be taken into account. Baseball Prospectus posts OPS of batters faced on their stat pages, but I thought that there had to be something that was better, something that was more quantifiable. This analysis, originally posted at Lookout Landing, is the result of that thought.

Special Thanks

I’d like to thank both Graham MacAree and Matthew Carruth up front. Graham allowed me to bounce the idea off of him and helped me start the list of caveats. He also put me in touch with Matthew, who was gracious enough to send me the data behind all pitcher/batter matchups in 2009. I’d also like to thank them publicly for StatCorner, as I used their tRA data as the basis for the pitching numbers and their wOBA data as the basis for the hitters. You guys rock.

The Steps

The solution to me seems to be wOBA of batters faced. It’s easily understood (well, if you’re a stats nerd anyway) and incredibly easy to use in analysis. If you can get the data, it’s not that hard to weight hitters’ wOBA figures together to get an aggregate. I started by hand-pulling data from baseball-reference, but that was incredibly time-consuming. I got in touch with Matthew and he graciously provided the batter/pitcher data that allowed me to run this for the pitcher universe in a much easier fashion.

Once you get the wOBA figures for the average hitter that faces a given pitcher, you need the average league wOBA to convert that to a runs figure. I compiled the StatCorner data by league and got averages of .341 in the AL and .330 in the NL. For this analysis, I included pitchers’ hitting stats (from what I understand, they’re typically excluded from the averages that drive batting runs above average) since that’s a major component of the difference between leagues. Additionally, I created a major league average wOBA.

I then calculated the bRAA of the hitters facing a given pitcher just like you would to create a hitter’s batting contribution ( [wOBA – league average wOBA] / 1.15 * Plate Appearances (or in this case, Total Batters Faced)]. So if in 2009 Zack Greinke faced an average hitter with a .340 wOBA and the AL average is .341, that cumulative hitter over the number of ABs against Greinke was 1.23 runs below average. Similarly, I made the calculation substituting major league average wOBA (.335 from StatCorner) for the league-specific figure and calculated the average hitter faced by each pitcher under that scenario (the comparable Greinke figure was a +3.42 run hitter). For the record, there is roughly a 10 run spread between the pitchers who face the “worst” and “best” average hitters in each league, and roughly 20 runs from worst and best average hitters across all major league pitchers.

I then took those bRAA figures and used them to adjust tRA, which is easily done by multiplying the bRAA figure by 27, dividing by xOuts, and subtracting the results (so a pitcher that faces a below average hitter would see an upward adjustment to his tRA). Intuitively it makes sense to me that if Halladay is a +44 pitcher and the hitters he faced were +5, then he should get credit for actually being something close to +49. I do this both within leagues and across leagues, and the differences between the adjusted and unadjusted leaderboards are shown below. I limited it to pitchers with 300 of more expected outs (so approximately 100 innings pitched). Clearly there’s a bit of reshuffling and the largest change is the AL/NL reshuffling on the combined leaderboard (note that you may have to open the leaderboards for full effect).

Results and Application

In general, the changes were what I expected. AL pitchers face better hitters than their NL counterparts (which makes total sense given the DH rule). Within the leagues, the pitchers in each East division faced the toughest hitters. But somewhat surprisingly, there were some relatively meaningful differences even among starters on a given team (for instance, Adam Wainwright faced a +1.2 bRAA NL hitter, while Chris Carpenter and Joel Pineiro both faced hitters around -2.5 bRAA; granted, it’s not huge, but it’s still almost half a win).

As far as how it gets applied, I’m still not totally sure about applying it directly to tRA (or FIP). I think the adjustment works to an extent, but there’s probably some noise in there or a perhaps a good reason why we shouldn’t just add pRAA to bRAA against, especially when trying to look at AL vs. NL pitchers. I also believe there’s likely to be some very good information contained in rolling this up by team or even division, which could aid in projecting “next year” for a player that changes teams/divisions/leagues from one year to the next (certainly multiple years would be needed).

Caveats

I have several caveats about this analysis. For one, it is heavily driven by the wOBA of hitters faced. It is possible that if, say, the AL is similarly better than the NL at both hitting and pitching that differences across leagues may not be picked up correctly (i.e., a .335 wOBA in the NL is potentially not the same as a .335 wOBA in the AL). Similar to that is the idea that there could be a disconnect within leagues as well due to the variation in the quality of pitchers that individual hitters face, which help drive each individual’s wOBA (of course now we’re back to a very cyclical chicken vs. egg argument). Second, I’m using but one year of data, so I’d need to run this several more times to see if 2009 is a representative year. As described above, I’m not sure if it works as an actual adjustment or if it should just be informational. I’ve also made no effort yet to figure out next steps as far as how this may be regressed. Additionally, I considered attempting to use left/right split wOBA data in the analysis but decided against it. That is one more potential refinement. Lastly, I’m not sure how this interacts with stats like tRA* or xFIP, as the adjustment of certain underlying batted ball figures would undoubtedly take care of some aspects of “facing better hitters” or whatever you want to call it.

Conclusion

These are but some of my thoughts on adjusting pitching stats for the quality of batters faced. I’m very interested in what the larger group thinks about the merit of such an adjustment, especially given some new information on how big some of the tRA adjustments are. What else should be considered? Are there other reasons that you have why it may or may not work? How do we consider the chicken and egg nature of adjusting both hitters and pitchers for the quality of the opponent? I’d love to hear any comments any of you have, either positive or negative. Thanks for taking the time to read this!

How Much Have Young Pitchers Contributed to the “Year of the Pitcher”?

Fun With ERA Estimators

15 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

LMack

13 years ago

Very interesting read, thanks.
Although it does lead to a mind-boggling cyclical phenomena 🙂

mweir145

13 years ago

But Halladay played for the Blue Jays

-1

William

13 years ago

Gabe –

(These columns should be on the main article column, as they are quite good but get far far fewer comments … maybe just don’t label them as “community”)

This is a great step you’ve taken here. So, I have no idea why, but some of the major upwards moves happened with three pitchers I targeted based on a relatively very-simplistic screen of SPs with lower than avg. *tERA and xFIP and higher than avg. BABIP: Lester, Wainwright and Johnson.

Is there some connection there?

Mason

13 years ago

I’ve been thinking about this for a while. I don’t know if the data is available, but I think looking at individual match ups might help explain some variations we couldn’t previously explain. Often we’ll notice a pitchers O-Swing% or F-Strike% will vary from year to year. We can chalk it up to random variation, but it might be that a given pitcher isn’t drawing as many swings on pitches outside the strike zone because he’s facing Kevin Youkilis every other at bat.

I think the same analysis could also be applied to quality of pitcher a given hitter faces. Maybe Adam Lind is having a similar year to last year except he’s facing Francisco Liriano or a LOOGY for a large portion of his at bats.

I’m sure this wouldn’t explain all good or bad years, but we might be able to quantify some effects that we previously attributed to luck.

Sorry I just added more questions instead of answering anything.

kds

13 years ago

Are these figures park adjusted? For example, let’s say that Lester’s batters hit at a .340 wOBA on average. But when they were facing Lester about half the time they were in Fenway, and this would affect their expected wOBA. And the effects would likely vary with the handedness of the batter. A lot of layers to this onion!

Gabe

13 years ago

Thanks for the comments, guys.

William – It stands to reason that one potential reason a pitcher might have a higher than expected BABIP is because he’s facing better hitters, which would lead to a larger adjustment here and a move upwards. A quick scan of the wOBA leaderboards over the past several seasons shows a bunch of the high wOBA hitters with higher than average BABIP.

Mason – I think you’re right. Since I’ve got the matchup data now, I’m going to play with this on the hitter side as well. It’s very cyclical though and I’m still not sure if it’s just informational or if I could use one to adjust the other and then back, etc. But the O-Swing% or F-Strike% ideas are interesting as well and definitely something to keep in mind.

kds – The figures are not park adjusted. That’s definitely another layer and could potentially be incorporated if I try to play with the handedness aspect of this (in the form of wOBA splits).

pft

13 years ago

These numbers are almost meaningless without taking into account how the hitters were hitting at the time the pitcher faced them. For example, anyone facing Big Papi last April and May were not facing the hitter his season ending stats said they were. Same thing with JD Drew, he was hitting 244 near the end of July and a hot 6 weeks at the end of the year padded his season ending stats.

By the same token, the strength of schedule should take into account how a team had been playing at the time teams play them. Facing the Red Sox in April was much easier than facing them in June, yet the stats will show the Rangers who played them in April as playing a 600 team (they are but were playing as a sub 500 team).

-1

AC_Butcha_AC

13 years ago

Ever thought of, that great pitchers surpress an opponent’s wOBA?
Let’s say Lincecum faces a true .350 wOBA line-up every start. But because of his ability he holds them to a .310 wOBA.
Now you come along and say: Wait a second Lincecum didn’t face good hitters. This other guy (a terrible pitcher who makes a real .310 wOBA team a .350 wOBA team) was facing tougher competition.

I know, that these examples are a little unrealistic but I guess you know what my point is.

Gabe

13 years ago

AC – Not only have I thought of the idea that a great pitcher would suppress an opponent’s wOBA, I expect it. And I’m not looking at the results/performance against each pitcher, but rather the relative quality of the hitters each pitcher faced based on their performance against every pitcher.

So in your example, I’d be adjusting Lincecum to account for the fact that he faced a .350 wOBA lineup and the other pitcher to account for him facing a .310 wOBA lineup, if that makes sense. To continue with your example, if .330 wOBA was average, then Lincecum would get extra credit for facing much better hitters and the other nameless pitcher would get dinged because he faced bad ones.

evo34

13 years ago

Nice article. This is the next major step in improving player projections. Imagine if power rating systems for teams did not incorporate strength of schedule… That is where we are at present in baseball player projections.

pft — your point is absurd. You cannot use tiny sample sizes (such as those avail. in April and early May) to evaluate batter quality. Players don;t magically change from great to terrible to great hitters throughout the course of a season. It’s called natural variance. The best way to evaluate schedule strength of pitchers would be to use some weighted avg. perf. of hitters that used past seasons as well as current season data. E.g., the formula used for CHONE rest-of-season projections.

AC_Butcha_AC

13 years ago

Hey Gabe,

actually what I mean is this:

Let’s stay with Lincecum for this example. And my other guinea pig is Tulowitzki. Now Let’s assume Tulowitzki is a .370 wOBA player versus league average pitching in a league average park.

-The reason I choosed Tulo is simply because they are playing in the same division. –

So now let’s assume Lincecum is able to cut off 40 points of wOBA from the average player he faces. Simple math… that would mean Tulo’s offense would drop to a .330 wOBA mark.

So far so good. But this is where the problem starts. Assuming .330 wOBA is average it is logical to credit Lincecum who holds a true .370 wOBA hitter to average. BUT even if Tulo’s real talent level is .370 he is hitting .360. Why? because he faced Lincecum way more often than Mr. Jimmy Average because of the unbalanced schedule. So now you are crediting Lincecum to hold a .360 hitter to average while he is actually holding a true .370 hitter to average. So he gets underrated a little bit.

Now the other extreme. Let’s say Josh Johnson is having a terrible year. He adds 40 points of wOBA to the average hitter in an average park. Ryan Howards’s true talent level is the same as Tulo’s sitting at .370. So we could say Howard would post a .410 wOBA vs. Johnson. But this time our batter is hitting .380 for the season beacuse of the fact he faces Johnson more often.
This time you would say Johnson faces tougher competition than Lincecum while they are facing essentially the exact same hitter.

My first post could have been a little confusing because of my extreme examples. I hope this close to reality example hepls you understand my point a little better.

To sum it up:
While facing the exact same talent level, you would assume Johnson had tougher competition whereas they were identically.

Note: This example only works if the sum of all the other pitchers faced other than my mentioned pitchers were exactly average or really close to.

(sorry if there are mistakes in language… ich bin ein Deutscher who just likes baseball and has enough time since my team got kicked out of the World Cup xD)

AC_Butcha_AC

13 years ago

Just wanted to mention that I totally like the idea of adjusting for the talent faced and I appreciate your work! I just want to help you improve it. So don’t get me wrong 😉

peace

Gabe

13 years ago

Got it. That’s basically my second caveat. I’m not sure that it’s possible to adjust hitters for the pitchers they face and then turn around and adjust the pitchers for the “adjusted” hitters they faced. It’s just so cyclical. At that point you may be better off doing something based on some set of context-neutral projections…

But since I’ve got all the data, I’m going to try to run it in reverse to see how drastic an impact it would be for the hitters. I always thought of it as “pitcher vs. team” and “hitter vs. pitcher” which meant that Halladay facing the Yankees in ~15% of his starts was a much bigger deal than ARod facing Halladay 20 times over the course of the season. But it’s possible that the same argument could hold water if a hitter has a disproportionate share of plate appearances against a team that’s really good at pitching.

AC_Butcha_AC

13 years ago

Maybe this could be a way how to solve this cycle:

Example:
Greinke faced 700 batters. These 700 plate appearances created a .320 wOBA against him. LgAVG wOBA is .330. Now you look at who these 700 guys actually were. They posted a combined .350 wOBA when NOT facing Greinke. So one might initially think he depressed opponents wOBA by 10 points while he actually depressed it by 30 points.
So you then use .350 as ‘league average’ and the .320 as the quality faced.

(.320-.350)/1.15*700=(-18.3 bRAA)
instead of the wrong
(.320-.330)/1.15*700=(-6,1 bRAA)

Actually, I think you can ignore the fact that hitters face too many good pitchers since they see a different starter every game and even in divisions with great pitching talent this problem can almost be ignored since it is a great improvement already.

Paul Kasińskimember

13 years ago

In order to solve the problem that you mentioned at the end of the article (the AL may be better at both pitching and hitting), you should look at how many more runs the AL teams scored in this year’s interleague games. There were 252, which is a good sample size, so that should give you your answer.

BAL	CHW	LAA
BOS	CLE	OAK
NYY	DET	SEA
TBR	KCR	TEX
TOR	MIN	HOU

ATL	CHC*	ARI
MIA	CIN	COL
WSN	MIL	LAD
NYM*	PIT	SDP*
PHI	STL	SFG