Archive for July, 2010

Cliff Lee Hates Walks

If Kevin Youkilis is the “Greek God of Walks,” does that make Cliff Lee his mortal enemy? It’s an interesting query to ponder, considering Cliff Lee’s complete refusal to walk batters this season. At this pace, Lee is set to shatter the single season record for K/BB ratio. Given that the record for highest K/BB isn’t as universally celebrated as the single season home run or RBI leaders, let’s take a closer look at Cliff Lee’s historic season.

The current record holder in K/BB rate is Bret Saberhagen, who posted a K/BB rate of 11.00 over 24 starts in 1994. By comparison, in 13 starts this season Cliff Lee’s current K/BB rate sits at 14.83. While the list of K/BB leaders is littered with players from the 1800s, recent players on the list include Curt Schilling, Pedro Martinez, Greg Maddux, Ben Sheets, and Carlos Silva. Since it’s difficult to compare players from the 1800s with players today, let’s take a look at how Lee stacks up against the recent control freaks.

Player               GS     K/BB     K/9     BB/9     FIP    WAR  WAR/GS
Cliff Lee            14    15.17    7.27     0.48    2.58    3.8    0.27
Bret Saberhagen      24    11.00    7.26     0.66    2.76    5.2    0.21
Curt Schilling       35    9.58    10.97     1.15    2.40    9.7    0.27
Pedro Martinez '99   29    8.46    13.20     1.56    1.39   12.1    0.42
Pedro Martinez '00   29    8.88    11.78     1.33    2.17   10.1    0.35
Ben Sheets           34    8.25    10.03     1.22    2.65   8.0     0.24
Carlos Silva         27    7.89    3.39      0.43    4.18   3.0     0.11
Greg Maddux          33    8.85    6.85      0.77    2.43   8.2     0.25

A quick look at the table reveals the true dominance of Cliff Lee this season. On a per start basis, Lee is set to post a better WAR than every pitcher on the list except Pedro Martinez. While WAR/GS is a crude way to predict Lee’s WAR going forward, it does tell us how incredible his performance has been in the first half for the Mariners/Rangers. It’s also worth noting that even though he struggled in his Rangers debut, Lee did not give up a walk, increasing his K/BB rate while decreasing his overall BB/9 on the season. Despite a K/9 rate in line with Saberhagen, Lee is on pace to best Saberhagen in every single category in the table. Outside of the big strikeout guys (Schilling and Martinez), Lee may actually outproduce every other player in the table.

Even though K/BB leader isn’t a highly distinguished title, it’s certainly a sign of a player’s dominance in a particular season. No pitcher in the history of baseball has shown the amount of control Lee has exhibited this season. Since Lee’s strikeout rates are only above-average, you might expect batters to make a lot of contact against Lee, leading to more hits and a higher WHIP. This hasn’t been a normal season for Lee, however. The lefty has posted a WHIP of 0.95 this season, the top mark in the league. Some of that can be attributed to luck, but his current BABIP of .291 is actually fairly close to his career average of .305. With the recent trade, however, it’s going to be tough for Lee to match or improve on his numbers going forward. Leaving Safeco (and the Mariners defense) and moving to Texas will affect Lee’s numbers slightly. Despite that move, Lee still has a chance to complete one of the finest seasons by a pitcher. Even if Kevin Youkilis is Cliff Lee’s mortal enemy, I think it’s safe to say that every hitter despises Lee, especially this season.

*This article was originally written for

Subjectivity Objectified: Measuring Fans’ Biases with All-Star Votes

It doesn’t take a hardcore sabermetrician to realize that the All-Star vote is a sham. After all, the undeniable best catcher in the game received only the 11th-most votes at his position, and Omar Infante made the cut while MVP candidate Ryan Zimmerman had to sit at home (not the fans’ fault, but still).

But even if it’s impossible to distinguish the game’s best players by looking at the vote totals, I wondered if it would be possible to gather some more unorthodox information from the results: namely, the impact of fans’ biases on their ballots.

I quickly scratched out an equation for a statistic I made up, called “All-Star Score,” to measure how deserving a player is of fans’ votes for the Midsummer Classic:

All-Star Score = (Wins Above Replacement* + 2) ^ 2

*—numbers as of the All-Star Game

I calculated the All-Star Scores for each player listed on the ballot and added them together. I then added up the total All-Star votes cast (Major League Baseball releases the vote totals for only the Top 25 outfielders and Top 8 vote-getters at other positions per league, so I used 300,000 as a baseline for those players whose results were not available) and divided that by the composite All-Star Score to find out what the average All-Star Score Point was worth (just under 74,000 votes).

Finally, I calculated the votes-per-All-Star Score points ratios for each team, then divided that by the league average to get an estimate of what proportion of votes each team’s players got relative to what they deserved. The numbers below show each team’s relative figure as a percentage—a “Bias Score” of 100 would mean the team received exactly the right amount of support (of course, no club came out at 100).

I’m fully aware of the flaws in my experiment: the statistics used were compiled after the voting, not during it; I’m sure my 300,000-vote estimate for the lower-tier players is extremely generous to some and a big low-ball to others; and, of course, there’s no guarantee that my little equation represents the ideal proportion of All-Star votes a candidate should receive.

Nonetheless, I think the results are both somewhat meaningful and interesting:

Tier 1: The Unloved (79 and below)

1 White Sox 47
2 Royals 47
3 Athletics 48
4 Padres 49
5 Giants 50
6 Cubs 56
7 D-Backs 57
8 Blue Jays 59
9 Indians 59
10 Nationals 59
11 Orioles 60
12 Rockies 66

If you look at the vote totals, seeing the Royals and A’s at the top of the list shouldn’t come as a surprise: they’re two of the three miserable teams that didn’t get a single player on the voting leaderboards. Meanwhile, the starting nine for the Orioles—the only other club to be completely neglected—have been so bad that Baltimore landed in the middle third of the Bias Scores despite having the absolute minimum number of votes. Ouch.

It’s no surprise to see struggling teams like the Indians and Diamondbacks fall this low, but I would have expected Padres, Blue Jays, and Nationals fans to show their favorite players a little more love in light of their teams’ expectations-beating early performances. And I’m shocked that the Rockies haven’t been able to generate more excitement, what with their recent string of comeback wins in playoff races.

However, I’d say the biggest upsets here are the teams from Chicago—particularly the Cubs. North Side fans have a reputation of being among the most loyal and passionate in baseball (after more than a century without a championship, they’d have to be). It’s a telling sign that something is very wrong in Wrigleyville.

Tier 2: The Average (80 to 120)

13 Marlins 80
14 Pirates 81
15 Reds 84
16 Red Sox 90
17 Astros 102
18 Mariners 114
19 Rangers 120

The first team that jumps out at you here is Boston: how can Red Sox Nation be classified as a relatively unbiased fanbase? Take a look at the leaderboards and it becomes clear. Adrian Beltre finished behind Michael Young, Kevin Youkilis got barely half the votes of scuffling Mark Teixeira, even local hero David Ortiz fell behind the anemic Hideki Matsui. Derek Jeter has been better than Marco Scutaro, fine, but does he really deserve six times as many votes?

Two teams in this grouping redefine pathetic. A 20th-place finish for Andrew McCutchen is enough to put the Pirates squarely in the middle of the pack because their eight candidates have combined to be of less value than Dan Uggla. Astros fans, meanwhile, turn out to have a positive bias because of Lance Berkman’s eighth-place finish at first base. That’s what happens when your team has a negative composite WAR.

The two AL West teams are both interesting cases. The Mariners don’t have much of a reputation for a strong fan base, but people love Ichiro and the now-retired Ken Griffey Jr. raked in over a million votes. Given that the Rangers have the third-highest team vote total in the game, you might expect them to have a far higher Bias Score. But you might not realize that Texas also has the third-highest composite WAR.

Tier 3: The Coddled (121-150)

20 Tigers 126
21 Angels 129
22 Dodgers 129
23 Cardinals 134
24 Brewers 138
25 Mets 146

Most of these names were pretty predictable. The Brewers are probably the most surprising team to be ranked this far up. Their high score is entirely the fault of Ryan Braun, who led all outfielders with just under 3 million votes despite a significant offensive dropoff and horrific defensive, even by his standards.

Tier 4: The Overindulgent (151-190)

26 Braves 159
27 Rays 163
28 Twins 171
29 Phillies 181

Eight years ago, the Twins were on the verge of falling victim to contraction. Three years ago, the Rays had never finished a season with more than 70 wins. If you’d said then that both teams would soon have some of the most passionate fans in baseball, you would have been laughed out of the room.

Tier 5: The Insane (191 and up)

30 Yankees 199

I’m sure some commenter will accuse me of writing this article for the sole purpose of blasting the Yankees. I’ll say here for the first and only time that, while their coming out on top was somewhat predictable, this is just how it happened.

Just look at the vote totals. A-Rod over Beltre two-to-one, Curtis Granderson over Alex Rios by a nearly three-to-one margin, Teixeira over Paul Konerko almost five-to-one, Jeter over Cliff Pennington by over 10-to-one. Is there any logical explanation for that? And this isn’t even taking into consideration Nick Swisher’s Final Vote victory over Youkilis.

I’ll be the first to admit that this isn’t a definitive study—the rankings would surely be shuffled around if the full, precise vote totals were available (especially towards the lower end), and I don’t think anyone believes for a second that fans in Houston are more loyal than their counterparts in Boston. But I still think the results are somewhat telling, so in the future, fans in Minnesota and Wisconsin might want to think twice before complaining about East Coast bias.

Lewie Pollis is a recent high school graduate from outside of Cleveland, Ohio. He will be attending Brown University starting in Fall 2010. For more of his writing, click here.

It’s Time to Stop Using BABIP

I originally wrote this on Amazin’ Avenue, an analytics-friendly (to say the least) Mets blog/community.  It was well received so I am submitting it for cross-posting here.

* * *

A week or so ago, the Mets award-winning television team (well, the Gary and Ron parts) started talking sabermetrics — specifically, BABIP.   They tore it a new one, and for the most part, it’s because they didn’t understand what BABIP meant, or did, or… whatever.  It doesn’t matter.

What matters is that they talked about BABIP.  Which is horrible, because they’re going to botch it 100% of the time.  And that’s our fault, not theirs.  It’s time to stop using it.


By itself, batting average on balls in play means nothing.   It tells us how often a player gets a hit during the at bats when he doesn’t homer or strikeout, which in and of itself is worthless.   We know better.  Gary and Ron know better.  BABIP doesn’t differentiate between lineouts and popouts.  It treats a double in the gap the same as a bloop single.  Gary and Ron know it, and they laugh at our geekiness.  We don’t care how hard a guy hits a ball.  We’re nerds and the numbers don’t tell us that.  Literally:

Gary: Conversely, if a pitcher has a particularly low batting average on balls in play, they like to tell you it’s going to rise eventually. Well, to me that doesn’t make any sense. Certain guys hit the ball harder than other guys hit it. Certain pitchers induce more groundballs or more weakly hit balls than others. That’s part of what you’re trying to do. Am I totally off base with that?

Ron: No I totally agree with you, I think that for the average hitter, to have a high average putting balls in play, it’s probably because they do have some lucky hits. But certain hitters, like [David] Wright, hit the ball hard almost all the time.

Of course, we know it too.  We measure line drive rates and stuff like that.  We have xBABIP!   Yeah, go us!  And no, we don’t differentiate between the bloop single and the gap double — well, not independent of line drive percentage, etc.  But that’s the whole point.  We’re trying to measure how lucky the batter has been.  We want to know what the batter’s expected batting average is.

So let’s just say that.  Stop with the BABIP.  Stop with the esoteric number which only means something in relation to another number (BA) and even then really needs to incorporate other numbers (e.g. LD%) to truly say what we want to say.   Let’s do this instead.

1) Call it “Expected Batting Average.”

Obviously, BABIP isn’t a player’s expected batting average.  BABIP is a tool we use to try and figure out a players xBA (ooh! I acronymifieid it!), but that’s OK.   Let’s figure out the xBA and call it xBA.

2) Explain it in words.

Start with this:

Know what the difference between hitting .250 and .300 is? It’s 25 hits. 25 hits in 500 at bats is 50 points, okay? There’s 6 months in a season, that’s about 25 weeks. That means if you get just one extra flare a week – just one – a gorp… you get a ground ball, you get a ground ball with eyes… you get a dying quail, just one more dying quail a week… and you’re in Yankee Stadium.

That makes a ton of sense.  It has to.  It’s from Bull Durham.

But you know what?  Dying quails are fluky.  They’re luck.  Ground balls with eyes, same thing.  Flares, gorps, whatever.  Luck. That’s what Crash is saying there. The difference between a .250 hitter and a .300 hitter is a little bit of luck each week.

Guys who hit the ball hard, they don’t need as much luck.  Turn those grounders into line drives and those dying quails into warning track doubles and they’re hits — to hell with luck.  Luck is for guys like Alex Cora and Gary Matthews Jr. and that guy Rick Evans or something.

We say, screw that.  Let’s look at each at bat.  If a guy hits a frozen rope that’s caught, we know that’s not his fault.  Over time, that’ll even out, and he’ll get more hits.  If a guy strikes out, that’s an out every time.  Same with a pop up.  That won’t even out.  Homers?  Always a hit.  Grounders with eyes?  Well, that’s usually an out, and that’ll even out over time to.  We look at every single at bat and ask if the guy hit the ball hard enough to “make his own luck.”  That’s xBA.

(And you know what?  At the end of the day, that’s what BABIP turns into, too.  Except that BABIP sucks, because it doesn’t actually start there, in either name or by its equation.)

3) Drop the arrogance of specificity.  Use ranges when possible.

We’re measuring luck.   Luck isn’t exact.   So we’ll never be right on the money.  You’ll never be able to find a season where a significant number of players have an xBA equal to their actual batting average.  That makes us look stupid, when in fact, we’re just being arrogant — by being so exact.

We should use ranges.  xBA should be the 50% confidence interval, not the midpoint thereof.  More made up numbers: If a guy’s xBA is .285, it’s probably better expressed by saying that it’s between .279 and .291, or whatever.  It makes that .290 BA not seem “lucky” (it really isn’t) but tells us that a .274 is really unlucky.   In other words, it does the job — without the excruciatingly nerdy exactitude we are (wrongly) associated with.

It’s our job to communicate this stuff.  It’s not their job to get smarter (they’re not dumb) or to figure it out themselves (they’re busy) or that they don’t respect us (true, but fixable).  The problem is semantic, not logical, and semantic problems can — and indeed, must — be fixed by revising our language.  It’s time to stop using BABIP.

Dan writes a daily email newsletter, “Now I Know,” which shares something interesting to learn each day.

Call-Up Time: Brett Wallace

With the Major League season at its halfway point, and the Jays quickly running away from the playoff hunt, it’s time to look at the top man in their minor league system and see if there’s room to work him into the everyday line-up in The Bigs. This would of course be Brett Wallace, acquired in the Roy Halladay trade this off-season in a three-way deal involving Philadelphia and Oakland. Brett started his pro career at third base, but was moved to first base to start the season in order to mitigate his defensive shortcomings. This also works out nicely for the Jays because the only one in Wallace’s way is the incumbent Lyle Overbay.

Overbay has been a Blue Jay since 2006, and hopefully Toronto GM Alex Anthopoulos makes this year his last. Praised for his defensive abilities (all too often by the Jays’ commentators) he has actually been a below average defender so far this year with a UZR of -1.5, which doesn’t make him a laughing stock, but also doesn’t make up for his below-average .320 wOBA, or (almost exactly average) .334 park-adjusted wOBA. But when you consider that he’s playing the most “hitter-friendly” position, average just doesn’t cut it. So if Overbay is an average to below-average player, is Wallace an upgrade?

Brett Wallace has produced a solid .300/.362/.503 triple-slash in AAA Las Vegas, but with the Pacific Coast League being hitter-friendly we need to take these stats with a grain of salt. Thanks to StatCorner, though, we have a park-adjusted wOBA for Wallace, and it is an above-average .361 wOBA. Expecting it to drop when he goes to the majors, Wallace still projects to be an average hitter and at least as good a hitter as Overbay has been so far this year. Throw in the hitter — and especially HR — friendly nature of the Rogers Center, and Wallace could be a fairly productive player in the Jays’ power-hitting line-up. While it will be hard for Wallace to keep his batting average at the .300 mark, having a 23.2% line-drive rate in the minors just reeks of above-average BABIP (if he keeps it up) which should help him to a sustainable .270-280 avg at the Major League level.

The only problem now is getting rid of Lyle Overbay. The Jays are looking to be big sellers at the deadline, with John Buck, Alex Gonzalez and Lyle Overbay himself being UFAs. All three of these guys should be moveable to teams with a weakness at thin positions (Catcher and Shortstop). Overbay has also been hitting better as the season progresses and could look like an attractive and cheap option for a team like Tampa Bay who is lacking at the DH spot, or a NL contender such as the Reds (as a world series DH) or the Giants. He could conceivably bring in a B- or C+ prospect, depending on how much of his salary the Jays are willing to pick up. With the Giants also lacking at shortstop, the Jays could package the two together and try to pry away a high-level prospect from them, although a Bumgarner is probably out of the question.

All in all, the Jays should actively be looking to shop Overbay ASAP and give Wallace a good long look this season to see if he can cut it at the major league level, and thus get a better idea of where the organization is at moving forward

Stephen Strasburg Should Be an All-Star

What is the All-Star Game really about? Joe Posnanski ponders that question in his most recent article.While Posnanski doesn’t answer the question directly, he presents the different opinions of the All-Star Game. When I think about the All-Star Game, I tend to use a combination of two opinions Posnanski presents.

• It’s all about watching the best players in baseball.
• It’s all about watching the best players IN THE FIRST HALF (which is a different thing).

Following that line of reasoning, there is no doubt in my mind that Stephen Strasburg belongs on the NL All-Star Team this season.

Let’s examine the potential arguments against putting Stephen Strasburg on the All-Star team.

1. He hasn’t pitched enough to justify an All-Star selection.

Strasburg has pitched about as much as any reliever on either All-Star team this season. As a matter of fact, the only reliever on the team with more innings pitched than Strasburg is Pittsburgh’s Evan Meek. The skeptics are so quick to point out Strasburg’s lack of playing time, but very few actually seem to realize that most of the relievers selected have actually pitched fewer innings.

2. He’s pitched well, but it’s a small sample. How do we know he won’t regress?

This argument goes hand in hand with our first point (somewhat). Strasburg has accumulated a larger sample (albeit barely) than most of the other relievers selected. A look at his advanced stats reveals a pitcher that is as good as advertised.

Stat	Strasburg	Rank (Among All Pitchers/Among Starters)
K/9	13.01		4th/1st
K/BB	5.3		13th/4th
FIP	1.77		2nd/1st
xFIP	1.88		1st/1st

*Minimum of 30 innings pitched this season

Among starting pitchers, Strasburg ranks 1st in K/9, FIP, and xFIP. Those are truly terrifying numbers for any pitcher, especially a rookie. That level of dominance certainly suggests that Strasburg is already one of the best pitchers in baseball, and is worthy of pitching in the All-Star Game.

3. The league will adjust to Strasburg the second time around.

I suppose this part of the argument cannot be proved until Strasburg gains more experience in the major leagues. At the same time, this is Stephen Strasburg we are talking about! The most hyped pitching prospect in baseball since… well, maybe ever. As the stats in the table above show, it’s not as if Strasburg is using “smoke and mirrors” to confuse opponents. Anyone who has watched Strasburg pitch this season can tell you that he already has three plus pitches, and inferior hitters look useless against him. Much like Mark Prior, it appears only injuries can derail Strasburg’s dominance.

4. He’s young, he will have more opportunities to pitch in the All-Star Game.

Personally, I hope Strasburg goes on to pitch in a number of All-Star Games. The fact is, pitchers are so unpredictable these days, that we can’t be sure Strasburg will remain healthy throughout his career. Perhaps I am overreacting, but because pitching is so uncertain, we can never be 100% sure one guy will remain healthy. If I had to bet, I would guess Strasburg makes a number of All-Star Games throughout his career, but we just never know.

Chances are, Strasburg’s inclusion to the NL Roster would be one of the best things to happen to the MLB. Ratings would probably be higher if Strasburg was on the team. This was the same guy who broke NERD, Carson Cistulli’s method for picking the most exciting baseball games on any given day. Also, and no disrespect to Arthur Rhodes here, most baseball fans would likely rather see Strasburg come out to face Crawford-Hamilton-Morneau in a tie game than Arthur Rhodes. Strasburg’s rise to the majors has been one of the biggest stories of the 2010 season, and this was an opportunity for the “Legend of Strasburg” to grow larger.

Again, Strasburg has already proven that he is a fantastic young pitcher. He will very likely make a number of All-Star Games throughout his career, so I shouldn’t overreact to one snub. While I have dedicated this article to Strasburg, there are many players who were snubbed that were even more deserving than Strasburg this season. I still stand by my premise that Strasburg should have been included on the NL Roster, but (barring health) I’m so glad I will be able to watch his magnificence for many years to come.

*This article was originally written for

The Power of Expectations

“Oft expectation fails, and most oft where most it promises; and oft it hits where hope is coldest; and despair most sits” – William Shakespeare

If you’ve ever read the magnificent Joe Posnanski (and if you haven’t, what are you waiting for?), you’re probably familiar with his patented movie rating scale. The first time I read about it, I was blown away – the concept is so simple and elegant, yet it captures the intricacies of expectations and how they influence our final opinions of movies, books, beers, musics, video games, first dates, and yes, baseball players. Let’s take the M. Night Shyamalan film “Lady in the Water” for example. If you rented the movie after watching “The Sixth Sense”, you’d obviously have high expectations for the film – maybe a 3 1/2 to 4 star rating. And well, it turns out the movie was nothing like what you were expecting; it wasn’t a thriller or suspense story, but more something like a fairy tale for kids. To you, the movie was a flop – a one star movie in the end. Four stars minus one star gives you a negative overall movie experience.

But suppose you entered watching that same movie with different expectations. I’d watched little Shyamalan before watching “Lady in the Water”, and I’d heard from others that the movie was a disappointment. I was expecting something like a 1/2 star performance, but hey, the girlfriend wanted to watch it so we did. I wasn’t expecting a thriller or suspense movie, so the movie struck me as actually quite fun. I’d rank it a two-ish star movie in the end, giving me a positive movie experience. My expectations were lower than the quality I received, so it made the movie fun to watch.

It’s an odd coincidence that when I first read about Joe Poz’s movie system, I was studying abroad in Denmark. The Danes are masters of low expectations; their entire culture is built around “Jantelov“, the idea that nobody is better than anyone else. If you succeed and admit it, you’re ridiculed and held in contempt. And if you talk to a Dane, they’ll constantly remind you of the fact that their nation is no great shakes.* If you take a look at their nation’s history, you can understand why. They’ve lost every war they’ve been involved in since Viking times, their nation has shrunk continuously for the past 200 years, and their land is cold, dark, and uninspiring for 11 1/2 out of 12 months every year. Heck, the most cocky thing you can find in the entire nation is Carlsberg’s (their beer’s) slogan: “Probably the best beer in the world.” And even then… “probably”? What advertising agency over here would ever approve of such ambiguity? Danes are the kings of schadenfreude.

*My host family commented at one point that the war in Iraq probably wasn’t going to end well since the Danes were allied with the US. “We’ve lost every war we’ve been involved in – sorry, but it doesn’t look good for you.”

And yet, in multiple studies over different time spans, the Danes have been ranked the happiest nation in the world. Not who you would have expected, huh? In classic form, the Danes don’t have a great answer as to why – they just shrug their shoulders and say that they’re really not that happy. The weather stinks, their taxes are too high – jantelov all over again. The best answer I’ve ever heard came from one of my professors there; she claimed that if you were never expecting anything good to happen to you, you’d always end up pleasantly surprised.

What does any of this have to do with baseball? As fans, everything.

Read the rest of this entry »

Flooring the WBC: How the World Baseball Classic Negatively Affects the Health and Performance of Pitchers

The World Baseball Classic is certainly a noble idea. I mean, what’s not to like about it on paper? You take the best players from each baseball-playing nation and have them battle it out to see which country reigns over the rest of the globe. Can anyone trot out a more thunderous lineup than the USA? Who has the more dynamic pitchers: the Dominican Republic or Venezuela? Does Japan really produce the most fundamentally sound players? Fans all over the world have shown their support for this, as have many players.

All of this would be fine if baseball were like basketball, hockey or soccer; sports where you could wake up, trip over your dog, tumble down the stairs into a pair of cleats, skates or sneakers and play. Those sports employ bio-mechanics the body was designed to handle like running, jumping, kicking and swinging. Baseball, specifically pitching, is not like that. The human arm was not designed to handle the stress and torque put on it by pitching. If you don’t believe me, then I have a few thousand shoulder and elbow scars to show you, including my own.

The lucky few who are able to withstand such actions and be successful are kept on a yearly routine: start throwing in mid-February, build strength and stamina through March before turning up the intensity at the beginning of April. But just like it isn’t wise to turn the ignition on a new Mustang and instantly floor it, it doesn’t seem right to take a pitcher conditioned to ease into a season during Spring Training and tell him to pitch with October-like intensity in March. Unfortunately, this is the case with the WBC.

After looking through the statistics of those who appeared in both WBC tournaments, it is my belief that pitchers who participate in the WBC, especially starters, are far more likely to see a regression in their performance, get hurt or both than pitchers who do not play in the WBC. I reason that the most likely cause is the tournament’s timing disrupts the normal routine of pitchers and their arms are not yet ready to handle the stress and intensity then. With data collected from various sources, I will demonstrate the stark differences between WBC pitchers and their counterparts who did not participate in the tournament, using spreadsheet data and graphs included in this analysis.


The pitchers who were included in this study had to satisfy a few conditions. First, pitchers in the WBC group had to have pitched primarily in Major League Baseball in 2005, 2006, 2008 and 2009[1]. Players who played in one year but not another (spent one year in the minors or injured; or retired after a WBC) were not included. For the baseline of starters and relievers, a pitcher who made 10 or more starts for the year was counted as a starter while a pitcher who made 25 or more appearances with nine or fewer starts was counted as a reliever. The “all pitchers” category includes every pitcher who made an appearance during the 2005, 2006, 2008 and/or 2009 seasons.


At the heart of it, the key to successful pitching is how good you are in preventing runs from scoring, with ERA and component ERA (ERC)[2] being the primary statistics used to measure this aspect. The MLB’s ERA usually falls between 4.25 and 4.45 in most years, with only small differences from season to season. The last four groups saw small-to-moderate increases in their ERA between 2005 and 2006, but WBC starting pitchers saw a dramatic jump, from 3.75 to 4.48 while the ERC inflated from 4.09 to 4.79. WBC relievers also saw a significant jump in their collective ERAs (3.15 to 3.51), but not only is that only roughly half of what starters experienced, WBC relievers saw their ERC drop from 3.86 to 3.41. Compared against the league-wide ERA/ERC jumps of 0.24 (4.29 to 4.53) and 0.25 (4.18 to 4.43), respectively, the WBC starters’ jumps look even more like one of Superman’s single bounds. A major factor for this spike may be the above-average rise in HR/9 ratios. The average MLB starter showed no increase in his HR/9 rates and all other groups had increases of 0.1, but the HR/9 rates of WBC starters rose by 0.2 (0.9 – 1.1).

Home runs aren’t so bad, just as long as there isn’t anyone on base, but WBC starters were putting more and more runners on in 2006. Starting pitchers saw the highest rise in WHIP out of the five groups. The major league-average increase in WHIP between 2005 and 2006 was 0.04 (1.37 to 1.41), but the average WBC-participating starter saw his WHIP rise double that amount (0.08) from 1.29 to 1.37. Part of that increase was fueled by an up-tick in their BB/9 rates, which climbed from 2.9 to 3.1 (0.2). The most startling changes, though, were with the starters’ rising H/9 rates and falling K/9 rates . While all other groups saw a 0.2 increase in their H/9 ratios, WBC-participating starters’ ratios shot up by 0.5, going from 8.7 in 2005 to 9.2 in 2006. This may be attributed to a pitcher’s prematurely tired arm or improper mechanics from being rushed along during what normally is Spring Training. Either way, the pitches became more hittable, which also showed a decrease in these pitchers’ ability to strike batters out.

Every group I collected data on showed an improvement in their K/9 ratios by 0.2…except for WBC-participating starters. Their K/9 ratios actually fell, going from 7.0 in 2005 to 6.7 in 2006—a drop of 0.3 whiffs per nine innings. A good K/9 ratio shows both how good a pitcher is at retiring a batter without the help of his fielders and how dominant his repertoire is. The higher, the better. When I see that one group’s ratio is regressing while all others are improving, that would make me a little curious as to what may be causing such a downturn, especially with a group as valuable as starting pitchers. If I were in a team’s front office, it would make me wonder if this little event that is supposedly good for baseball is actually harming my pitcher and my team’s playoff chances.


Now, this wouldn’t so much of a concern if the pitchers who saw this decline in performance were just hurlers on the wrong side of 30 and/or at the tail-end of their contract, but that’s not a case. Pitchers like Jake Peavy and Dontrelle Willis saw their performances take a dive after participating in the 2006 WBC, while promising up-and-comers like Francisco Liriano and Gustavo Chacin suffered major injuries that year. Two of the more alarming examples are Peavy and Willis, two National League hurlers from pitcher-friendly ballparks who use complicated or violent deliveries.

Peavy seemed out of sorts during the first half of the 2006 season, posting ERAs of 5.17 or worse in three of the first four months. It was during this time that Peavy was also prone to the long ball, serving up 14 of his 23 home runs in April, May and June. The “gopher-itis” lessened once July hit, but then Peavy had a little more trouble finding the strike zone. After issuing no more than eight free passes in each of the first three months, Peavy walked 12 or more batters in every month during latter half of the season. Peavy eventually straightened himself out in 2007, but the same cannot be said for Willis. After nearly winning the Cy Young in 2005, Willis never could establish any consistency in 2006. His WHIP climbed an astonishing 0.29 points from 1.13 (sixth in the NL) to 1.42 (outside the top 30). At the same time, his HR/9 rate doubled from 0.4 to 0.8 while his opponents’ OPS climbed from .644 to .745. Since then, Willis’ regression went from bad to worse and is now viewed as little more than a reclamation project for the Arizona Diamondbacks.


Whereas 2006 saw a decline in WBC pitchers’ performance, the 2009 tournament participants saw an even more disturbing trend: a steep drop in their time on the mound. There were only negligible decreases in innings pitched following the 2006 WBC—10.1 percent for starters, 2.6 percent for relievers—but those figures worsened dramatically following this past tournament. WBC starters pitched, on average, 21.1 percent fewer innings in 2009 than they did in 2008 while relievers saw their innings totals drop by 27.2 percent. Houston ace Roy Oswalt saw his streak of five consecutive 200-inning seasons come to an end due to chronic back problems. Cincinnati’s Edinson Volquez appeared in one WBC game, then made only nine starts during the regular season before undergoing “Tommy John” surgery[3].

A second trend I noticed involved those pitchers who were in the playoffs the previous season. Out of the 11 pitchers who appeared in both the ’08 playoffs and the ’09 WBC, eight of them missed time due to injury (or, in the case of Javier Lopez, demotion) or saw an overall regression in their performance. The pitchers from this group who spent time on the disabled list pitched anywhere from 13.5 percent to 80.3 percent fewer innings than they had in ’08. Some of the more notable examples include Red Sox right-hander Daisuke Matsuzaka, whose 59.1 innings in ’09 were the fewest he’s pitched in either Japan or America, and Angels set-up man Scot Shields, who had never been on the disabled list for his entire nine-year big league career.


There are more examples of pitchers seeing their fortunes change for the worse after either of the two WBCs, like Bartolo Colon’s shoulder falling apart after rushing through rehab and Esteban Loaiza’s collapse in Oakland in 2006 or how Volquez’s elbow went kaput in the middle of 2009. I won’t list every pitcher who suffered, but my point is clear: the WBC increases the chances for pitchers to suffer injuries, see an across-the-board decline in performance or both. As I stated earlier, I feel the biggest reason for these unfortunate trends is the timing of the tournament. Holding this tournament in the early spring can only damage the health and careers of the players who wish to represent their countries and, in turn, hurt the player’s team both on the field and their long-term organizational plan. I feel the best possible resolution would be to hold the tournament at two different times: have the preliminary rounds during the week of the All-Star Game—while giving MLB, the Japanese leagues and all other leagues a mid-season break—and the final two rounds shortly after the World Series. This way, not only would the careers and health of the pitchers be better preserved, but it would also be highly beneficial to MLB as a whole.

Under the current scheduling, the WBC and MLB has to battle against the NCAA men’s basketball championship tournament for ratings and coverage. Since all other major professional and collegiate leagues are inactive in July, it would allow MLB a better opportunity to drum up interest in the tournament and give less well-known baseball-playing nations a bigger platform to perform. The week off would also benefit the players who are not in the WBC, as they would have had time to recover from injuries and spend invaluable time with family and friends. Lastly, the buzz over a recently completed World Series could carry over to the final stages of the WBC, with story lines from the first phase being built up prior to the resumption of the tournament. Playoff-participating players could have the option of continuing in the tournament or allow other players, who spent most of October resting and re-energizing, to go in their places. Those fresh bodies would also improve the quality of play seen by the fans.

The bottom line is this: the World Baseball Classic is an excellent idea, but is poorly executed in its current form, with pitchers suffering the most damage. Pitchers are the most valuable and volatile commodity in baseball and MLB should do its very best in order to protect that commodity. Even though there have been only two tournaments to study, the numbers are very clear and the logical decision to change should be made.

Michael Echan is a freelance sports writer from New Jersey. Please contact him if you would like to see the compiled spreadsheet data and graphs. He may be reached at

[1] Francisco Liriano spent most of 2005 in the minors, but was included because he spent most of 2006 with Minnesota before a season-ending elbow injury in August. Luis Ayala was on Washington’s roster in 2006, but injured his elbow during the WBC.

[2] ERC is a statistic created by Bill James. It takes the number of hits, walks, home runs, hit batters and total batters faced by a pitcher to give an “alternate” ERA that better reflects his performance.

[3] Volquez did pitch a career-high 196 innings in 2008, his first full season in the big leagues, but has had his workload gradually increased during his career. His combined innings progression: 140 in ’05, 154 in ’06, 178.2 in ’07, 196 in ’08.

Fun With ERA Estimators

There are a number of ERA estimators out there and just as many opinions on which one is the best.  Among the more well-known estimators are FIP (Fielding Independent Pitching, developed by Tom TAngo), xFIP (FIP, with a normalized HR-rate), SIERA (created by Matt Swartz and Eric Seidman at Baseball Prospectus), tRA (created by Graham MacAree), QERA (created by Nate Silver), Component ERA (created by Bill James), and DIPS, which was developed by Voros McCracken and was the first ERA estimator to attempt to use the three true outcomes (strikeouts, walks, home runs allowed) to separate the things pitchers have control over from other factors, such as defense, sequencing of hitting events, and luck.  Ultimately, that’s what an ERA estimator attempts to do:  they allow us to evaluate pitching performance based on the things pitchers actually control.

For this article, the three estimators that will be used are FIP, xFIP, and SIERA.  A quick refresher on the three:

FIP—“Fielding Independent Pitching, a measure of all those things for which a pitcher is specifically responsible. The formula is (HR*13+(BB+HBP-IBB)*3-K*2)/IP, plus a league-specific factor (usually around 3.2) to round out the number to an equivalent ERA number. FIP helps you understand how well a pitcher pitched, regardless of how well his fielders fielded. FIP was invented by Tangotiger.” (from The Hardball Times glossary).

xFIP—“Expected Fielding Independent Pitching. This is an experimental stat that adjusts FIP and “normalizes” the home run component. Research has shown that home runs allowed are pretty much a function of flyballs allowed and home park, so xFIP is based on the average number of home runs allowed per outfield fly. Theoretically, this should be a better predicter of a pitcher’s future ERA.” (from The Hardball Times glossary).

SIERA—Skill Interactive Earned Run Average.  This is the most recent entry into the field and is more complex as it incorporates a number of adjustments to the basic three true outcomes formula.  From the introductory essay at BP, there are things that SIERA takes into account that other ERA estimators do not:  it allows for the fact that a high ground ball rate is more useful to pitchers who walk more batters, a low fly ball rate is less useful to high strikeout pitchers, adding more strikeouts is more useful to low strikeout pitchers, and adding ground balls is more useful for high ground ball pitchers.  SIERA also uses ground balls per plate appearance rather than ground balls per balls in play.

For background information on FIP, xFIP, and SIERA, please see the following web pages:

Ultimately, we want an ERA estimator that will tell us how well the pitcher is pitching after you take away the defense and luck elements.  Also, we want our ERA estimator to be able to most accurately predict future performance.   If you have Dan Haren and his 4.56 ERA on your fantasy team, you want to know if he’s going to improve or if you should part ways with your expected Ace, so you look at an ERA estimator as a clue to his expected future performance.  Which ERA estimator you choose can give you very different expectations.

Here at Fangraphs, I’ve noticed a recent backlash against xFIP from commenters on articles that use the metric in their analysis.   These commenters feel that pitchers do have control over their HR-rate, whereas xFIP normalizes all pitchers to a league average rate.  Often, they will point out that a pitcher’s home ballpark could be a factor in a pitcher’s high home run rate and that it isn’t likely to come down as long as the pitcher continues to play for that team.  For them, FIP is the metric to use.  This can obviously make a big difference in predicting future performance.  I’m not going to weigh in on that particular debate, but I did want to highlight some pitchers and their respective ERA, FIP, xFIP, and SIERA numbers to illustrate the different expectations based on which ERA estimator you choose to use.

All pitcher data is as of June 30 and only pitchers with 75 or more innings were included.  This produced a sample of 115 pitchers.

ERA Leaders

1 Josh Johnson 1.83 2.47 3.16 2.99
2 Ubaldo Jimenez 1.83 3.07 3.68 3.49
3 Jaime Garcia 2.27 3.47 3.84 3.77
4 Roy Halladay 2.29 2.78 3.06 3.05
5 Adam Wainwright 2.34 3.11 3.27 3.12
6 Tim Hudson 2.37 4.37 4.29 3.94
7 David Price 2.44 3.73 4.07 3.97
8 Cliff Lee 2.45 2.34 3.30 3.09
9 Clay Buchholz 2.45 3.47 4.28 4.37
10 Yovani Gallardo 2.56 2.97 3.46 3.32

Generally, the league’s top 10 ERA leaders have had some good fortune to go along with their good pitching.  In the case of these pitchers, the first place to look is their BABIP.  In 2010, MLB hitters have a .299 BABIP.  Eight of the ten pitchers in the list above have BABIPs lower than .299 and the other two pitchers are at .304 and .305.  The lowest is Tim Hudson’s .234.  Left On Base Percentage (LOB%) is another key area.  Eight of the ten pitchers have a LOB% of 79% or higher, with the other two at 71.6% and 76.2%.  Ubaldo Jimenez leads the league with a LOB% of 86.2%.  Finally, HR-rate (HR/FB) is a key factor for a pitcher keeping his ERA low.  Nine of the ten pitchers have a HR/FB rate at 9% or lower, with Clay Buchholz leading the pack at 3.6%.

FIP Leaders

Rank Pitcher FIP ERA
1 Francisco Liriano 2.19 3.47
2 Cliff Lee 2.34 2.45
3 Josh Johnson 2.47 1.83
4 Roy Halladay 2.78 2.29
5 Tim Lincecum 2.88 3.13
6 Jered Weaver 2.93 3.01
7 Yovani Gallardo 2.97 2.56
8 Jon Lester 3.01 2.86
9 Ubaldo Jimenez 3.07 1.83
10 Adam Wainwright 3.11 2.34

When we shift over to look at FIP leaders, we have four pitchers who fall out of the top 10 based on ERA:  Jaime Garcia, Tim Hudson, David Price, and Clay Buchholz.  Joining the remaining six in this list of FIP leaders are Francisco Liriano, who surges to the top, along with Tim Lincecum, Jered Weaver, and Jon Lester.  Francisco Liriano has a solid 3.47 ERA, but his FIP shows he could be much better going forward.  The main culprit is a .355 BABIP, which should come down.  All ten of these pitches have great HR/FB rates.  Adam Wainwright has the highest rate, at 9.0%.  The other nine pitchers are at 8.7% or lower, with six pitchers sporting a rate below 7.0%.

xFIP Leaders

Rank Pitcher xFIP ERA
1 Francisco Liriano 3.01 3.47
2 Roy Halladay 3.06 2.29
3 Josh Johnson 3.16 1.83
4 Jered Weaver 3.21 3.01
5 Tim Lincecum 3.22 3.13
6 Adam Wainwright 3.27 2.34
7 Cliff Lee 3.30 2.45
8 Ricky Romero 3.43 2.83
9 Dan Haren 3.43 4.56
10 Jon Lester 3.44 2.86

The usual suspects remain on the list, with two additions in Ricky Romero and Dan Haren, while Yovani Gallardo barely drops out of the top 10, falling to 11 here, and Ubaldo Jimenez drops to 16.   Romero had placed out of the top 10 in ERA (17th) and FIP (11th), so he receives just a slight bump up based on xFIP, where he places 8th.  Dan Haren is the high-riser, though, as he’s allowed a HR/FB rate of 13.5%.  Haren is 78th based on ERA and 47th based o FIP, but moves up to 9th based on xFIP.  If you believe that HR-rates normalize over time, then Haren is a pitcher to target.  If, however, you think Haren will continue to be plagued by the long ball, whether that’s due to his home park or his actual skill, then you might want to steer clear of him (his career rate is 11.0%, by the way).

SIERA Leaders

Rank Pitcher SIERA ERA
1 Jered Weaver 2.55 3.01
2 Francisco Liriano 2.91 3.47
3 Josh Johnson 2.99 1.83
4 Roy Halladay 3.05 2.29
5 Cliff Lee 3.09 2.45
6 Adam Wainwright 3.12 2.34
7 Dan Haren 3.14 4.56
8 Tim Lincecum 3.17 3.13
9 Jon Lester 3.28 2.86
10 Yovani Gallardo 3.32 2.56

The SIERA leader list and xFIP leader list have nine common names.  The difference is Yovani Gallardo at #10 according to SIERA and #11 according to xFIP, and Ricky Romero (#11 based on SIERA, #9 based on xFIP).  Looking at the entire list shows that xFIP and SIERA produce similar ERA estimates.  I ran a correlation for all 116 pitchers between their xFIP and their SIERA and it produced a 0.96 correlation.  I then took the absolute difference between each metric for each pitcher and found that, on average, the difference was 0.17.  Seventy-seven of the 116 pitchers (66%) had xFIPs and SIERAs within 0.20 of each other and four pitchers had identical xFIPs and SIERAs.

Pitchers the ERA Estimators Agree On

Some pitchers have FIPs, xFIPs, and SIERAs that are near matches for their actual ERA.  It might be said that these pitchers are the easiest to predict going forward, simply because all three ERA estimators agree that their current ERA is likely to be a legitimate estimate of their ability.  Below is a top 10 list of pitchers who’s ERA estimators agree most closely with their actual ERA.  The final column, “AVG”, shows the average of the three ERA estimators.  To create the top 10 list, I found the absolute difference between each estimator and actual ERA, then divided by three to get an average absolute difference for each pitcher.

1 Freddy Garcia 4.66 4.69 4.60 4.66 4.65
2 Kyle Kendrick 4.88 4.89 4.90 4.98 4.92
3 Roy Oswalt 3.55 3.51 3.55 3.39 3.48
4 Zack Greinke 3.72 3.74 3.76 3.52 3.67
5 Kenshin Kawakami 4.48 4.23 4.52 4.52 4.42
6 Chris Volstad 4.40 4.21 4.47 4.47 4.38
7 Felix Hernandez 3.28 3.38 3.49 3.33 3.40
8 Tim Lincecum 3.13 2.88 3.22 3.17 3.09
9 Scott Kazmir 5.42 5.27 5.46 5.15 5.29
10 Jeremy Bonderman 4.36 4.02 4.42 4.23 4.22

Now, some of these pitchers are better than others.  In Joe Morgan terms, these are the most “consistent” pitchers when looking at how they fare according to advanced metrics but consistent doesn’t mean good (something Joe never seems to mention).  You can be consistent like Scott Kazmir and be of no use to anyone.  Or you can be consistent like Felix Hernandez or Tim Lincecum and be a top starting pitcher.  These pitchers generally have BABIPs within 10 points of the league average and HR/FB rates close to league average.

Most Volatile Pitchers

The following list shows the pitchers who’s ERA estimators disagree with their actual ERA by the largest amount.  These are the pitchers who advanced metrics suggest will either greatly improve or who are headed for heaping dose of reality in the future.

1 Tim Hudson 2.37 4.37 4.29 3.94 4.20
2 Livan Hernandez 3.10 4.40 4.91 5.18 4.83
3 Clay Buchholz 2.45 3.47 4.28 4.37 4.04
4 Ubaldo Jimenez 1.83 3.07 3.68 3.49 3.41
5 Jeff Niemann 2.72 4.39 4.29 4.16 4.28
6 Jason Vargas 2.80 3.71 4.81 4.45 4.32
7 David Price 2.44 3.73 4.07 3.97 3.92
8 Jaime Garcia 2.27 3.47 3.84 3.77 3.69
9 Justin Masterson 5.21 4.04 3.94 3.55 3.84
10 Matt Cain 2.93 3.60 4.70 4.49 4.26

Of note here is that nine of these ten pitchers are expected to perform much worse going forward, with only sabermetric favorite Justin Masterson expected to improve.  Some of these names are sure to cause controversy.  Matt Cain, for example, consistently out-performs his FIP and xFIP.  He has a lifetime ERA of 3.44, with a lifetime FIP of 3.66 and xFIP of 3.97.  Every year, his HR/FB rate is below the league average (7.7% for his career), and in five of his six years in the league his BABIP has been below league average (.285 for his career).  At some point, we must conclude that Matt Cain is better than the ERA estimators think he is.  Another pitcher on this list, Tim Hudson, has a career ERA of 3.43, with a FIP of 3.82.  He’s done it with a better-than-expected career BABIP (.287).  This year, that BABIP is .234, so he should regress, but he has a history of bettering his FIP, so he has a good chance of not regressing as much as the ERA estimators believe he will.

The ERA Estimator “Get Them If You Can” Official List

For this list, I limited the pitchers to those for whom the average of the three ERA estimators suggest a 3.80 ERA or below.  I don’t think it’s particularly helpful to know that the ERA estimators suggest Kyle Davies should have an ERA around 5.04 rather than the 6.06 he currently sports.  The “AVG” column is the average of the ERA estimators. The “DIFF” column is the difference between that average and the pitcher’s actual ERA.

1 Randy Wells 4.96 3.47 3.77 3.94 3.73 -1.23
2 Dan Haren 4.56 3.90 3.43 3.14 3.49 -1.07
3 James Shields 4.76 4.13 3.55 3.41 3.70 -1.06
4 Gavin Floyd 4.66 3.41 3.81 3.73 3.65 -1.01
5 Brandon Morrow 4.50 3.45 3.90 3.55 3.63 -0.87
6 Tommy Hanson 4.50 3.45 4.10 3.54 3.70 -0.80
7 Francisco Liriano 3.47 2.19 3.01 2.91 2.70 -0.77
8 Jason Hammel 4.32 3.69 3.81 3.85 3.78 -0.54
9 Justin Verlander 4.02 3.38 4.10 3.74 3.74 -0.28

The top eight pitchers on this list have BABIPs at .328 or higher.  The top six have LOB% below 70%.  Dan Haren and James Shields sport HR/FB rates of 13.5% and 14.4%.  Obviously, some of these pitchers are better than others and you can see for yourself the disagreement between the ERA estimators.  Haren and Shields, with their high HR/FB rates, have much higher FIPs than the others.   If you believe he can remain healthy, I’d say the #1 target would be Francisco Liriano, as his ERA is 40th among starting pitchers, while he’s ranked #1, #1, and #2 according to the ERA estimators.

The ERA Estimator “Sell!  Sell!  Sell!” Official List

For this list, I limited the pitchers to those who currently have ERAs below 3.50 and a K/9 great than 6.0.  Tim Hudson and Livan Hernandez, with K-rates around 4.0, are not likely to be easy to unload, despite their shiny ERAs.  The pitchers below have good ERAs and solid strikeout rates, but the ERA estimators suggest they are not as good as their performance so far.

1 Clay  Buchholz 2.45 3.47 4.28 4.37 4.04 1.59
2 Ubaldo Jimenez 1.83 3.07 3.68 3.49 3.41 1.58
3 Jeff Niemann 2.72 4.39 4.29 4.16 4.28 1.56
4 David Price 2.44 3.73 4.07 3.97 3.92 1.48
5 Jaime Garcia 2.27 3.47 3.84 3.77 3.69 1.42
6 Matt Cain 2.93 3.60 4.70 4.49 4.26 1.33
7 Ted Lilly 3.12 4.21 4.61 4.27 4.36 1.24
8 Andy Pettitte 2.72 3.76 4.04 4.05 3.95 1.23
9 Wade LeBlanc 3.25 4.19 4.60 4.57 4.45 1.20
10 Trevor Cahill 2.88 4.18 4.03 4.02 4.08 1.20

These pictures have a mixture of low BABIPs, high LOB%, and low HR/FB, which makes them candidates to perform worse from here on out.  Of course, Matt Cain, as mentioned before, always seems to defy expectations of ERA estimators.  Also, Ubaldo Jimenez, currently #2 in ERA, is #9 in FIP, and #16 in xFIP and SIERA, so he’s still a top pitcher, just not as good as he’s shown so far.  Depending on your confidence in these advanced metrics, there are moves to make as the baseball season reaches its halfway point.

Pitching Stats and the Quality of Batters Faced

Pat Andriola’s recent post about a pitcher’s opposition prompted me to present something I’ve been playing with for a few months. Several months ago, around the time of the Cy Young Awards, I saw a debate on another website focusing on the question of who the best pitcher in baseball was. The debate primarily centered on Roy Halladay and Tim Lincecum. One thing that was continually brought up in defense of Halladay was that he’d faced much stiffer competition than Lincecum, and that needed to be taken into account. Baseball Prospectus posts OPS of batters faced on their stat pages, but I thought that there had to be something that was better, something that was more quantifiable. This analysis, originally posted at Lookout Landing,  is the result of that thought.

Special Thanks

I’d like to thank both Graham MacAree and Matthew Carruth up front. Graham allowed me to bounce the idea off of him and helped me start the list of caveats. He also put me in touch with Matthew, who was gracious enough to send me the data behind all pitcher/batter matchups in 2009. I’d also like to thank them publicly for StatCorner, as I used their tRA data as the basis for the pitching numbers and their wOBA data as the basis for the hitters. You guys rock.

The Steps

The solution to me seems to be wOBA of batters faced. It’s easily understood (well, if you’re a stats nerd anyway) and incredibly easy to use in analysis. If you can get the data, it’s not that hard to weight hitters’ wOBA figures together to get an aggregate. I started by hand-pulling data from baseball-reference, but that was incredibly time-consuming. I got in touch with Matthew and he graciously provided the batter/pitcher data that allowed me to run this for the pitcher universe in a much easier fashion.

Once you get the wOBA figures for the average hitter that faces a given pitcher, you need the average league wOBA to convert that to a runs figure. I compiled the StatCorner data by league and got averages of .341 in the AL and .330 in the NL. For this analysis, I included pitchers’ hitting stats (from what I understand, they’re typically excluded from the averages that drive batting runs above average) since that’s a major component of the difference between leagues. Additionally, I created a major league average wOBA.

I then calculated the bRAA of the hitters facing a given pitcher just like you would to create a hitter’s batting contribution ( [wOBA – league average wOBA] / 1.15 * Plate Appearances (or in this case, Total Batters Faced)]. So if in 2009 Zack Greinke faced an average hitter with a .340 wOBA and the AL average is .341, that cumulative hitter over the number of ABs against Greinke was 1.23 runs below average. Similarly, I made the calculation substituting major league average wOBA (.335 from StatCorner) for the league-specific figure and calculated the average hitter faced by each pitcher under that scenario (the comparable Greinke figure was a +3.42 run hitter). For the record, there is roughly a 10 run spread between the pitchers who face the “worst” and “best” average hitters in each league, and roughly 20 runs from worst and best average hitters across all major league pitchers.

I then took those bRAA figures and used them to adjust tRA, which is easily done by multiplying the bRAA figure by 27, dividing by xOuts, and subtracting the results (so a pitcher that faces a below average hitter would see an upward adjustment to his tRA). Intuitively it makes sense to me that if Halladay is a +44 pitcher and the hitters he faced were +5, then he should get credit for actually being something close to +49. I do this both within leagues and across leagues, and the differences between the adjusted and unadjusted leaderboards are shown below. I limited it to pitchers with 300 of more expected outs (so approximately 100 innings pitched). Clearly there’s a bit of reshuffling and the largest change is the AL/NL reshuffling on the combined leaderboard (note that you may have to open the leaderboards for full effect).

Results and Application

In general, the changes were what I expected. AL pitchers face better hitters than their NL counterparts (which makes total sense given the DH rule). Within the leagues, the pitchers in each East division faced the toughest hitters. But somewhat surprisingly, there were some relatively meaningful differences even among starters on a given team (for instance, Adam Wainwright faced a +1.2 bRAA NL hitter, while Chris Carpenter and Joel Pineiro both faced hitters around -2.5 bRAA; granted, it’s not huge, but it’s still almost half a win).

As far as how it gets applied, I’m still not totally sure about applying it directly to tRA (or FIP). I think the adjustment works to an extent, but there’s probably some noise in there or a perhaps a good reason why we shouldn’t just add pRAA to bRAA against, especially when trying to look at AL vs. NL pitchers. I also believe there’s likely to be some very good information contained in rolling this up by team or even division, which could aid in projecting “next year” for a player that changes teams/divisions/leagues from one year to the next (certainly multiple years would be needed).


I have several caveats about this analysis. For one, it is heavily driven by the wOBA of hitters faced. It is possible that if, say, the AL is similarly better than the NL at both hitting and pitching that differences across leagues may not be picked up correctly (i.e., a .335 wOBA in the NL is potentially not the same as a .335 wOBA in the AL). Similar to that is the idea that there could be a disconnect within leagues as well due to the variation in the quality of pitchers that individual hitters face, which help drive each individual’s wOBA (of course now we’re back to a very cyclical chicken vs. egg argument). Second, I’m using but one year of data, so I’d need to run this several more times to see if 2009 is a representative year. As described above, I’m not sure if it works as an actual adjustment or if it should just be informational. I’ve also made no effort yet to figure out next steps as far as how this may be regressed. Additionally, I considered attempting to use left/right split wOBA data in the analysis but decided against it. That is one more potential refinement. Lastly, I’m not sure how this interacts with stats like tRA* or xFIP, as the adjustment of certain underlying batted ball figures would undoubtedly take care of some aspects of “facing better hitters” or whatever you want to call it.


These are but some of my thoughts on adjusting pitching stats for the quality of batters faced. I’m very interested in what the larger group thinks about the merit of such an adjustment, especially given some new information on how big some of the tRA adjustments are. What else should be considered? Are there other reasons that you have why it may or may not work? How do we consider the chicken and egg nature of adjusting both hitters and pitchers for the quality of the opponent? I’d love to hear any comments any of you have, either positive or negative. Thanks for taking the time to read this!