Hall of Fame Voters Really Made Love to the Pooch with This Closer Situation
One of the hallmarks of the annual Hall of Fame debates is the comparison to players already enshrined. It can be a very good exercise in determining the merits of a particular player, especially because after so many years, there are now a lot of players in the Hall of Fame. There are plenty of players at every single position. There are pitchers. There are power hitters, average hitters. There are great fielders. One area where the present Hall of Fame lacks in providing a good comparison is the Closer situation.
As Wendy Thurm’s post indicated in evaluating Lee Smith’s candidacy*, it is difficult to judge because the only full-time relief pitchers in the Hall of Fame are Rollie Fingers, Goose Gossage and Bruce Sutter. Hoyt Wilhem is not an apt comparison, having retired in 1972 with 500 more innings pitched than even Rollie Fingers. Wendy reached the conclusion that Smith was better than Sutter, not as good as Fingers and Gossage, and put Smith just on the other side of the Hall of Fame. It feels like the right call, but if Sutter is in the Hall what exactly is the standard for relief pitchers?
Taking a look at only Fingers, Sutter, and Gossage is not very instructive, so I expanded the parameters for comparison to include Smith, a likely first ballot player in Mariano Rivera as well as Trevor Hoffman, whose candidacy is not really clear at this point.
The era that the current members of the Hall played in was different than the current players with Smith serving as sort of a bridge between the two. I wanted to compare their innings totals so I took a look at each player’s twelve best years (omitting Gossage’s year as a starter) and created a cumulative IP graph.
As you can see, Fingers stands out, followed by Gossage, a small gap, then Sutter and Smith, followed by another gap, and then Rivera and Hoffman. Although Smith definitely compiled a lot of saves, it is not really fair to put him in the modern-day closer group.
Next, I looked at the players’ WAR cumulatively in their twelve best years. I order the WARs in descending order so that the peak would be first. This is what I found:
As you can see, it was Sutter’s peak that appealed to voters as he jumped out to an early lead and then crashed. Gossage tailed off, but remained high with Smith not too far behind. Rivera’s graph shows why he will make the Hall while Hoffman lags well behind.
I decided to take a look at a few other players who have already been dismissed from the ballot or will arrive on the ballot shortly. Leaving in Smith and Hoffman, and adding Billy Wagner, Dan Quisenberry, and Doug Jones, their cumulative WAR graphs look like this.
As you can see, Smith comes out as the clear leader, with Jones, Wagner and Hoffman bunched together and Quisenberry trailing behind. You can have a Hall of Fame that includes Trevor Hoffman, but that Hall of Fame needs to include better players like Lee Smith, and equivalents like Doug Jones and Billy Wagner. It seems too inclusive, yet that is the Hall the writers appear to have created.
Much of this debate would have been avoided if the Hall of Fame had never let Bruce Sutter in based on two exceptional years and a small handful of pretty good years. If Sutter does not get in, Gossage probably does not get the momentum he needs, and if Gossage doesn’t get in, Smith wouldn’t. I am not entirely sure why Fingers made it, but only Fingers and Rivera would make it in along with hybrid players like Eckersley and John Smoltz. I am by no means a small-hall guy, but specialists pitching in at most half of their teams’ games for an inning should be truly exceptional to make the Hall of Fame.
*A few interesting facts about Smith. In the first seven ballots Lee Smith has appeared on beginning in 2003, every single player who finished higher than Smith on the ballot is in the Hall of Fame. If Jeff Bagwell and Jack Morris make the Hall, it will be true for Smith’s first ten ballots. Smith finished higher than Morris on his first seven ballots. Smith held on the All-time save lead for 13 years, longer than Fingers (12), Reardon (1), and Hoffman (5). One more: Smith was on the wrong end of the platoon advantage for 53.88% of his matchups. Rivera (51.27%), Sutter (49.46%), Hoffman (48.19%), Gossage (45.98%), Fingers (43.8%) all lag behind.
Craig Edwards can be found on twitter @craigjedwards.
I’m in a tough position with Smith, because I don’t think he necessarily deserves to be in, but he is also more deserving then people that are already in, or will most likely make it in. personally, I’d put him in, but that’s mainly because he was always a favorite of mine, which is usually enough to push someone over the edge for me.
Great article. Amazing title.
The reliever who belongs in the Hall of Fame, beyond any doubt, is Hoyt Wilhelm, and he was duly elected in 1985. He was an excellent starter for 3 seasons in the middle of his career, but he pitched as many or nearly as many innings in the years when he was a reliever.
His 2254 IP dwarfs the relievers discussed in this article. His 3.06 FIP (2.54 ERA) must be as good or better than those guys, though I haven’t checked.
He was also a good hitter for a reliever (.121 wOBA in 493 PA’s).
Roy Face, a great reliever, is arguably HOF quality for his career, but should definitely have some kind of honorable mention plaque for his 1959 season: 2.70 FIP, 2.60 ERA, 18-1 W-L. I know pitchers wins and losses don’t normally mean much, but 18-1!
Poz did a great bit on the hall, in which it was found that the miserly bbwa led the vet committee to overcompensate with crap … especially the catcher position, which is remarkable since it is so under-represented. I hope that they do not think they have to overcompensate with crap since someone unworthy is in and because the position does not have much representation. That said, Wagner was awesome!
Just look at that Rivera cumulative WAR graph – maybe it’s just because I have a small screen on my netbook, but it looks almost perfectly linear. I’d call him inhuman, but there would probably be more variation between seasons if he actually WAS a machine than there is.
Fingers got in because he was the all time saves leader, and the only man with 300+ saves at the time. Add in many postseason appearances, and he was voted in, but more as a factor of luck, given he pitched when saves first became “important” and was used accordingly. Had he come up 10-15 years later, he would not have made the Hall, and fell into the second tier of all time closers.
Sutter got into the HoF with help from voters giving him credit for inventing (or at least popularizing) the split finger fastball.
He did not get in on statistics alone, so using him as the baseline is a faulty assumption.
It seems like it should be much, much easier to maintain top performance if you know that you are only ever warming up once, and then pitching one inning at the most. Giving modern closers credit for a more effective usage pattern that they have nothing to do with seems out of line, which is why I would be fine with a “just Rivera” reliever selection from here on out. The managers/GMs are the ones who should get the credit, such as it is.
I don’t think an analogous situation is even possible for hitters – there doesn’t appear to be a in-game usage pattern that allows hitters a dramtic boost in their performance relative to what you would see in full-time action (platoon advantage is about it)
On those additional facts – how do Fingers and Gossage end up facing so few lefties compared to the other guys? Was it the pinch-hitting rate during their era?
Sutter may not have made it to the Hall on statistics alone, but you can be sure that future voters will be looking at the statistics when they vote. If Hoffman and Wagner stay on the ballot for multiple years, we are talking about an examination that will be done in 10-15 years, 40 years after Sutter’s prime. Gossage getting in the Hall shortly after Sutter also speaks to voters making comparisons between similar players.
Re: facing fewer lefties
I think Fingers and Gossage faced fewer lefties for the reason you stated regarding pinch hitting being less popular, but it was also not as feasible if a pitcher was going to pitch a couple innings. If a pitcher pitched two innings and let one runner on, he faced seven batters. There are only so many lefties you can stock your roster with, and the pitchers are more likely to face a team’s heart of the order where hitters would not likely be taken out of the game. Smith pitched in both eras, and his PAs against lefties got bigger in the mid to late eighties. I suspect Rivera and Hoffman’s aren’t as big because the cutter and changeup, respectively, neutralized lefties and did not provide as much of an advantage for hitters.
I think Billy Wagner belongs in the Hall of Fame. When you look at pure nastiness, Mariano is his only peer. Had he pitched ~100 more innings, he would have qualified for the ERA+ and K/9 title. His ERA+ was 187(2nd all-time) and his 11.90 K/9 would easily be the best mark in baseball history.
It would just be a shame to me to keep someone out of the Hall when they were clearly this historically great. Had he played two or more years, he might have 500+ saves and his WAR total would only be surpassed by Rivera and possibly Gossage. The guy just belongs.
The perfect linearity of Rivera’s line in the first chart is proof that he’s a robot. I don’t think it’s fair to judge human’s and robots by the same standards.
Lou Whitaker’s career WAR (74) more than doubles all of these closers’ WAR…except Rivera (39). Even adjusting for some of WAR’s deficiencies, it upsets me at how famous a closer can become despite not impacting his team’s performance that much.
Well, you’re only looking at an FIP-based fWAR. You know that Rivera has significantly outperformed his FIP for his entire career, right? That 39 number is a complete joke.
bWAR has Whitaker at 69.7 WAR, Rivera at 56.3.
If we’re comparing relief pitchers over an entire career, I think I prefer WPA to WAR. This is because relievers (especially “ace” relievers or closers) tend to rack up small numbers of innings, but they are highly leveraged innings that tend to have disproportional influence on the outcome of a game.
Being highly effective for 1-2 highly leveraged innings is the hallmark of modern reliever effectiveness. WAR doesn’t care at all about leverage, it just cares about quality (performance, specifically FIP) and quantity (innings).
If you look at career WPA, Hoffman (32.98) and Gossage (31.40) soundly defeat Lee Smith (23.97). Rivera trounces them both (54.70).
Sutter sits at 19.61 career WPA. I think that this stat, combined with the WAR data you have presented, makes for a pretty compelling case that perhaps he does not belong in the HOF. But in terms of Lee Smith’s consideration, I think it’s a big mistake when you are thinking about the HOF to always be comparing a player to the *worst* current HOFer at his position. If you don’t think player A deserved to be in the HOF, you shouldn’t vote for player B just because player A wound up getting in.
OzzieGuillen (above) is upset about closers getting famous when they don’t affect a team’s performance that much. But again, this statement relies on WAR, which discounts the leverage of the situation. WAR considers team performance in the “Pythagorean Win” sense—that is, a function of total runs scored and total runs allowed. And of course, since a reliever only pitches a small amount of innings, his influence over a team’s total runs allowed over the course of a long season is going to be small. As well, the difference in WAR between the best and worst relievers in baseball will not be anywhere near as striking as the difference between the best and worst shortstop or starting pitcher.
But on the other hand, a properly-utilized ace reliever will be injected into games in the most highly-leveraged situations, where the difference between allowing 0 and 1 runs can be the difference between winning or losing. I don’t think it’s naive, archaic, or sabermetrically-challenged for me to make the claim that superior performance in these situations provides value not captured by WAR.
Since a reliever’s perceived value is almost entirely tied up in performance in highly-leveraged situations, if you use a metric that is intentionally agnostic to this context, of course it’s not going to capture the reliever’s full value.
I encourage you to at least consider WPA when evaluating relievers. I’m pretty sure WPA only takes into account a position player’s offense, so take this with a grain of salt, but Lou Whitaker’s career WPA of 28.01 now seems like a saner comparison with the likes of Goose Gossage (31.40) and Trevor Hoffman (32.98).
Just to make another important point, statistics don’t need to be the absolute only criterion one uses when considering a player for the HOF. There will always be a long list of players that are on the very fuzzy borderline (statistically) for enshrinement. But what about the less quantifiable aspects of a player’s career that ultimately leave a lasting positive influence on the game and its fans? Breaking the color barrier (Jackie Robinson), being an iconic player defining baseball in a city for 20 years (Willie Stargell), giving inspiring and dominant postseason performances (here’s a more recent example, in Curt Schilling), being a race-transcending star on and off the field (Roberto Clemente)…
Lee Smith may be a marginal HOF case statistically, as many players are, for his regular season performance. But If I’m on-the-fence, I’d rather give the nod to a player like Curt Schilling, or Chipper Jones on the basis of what they’ve truly mean to the game of baseball over the last couple of decades. I’m sure Lee Smith had his moments, but he’s no icon in any of the 8 cities he played in, and he never left his mark in the postseason.
Peter,
You make several very good points, especially in relation to the highly leveraged nature of relievers. I also agree that looking at the lowest level of hall of famer for comparison is generally not a good idea, although with hoffman, smith, and jones, we are dealing with non-hofers. A couple of points though.
1. WAR, I believe, does take into account, leverage index for relievers. Those relievers do get credit for pitching in more important situations.
2. WPA is a good tool, but it doesn’t tell you how the outs are made or how the runs are given up. WAR takes the most important factors, ks, bbs, and homers. WPA is not aware, nor does it consider park factors or defense that the pitcher has no control over.
1. I don’t believe WAR takes leverage index into account for relievers. If you can show me a link from fangraphs where it says WAR is calculated that way, please share it. I would be surprised if it did, since one of the theoretical bases of WAR is that performance across situations in terms of leverage is random and therefore not indicative of true value.
2. WAR takes into account ks, bbs, and homers, but it doesn’t take *everything* meaningful into account. It makes assumptions that pitchers aren’t able to to systematically induce weak contact (leading to lower BABIP) and that “clutchness” doesn’t exist. It doesn’t account for the pitcher’s own defensive abilities, nor his ability to keep runners from advancing on the basepaths via steal or wild pitch. You’ll notice that Mariano Rivera, who has had all of these fine qualities, routinely outperforms his FIP, and therefore is undervalued by WAR.
You can make the argument that ks, bbs, and homers are important because they tend to correlate well year-to-year, while other stats don’t. Therefore, as tools for *projection* people have had success with them.
That doesn’t mean that when a career’s player is finished you aren’t allowed to consider anything else, especially considering that a lot of that pure luck is going to even out over a 15 year career. WPA is more of a descriptive statistic than a projective statistic, but when we’re looking back at a player’s career we’re trying to describe, not project. You’re allowed to entertain the possibility that a player outperforming his K, BB, and HR rates might not have just gotten lucky for 15 years, and that maybe give that player credit for at least some of the other variance contributing to his performance.
Since it seems at least one person is listening, I’m going to use this opportunity to make another case for WPA, in the context of how one votes for MVP. A good way to define MVP, in my opinion, is: What player has contributed most to the success of his team?
As in the case of rating HOFers, we want to *describe* what happened in a given season, rather than infer the player’s underlying skill. We give credit for what a player did, just like we award the World Series trophy to the team in the series that gets the most wins, not the most Pythagorean wins.
Going with WAR, I think, can really undermine the spirit of the MVP. Consider two players: Player A is an absolute terror at the plate, but only when the game is entirely out of reach in either direction. In situations where good hitting can actually effect the outcome of the game, he strikes out virtually every time. Player B strikes out virtually every time in meaningless situations, but is an absolute terror when it counts. Let’s say both these players finish with identical stats. WAR will say both these players are equally valuable, but it is evident that player B has contributed much more to the success of his team.
WPA tends to be highly correlated with WAR—you don’t have to be superhumanly clutch to have a very high WPA—but I think it adds something highly desirable.
See this one
http://www.fangraphs.com/blogs/index.php/war-and-relievers/
I believe that is still accurate. All implementations of the framework are going to be better at some things than others. I’m not going to argue that rivera is underrated according to his WAR. He is an outlier. The rest of the relievers cited don’t have those same issues. Some stats may even out over the course of a career, but if a pitcher pitches half of all his innings in a pitcher’s park, that will not even out. If there is evidence that pitchers have control over the type of contact they induce and that they can sustain over the course of a career, I have not seen it. There are going to be outliers going both ways over the course of a career with great pitchers on either side of it. I feel more comfortable relying on ks, bbs, and homers not just because it is predictive, but also because it is descriptive of the events that the pitcher has the must control over. How much control a pitcher has over the type of contact I’m not sure about, but I’m very confident when there is no contact.
I went to the link. That honestly surprises me that they would even give “half credit” for leverage, as it seems (to me) to go generally against the whole spirit of WAR.
Surely you don’t really mean there is no evidence that pitchers have control over the type of contact they induce? GB/FB rates certainly have some stability, even over the course of a career. For example, Hoffman was consistently a flyball pitcher on balls in play.
A pitcher’s park won’t even out over the course of a career (a big park would certainly have played to Hoffman’s strengths). Nor will the contextual effects of league and era (In this case, though their careers overlapped, Smith played in a weaker offensive era than Hoffman, no?). But I was careful to say that “some” things would even out—namely the pure luck aspects.
Basically, my goal here is for neither of us to throw out the baby with the bathwater. Because WAR takes into account things like ballpark effects, it should not be dismissed. Then again, if you’ve got the time (say, you’re voting for the HOF), why even limit yourself to such a coarse summary statistic, right?
Anyway, I hope we both got something to mull over from this exchange.
Was doing some stat perusal and, while we’re on the topic of modern big name relief pitchers, there is one such closer from the 1990s/2000s who actually is 2nd *all-time* in lowest career BABIP against (with at least 300 IP). Any guesses?
Troy Percival (astonishingly) held opponents to a .230 BABIP over 700+ innings of relief. Considering his great (fortune?/skill?) on balls in play, and that he additionally struck out guys at a pretty good clip, and that he didn’t give up an insane amount of walks/homers (though a fair amount of each), one would think he’d have managed an ERA even better than his career 3.17 mark.
This is a great article, bummed I didn’t see it until now.
Basically, after Rivera, there’s no real sure things for modern era relievers and the HoF (Eckersley and Smoltz are hybrids).
I really like shutdowns and meltdowns, especially shutdown/meltdown ratio, but that seems to favor more current relievers and I’m not sure why. Perhaps the longer outings made relievers like Gossage and Fingers less effective?
Looking at the stats, Bruce Sutter is basically Francisco Rodriguez. Not that there’s anything wrong with that, but there’s no way K-Rod gets in the Hall. I say forget the past mistakes and keep ’em all out except Rivera.
WAR has to be the single worst stat in the world to use for closers, bar none. It is totally useless. For example, Valverde was a perfect 49-49 last year, with a 2.24 ERA. But he had a negligible 1.0 WAR. Benoit, who had a good season, but not great, had a 1.3 WAR. Coke, who started and relieved, was maybe average, at best, with a 4.49, and he had a 2.0 WAR.
So in Coke’s case, you are basically rewarding him for throwing an extra 36.1 innings, in far lower leverage situations, and allowing an extra 36 ERs in those extra 36.1 IPs.
The key thing I learned in statistics classes 30 years ago was that, if your model tells you that Coke is a much better pitcher than a guy that saves 49-49, then throw the model away.
And Smith should be a lock.
Joe
The guy that went 49 for 49 in saves walked more than 4 hits per nine innings. There was a lot of luck involved. Coke made 14 starts. Those 36 extra innings are not meaningless. In fact, Coke pitched 50 percent more innings than Valverde. If a hitter had 162 at bats in the ninth inning and was above average that wouldn’t make him more valuable than an average hitter with 250 at bats that were spaced throughout the game. Coke is not being rewarded for giving up more earned runs. He is simply not being punished for a lot of bad luck while Valverde is not being rewarded for a lot of good luck.
If the model says that Coke is better than Valverde, perhaps it is not the model that should be thrown out, but the conventional wisdom that closers are as valuable as everyone thinks they are. WAR says that Rivera, Gossage, and Smith have been the most valuable closers of the past thirty years. Seems like it is doing something right.
Craig,
I’ll give you a +1 on Mo, Goose and Smith being the top-3 according to WAR. From that perspective, it works as a relational tool for closers only.
But sometimes it’s good to let democracy work for you. Is there even one person in the entire world of 7B people that think Coke was better or more valuable than Valverde last year?
Here is the weakness of WAR for RPs, particularly closers.
An average player who consistently gets 650 PAs is a more valuable (WAR) player than a slightly better player who only gets 500 PAs. That’s because the loss on value to the replacement player is higher than the gain in value because the 500 PA player is slightly better.
But the replacement value doesn’t apply to an RP. There is no replacement for Valverde since he pitched the entire season. Therefore, an RP with 100 IPs should not accrue benefits for value over replacement because he pitched 30 more IPs than Valverde.
Thought from a more intuitive perspective,
As a RS fan, I have always appreciate Wakefield. He was generally a nice #4 SP overall. 200-180 with a 4.41. One might even call him an adequate #3. But no one would even call him more than an average pitcher.
As a RS fan, I have a duty to hate Mo, but as a BB fan, he’s about the best closer in history, and no one would rank him any lower than maybe #5.
But Wake has a career WAR of 38.6, while Mo has a career WAR of 39.4, virtually identical. Would anyone consider them to be equals?
Joe,
I agree with your premise regarding wakefield and rivera. Rivera has definitely been the better pitcher, but at the same time all those innings have value.
Valverde did pitch the whole year, but consider if he hadn’t. All the relievers move up an inning, a AAA pitcher then gets 60 innings in the fifth and sixth of blowout games. The lost impact isn’t that great. Giving a replacement player 180 starter innings is going to hurt a team a lot more than losing a closer.
Joe,
You’ve uncovered something that people who take pitcher WAR too seriously fail to acknowledge: its’ a counting stat and basically an IP contest. The difference in WAR/inning between elite closers(and often elite starters) is phenomenally small. So, yes, whoever accumulates the most innings is going to finish on top. Also, when using fWAR, which doesn’t properly measure the ability of pitchers who can induce weak contact, you have to be a certain type of pitcher to finish on top. (See Greinke, Zack this year)
comment