There are a lot of people who like baseball. Almost 74 million people attended an MLB game last year, and a 2006 Gallup poll estimated that 47% of Americans identify as a baseball fan. Almost every one of those fans can be more precisely described as a fan of a team rather than the sport itself. FanGraphs readers certainly lie on the less casual end of the spectrum, and that seems to lead to a broader appreciation of baseball in general, but that’s not the case for the vast majority of the baseball-loving populace. Even those who have grown into sport-wide interest didn’t start that way, and probably maintain a preference for one team over all others.
The fan-team relationship, in many ways, is at the heart of baseball. There are easier ways to generate numbers at random, but preference for one outcome over another is what provides a narrative. To be sure, there are lots of ways to enjoy the sport. There is significant pleasure to be had from detailed analysis, or moments of physical grace and power, and everyone is free to enjoy baseball in whatever way they see fit. Rooting, however, is what turns baseball from a hobby into a sport.
What are fans rooting for when they root for a team? Teams today are thoroughly modern organizations, exceedingly large and multifaceted and difficult to grasp entirely. Fans are like the proverbial blind people around an elephant, who comprehend only what they immediately perceive. For most, that’s the laundry, as Jerry Seinfeld famously described it. Everything about a team changes but their name, and sometimes even that changes, but fans remain loyal to the concept of the team, to the history and experiences and hopes they share with other fans. But the second-most enduring aspect of most teams is the owner, far surpassing almost every player, despite being virtually ignored by most fans. Owners are also the ones that benefit most when a team succeeds, and so rooting for a team is in many ways closer to rooting for its owner than rooting for its players.
There have been a few events in the news recently that have prompted thoughts on this topic from several people. The two major ones were Kris Bryant’s demotion by the Cubs, for reasons connected to his arbitration clock rather than his performance, and the comments made by Angels owner Arte Moreno about Josh Hamilton’s drug problem. These are both conflicts between people who are part of the same team, and from a team-oriented viewpoint, should have the same goals but clearly don’t. The question becomes who is “right”, from the viewpoint of the fan. Lots of people have written some excellent things about these conflicts, but my favorite is by Jason Wojciechowski, found here, writing about the paradigm we view these sorts of disagreements from. The whole piece is well worth a read, but the relevant part for this discussion is in the last paragraph:
“A notion of ethics or even morals is something I think we ought to promote in business rather than celebrating the pure concept of moneymaking… We’ve created a political-legal-social scheme that allows firms to exist (thrive!) because we’ve judged the firm a useful construct. Where we go from that starting point… is up to us…. I would like us not to say ‘baseball teams are businesses and so they should be applauded for demoting Kris Bryant’ as our starting point. That’s not our starting point. That’s a moral/ethical choice that has been made from an earlier starting point. Recognition that there are other choices is the first step to reform.”
I don’t think there can be much disagreement with Jason’s point, and I think it’s critical to this discussion. In every aspect of baseball, the viewpoint and goals are decisively pro-team, but that is a starting point. I think it’s time for fans to take a different, pro-player, view as our starting point.
This can be traced back to analytics and the rise of sabermetrics, which have blurred the line between fans and observers of baseball from the outside and professionals from the inside. Anyone who demonstrates their ability to find useful information for a team has the potential to be richly rewarded, and as a result, analytics has one motivating goal in almost every case: to make teams more money. Usually, this takes the form of identifying or measuring undervalued skills and assets, and capitalizing on those market inefficiencies. Under the prevailing framework of baseball analysis, a researcher who identified (for example) the key to Tommy John surgery would be entirely justified in keeping that information private and selling it to a team and making untold sums of money, rather than releasing it to the public and keeping the other 97% healthy as well.
Now, I am not suggesting that someone who made such a major breakthrough should not be rewarded for their work, medical or analytical. Modern baseball analysis is increasingly a business rather than a hobby, and the researcher who identifies the perfect defense-independent pitching metric should be rewarded for the likely massive amounts of work that went into that discovery. But teams are trying to save money for one reason only: to make their owners more money. Every team, from the Red Sox and Yankees to the A’s and Rays, has the ability to spend more and chooses not to. The only “spending limits” they encounter are owner-imposed, and exist for the purposes of profit.
We, meaning fans and hobbyists, are not professional baseball researchers or owners of teams, and as such, are not restricted or motivated by the profit motive. We should feel no such compulsion to orient our passion solely toward teams and their profits.
Despite that, the perspective of the fan tends to always be pro-team, and in many cases, that means it is anti-player. Mike Trout’s contract is “good” because the Angels don’t pay him a lot despite being very good, and Josh Hamilton’s contract is “bad” because the Angels do pay him a lot despite not being very good. Really, therefore, what we mean when we say a contract is good or bad is that it makes or loses an owner money. When the topic of contracts comes up, fans often view them solely as a question of what the team “should” do. This is an example (no offense, T-Sky, you were just the first I saw), where the author writes that “if I were a general manager… I would hand out a lot more contracts like the one the Cleveland Indians just gave Carlos Carrasco.” To be fair, the author also discusses why he feels these deals are good for the player later in the article, so the focus is not just on the team (owner) saving money, but the wording suggests that the player has no agency or control over his own future. While people might not consciously think this, the language used is important, and shows the subconscious assumptions of most fans: contracts are bequeathed by teams to deserving players, as determined by that same team. Now, this obviously isn’t the case in contract negotiations in reality, but it illustrates the viewpoint fans bring – team first, and frequently, team only.
Contract negotiations are not the only aspect of baseball in which this fan viewpoint reigns supreme – on the contrary, this is baked into everything we as fans do. It colors every aspect of the game. As another example, when each year’s Hall of Fame discussions are happening, players are often given accolades for spending their entire career with a single team. There might be valid and legitimate reasons for this – a rapport developed with the fans of that team really is cool, and worth giving someone a bump for – but truly, what is being rewarded is the decision not to test the free-agent market and take the highest contract possible, and instead to reward a team (and an owner) with performance at below-market rates.
Dustin Pedroia, for example, has played with the Red Sox for his entire career, and is currently signed through the 2021 season, after which he will be 38 and either finished or very close to finished playing baseball. He signed his current contract in 2013, but was already extended through 2014 and 2015. The net extension was for 2016 through 2021 (six years) and $89 million dollars, or about $15 million per year. In 2013, Dustin Pedroia had over 5 WAR. At that point in his career, he had averaged 4.7 WAR per 600 PAs. Two years prior, in 2011, he had almost 8 WAR. Had he made his services available to the highest bidder, he would have signed for so, so, so much more than $15 million per year. Instead, he signed with the Red Sox, saving them that large amount of money. Maybe that meant more money was spent on other players, but the Red Sox are one of baseball’s richest teams, and the limits to their spending have always been self-imposed. What that definitely meant was that more money went to the team and its owners.
The standard is to consider Pedroia’s career in a slightly better light because of that. (I don’t mean to point fingers, either – I absolutely am guilty of this.) He sold his services for less than they were worth to a team that could absolutely afford to pay full price, and he’s more likely to make the Hall of Fame because of it. That also means that, implicitly, we’re punishing players that choose to go to the market, and make as much money as they can, which is the last thing I want to do! But when it’s portrayed as rewarding “loyalty”, or whatever other word is used to describe giving money back to team owners, it’s hard not to. This is but one example of the subtle but pervasive pro-team culture that’s endemic in all of baseball fandom.
If this resonates at all with you, I’d encourage you to try to shift your focus as a fan, away from the team and toward the players. There are some trends in baseball that make this more of a legitimate option. Fantasy baseball allows fans to have “their” guys, regardless of what team they’re on. The drive to recognize prospects as early as possible allows fans to keep track of players long before they do anything that impacts a major league team, and hopefully root for them no matter what team they debut with. National media coverage and MLB.tv means you aren’t restricted by geography to what players you follow. Those are steps in the right direction.
If we as fans take a more individual focus, perhaps the conversation will change. Perhaps it will no longer be considered automatically “good” that Bryant has to wait an extra year to sign his first free agent contract, and is more likely to see his career ended by an injury before he ever gets paid, or “bad” that Josh Hamilton capitalized on his excellent performance through age 31. The good/bad labels come from the perspective of the people paying those players, but we as fans are not those people, and we should feel no obligation to take that as our starting point. Root for the players, not the teams.
I really enjoyed Jeff Sullivan’s piece on the prospect pedigree of good players, and it was interesting to see how many solid players never cracked the Baseball America 100 in any year. This is an extension of that article, and not a particularly original one. In fact, I think it’s about the most obvious next step: how many great players were prospects?
It was interesting to see that someone can have a decent season as a totally unheralded player, but there are a lot of players who have a 3-win season and promptly fade into ignominy. Players at that threshold in 2010 included Cliff Pennington and Dallas Braden, and in 2011, Emilio Bonifacio and Alexi Ogando. Cherry-picked names, to be sure, but it’s easy to imagine they (and players like them) are the source of that ~33% of un-ranked good players, and the real elite players are usually identified as at least good. That doesn’t mean it’s true, though, so I tested it.
I pulled the top 10 pitchers and the top 10 position players by WAR for each year from 2010 through 2014. If there was a tie for 10th, I included both players, so the sample ended up at 101 players. Then, for each player, I found their highest ranking on the BA lists. The same caveats as in Jeff’s article apply here, but again, BA is the industry standard, and their lists go back long enough to make them very useful. Following: a giant table, with every qualifying player-year, their WAR in that year (and how that ranked among all players), and their highest prospect ranking and the year of that ranking.
That is a big, ugly table, so here are some summary facts. Of this 101-player sample, 20 were never ranked by Baseball America, so indeed, top players appear to be more likely to have been a ranked prospect (80%) than good players (66%, per Jeff’s article). None of the unranked players were ever the best position player or pitcher in 2010-2014; the 1st place player with the lowest ranking was Cliff Lee, who topped out at 30th in 2003. The unranked players tended to be concentrated toward the bottom of the WAR leaderboards; 75% of the unranked players had a rank of 5th through 10th. I expected more of the people in 8th through 10th in a given season to be beneficiaries of a fluke season, but there are a lot fewer of those than I expected. The unranked players with the least impressive careers outside their top seasons are probably Andres Torres and RA Dickey, but the other unranked players are pretty uniformly great. Maybe not top-10-WAR-every-year-great, but still, great.
What about pitchers versus position players? If the top 10 by WAR of one group was more likely to include unranked players than the other, that would suggest that group was more difficult to scout and accurately predict. But while the split between pitchers and hitters among the unranked players is not totally even, 12 to 8, it’s well within what I would expect from random variation. Maybe a bigger sample could pull something meaningful out, but I’m not comfortable concluding there’s a difference based on this alone.
The following chart digs more into the individual ranks in each season. The x-axis is the WAR rank, and the bar height is the percentage of players at that point that were in the BA top 100. The line running across the chart is the average BA ranking of the players that were ranked.
What this shows is a pretty steady decline in the percentage of players ranked in the BA Top 100 as you move down the WAR leaderboard, and a totally random average ranking of those ranked players. This fits with my perception of prospect rankings – being good enough to be ranked is pretty important, but the exact position on those rankings is not very predictive. As Jeff showed, it’s very tough to be good without being ranked, but this suggests it’s not tough for a prospect to be ranked as if he’ll be merely good, but be great some season.
What about consistent greatness? This list I created really doesn’t capture the best players of the last five years, but the best player-seasons. Can someone be really excellent over a sustained period of time if they weren’t ranked? For this, rather than looking at individual seasons, I grabbed the top 25 hitters and the top 25 pitchers by total WAR from 2010 through 2014. I thought about doing several five-year periods, but I didn’t want to double-count someone like Miguel Cabrera, who would show up for both 2010-14 and 2009-13. Below, a slightly less-giant table than the first, containing similar information: their WAR from 2010-2014, their highest BA ranking (if any), and the year that ranking came in.
Of these 50 players, 12 were unranked, or almost the exact same percentage as the single-season leaders (24% for the five-year vs. 20% for the single-season). Of the 12 unranked players, 7 came between 38th and 50th on the leaderboard, but 3 came in the top 10 (Robinson Cano, Jose Bautista, and Ben Zobrist). At first glance, there was no meaningful split in the unranked players between pitchers and hitters (7 vs. 5), but interestingly, all 7 of the unranked pitchers were in the bottom half of the pitcher leaderboard. All of the top 12 pitchers in the last five years were ranked, with Max Scherzer (#66 on BA’s 2008 list) the lowest, so perhaps it’s less likely a pitcher will be truly elite out of nowhere than a hitter. Again, with this small a sample, I’m not comfortable concluding anything, but it’s certainly interesting.
This is kind of an anticlimactic article, because none of my expectations were turned upside down. A great player was likely to have been ranked at some point, more likely than a merely good player, but there are still some who come out of nowhere. Of those ranked, the actual rank seems to matter less than the fact that they cracked the top 100. None of that is very surprising, but hopefully it’s still interesting to see it all laid out.
To me, the first few weeks of baseball each year are small sample size season. It seems that every article is either a) drawing wildly irresponsible conclusions based on a few dozen plate appearances or innings (either with or without the routine “This is a small sample, but…” disclaimer), or b) showing why those claims are wildly irresponsible and not very useful. This is how we get articles comparing Charlie Blackmon and Mike Trout. It gets a little repetitive, but writing this in March, when the closest thing to real baseball I can experience is play-by-play tweeting of a spring training game, it honestly sounds lovely.
Fairly often in those early articles, I see analyses that use past calendar year stats, that incorporate the first x games of the current season and the last 162-x games of the previous season. The idea is to rely on more than a few games of evidence, but still incorporate hot first months in some way. I’m always conflicted about how much trust to put in those stats and the resulting conclusions.
On the one hand, they have a reasonable sample size, and aren’t drawing any crazy conclusions off a few good games. Including a large portion of the prior season limits the effect a first month can have on the results, which is probably a good thing. On the other hand, it seems like a lot of changes could be made in the offseason, and those changes could have major effects on a player’s performance basically immediately. If that were the case, stat lines that treated game 1 of 2014 as following game 162 of year 2013 in the same way game 162 of 2013 followed game 161 of 2013 would not be presenting an accurate picture of skill.
Consider the case of Brandon McCarthy, who made a lot of changes to his offseason training regimen between the 2013 and 2014 seasons (detailed in this Eno Sarris article). He went on to record his healthiest season to date in 2014, hitting 200 innings exactly with the second-best WAR (3.0) and best xFIP (2.87) of his career. Combining his results from September/October 2013 (42.0 IP, 7.6% K-BB%, 3.74 xFIP) and March/April 2014 (37.1 IP, 15.2% K-BB%, 2.89 xFIP) would not give an accurate sense of McCarthy going into 2014. But is he the exception, or the rule?
To test this, I looked at the correlations between players’ stats in the first and second halves of 2014, and compared that to the correlation between their stats in the second half of 2013 and the first half of 2014. I expect the six-month discontinuity in the second case to make the correlations weaker, but by how much? If it’s a lot, that’s a sign that analysis relying on stats from the last calendar year probably shouldn’t be trusted; if it’s not, then incorporating the last few months of the previous season to boost sample size is more likely to be a good idea. I also looked at the correlations between stats in 2013 and 2014, to provide a sort of baseline for how predictable each statistic is from season-to-season.
I tried to choose stats that reflect primarily the skill of each player, but that they can control to some extent. Hopefully these are stats that won’t change due to a player switching teams, but might if he changes his approach. I settled on BB%, K%, ISO, and BsR for batters, and BB%, K%, GB%, and HR% for pitchers. Those look reasonable to me, but I’d welcome any suggestions.
I set a minimum of 400 PAs or 160 IP for the full-year samples, and 200 PAs or 80 IP for the half-year samples, and looked at all the players that showed up in both of the time frames being compared. I’m going to look at position players first, then starters. In the following table, the value in each cell is the linear R2 of the stats in the two time periods, except in the last row, which shows the number of players in the sample. I bolded the stronger of the two half vs. half correlations.
So these are some seriously unintuitive results, to the point that I went back and triple-checked the data, but it’s accurate. BB%, K%, and ISO all tracked better from player to player from the second half of 2013 to the first half of 2014 than they did from the first half of 2014 to the second half of 2014. Of the four selected stats, only BsR had a stronger correlation inside 2014, but it was odd in its own way, as it was also the only stat for which the full year correlation wasn’t the strongest.
What could explain this? First, it’s possible that this is just randomness, and if we looked at this over a larger sample, the in-year correlations would tend to be stronger. But even if that’s the case, the fact that randomness can make the cross-year correlations stronger (as opposed to just making the lead of the in-year correlations larger) suggests that the difference between the two is relatively small. One possible explanation is survivor bias – perhaps players that get a lot worse between the first and second halves are still likely to see playing time until the end of the season, while players who get substantially worse between seasons might be benched in the first month or two and not get to the 200 PA/80 IP minimum. There’s no doubt that there is survivor bias in this sample, but I’m not convinced by that explanation. Settling on randomness always feels half-hearted, but I really have no idea what else it could be. If anyone has any thoughts, post them in the comments!
The table for the pitchers is set up in the same way.
This looks a lot more like I expected. Three of the four stats are more strongly correlated in season than between seasons, and the exception (HR%) also has the smallest gap between the two correlations, making me inclined to chalk that up to random variation. Interestingly, the gap between the season-to-season correlations and the half-to-half correlations is relatively small (again with the exception of HR%), which fits with my perception of BB%, K%, and GB% as stats that stabilize relatively quickly.
It also doesn’t surprise me that pitchers are less predictable than hitters from the second half of one season to the first half of the other, relative to their in-season predictability. Intuitively, pitchers seem to have a lot more control over their approach, and a much greater ability to shift significantly in the offseason by adding a new pitch, changing a grip, or just getting healthy for the first time in a while. Hitters, on the other hand, seem like they have less ability to change their approach drastically. Even when they can make a change, it’s not necessarily the sort of thing that has to happen in the offseason; if a hitter wants to be more aggressive, he can just decide to be more aggressive, whereas a pitcher looking to throw more strikes is probably going to have to work at that. If true, hitter changes would happen throughout the season and offseason, while pitcher changes would be clustered in the offseason. These correlations don’t provide nearly enough evidence to conclude that’s true, but they do fit these perceptions, which is encouraging.
Overall, this suggests that while going back to last season to get a year’s worth of PAs for a hitter might be a good way to beef up your sample size, it’s probably not as good idea for a pitcher, and also less necessary. After the first few starts, most starters have thrown enough innings that the interesting metrics – BB%, K%, Zone%, etc. – are more signal than noise, and not a lot is added by going to the previous season. This analysis also suggests that adding old stats may even reduce accuracy, by ignoring the potentially significant shifts made by pitchers in the offseason. So the next time you read about a starter’s performance in his last 30 starts, stretching back to May 2014, beware! Or at least be skeptical.
There was an article published in 2013 on FanGraphs that focused on the value of starter inconsistency. The basic idea is relatively simple – a starter who does terribly in one start and very well in the next (e.g., 8 runs in 2 innings followed by 2 runs in 8 innings) gives his team better chances to win than one who is mediocre in two starts (5 runs in 5 innings both outings). Mr. Hunter did some math to illustrate the fact, and quantify it somewhat, but it was a relatively rough measure, and I think the concept is intuitive enough not to gain a ton from a rough demonstration. Definitely read that article, though!
I think the first question that comes to mind upon reading that is: is this sustainable? Is consistent inconsistency possible? To find out, I came up with a relatively simple measure of inconsistency within a season. For every pitcher, I calculated the standard deviation of the Game Scores for each of their starts. If you’re not familiar with Game Score, it’s a Bill James-developed metric that gives pitchers points for outs and strikeouts and docks them points for hits, walks, and runs. It’s mostly a narrative stat, but I think it does a good job of illustrating the quality of a given start. The best start of 2014 by Game Score: Clayton Kershaw’s no-hitter against the Rockies, on June 18th, in which he didn’t allow a hit or walk (damn you Hanley Ramirez) and struck out 15, good for a Game Score of 102. The worst: Colby Lewis’s July 10th start, in which he went 2.1 innings, gave up 13 hits and gave up 13 runs. Didn’t walk anybody! Still had the abysmal Game Score of -12. The 2014 Rangers, ladies and gentlemen.
By looking at the standard deviation of a season’s worth of Game Scores, we get a measure of the inconsistency of their quality. I set a minimum of 10 starts to qualify, which ensures no one is being labeled consistent off a single week of pitching. The usual caveats apply – pitchers needed to be good enough to pick up 10 starts, so this is a snapshot of usage, not just skill. Before looking at the year-to-year correlation, I want to look at the most consistent and inconsistent starters of 2014.
Not surprisingly, we see a lot of starters with fairly low numbers of starts, since extreme values (either high or low) are likely to regress toward the variance for the whole sample (15.53 in 2014) as the number of starts increases. On the consistent end, David Buchanan started his first game for the Phillies on May 20th, and between then and the end of the season, his worst start by Game Score came on June 3rd, when he gave up 7 runs in 6 innings, striking out 2 and walking 6, good for a Game Score of 28. But for a worst start, that’s not that awful, and his best wasn’t that great either – about two weeks later, on June 19th, he threw 7.2 innings of 1-run ball, with 1 walk and 4 strikeouts, and a Game Score of 70. The rest of his season was extremely consistent in its mediocrity, with 16 of his 20 starts having Game Scores between 40 and 60, so it’s no surprise that he takes the bottom spot on this list.
Miles Mikolas was worse, but also much more erratic, with outings like his on August 25th (8 innings, 1 walk, 5 strike outs, and no runs, Game Score of 80) and on July 7th (3.1 innings, 0 walks, 5 strike outs (looks fine so far!) and 9 runs (oh), Game Score of 5). Between those two starts, he had an RA9 of 7.15, but my guess is he gave the Rangers a much higher expected win percentage than if he had evenly distributed those runs across two 6-inning outings.
But does this mean anything when it comes to evaluation? Should a GM view one of the inconsistent starters with a little more optimism for 2015 than one of the consistent starters? In a word, no.
That is a pile of random points, and a resulting R2 value that is basically zero. The inconsistency of a pitcher in 2013 had almost nothing to do with their inconsistency in 2014, so while inconsistency is a hidden way for a pitcher’s results to be better than they look, it doesn’t appear to be a skill.
Even if this was predictable, though, this doesn’t seem to be the sort of thing that would swing the needle too far in either direction. The theoretical argument makes sense, but in practice, there are lots of mitigating factors that might make consistency more valuable. Maybe the starter the day before got bombed, and the bullpen really just needs a day off, and a 100% chance of 6 innings/4 runs is more valuable to the team that day than a 50% chance of 8 innings/1 run and 4 innings/7 runs. There’s also just a lot of randomness, probably enough to drown out the small effect. Inconsistency isn’t consistent year-to-year, and it also isn’t predictable. If a pitcher could control what games he was bad, and bank some great innings to use when he needed them, that would be a big deal. They can’t.
Managers, however, can. They can use their bad innings in games where the outcome is already practically decided, and save their best innings for the tightest of moments, with optimal bullpen use. Day-to-day inconsistency of a pitcher isn’t predictable, but pitcher-to-pitcher inconsistency of a bullpen is, and a similar argument for its value applies. A team with a lights-out closer (FIP of 2.00) and a pretty terrible long man (FIP of 5.00) is going to win more games than a team with two okay relievers (FIP of 3.50 for both), if the manager of the first team deploys his closer in close games and lets the other pitcher eat innings in blowouts. The ability to choose those spots makes the effect potentially much larger than among starters.
Balancing that, however, is the fact that relievers just have a much smaller effect on the game, so this still might not be big enough to matter. However, if it did have a noticeable effect, it would give a team an edge that wouldn’t be reflected in measures of collective performance, and so this could be one reason a team beat its BaseRuns estimated record. To see if that was perhaps the case in 2014, I developed a simple measure of bullpen-wide inconsistency. After discarding some more complicated ideas, I settled on calculating the standard deviation for each team’s eight relief pitchers that threw the most innings. This picks up most of each bullpen’s regulars and semi-regulars, and should be an okay measure of the distribution of skill in a bullpen.
Again, I wanted to first look at the most and least consistent bullpens of 2014 by this measure.
Seeing the Royals as the most inconsistent bullpen of 2014 is not a surprise. On the one hand, Wade Davis (1.19 FIP), Kelvin Herrera (2.69) and Greg Holland (1.83) combined to throw over 200 innings of absurdly good relief. The next five most-used relievers, however, were Aaron Crow (5.40 FIP), Louis Coleman (5.69), Francisley Bueno (3.84), Michael Mariot (3.93), and Tim Collins (4.80). Those are not good pitchers, and that’s a huge gap between the two groups, but by using the top three in close games and letting the other five eat as many non-crucial innings as possible, Kansas City might have been able to win a lot more games than a bullpen with eight relievers with FIPs around 3.30 (the figure for the bullpen as a whole). The Royals are also a good example of why the advantages of inconsistency might just not show up – Ned Yost was (in-)famous for not using his bullpen optimally, and sticking to strictly defined roles with his relievers, which is the sort of thing that could nullify this effect.
The consistent bullpens are pretty boring, so I won’t spend much time on them. Seattle’s worst reliever by FIP in the eight most-used was Joe Beimel (4.18), and the best was Charlie Furbush (2.80), with the other six spread fairly evenly between them. Consistency has advantages, but not being able to turn to a true shutdown reliever when needed, or having to use a fairly valuable arm even in a blowout, might have its own costs, even compared to a bullpen with similar overall skill, such as Kansas City.
Unfortunately, either because of manager incompetence, the smallness of the effect, or something else entirely, bullpen inconsistency does very little to explain BaseRuns over- or under-performance in 2014. In the below graph, teams that beat their BaseRuns record are on the right, while those that fell below are on the left, and more inconsistent bullpens are higher versus consistent bullpens lower.
That, again, is basically a random collection of points. In the top right, the Royals, both the most inconsistent bullpen and the team with the biggest positive gap between their actual winning percentage and the BaseRuns estimate (5.0%). But in the top left, Houston, the second-most inconsistent bullpen and the second-largest negative gap between their actual and BaseRuns winning percentages (-4.6%).
At best, this is inconclusive, but I find the idea really interesting. This does at least show that, on an individual pitcher basis, inconsistency is not predictable, even when looking at previous years, which I think bucks conventional wisdom in a real way. Seeing what bullpens and pitchers were particularly erratic in 2014 is fun, and it’s something I’ll be keeping an eye on in 2015.
If you were feeling charitable, you could say this post owes a lot to Jeff Sullivan’s recent set of articles examining pitch comps. If you weren’t feeling charitable, you could say this post is a shameless appropriation of his ideas. Either way, you should read those articles! They were very good, and very entertaining, and directly inspired this post. There were seven, in total: here, here, here, here, here, here, and here. I’ll wait.
Back? Good! In the comments of the third article, someone asked Jeff about finding the “most signature” pitch, or the pitch with the worst/fewest comps. Jeff said: “Wouldn’t be surprised if it was Dickey or the Chapman fastball. That math… I’m afraid of that math, but I might make an attempt.” Jeff has looked at unique pitches twice (Carlos Carrasco’s changeup and Odrisamer Despaigne’s changeup, the last two articles linked above), but I wanted to attack the question in a less ad-hoc fashion, looking at all pitches rather than singling some out.
Jeff wasn’t wrong, though – the math is not simple. His methodology doesn’t really work here for a couple reasons. First of all, I’m looking for uniqueness rather than similarity. I could just flip Jeff’s method around and look for high comp scores, like what he did for the Carrasco/Despaigne changeups, but I also want to consider all pitch types. Again, Jeff sort of did this in the Despaigne article, by comparing his changeup to a few different pitch types, but that is not really feasible for every pitch thrown.
What this means is that a new method is needed to directly calculate dissimilarity. We could find the maximum distances from the mean (basically Jeff’s method), which would work for a single pitch type: if all the pitches are clustered together, with similar velocities and breaks, calculating the distance from the mean to find the weirdest pitch makes sense. But consider this hypothetical set of pitches, graphed on two axes for simplicity:
Obviously, the pitch that corresponds to the red point is the sort of thing we’d like to identify as unique. It’s also exactly at the center of that dataset, and would show up as the least unique pitch, if distance from the mean was used to determine uniqueness. Luckily, there’s an algorithm that is designed to find outliers in a more rigorous way.
This is where the math gets scary. The algorithm is called Local Outlier Factor analysis, which identifies outliers in a dataset based on the density of data around that point as compared to its neighbors. In this context, the density around a point is a function of how similar the best comps are for each pitch. Each point gets a score, where anything near 1 indicates normal, and higher values indicate greater isolation. I’m not going to go into detail, but if anyone wants to learn more, feel free to ask in the comments, or just Google it. It’s fairly simple to run it on all pitches, with the relevant variables of velocity, horizontal break, and vertical break.
Any pitch thrown more than 100 times in 2014 was included, and righties and lefties were considered separately (since pitches that move the same way obviously are very different based on what side of the rubber they come from). But enough about methodology! Here are the top five most signature pitches, for righties and lefties, along with their LOF scores, followed by some gratuitous gifs.
It’s nice when things work exactly like you expect them to. The top pitches on the two lists are incredible, and incredibly unique, and while it’s not a surprise to see them here, it does provide some reassurance that this measure is doing what it’s supposed to. Everyone knows about Dickey’s knuckleball, and if anything, it’s underrated by this measure. Since it moves so randomly, the knuckle’s season averages end up being slow and pretty much neutral horizontally and vertically. While that’s enough to make them show up as very odd under this measure, the individual pitches don’t often follow that straight trajectory, as seen in the above gif. The same can be said for Steven Wright’s knuckleball in third, but it’s nice that this measure still picks them out as unique pitches.
As for Chapman, there’s not that much to say about his fastball that hasn’t already been said. It feels wrong in some way to call his fastball strange, since it is disturbingly direct in practice, but there was truly no pitch like it in 2014. The velocity is the carrying factor behind the massive outlier score, almost a full 2 MPH greater than the next fastest pitch. Interestingly, Chapman’s pitch was the only one in either top five with notably high velocity.
Looking at the weirdest pitches in baseball, what can we conclude about them as a group? First, the pitchers throwing them are generally not bad. While you’d expect someone to be at least halfway decent to get in the position to throw 100 pitches of a single type, the owners of these pitches averaged about 1 WAR in 2014. With eight of these 10 throwing primarily in relief, and having only 710.2 innings collectively, that comes out to a very respectable 2.4 WAR/200.
The pitches themselves varied in usage, from Neshek’s change, thrown 13.4% of the time, to Britton’s sinker, thrown 89.3% of the time. They also varied in effectiveness, as measured by run values, from Neshek’s 3.6/100 to Marshall’s -1.63/100. Overall, the best pitch is probably Chapman’s fastball, followed by Britton’s sinker, given both the results on those pitches and how often they use them, but as a group, these pitches are pretty good. Maybe that isn’t totally surprising, but weird does not necessarily equal effective. Any pitcher could immediately have the weirdest pitch in baseball, if he threw 40 MPH meatballs, but less absurdly, mix and control matter just as much as the movement of the pitch.
Finally, all this stuff tracks fairly well with what Jeff identified previously. Obviously, he called Dickey and Chapman, but he also wrote this article about how Zach Britton’s sinker is pretty much comp-less, and we see that very pitch in fifth for lefthanders. Odrisamer Despaigne’s change was 12th for righthanders. Interestingly, Carrasco’s change is 98th on that same list, indicating this method doesn’t think he’s incredibly unique. Overall, this was mostly just a fun exercise, but maybe there’s more to this list, so if you want to poke around, it’s in a public Google Doc here. And like I said, if you have any questions about the methodology or anything like that, I’d be glad to answer them in the comments.
Rob Arthur published a really interesting piece at Baseball Prospectus, where he presented evidence in favor of the idea that batters are aware of the relative framing ability of the receiver they’re facing. That’s really fascinating to me, because it suggests that this skill, which the baseball research community has only recently begun to quantify, has been understood by players for a long enough time to show up in the behavior of major leaguers.
If that were true, batters are not the first component of an at-bat I’d expect to adjust to the receiver. Quotes from pitchers in the past have suggested that they’re aware of when their catcher is helping them out, and how; I want to know if that awareness is reflected in their pitch tendencies. Specifically, I want to know if pitchers are aware of the particular framing skills of their receivers. This article, by Community Blog Overlord Jeff Sullivan, is a little old, but it was one of the first framing articles I read, and the first I remember suggesting that some catchers were not just better at framing than their counterparts, but framing in specific parts of the zone. This more recent article, where Dave Cameron discusses the possibility of voting for Jonathan Lucroy as NL MVP, does talk about pitcher tendencies based on receiver skill, but it’s one pitcher and one catcher. Additionally, I’m just as interested to see how pitchers react to bad receivers, which as far as I can tell, hasn’t been covered. Do pitchers throw to their receivers’ strengths, and do they avoid their weaknesses?
The first thing to do is to establish how catchers do in different sections of the strike zone. I’m using Pitch F/X data from the wonderful Baseball Savant, which splits the zone like so:
For the purposes of this article, I’m concerned with the relative ability of receivers to preserve and gain strikes in different parts of the zone. As such, I’m going to categorize all pitches as “high in-zone” (in zones 1, 2, and 3), “high out-zone” (11 and 12), “low in-zone” (zones 7, 8, and 9), and “low out-zone” (13 and 14). It is a little unfortunate that this doesn’t pick up the relative horizontal skill of receivers, but these divides should still allow for some real differentiation between catchers while also keeping our sample sizes large-ish. If we pick too narrow a slice of the zone, the results might get a bit iffy.
Calculating relative framing ability took a few steps. To begin with, I looked at receivers with at least 30 pitches in each of the four zones, which picks up 87 catchers. That’s might be way too small a sample size, but the least pitches caught by any of these receivers is 1,040, which is not terrible. For each receiver, I calculated their rate of strikes for each of the four zones, and took the ratio between their strike rate and the average strike rate for the sample, and averaged together that ratio for the two low zones and the two high zones. That left two ratios for each player, high and low, where a number greater than 100 indicated better than average framing ability and a number less than 100 indicated worse than average framing ability.
Now, this is not a very good framing metric, but it does allow for a zone-oriented measure. I then divided the high-zone ratio by the low-zone ratio to get a final ratio, where greater than 100 indicated a receiver relatively better at getting the high strike, and less than 100 indicated a receiver relatively better at getting the low strike. Catchers notably better in the lower part of the zone: George Kottaras (.68), Jeff Mathis (.72), and Travis d’Arnaud (.73). Jonathan Lucroy, mentioned as a good low-ball framer, had a score of .89, but as he was good in both parts of the zone, there was a limit to how extreme his ratio could be. Catchers notably better in the high part of the zone: A.J. Ellis (1.46), Adrian Nieto (1.33), and Brett Hayes (1.29), again, three catchers with pretty bad receiving reputations.
So we now have a rough indication of how much better catchers are in the bottom and top of the zone. What kind of relation does this have to how they were pitched to? To estimate that, I stayed simple – I ran a linear regression, with the high/low ratio as the independent variable and the percentage of low or high pitches the catcher was thrown as the dependent variable. This, again, is a very rough measurement, since different pitchers are throwing to these catchers, but looking on a battery-by-battery basis would make the sample sizes tiny. Additionally, sometimes a catcher is catching a given pitcher because he’s good at receiving in a certain part of the zone that pitcher throws to frequently. So while this might be picking up manager actions as well as pitcher actions, it should be picking up something.
Results! Two graphs.
Both graphs show the expected relationship, with this blunt measurement of relative framing ability doing a fairly good job of predicting the distribution of low and high pitches thrown to a given catcher. Obviously there’s more at play here, but clearly pitch selection is impacted by the strengths of the receiver behind the plate.
There’s another question that can potentially be answered using this metric: do pitchers react differently to strengths and weaknesses? If one catcher is 30% better at framing low pitches than high pitches, and very good at framing low pitches, and another catcher is also 30% better at framing low pitches than high pitches, but very bad at it (and apparently even worse at framing high pitches, I guess (he is a very good hitter)), is one of them more likely than the other to get an increased rate of low pitches? In other words, are pitchers more inclined to avoid the bad, or seek the good?
To answer this question, I split the receivers into above-average and below-average low pitch receivers (46 and 41 in each group) and above-average and below-average high pitch receivers (51 and 36 in each group), using the scale described above. I then plotted the rate of pitches in the appropriate zone against each group separately. Following: more graphs!
What we see here is a higher R2 value in both of the below-average samples, indicating that the high vs. low ability of bad framers appears to influence pitcher decisions more than the high vs. low ability of good framers. The gap for low pitches isn’t huge, but the gap for high pitches is fairly substantial. While this analysis is way too rough to conclusively show anything, this would seem to suggest that pitchers behave differently when throwing to good and bad framers, and may be more inclined to avoid weaknesses than to seek out strengths.
As I said (several times), this is a rough analysis that relies on a rough metric, but I think it provides some evidence for some very interesting pitcher behavior. I’d love to hear about other ways of identifying receivers’ strengths and weaknesses in different parts of the zone, so if anyone knows of articles doing so, or has some different ideas, say so in the comments!