Archive for Research

Fantasy Comparables: Ceilings, Floors, and Most Likely Situations

I’m entering my fourth season of fantasy baseball this year and in my quest for my first championship I stepped up my preseason work to include making my own projections for players and creating my own dollar value system for my league’s custom scoring (6×6, standard with OPS and K/9 added). When making projections for players this year, I looked at their last three seasons in the Majors and used their Steamer and ZiPS projections to make sure I was in the same universe or had solid reasons for my different projection. I made projections for about 300 hitters and 200 pitchers, which I feel are grounded in reality and will give me an edge in my fantasy endeavors this year.

However, while I’m pleased with my projections and it’s definitely better than when I first started playing and just knew Yankees and other AL East players, my projections are still very limiting. One of the main problems is that I’m producing a single stat line for each player. It’s based on what they’ve done previously, how they’re trending, and how I and other systems think they’re mostly likely to produce in 2014, but it’s still just a single projection. More advanced projection systems, like PECOTA, compare a given player to thousands of other Major Leaguers to find comparable careers and produce various projections and each projections probability of occurring.

Projection systems like this recognize the inherent uncertainty of projecting future baseball performance and instead of giving one stat line, give us a range of outcomes with their likelihood and produce more accurate results. Now, I am just dipping my toe in the water of finding comparable players and making projections based on that but I wanted to see how this type of system would change my valuation two outfielders who will turn 27 this season, Justin Upton and Jay Bruce. Bruce will turn 27 in April and Upton turns 27 in August. They’ve both been big fantasy contributors in the past, Bruce is more consistent in his production while Upton has been streakier, with hot and cool months and peaks and valleys of home run and stolen base totals. I’ve put my projections for them below with a dollar value based on a 12 team league with 22 roster spots and a 70-30 hitters-pitchers split.

Player

AB

BB

Hits

2B

3B

Runs

HR

RBI

SB

AVG

OBP

SLG

OPS

Dollar Value

Jay Bruce

590

62

154

38

1

88

33

100

7

.261

.331

.497

.828

$29.39

Justin Upton

550

68

150

28

2

95

25

78

13

.273

.353

.467

.820

$26.96

I’m projecting them to produce similar value, but Bruce definitely has an edge. To find comparable players to Bruce and Upton, I looked at all MLB season from 1961 through 2013 (61 being an arbitrary start date based on how much data my laptop could sort through and organize with John Henrying it’s CPU). I narrowed down to players with similar home run and stolen base totals in their age 23 to 26 seasons, along with average, OPS, strikeout and walk percentages, and playing time in an attempt to find a list of similar hitters.

For Jay Bruce I found 19 comps and I found 26 for Upton, there’s a link to the google doc with the full list below which I recommend checking out, it’s not included here so I can save some space. Now that I have the comparable players, I want to see how the performed in their age 27 season to give me a range of outcomes for both Bruce and Upton. I’ve included some bullet points here, again with the full spreadsheet linked at the end.

Mean and Median Value of Comparable Players’ Age 27 Season

  • The average dollar value of Upton comparables was $27.17 and the median value was $31.49.
  • The average of Bruce comparables was $21.39 and the median value was $19.51.

Best Case Scenario

  • The best case scenario for Upton would be to follow Bobby Bonds’ age 27 season, where he put together his power and speed (39 HRs and 43 SBs) and bumped his average up to .283 from .260 in the previous year. I don’t think the HR total is out of the question, definitely hard and more than I’m predicting, and I think the average is within reach, but Bonds was regularly stealing 40 bases a year at this point which Upton is clearly not.
  • The best case scenario for Bruce would be to follow Dale Murphy’s age 27 season. Murphy hit .302 that year, with 36 HRs, scoring 130 times and driving in 121 RBIs. While a .300 average may seem unfathomable for Bruce, Murphy hit .281 the year before and .247 the year before that. What makes this situation most unlikely, is that Murphy had a little more speed than Bruce (most seasons stealing bases totaling in the high single digits or low double digits) but he swiped 30 when he was 27, probably out of Bruce’s reach.

More Realistic Good Scenarios

  • While I don’t expect Upton to reach Bobby Bonds level, it’s not hard to imagine him producing a line similar to Reggie Jackson’s 1973 when Jackson was 27. From 1970 to 1972, Jackson’s home run highs and lows by season were 23 to 32, his stolen bases ranged from 26 to 9, and his average fluctuated from .237 to .277.  There’s the volatile situation that we’ve grown accustomed to seeing from Upton. In 1973, Jackson put it together and hit 32 dingers, stole 22 bags, and hit .290.  Upton has already produced remarkably similar lines (2011 – 31HR/21SB/.289 avg) and could put it together for 2014.
  • Jay Bruce isn’t going to steal 30 bases but he easily follow the 27 year old season of a former Cincinnati Red, Adam Dunn. Dunn was reliably hitting 40 home runs a year at this point (seriously, four straight season with exactly 40) and while Bruce has yet to reach the 40 mark, it’s not outside the realm of possibilities. The big difference with Dunn’s age 27 season from his other years is that he got his average up to .260 (bookended with .230 seasons), stole 9 bases, and had over 100 runs and RBIs. With Bruce entering his power prime, I think 40 homers is definitely possible, if still unlikely, and hitting .260 is definitely in his wheel house.

Outside of Injury, Worst Case Scenario

  • For Upton, if he stays healthy the worst case scenario is following former Phillies 2B, Juan Samuel. Samuel had between .264 and .272 the four previous season, with home run totals as high as 28 but reliably in the high teens, and had stolen at least 30 bases each year. At age 27 though, his average fell to .240, he only hit 12 home runs (and never exceeded 13 again), and while he could rely on his speed and stole 30 bases he failed to produce 70 runs or RBIs. Not the most likely situation for Upton, but I could envision it with less stolen bases.
  • For Bruce, the floor doesn’t get that low. If he reaches 500 Abs the worst comparable is Torii Hunter’s age 27 season where he only hit .250 and stole 6 bags, but still hit 26 homers and drove in 100 RBIs. Given Bruce’s consistency and the consistency of his comparables, I’d expect a high floor.

The Merciful Conclusion

 I know this took up a lot of room and we’re all happy this is almost over, but what does this mean. First, this is pretty rudimentary with no set formula for finding comparable players, I did my best but they’re definitely not one to one matches and should be taken with a grain of salt. However, I think this helps articulate a fundamental difference between Jay Bruce and Justin Upton. Bruce is a high floor, more limited ceiling guy and I’ve got more confidence that his 2014 will fall close to my projections. I know I’m buying about a .260 average, with a couple of stolen bases, mid 30s home runs with a little wiggle room, in a good lineup.

Justin Upton is a lotto ticket guy. I’m sticking with my projection for his season which falls between the extremes, but if he repeats his 2011 or puts together his tools that he has demonstrated at different points of his career, he could finish right behind Mike Trout among fantasy outfielders. At the same time, I could see him producing a line like his big brother BJ did last year, okay maybe not that bad, but definitely not worth his draft price. Who you take depends on what path you want to believe and who you already have on your team, but I think laying out these options and using player comparables definitely adds to fantasy projections and will be a staple I’ll use next year.

 

As promised, here’s the link to the full list of comparable players used for this article: https://docs.google.com/spreadsheet/ccc?key=0AmP-CH5MqzENdFZSZ0xhQVZiYWxNSVQxYzBsOFh3YkE#gid=0


Why I Don’t Use FIP

Over the last decade, Fielding Independent Pitching (FIP) has become one of the main tools to evaluate pitchers. The theory behind FIP and similar Defensive Independent Pitching metrics is that ERA is subject to luck and fielder performance on balls in play and is therefore a poor tool to evaluate pitching performance. Since pitchers have little to no control over where batted balls are hit, we should instead look only to the batting outcomes that a pitcher can directly control and which no other fielder affects. In the case of FIP, those outcomes are home runs, strikeouts, walks, and hit batters.

However there are many serious issues with FIP that collectively make me question its usage and value. These issues include the theory behind the need for such a statistic, the actual parameters of the formula’s construction, and the mathematical derivation of the coefficients. Let’s address these issues individually.

Control over Balls in Play

A common statement when discussing FIP or BABIP is that pitchers have little to no control over the result of a ball once it is hit into play. A pitcher’s main skill is found in directly controllable outcomes where no fielder can affect the play, such as home runs, strikeouts, and walks (and HBP). In trying to estimate a pitcher’s baseline ERA, which is the objective of FIP, the approximately 70% of balls that are put into play can be ignored and we can focus only on the previously mentioned outcomes where no fielder touches the ball.

The concept of control is a little fuzzy though and something I believe has been misappropriated. It is definitely true that the pitcher does not have 100% absolute control over where a batted ball is hit. There is no pitch that anyone can throw that can guarantee a ball is hit exactly to a particular spot. However in the same vein, the batter doesn’t have 100% absolute control either. If you were to place a dot somewhere on the field, no batter is good enough to hit that spot every time, even if hitting off a tee.

However this lack of complete control should not in any way imply that the batter or pitcher doesn’t have any control at all over where the ball is hit. Batters hit the ball to places on the field with a certain probability distribution depending on what they are aiming for. Better batters have a tighter distribution with a more narrow range of possibilities and can more accurately hit their target. For example consider a right-handed batter attempting to hit a line drive into left field on an 80 mph fastball down the heart of the plate. A good hitter might hit that line drive hard enough for a double 30% of the time, for a single 30% of the time, directly at the left fielder 10% of the time, and accidentally hit a ground ball 20% of the time. Conversely, a worse batter who has less control over his swing may hit a double 10% of the time, a single 10% of the time, directly at the left fielder 15% of the time, an accidental ground ball 25% of the time, and in this case not even get his swing around the ball fast enough and instead hits the ball weakly towards the second baseman 40% of the time.

Where the pitcher fits into the entire scheme is in his ability to command the ball to specific locations, with appropriate velocity and spin, as to try to sway the batter’s hit distribution to outcomes where an out is most likely. Consider the good hitter previously mentioned. He accomplished his goal fairly successfully on the meatball-type pitch. What if the same good batter was still trying to hit that line drive to left field, but the pitch instead was a 90 mph slider on the lower outside corner? On such a pitch the good batter’s hit distribution may start to resemble the bad hitter’s hit distribution more closely. This is a slightly contrived and extreme example, but it also encompasses the entire theory of pitching. Pitchers are not trying to just strike out every batter, but instead pitch into situations and to locations where the most likely outcome for a batter is an out.

By this reasoning the pitcher has a lot of control over where and how a batted ball is hit. This does not mean that even on the tougher pitch that the batter can’t still pull a hard double, or even that the weak ground ball to the second baseman won’t find a hole into right field, these are all still possibilities. However by throwing good pitches the pitcher is able to control a shift in the batter’s hit probability distribution. Similarly, better batters are able to make adjustments so that their objective changes according to the pitch. On the slider, the batter may adjust to try to go opposite field. However a good pitch would still make the opposite field attempt difficult.

This is all to say that better pitchers have more control over how balls are hit into play. They are able to command more pitches to locations where the batter is more likely to hit into outs than if the pitch was thrown to a different location. Worse pitchers don’t have such command or control to hit those locations and balls put into play are decided more by the whims of the batter. FIP takes this control argument too far too the extreme. There is a spectrum of possibilities between absolute control over where a ball is hit and no control over where a ball is hit that involves inducing changes in the probability distribution of where a ball is hit, which is how the game of baseball is actually played. As a simple example, we see that some pitchers are consistently able to induce ground balls more frequently than others. Since about 70% of all plate appearances result in balls being put into play, it is important to actually consider this spectrum of control instead of just assuming that the game is played only at one extreme.

Formula Construction

Let’s pause though and ignore my previous argument that a pitcher can control how balls are hit and we’ll instead assume that all the fielding independence theories are true and we can predict a pitcher’s performance using only the statistics in the FIP formula. This introduces an immediate contradiction since none of the statistics used in the FIP formula (except HBP, which has the smallest contribution and is a prime example of lack of control) are in fact fielder independent. The FIP formula is not actually accounting for its intended purpose.

The issue of innings pitched in the denominator has been addressed before. Fielders are responsible for collecting outs on balls in play which therefore determines how many innings a pitcher has pitched. However all three of the statistics in the numerator are also affected by the fielding abilities of position players, especially in relation to ballpark dimensions. Catchers’ pitch framing abilities have been shown recently to heavily affect strike and ball calls and could be worth multiple wins per season. Albeit rare events, better outfielders are able to scale the outfield fences and turn potential home runs into highlight reel catches.

More commonly though, better catchers and corner infielders and outfielders can turn potential foul balls into outs. When foul balls are turned into caught pop-ups or flyballs, the at bat ends, thus ending any opportunity for a walk or a strikeout which may have been available to a pitcher with worse fielders behind him. This is particularly harmful to a pitcher’s strikeout total. Whereas a ball landing foul only gives an additional opportunity for a batter to draw a walk, it also moves the batter one strike closer (when there are less than two strikes) to striking out.

Similarly, instead of analyzing the effects of the fielders, we can look at the size of foul territory. Larger foul territory gives more chances for fielders to make an out since the ball remains over the field of play longer instead of going into the stands. Statistics like xFIP normalize for the size of the park by regressing the amount of flyballs given up to the league average HR/FB rate, however there is no park factor normalization for the strikeout and walk components of FIP.

We can see the impact immediately by examining the Athletics and Padres, two teams whose home parks have an extremely large foul territory. By considering only the home statistics for pitchers who threw over 50 IP in each of the last five seasons, the Athletics pitchers collectively had a 3.25 ERA, 3.74 FIP, and 4.05 xFIP, while the Padres pitchers collectively had a 3.38 ERA, 3.84 FIP, and 3.86 xFIP. In both cases FIP and xFIP both drastically exceeded ERA. Also, of the 46 pitchers who met these conditions, only 9 pitchers had an ERA greater than their FIP and only 7 had an ERA greater than their xFIP, with 6 of those pitchers overlapping. This isn’t a coincidence. Although caught foul balls steal opportunities away from every type of batting outcome, it is more heavily biased to strikeouts since foul balls increase the strike count.

Mathematics

The mathematics of the FIP formula may be my biggest problem with FIP, mostly because it’s the easiest to fix and hasn’t been. I’ve seen various reasons for using the (13, 3, -2) coefficients in derivations of the FIP formula. Ratios of linear weights, baserun values, or linear regression coefficients are the most common explanations. However none of these address why the final coefficient values are integers, or why they should remain constant from year to year.

There is absolutely no reason why the coefficients should be integers. Simplicity is a convenient excuse, but it’s highly unnecessary. No one is sitting around calculating FIP values by hand, it’s all done by computers which don’t require such simplicity. By changing the coefficients from their actual values to these integers, error and bias is unnecessarily introduced into the final results. Adjusting the additive coefficient to make league ERA equal league FIP does not solve this problem.

The baseball climate also changes yearly. New parks are built and the talent pool changes. This changes the value of baseball outcomes with respect to one another. It’s why wOBA coefficients are recalculated annually. However for some reason FIP coefficients remain constant. The additive constant helps in equating the means of ERA and FIP but there is still error since the ratios of HR, BB, and K should also change each year (or at least over multi-year periods).

I’ve calculated a similar version of FIP, denoted wFIP, for the 2003-2013 seasons using weighted regression on HR, (HBP+BB), K, all divided by IP as they relate to ERA. If we treat each inning pitched as an additional sample, then the variance of the FIP calculation for a pitcher is proportional to the reciprocal of the amount of innings pitched. Weighted regression typically uses the reciprocal of the variance as weights. Therefore in determining FIP coefficients we can use each pitcher’s IP as his respective weight in the regression analysis. The coefficients for the weighted regression compared to their FIP counterparts are shown in the following graph.

Ignoring the additive constant, since 2003 each of the three stat coefficients have varied by at least 22% from the FIP coefficient values and are all biased above the FIP integer value almost every year. In 2013 this leads to a weighted absolute average difference of 0.09 per pitcher between the wFIP and FIP values, which is about a 2.3% difference on average. However there are more extreme cases.

Consider Aroldis Chapman, who had a 2.54 ERA and 2.47 FIP in 2013. On first glance this seems to indicate a pitcher whose ERA was in line with his peripheral statistics and if anything was very slightly unlucky. However his wFIP came to 2.96. If we saw this as his FIP value we might be more inclined to believe that he was lucky and his ERA is bound to increase. This difference in opinion would come purely from use of a better regression model, without at all changing the theory behind its formulation. That is a poor reason to swing the future outlook on a player.

However even with current FIP values, no one would draw the conclusions I did in the previous paragraph that quickly. Upon seeing the difference in FIP (or wFIP) and ERA values, one would look to additional stats such as BABIP, HR/FB rate, or strand rate to determine the cause of the difference and what may transpire in the future. This in fact may be the ultimate problem with FIP. On its own it doesn’t give us any information. Even with the most extreme differentials we always have to look to other statistics to draw any conclusions. So why don’t we make things easier and just look at those other statistics to begin with instead of trying to draw conclusions from a flawed stat with incorrect parameters?


The Royals: The AL’s Weirdest Hitters

The MLB season is quickly approaching, and I am running out of ways to entertain myself until real baseball starts again.  One way that I attempted to do so today was to prepare a guide about strengths and weaknesses of offenses by team.  I just worked with the AL because I didn’t feel like adjusting the data for DH and non-DH teams to be in the same pool.  Using FanGraphs’ infallible Depth Charts feature, I gathered every American League team’s projected totals for AVG, OBP, SLG, and FLD, in order to see some basic tendencies for each team coming into the 2014 season.  I plugged some numbers into 4 variables which I thought would give a better-than-nothing estimate of how a team’s offensive roster was set up. Here are the stats I used to define each attribute:

Contact: AVG

Discipline: OBP – AVG

Power: SLG – AVG

Fielding: FLD

These variables are about as perfect as they are creative (which is to say, not very).  However, this was intended to be a fairly simple exercise.  For each variable, I ranked all the teams and assigned a value between -7 and 7.  The best team in the AL received a 7, second best a 6, and so on.  A score of 0 is average and -7 is the worst.  Here are the results:

Dashboard 1

As an inexperienced embedding artist, I feel obligated to include this link, which should work if the above chart is not working in this window.

Immediately, one thing popped out at me. The Royals are 1st in Contact. They also are 1st in Fielding. This is good, since they project to be dead last in Discipline and Power. These facts going together really is odd. For the most part, teams fit into more general molds. The White Sox and Twins are below average in everything. The Yankees, Red Sox, and Rangers, are below average at nothing. The Rays and A’s are, to no one’s surprise, copying each other with good Discipline and Defense.

In fact, outside of the Royals, there isn’t another team who is 1st or 15th in any 2 categories, and Kansas City did it in all 4. To figure out how they got here, let’s look at some of the ways they stick out from the rest of the league.

In 2013, the American League had a 19.8% strikeout rate. Of all the Royals’ projected starters in 2014, Lorenzo Cain had the highest 2013 K% at 20.4%. Alex Gordon sat at 20.1%, and you won’t find anyone else above 16.1%. Not satisfied with an overall team strikeout rate about 3 points lower than the league average in 2013, the Royals went out and acquired Omar Infante and Nori Aoki this offseason, whose respective rates of 9.1% and 5.9% ranked 8th and 1st among all hitters with 400+ PA last year. It’s obvious why the Royals batting average is supposed to be 8 points higher than the 3rd best in the league. They put the ball in play.

Unfortunately for them, putting it in play is about as much as they can do. They’re the least likely team in the AL to be clogging up bases with walks, and they’re the least capable team to drive in runs with power.

In 2013, the American League had an average Isolated Power of .149. Alex Gordon led the Royals with his .156 mark. And that was it for the above average power hitters. Even Designated Hitter Billy Butler couldn’t muster up anything better than a .124. The team’s ISO was .119, which won’t be affected dramatically by the arrival of Aoki and Infante, whose ISO’s averaged out to .108, but who replace weak-hitting positions for the Royals.

Oh, and for discipline: they don’t walk. They don’t like it. GM Dayton Moore got in trouble for saying something dumb about it, and the data suggest Manager Ned Yost may not have been aware they existed when he played. To the Royals’ credit, they did acquire Aoki, whose 8.2% rate last year was ever so slightly higher than the AL average of 8.1%. Omar Infante’s rate was just above 4, though, and their 6.9% team rate probably won’t be much better this year.

Lastly, fielding. Kansas City could flat out field, winning 3 Gold Gloves, and saving a mind-blowing 80 runs according to UZR. That number, more than double (!!!) anyone else in the AL in 2013, was the 2nd highest UZR ever in the AL, trailing only the 2009 Mariners. Those 80 runs are almost sure to decrease in 2014, but there’s little reason to argue that any other team in the AL will be expected to save more runs with the glove this year.

Overall, the Royals offense could be nuts in 2014. They won’t strike out, and will put the ball in play. There won’t be many other ways they get on, and they won’t be hitting the ball out of the park much. If last year is any indication, they should save some runs for their pitchers when they’re out in the field. No matter how they turn out this year, there’s one thing to remember. If you’re watching a team effort from Kansas City, there’s a decent chance that no one in the rest of the league is doing it better. There’s also just as good a possibility that everyone is.


What’s the Value of a Home Run These Days?

Let’s face it, people love the home run. It’s why players like Mark Reynolds can find jobs. These days, we aren’t surprised when we see a couple of home runs in one game. It wasn’t always like this, however. Home runs used to be a rarity among baseball events. In the early 20th century, it wasn’t uncommon for a player to lead his league in baseball by hitting 10-15 home runs. This brings me to the question: how has the home run actually changed? Not in terms of its frequency, but in terms of its value. More specifically, its value in runs. To approach a solution to this question without arduously parsing through hundreds of event files, we must find a way to mathematically frame the game of baseball in a way that encourages simplicity but doesn’t lose the most familiar parts of the game.

Markov Chains

The first batter of the game steps to the plate and sees no runners on base with none out. He pops up. The second batter steps to the plate and sees the immediate result of the last at bat: an out. The second batter walks. The third batter then sees the immediate result of the second batter’s at bat: a runner on first base. The stream of batters stepping to the plate and being placed into a state resulting from the previous batter’s at bat exemplifies the nature of a Markov Chain. When a batter steps into the batter’s box, his current state (whether it be an out situation, a base situation, or a base-out situation) is only dependent on the previous batter’s state. This is known as the Markov Property. Using this structure, we can simulate any baseball game we’d like. However, to keep our calculations simple, we should introduce some new rules.

The Rules of the Game

  1. A batter can only attain a BB, HBP, 1B, 2B, 3B, or HR.
  2. Outs only occur via a batter getting himself out.
  3. Anything other than the events from 1) is assumed to be an out.
  4. When a batter gets a hit, the runners on base advance by as many bases attained by the batter (e.g. a double with a runner on second will score the runner on second).

These are the rules of the game. There are no stolen bases, no scoring from second on a single, and no double plays. We have stripped the game down to only its essentials, while implementing certain changes for our own convenience. For our purposes, we don’t care about Mike Trout’s 33 stolen bases, only the fact that he mainly attains his bases through the events we allow.

The Out Chain

We assume that the probability of a batter getting a single at any point during a season is the number of singles he gets for the season divided by his plate appearances. We do this for the probabilities of all our desired events. By doing this, we can construct a simple Markov Chain where players step to the plate and find themselves batting with 0, 1, or 2 outs. We find that this chain is irreducible, meaning that each state (0, 1, or 2 outs) eventually leads to every other state. This, and the fact that we are dealing with a finite number of states, leads us to the existence of a probability distribution on our state space of outs. It so happens to be that when a batter starts his at bat, he does so with an equal probability of seeing 0, 1, or 2 outs, i.e. the probabilities of a batter seeing 0, 1, or 2 outs when he comes to the plate are all 1/3. The knowledge that outs are uniformly distributed over our game allows us to construct probabilities for a more complicated chain that should shed light on our original question.

The Base Chain

We now place our focus on the stream of batters who see a certain base situation when they step to the plate. The transitions of base situations are dependent on the out situation, as can be seen when a batter bats hits with 1 out versus 2 outs. Batting with 1 out, if the player makes another out, then the base situation stays the same for the subsequent batter. If he does this with 2 outs, however, then the inning is over and the base situation reverts to the state where no one is on base in the next inning. Fortunately, we know that the probability of a batter seeing any number of outs when he steps to the plate is 1/3. In a similar manner to the Out Chain, we find that every state in the Base Chain leads to every other state. The “runners on the corners” state eventually leads to the “bases loaded” state, which eventually leads to the “bases empty” state, and so on. Since there are finitely many base situations, we are led to a stationary probability distribution on the state space of base situations. That is to say, there is a probability associated with a runner stepping to the plate and seeing the bases empty, and another for seeing a runner on first, etc.

Results

Using this method, a player in our universe who stepped to the plate in 2013 saw the bases empty with an approximate probability of .467. That same batter saw the bases full with a probability of .103 and one runner on first with probability .210. If a team managed to load the bases, they’d find that they generally had to wait about 10 more plate appearances before they next loaded the bases. If they put runners on the corners, they generally had to wait 42 more plate appearances before they did so again. All of this leads us to some of our final conclusions. In the context of our rules, the expected number of runners on base in 2013 was .908, meaning that the expected value of a home run was 1.908 runs. This method generates home run values that are always between 1.8 and 2.2 runs. The following is a table of all of the expected home run values this method generates from the seasons of the last 25 years:

In the last 25 years, we predict that a home run had the greatest value in 1999, at 1.972 runs. This is a reflection of the heavily offensive environment of the season, when big bats such as Sammy Sosa, Mark McGwire, and Barry Bonds were getting on base at staggering rates. The following is a graph of all of the home run values the system predicts from 1884 through 2013:

We see that this system predicts home runs to have been of more value from around 1889 to 1902, when the home run hovers at around 2.00-2.15 runs. While most players of this generation weren’t hitting home runs, they were certainly getting on base often. In 1894, 38 players had on base percentages greater than or equal to .400, compared to 7 players in 2013. When on base percentages are higher, more people are on base, and this increases the expected value of the home run. Under our restrictions, however, the home run hasn’t been worth 2.00 runs since 1950 and these days it fluctuates between 1.90 and 1.93 runs. While these estimates are all under the umbrella of rules and assumptions, this framework allows us to more easily generalize the game of baseball while preserving its most important aspects. It’s this framework that gives us the power of estimating that, while Chris Davis‘ 53 home runs were probably worth 101 runs in 2013, they may have been worth 114 in 1894.


Talkin’ About Playoffs

While watching the playoffs last October, I realized that I had never seen rookies play such a prominent role in the postseason before.  Pitchers like Michael Wacha, Gerrit Cole, Hyun-Jin Ryu, and Sonny Gray propelled their teams into contention during the regular season, and took the hill in multiple elimination games.  The inimitable Yasiel Puig had a similar impact on the Dodgers’ fortunes in 2013.

This observation led me to investigate rookie performance during the 2013 regular season.  Were rookies contributing to the success of their teams more so than in the past?  Were rookie pitchers outperforming rookie hitters?  How about rookies on playoff teams versus non-playoff teams?

Using WAR data from Baseball Reference (sorry, guys) I measured rookies’ contribution to overall team success in 2000-2013, defined as rookie WAR divided by their team’s WAR.  A few definitions before jumping in to the findings:

  • Rookies are players who have accumulated less than 130 AB (or 50 IP) and less than 45 days on an active roster prior to their rookie season
  • For consistency across time, teams that won the second wild-card slot in 2012 and 2013 are not considered playoff teams (u mad, Reds and Indians fans?)
  • Rookie pitcher WAR = amount of WAR created by a team’s rookie pitchers
  • Rookie pitcher share of WAR = % of a team’s WAR created by rookie pitchers
  • Rookie batter WAR = amount of WAR created by a team’s rookie batters
  • Rookie batter share of WAR = % of a team’s WAR created by rookie batters
  • Rookie total WAR = Rookie batter WAR + Rookie pitcher WAR
  • Rookie share of total WAR = Rookie pitcher share of WAR + Rookie batter share of WAR

In chart 1, rookie share of total WAR for the average team in 2013 (11.3%) is above the long-run average of 8%, and was only exceeded in 2006 (12.7%).  But there was no discernible difference in rookie share of total WAR between the average playoff team (10.9%) and non-playoff team (11.4%) last season.  So far, it would appear as though I need to adjust my TV.

The data becomes more interesting when the average team’s rookie share of total WAR is decomposed into pitcher and batters’ contributions (chart 2).  There is a rapid rise in rookie pitcher share of WAR between 2010 and 2013, peaking last season at 6.7% of the average team’s WAR.  This increase was so strong, it more than made up for a decrease in rookie batter share of WAR during the same timeframe, from 6.5% in 2010 to 4.6% last season.

These trends become starker when the analysis is limited to playoff teams (chart 3).  On the average playoff team in 2013, rookies provided 10.9% of WAR, a step down from the high reached in 2012.  But there is still a huge rise in rookie pitcher share of WAR between 2010 and 2013, to 8.7% last season, and a concurrent decrease in rookie batter share of WAR, to 2.2%.  In other words, 80% of the average 2013 playoff team’s rookie total WAR was generated by pitchers.  If not for a certain Cuban-American hero with a penchant for bat-flipping, that share would have been even higher.

But some evidence, as well as anecdotal observation, suggests that pitchers in general have become more dominant over the past few seasons.  Is this trend, observed so far among rookies, true of all pitchers?  Over the past fourteen seasons, the average team has generated between 36-44% of WAR from pitchers (chart 4).  This share has been consistent over time, and has edged up only slightly during the past few seasons.  This suggests that rookie pitchers, especially those on playoff teams, really did excel in 2013.

Now, let’s look at just how good the rookie pitchers on playoff teams were last season (chart 5).  Together, the 54 rookie pitchers on 2013 playoff teams generated 29.6 WAR, which is slightly higher than last year’s total (29.1 WAR) and much higher than the long-run average (16.0 WAR).  What’s even more impressive is that last season, 57% of all 30 teams’ rookie pitcher WAR was generated by the rookie pitchers on playoff teams, a higher share than in any other season since 2000.  Cumulatively, 54 rookie pitchers on 8 teams outperformed 151 rookies on 22 teams.  Not bad.

But wait…there’s more.  By focusing on the best rookies on playoff teams (arbitrarily defined here as those who generated 1+ WAR), we see that there were 20 such players last season (chart 6).  Of that number, 16 were pitchers, like Shelby Miller, Hyun-Jin Ryu, and Julio Teheran.  Five of those pitchers were on the Cardinals (Miller, Siegrist, Wacha, Rosenthal, and Maness.)  The concentration of top rookie pitchers on playoff teams last year is the highest in at least fourteen seasons.

My initial observation, “Wow, there are lots of rookie pitchers killing it in the 2013 playoffs!” looks to be borne out in the data.  This raises two other interesting questions:

1.  For any of last year’s playoff teams, did rookie pitchers provide enough value to get their team into the playoffs?

2.  Is the rookie pitcher observation a one-time anomaly, or indicative of a larger trend?

The first question is relatively easy to answer.  We can compare each playoff team’s rookie pitcher WAR (essentially, how many more games the team won because of rookie pitchers) to the number of additional games each playoff team could have lost and still made the playoffs without tying a second-place team (let’s call this the buffer). 

For four out of eight playoff teams (again, I exclude the second wild-cards), rookie pitcher WAR is higher than the buffer (chart 7).  But since Detroit and Tampa made the playoffs by one game, and since Pittsburgh’s rookie pitcher WAR is less than one game higher than the buffer, it’s hard to argue that rookie pitchers definitively moved the needle for them. Andy Dirks or Yunel Escobar could have just as easily gotten their teams over the hump, since they also created more than 1 WAR.

The Cardinals are the one team whose rookie pitchers probably got them into the playoffs.  They got 9.7 extra wins from their rookie pitchers (almost 23% of the entire team’s WAR), and made the playoffs by 6 games.

The second question is harder to answer, since the 2014 season hasn’t started yet.  There’s no clear reason why rookie pitchers on playoff teams would suddenly start playing extremely well, especially since it doesn’t look like they’re causing their teams to make the playoffs.  The likeliest explanation is that the top teams in the league happened to have outstanding rookie pitchers last year.  Sometimes, “stuff” happens.

But if you want to prove me wrong, and show that last year’s playoff teams have developed great farm systems capable of producing more top rookie pitchers, pay close attention to what Jameson Taillon (Pirates), Carlos Martinez (Cardinals), Jake Odorizzi (Rays), and Allen Webster (Red Sox) bring to the table in 2014.  All four pitchers are on Baseball America’s list of top 100 prospects, are on last year’s playoff teams, and are projected to crack the majors this season.  If they get off to a hot start, and if they help their teams return to the playoffs, I might have to revisit my conclusion next winter.


Options for Closer in Arizona

As I usually do, I was checking through the headlines on mlb.com and I happened to notice that Kirk Gibson has not made a decision for who will be closing for his team. This should be one of the bigger questions leading up to the regular season as the Diamondbacks have several options when it comes to closers.

Honorable Mention: Josh Collmenter
He is a pitcher who has quietly been one of the best relief pitchers for the Arizona Diamondbacks of late. He is a three pitch pitcher with an 88 mph fastball, a 70 mph curveball, and a 78 mph changeup. With that slow speed, one would expect him to be a more pitch to contact kind of pitcher and let the defense take care of him. But he posted a career low 32.7% groundball rate which is low for many pitchers. However, he also does not give up that many homers, giving up an average of .78 HR/9 last season. He struck out 8.32 batters per nine innings last season while walking 3.23 batters per nine last year.

Where Collmenter’s value is on the Diamondbacks is as a long relief, spot starter pitcher for them. He pitched in 49 games last season and threw a total of 92 innings meaning that he threw nearly 2 innings per appearance. In his career in the minors, he pitched all of his outings as a starter with the exception of 2 games in his first year in low A ball in 2007. Closer could be a good spot for him with the strikeout rate but I would like to keep him in the bullpen for if the starter can only throw 2 innings or less.

3. Brad Ziegler
It is no secret that Brad Ziegler is very good at getting groundball outs, that is what makes him successful. He doesn’t really throw an actual sinker per se, but his fastball essentially plays the role as sinker. The submarine arm action that Ziegler throws with has the pitch rising up briefly before dipping down just before it gets to the plate (as shown in the gif below).

By using this heavy sinking action on the fastball, he has produced a career 66.1% ground ball rate (which has been raised to a 72.9% rate since the start of the 2012 season) and in front of a great fielding team like the Diamondbacks (team UZR/150 of 8.1, good for second highest in the Majors), that leads to success. But this is why he should be used more of as a relief ace as opposed to closer. If the starter leaves the game in the seventh inning with people on base, I want a pitcher to come in who can get the ground ball double play. Neither Putz nor Reed are as good at getting groundball outs and only Putz has a higher LOB% (90.9% for Putz as opposed to Ziegler’s 80.7). If Ziegler is put into the role of closer, then he would be less likely to be put into a situation where a groundball is needed as the manager would want to hold on to him until the ninth inning.

2. J.J. Putz
J.J. Putz has a very realistic chance of claiming the role of closer at the start of the season. If not for injuries, Putz would have maintained the role of closer last year but an elbow and finger injury during the season limited his playing time to only 34.1 innings and when he returned from them he was more of a situational right handed pitcher. But since the start of the 2012 season, no pitcher on the Diamondbacks has more saves than Putz’s 38 saves leading many to believe that he could be a front runner for the closer spot based on experience alone. He’s been solid for them in the past, but a steady decrease in pitch velocity and an increase in home run rate over the past 3 years should be somewhat concerning for the Diamondbacks. His fastball velocity is still above 90 mph (91.7 mph in 2013 and 92.8 mph in 2012) and the home run to fly ball rate is still not too high (having been only about 14.8% in 2013 and 8.7% in 2012 but that is a concerning increase from the 6.0% HR/FB rate in 2011).

One thing interesting to think about with regards to J.J. Putz is what effect his injuries had on his performance last year. In most areas, Putz experienced a dramatic increase in essentially all statistics but one of the more significant increases occurring in SIERA where he went from 2.29 in 2012 to 3.24 in 2013 and his walk rate increased from 1.82 BB/9 to 4.46 BB/9. It is tough to tell whether or not these inflated statistics are just as a result of injuries or if they are as a result of just wearing down from age. After all, we can’t forget that Putz is now 37 so he does not have age on his side any more. I don’t see him being as bad as his stats from 2013 indicate but it is certainly something to think about.

1. Addison Reed
One pitcher who definitely has age on his side is Addison Reed; the pitcher who I believe should be given the role of closer without question. He proved that he is one of the best young pitchers in the game and he showed this while playing for a terrible defensive team like the White Sox. I believe that his ERA is definitely misleading as a 3.79 ERA makes him seem worse than he is. Reed strikes out 9.08 batters per nine innings, limits the walks with only a 2.90 BB/9, and a HR/9 of .76 which is comfortable in the closer’s role. Those are the kind of numbers that someone in the position of closer should have and with his young age of 25, there is definitely room for improvement. His other numbers like his xFIP of 3.77 in 2013 and his SIERA of 3.19 in 2013 would indicate that he is definitely going to get better.

There are other things to like about Reed aside from his statistics and potential. Last year, he threw the four seam fastball for 92.7 mph, the two seam fastball 93.5 mph, the slider at 83.8 mph, and the changeup at 83.7 mph. The 8.9 mph difference between his fastball and slider are very deceiving to a right handed batter because of the movement away from the batter and the 8.8 mph difference between his fastball and changeup creates a devastating effect on left handed batters as is evidenced by the .266 wOBA vs. L last season with the 37 strikeouts.

The Diamondbacks are in an enviable position with having multiple options that they could plug into closer. With the young and fragile rotation (Corbin has already shown that young starters are good but not invincible) that the Diamondbacks have, I think that Collmenter will have to avoid getting locked into the closer spot as he may be needed to make a few starts. Ziegler was good for the Diamondbacks last season but don’t expect to see him in the closer’s role as a pitcher of his caliber needs to be free to pitch at any time during the course of a game. But honestly when it comes down to the choice, the gap between Reed and the other options is substantial enough that there really should not be much debate.


Examining the Prince’s Reign in Texas: Prince Fielder and the 2014 Rangers

One of the offseason’s most talked-about moves was the trade that sent Prince Fielder to the Texas Rangers in exchange for Ian Kinsler and gobs of cash. While universally (and rightfully so) viewed as primarily a salary dump for GM Dave Dombrowski and the Tigers camp, the Rangers have gained a strong bat to place in the middle of their batting order alongside Adrian Beltre and Alex Rios.

Yet unlike the much-theorized David Price trade, the Fielder deal was not a pure salary dump. Fielder stumbled mightily in his production in 2013. In 2012, he posted a robust .313/.412/.528 traditional slash line, with an impressive .940 OPS and 153 wRC+. According to Baseball-Reference’s oWAR calculations, 2012 was Fielder’s third-most valuable year at the plate with a 5.4 mark. All of this stands in stark contrast to Fielder’s 2013.

Last year Fielder posted a much more pedestrian .279/.362/.457, .819 OPS, 125 wRC+ and 2.9 oWAR. While of course those are still above-average numbers, when attached to the name Prince Fielder and his ubercontract, Dave Dombrowski clearly had reason for concern. However, off-the-field issues are widely believed to have contributed to the dip in Fielder’s production, and natural regression may have also contributed to the fall from Fielder’s career-high traditional slash line. Fielder also enjoyed a career-high .321 BABIP in 2012, with his 2013 mark of .307 more in line with his normal marks.

So, the question presents itself; what exactly does Texas GM Jon Daniels have on his hands in the 2014 model year Fielder? There are a number of factors contributing to this answer. Firstly, while the batters ahead of him do not contribute to his slash line, they certainly do help counting stats such as RBIs. While RBIs are naturally an utterly useless stat when evaluating individual performance, men getting on base allow a hitter to create runs, and as runs are ultimately what win games, putting men on ahead of big bats such as Fielder is part of what goes into good team creation. Therefore, I will examine the clip at which we can expect there to be runners on base when Fielder bats for Texas as opposed to his stint in Detroit.

Secondly, I will also examine the impact Arlington itself will have on Fielder’s bat. Arlington has traditionally been a much more hitter-friendly location than Detroit. But how much exactly will Texas raise Fielder’s numbers?

The top of the 2013 Tigers lineup consisted of Austin Jackson, Torii Hunter, Miguel Cabrera in front of Fielder. Those first three hitters posted OBP’s of .337, .334, and .442, respectively. That averages out to a .371 mark, albeit an imperfect one due to Cabrera’s significantly higher individual mark (also, Cabrera hit a lot of home runs last year, and while that counts towards his OBP, that means the bases were empty when Fielder came to bat). We’ll refer to this average of the top of the order as tOBP, or “Top OBP” for the rest of the article for the sake of saving space.

The top of the 2014 Rangers lineup will be made up of Shin-Soo Choo, and either Elvis Andrus or Jurickson Profar before Fielder, who will bat third. There are a number of different projection systems we can use to forecast the upcoming season, for this article we’ll be using Steamer. Choo is given a .391 OBP, Andrus a .340, and Profar a .321. With Andrus in the lineup the projected tOBP is .365, with Profar it’s .356. So despite throwing his wallet at Choo and his obscene .423 2013 OBP, Jon Daniels in fact is giving Fielder less to work with in front of him.

Or is he? Part of the smaller (projected) tOBP in Texas is that Fielder simply won’t have the best hitter in the game hitting in front of him anymore. Also, one has to expect Fielder to be better at the plate this year. Steamer awards Fielder a substantial .290/.390/.516 line with a 142 wRC+ and 3.4 WAR, a major uptick over last year’s production. If we factor him into the projected Texas tOBP, with Andrus it’s a .374, and with Profar it’s .367. That’s something you like to see if you’re Adrian Beltre, who lead the league in hits last year and launched 30 homers.

And speaking of homers, Fielder’s move to Arlington will help him in that department. The newly named Globe Life Park ranked seventh last year in home runs with a total of 107 being hit there. Comerica Park, where the Tigers play, ranked fourteenth with 99. This helps Steamer award Fielder 29 home runs, up from 25 last year.

However, can we possibly expect Fielder to exceed these projections? As mentioned earlier, Fielder’s down year was contributed to by a number of off-the-field issues according to Hunter. A change of scenery will definitely do Fielder well, and he also seems to have lost some weight if the pictures and video coming out of Spring Training are to be believed. For that reason I’m willing to bump up Fielder’s numbers by a few slots, and I expect him to be even better than what Steamer predicts. Because baseball is a fickle mistress I could easily be wrong, but call it a gut feeling. All in all, Jon Daniels may have caught lightning in a bottle here with his rather expensive gamble, and if Texas manages to overcome their pitching woes they should be a very dangerous team with Fielder anchoring their lineup.


Pitcher WAR and the Concept of Value

Whenever one makes any conclusion based off of anything, a bunch of underlying assumptions get shepherded in to the high-level conclusion that they output. Now that’s a didactic opening sentence, but it has a point–because statistics are full of underlying assumptions. Statistics are also, perhaps not coincidentally, full of high-level conclusions. These conclusions can be pretty wrong, though. By about five-hundred runs each and every season, in this case.

Relative player value is likely the most important area of sports analysis, but it’s not always easy. For example, it’s pretty easy to get a decent idea of value in baseball while it’s pretty hard to do the same for football. No one really knows the value of a pro-bowl linebacker compared to a pro-bowl left guard, for one. People have rough ideas, but these ideas are based more on tradition and ego than advanced analysis. Which is why football is still kind of in the dark ages, and baseball isn’t. But just because baseball is out of the dark ages, it doesn’t mean that it’s figured out. It doesn’t even mean that it’s even close to figured out.

Because this question right here still exists: What’s the value of a starting pitcher compared to a relief pitcher? At first glance this a question we have a pretty good grasp on. We have WAR, which isn’t perfect, yeah, but a lot of the imperfections get filtered out when talking about a position as whole. You can just compare your average WAR for starters with your average WAR for relievers and get a decent answer. If you want to compare the top guys then just take the top quartile and compare them, etc. Except, well, no, because underlying assumptions are nasty.

FanGraphs uses FIP-WAR as its primary value measure for pitchers, and it’s based on the basic theory that pitchers only really control walks, strikeouts, and home runs–and that everything else is largely randomness and isn’t easily measurable skill. RA9 WAR isn’t a good measure of individual player skill because a lot of it depends upon factors like defense and the randomness of where the ball ends up, etc. This is correct, of course. But when comparing the relative value of entire positions against each other, RA9 WAR is the way to go. Because when you add up all the players on all of the teams and average them, factors like defense and batted balls get averaged together too. We get inherently perfect league average defense and luck, and so RA9 WAR loses its bias. It becomes (almost) as exact as possible.

Is this really a big deal, though? If all of the confounding factors of RA9 WAR get factored together, wouldn’t the confounding factors of FIP-WAR get factored together too? What’s so bad about using FIP-WAR to judge value? Well there’s this: From 1995 onward, starting pitchers have never outperformed their peripherals. Relievers? They’ve outperformed each and every time. And it’s not like the opposite happened in 1994–I just had to pick some date to start my analysis. Here’s a table of FIP-WAR compared to RA9-WAR compared to starters for the last 18 years, followed by the same table for relievers.

Starter RA9-WAR/FIP-WAR Comparisons

Year RA9 WAR FIP WAR Difference
1995 277.7 305.0 -27.3
1996 323.2 337.1 -13.9
1997 302.5 336.6 -34.1
1998 326.8 357.8 -31.0
1999 328.7 359.7 -31.0
2000 323.0 348.6 -25.6
2001 324.9 353.9 -29.0
2002 331.4 348.6 -17.2
2003 315.0 346.7 -31.7
2004 311.9 343.0 -31.1
2005 314.8 333.0 -18.2
2006 317.0 345.7 -28.7
2007 343.3 361.6 -18.3
2008 325.7 351.9 -26.2
2009 325.1 351.8 -26.7
2010 317.8 353.6 -35.8
2011 337.3 355.6 -18.3
2012 311.1 337.6 -26.5
2013 304.0 332.4 -28.4

Reliever RA9-WAR/FIP-WAR Comparisons

Year RA9 WAR FIP WAR Difference
1995 78.4 50.3 28.1
1996 73.9 61.8 12.1
1997 98.0 65.4 32.6
1998 101.6 70.4 31.2
1999 99.8 68.9 30.9
2000 106.9 80.2 26.7
2001 103.3 77.6 25.7
2002 91.1 76.6 14.5
2003 112.5 83.4 29.1
2004 117.7 85.1 32.6
2005 115.7 96.7 19.0
2006 112.7 84.0 28.7
2007 86.8 68.2 18.6
2008 104.1 79.7 24.4
2009 103.7 77.7 26.0
2010 109.0 74.9 34.1
2011 91.0 73.6 17.4
2012 116.3 91.3 25.0
2013 126.6 98.5 28.1

Ok, so that’s a lot of numbers. The basis, though, is that FIP thinks that starters are better than they actually are, while it thinks relievers are the converse. And this is true year after year, by margins that rise well above negligible. Starters allow roughly 250 more runs than they should according to FIP every season, while relievers allow about 250 less than they should by FIP’s methodologies–in much fewer innings. In more reduced terms this means that starters are over-valued by about 10% as whole, while relievers are consistently under-valued by about 25% according to FIP-WAR. Now, this isn’t a completely new idea. We’ve known that relievers tend to outperform peripherals for a while, but the truth is this: relievers really outperform peripherals, pretty much all the time always.

Relievers almost get to play a different game than starters. They don’t have to face lineups twice, they don’t have to throw their third or fourth-best pitches, they don’t have to conserve any energy, etc. There’s probably a lot more reasons that relievers are better than starters, too, and these reasons can’t be thrown out as randomness, because they pretty much always happen. Not necessarily on an individual-by-individual basis, but when trying to find the relative value between positions, the advantages of being a reliever are too big to be ignored.

How much better are relievers than starters at getting “lucky”? Well, a few stats that have been widely considered luck stats (especially for pitchers) for a while are BABIP and LOB. FIP assumes that starters and relievers are on even ground, as far as these two numbers are concerned. But are they? Here’s a few tables for comparison, using the same range of years as before.

BABIP Comparisons

Year Starter BABIP Reliever BABIP Difference
1995 0.293 0.290 0.003
1996 0.294 0.299 -0.005
1997 0.298 0.293 0.005
1998 0.298 0.292 0.006
1999 0.297 0.288 0.009
2000 0.289 0.284 0.005
2001 0.290 0.286 0.004
2002 0.295 0.293 0.002
2003 0.294 0.285 0.009
2004 0.298 0.292 0.005
2005 0.300 0.292 0.009
2006 0.293 0.289 0.003
2007 0.291 0.288 0.003
2008 0.297 0.290 0.007
2009 0.296 0.288 0.008
2010 0.292 0.283 0.008
2011 0.292 0.290 0.002
2012 0.294 0.288 0.006
2013 0.293 0.287 0.006

LOB Comparisons

Year Starter LOB% Reliever LOB% Difference
1995 69.9% 73.4% -3.5%
1996 70.9% 73.2% -2.4%
1997 69.5% 72.7% -3.2%
1998 69.9% 73.1% -3.2%
1999 70.6% 73.2% -2.7%
2000 71.4% 74.3% -2.8%
2001 70.9% 74.0% -3.1%
2002 70.2% 72.3% -2.0%
2003 70.7% 73.8% -3.1%
2004 70.4% 74.0% -3.6%
2005 70.6% 72.9% -2.3%
2006 70.9% 74.2% -3.3%
2007 71.5% 74.0% -2.4%
2008 71.3% 73.9% -2.6%
2009 71.7% 74.3% -2.6%
2010 72.0% 75.3% -3.3%
2011 72.0% 74.6% -2.6%
2012 73.1% 76.2% -3.1%
2013 71.9% 75.5% -3.6%

With the exception of BABIP in ’96, relievers always had better luck than starters. Batters simply don’t get on base as often–upon contacting the ball fairly between two white lines–when they’re facing guys that didn’t throw out the first pitch of the game. And when batters do get on, they don’t get home as often. Relievers mean bad news, if good news means scoring more runs.

Which is why we have to be careful when we issue exemptions to the assumptions of our favorite tools. There are a lot of solid methodologies that go into the formulation of FIP, but FIP is handicapped by the forced assumption that everyone is the same at the things that they supposedly can’t control. Value is the big idea–the biggest idea, probably–and it’s entirely influenced by how one chooses to look at something. In this case it’s pitching, and what it means to be a guy that only pitches roughly one inning at a time. Or perhaps it’s about this: What it means to be a guy who looks at a guy that pitches roughly one inning at a time, and then decides the worth of the guy who pitches said innings, assuming that one wishes to win baseball games.

The A’s and Rays just spent a bunch of money on relievers, after all. And we’re pretty sure they’re not dumb, probably.


Projecting Strength of Schedule for Pitchers and Hitters

Friday morning, as I began the tedious process of combining all MLB schedules in one spreadsheet, I noticed that FanGraphs’ resident volcano expert and prolific content generator Jeff Sullivan posted one very similar article, and then another shortly thereafter. He focused on projected WAR, while I planned to look specifically at projected average ERA and wOBA a team must contend with over the 2014 season. So at the risk of writing a similar post, one with drier writing and less cool graphics, I submit to you the following simple table and graphs.

We often look at the strength of a division and make generalizations about the hardest place to pitch (AL East) and hit (NL East). Like park effects, we sometimes jump to conclusions about the effects of dream lineups and weak interdivision rivals. Chad Young’s analysis of Prince Fielder’s move to Arlington is a perfect example of how enthusiasm can be misplaced when we forget that 90 of a club’s 162 games take place outside of their division, with 20 games occurring in a different league.  The table below shows projected mean wOBA and ERA by team, which are weighted by expected plate appearances and innings pitched, respectively. As expected, AL teams generally have a DH-fueled high wOBA and inflated ERA when compared to their NL counterparts. All projections are courtesy of Steamer’s 2014 pre-season projections. Keep in mind that Steamer regresses stats like wOBA and ERA, so there is not as huge a gap between the Red and White Sox (0.332 vs. 0.317) compared to what you might see during the season. However, Steamer has been shown to be one of the best projection systems available when it comes to capturing player-to-player variation in performance (i.e. ranking players by production), which is sufficient for looking at the differences between teams.

2014 Steamer Projections*

Team

wOBA

ERA

BOS

0.333

3.85

TOR

0.331

4.16

BAL

0.326

4.13

NYY

0.322

3.92

TB

0.318

3.63

DET

0.330

3.64

KAN

0.324

3.95

CLE

0.321

3.91

CHW

0.317

4.35

MIN

0.312

4.33

TEX

0.332

4.09

LAA

0.327

4.00

SEA

0.325

3.84

OAK

0.320

3.81

HOU

0.310

4.41

WAS

0.328

3.58

ATL

0.322

3.66

PHI

0.310

3.72

NYM

0.309

3.85

MIA

0.309

4.04

STL

0.326

3.49

PIT

0.323

3.73

MIL

0.321

4.02

CHC

0.319

3.98

CIN

0.318

3.66

COL

0.347

4.22

LAD

0.329

3.44

ARI

0.329

3.78

SF

0.323

3.72

SD

0.319

3.80

*adjusted for PA and IP

I was surprised by the high ERA attributed to the San Diego Padres, poor enough for 6th worst in the NL. The Reds’ Choo-less offense is also, somewhat surprisingly, projected as the 7th worst in the majors. Let’s take a moment to silently reflect that the Minnesota Twins, despite having a spacious ballpark and a non piss-poor payroll, are still projected to give up more earned runs than the Colorado Rockies.

While the table displays projected wOBA and ERA by team, the charts below illustrate the mean wOBA and ERA faced by each team over 162 games.

 

Projected wOBA

Last September Dave Cameron presented a convincing argument that Chris Sale’s 2013 season was as good if not better than Max Scherzer’s, but was obscured in part because Sale routinely pitched against the Tigers and Scherzer routinely pitched against the White Sox. These projections reinforce the argument in favor of opponent-adjusted measurements—Detroit pitchers are projected to face a wOBA of 0.321 while Chicago pitchers play against teams with a projected wOBA of 0.324.

San Diego and San Francisco are home to some of the most pitching-friendly stadiums in the country. However, in part because they play 28 away games against the Rockies, Diamondbacks, and Dodgers, their opponent’s wOBA is higher than people might expect. However great it is that a flyball pitcher like Ian Kennedy has a home in spacious San Diego, it’s important to note that the Padres are slated to face some tougher-than-average lineups. Projected ERA

ERA drops off pretty sharply when we get to the NL. Surprisingly, hitters for the Nationals and Dodgers appear to have the easiest schedules in their league, despite being in divisions which are better known for their sharp pitching than strong offense. Not having to face the likes Clayton Kershaw or Stephen Strasburg can do wonders for a lineup.

The heavy-hitting Tigers are slated to face the worst pitching staff in the majors. While this is somewhat unfair considering they have the league’s best hitter, it is very unfair that the lowly Marlins will face the best pitchers in the league.

Projections are only predictions, and assuredly some teams will drastically outperform and others will underwhelm by season’s end. However, these data remind us that our preconceptions about who plays in an extreme park or which teams are in difficult divisions should not be overemphasized, nor should we discount the idea that some lineups or pitching staffs will have a significantly more difficult time than others. Over the course of the season, a single team will square off against almost 20 other teams in over a dozen different parks. Whatever the strength of their schedule, position players and pitchers face a wide variety of competition, and no doubt a good many will surprise us all.


Does Pitching Deep into Games Lead to More Wins?

Predicting pitcher wins is a capricious exercise, and few factors have been shown to have any correlation whatsoever with win percentage (W%). To predict wins, one should consider a pitcher’s ERA, offensive support, strength of bullpen, quality of defense behind the mound, and, innings pitched (IP) in a season.

In fact, research has shown that IP and ERA are the only two factors that have a correlation above .30, and the two are very close. In a sample of pitchers from 2003-2013, the correlation for both eclipsed .40.

Obviously, pitching more games leads to more wins in a season, but many fantasy experts insist that pitching deep into games is an important part of earning a win as well. The theory, which I’ve seen taken for granted by experts at ESPN, CBS, Baseball Prospectus, and Rotographs, is that a starting pitcher who pitches into the 8th or 9th inning and leaves with a lead intact is more likely be credited with the W.

However, to earn a win a starter must pitch only 5 innings. Since we know that starters are often less effective after 75 pitches or so, pulling a pitcher early and relying a fresh bullpen that is at least league average should, in theory, be more effective than keeping a starter in the game. Dave Cameron articulated this point when creating a gameplan for the Pirates’ all-important play-in game in October 2013 when he suggested Liriano be pulled after only 3 innings. The chart below reinforces the obvious point that, except for walk rate, relievers generally eclipse starters in most skill metrics.

Figure 1

In 2013 Shelby Miller started 31 games and came away with the W a total of 15 times, earning a W% that ranked 22nd in the majors right behind Clayton Kershaw and Anibal Sanchez. That’s impressive, but also consider that the innings-limited rookie pitched an average of 5.5 innings per start—he only racked up 13 quality starts (QS), ranked 86th in the league. QS, after all, require putting in 6 innings of work with at least a 4.50 ERA.

Why, then, are innings pitched per start (IP/GS) so important, relatively, when considering W%? I hypothesized that pitchers who are given the leeway to pitch deep into games, and hence give their bullpen a rest, were generally better at run prevention than their peers, i.e. sported a lower ERA.

In healthcare research, where we don’t write particularly well, we love simple diagrams to explain hypothesized effects. Below is a diagram showing how one might view the relationship between various factors like ERA, IP, defense, offensive support and bullpen ERA. The perceived link between IP/GS and Pitcher Wins is confounded by ERA, which has an effect on both factors.

DAG
Pitch Efficiency

Before examining the theory that ERA accounts for the correlation between IP and W%, lets look at another possible explanation. Perhaps pitch efficiency is the key. Jordan Zimmermann was the 3rd most efficient starter (14.5 P/IP) in the majors last year, and was tied for the 8th highest W% (.68). However, the table below shows the correlation between W per game started (GS) and P/IP, ERA, and IP/GS among starters between 2009-2013:

 

W% and…

R2

     ERA

0.39

     IP/GS

0.36

     P/IP

0.08

While ERA and IP/GS appear to be almost equally correlated, the squared correlation coefficient for P/IP was negligible at .08. Variance in pitch efficiency has little to do with variance in W%.

IP/GS: How to Measure a Confounder

There are 2 straightforward ways to determine if the relationship between 2 variables is actually being skewed by a third factor, in this case ERA. The first is to stratify the sample by ERA and see if the relationship between IP/GS and W% still stands. If ERA is not a confounder, we would expect the correlation between each tier to remain relatively stable. As we can see in the chart below, it follows no clear trend.

Figure 3

Interestingly, only the best tier of pitchers, those with an ERA less than 3.65, show any discernible relationship between W% and IP/GS, supporting the theory that those starters who have demonstrated a strong ability to prevent runs are given the chance to pitch more innings.  Among more middling pitchers, the relationship between pitching deep into games and W% is negligible.

The second way to measure confounding is using a regression model. If you create a model examining how factor X predicts factor Y, introducing factor Z should not change the coefficient for X by more than 10% if Z does not have a strong pull on the relationship. For example, if we run a model that shows that smoking doubles your chance of getting lung cancer, then introducing tea drinking into the equation should not really change that smoking-lung cancer connection by more than 10%, unless we believe that drinking tea can also affect lung cancer and/or smoking.

I’m with MGL that regression is often unnecessary in baseball research, as its results can be difficult to interpret and unnecessarily complicated. I might add that even simple linear regression rests on a series of assumptions that are not always met. With that caveat, the data in this sample are normally distributed and I kept the model as simple as possible. Model 1 examines the relationship between W% and IP/GS. Model 2 adds a third variable, ERA.

Parameter

Coefficient (%)

P-Value

Model 1

IP/S

11.13

<.01

Model 2

IP/S

5.71

<.01

Model 2

ERA

-4.77

<.01

All results are statistically significant. Model 1 indicates that for each 1-inning increase in IP/GS, we would expect an 11% increase in W%. Once we control for ERA, we see that each 1-inning increase would result in an even weaker relationship— we would expect a 6% increase in W%. The new coefficient, .057, is more than 10% different from .111 and we can safely conclude that ERA is confounding this relationship, just as we found in the stratified analysis above.

Predicting Wins?

Here at FanGraphs we might mock the idea of pitcher wins, since they are mostly a byproduct of an era when pitchers did pitch deep into games and bullpens were not utilized as often or as effectively. However, when it comes to predicting wins, Will Larson has shown that projection systems like Steamer and CAIRO do a pretty good job, and are on average within 3.5-4 wins of the actual end-of-season results.

In fact, projection systems across the board are better at capturing player-to-player variation (ranking players) in counting statistics like W and strikeouts than rate stats ERA and WHIP.

Figure 4

While I have previously shown that QS correlate much better than W with pretty much every measure of pitcher skill we have, W% is still somewhat predictable. As long as we have yet to #killthewin, we might as well keep trying to forecast the future.