Archive for Research

J.D. Martinez: Market Value and 2018 Projections

J.D. Martinez had another great year in 2017. With 3.9 sWAR[1] and a .430 wOBA, J.D. contributed well above average once again. Offensively (wOBA) he has been able to consistently contribute year after year since 2014. J.D. does carry some defensive shortcomings, yet he is an excellent asset in any lineup.

For the past three years he has been able to get on base at an above-average rate (.364 OBP), alongside an excellent .289 ISO and a .587 SLG. He does carry a lifetime 25% K-rate (approx.), but as long as he is able to produce and contribute the way he has, he should be able to make an impact in any organization.

In 2018[2], J.D. should see a slight decrease in wOBA (.395). Based on the 2018 projections, both OPS and ISO should decline marginally; nevertheless, J.D. should be able to perform as a top-caliber player.

Please find J.D.’s 2018 projections in the table below.

2018 Projections: J.D. Martinez
YEAR AGE sWAR wOBA OBP SLG OPS ISO AVG K% BB%
2015 28 4.7 0.372 0.344 0.535 0.879 0.253 0.282 27.1% 8.1%
2016 29 2.0 0.384 0.373 0.535 0.908 0.228 0.307 24.8% 9.5%
2017 30 3.9 0.430 0.376 0.690 1.066 0.387 0.303 26.2% 10.8%
2018 31 3.6 0.395 0.365 0.591 0.955 0.293 0.298 26.0% 9.6%

Projections: “SEG Projection System” (Including sWAR for 2015-2018)

sWAR = “SEG Projection System” calculation of WAR  

J.D. Martinez’s estimated AAV is around $27M, based on a five-year/$135M contract. J.D. is projected for 14.6 sWAR for the next five years.

Market Value: J.D. Martinez

YEAR AGE sWAR Value $WAR
2018 31 3.6 30.6 $8.4
2019 32 3.5 30.7 $8.8
2020 33 3.0 27.5 $9.2
2021 34 2.5 24.2 $9.7
2022 35 2.0 20.3 $10.2
TOTAL 14.6 $133.4

sWAR = “SEG Projection System” calculation of WAR 

$WAR: Adjusted for Inflation (5% per year)

[1] sWAR = “SEG Projection System” calculation of WAR

[2] 2018 Projections: JD Martinez (SEG Projection System)


Eric Hosmer: Market Value and 2018 Projections

Hosmer certainly had his best season so far, with a 4.0 sWAR[1] and a .376 wOBA. Overall, consistency has not been there; over the past three years his offensive output has fluctuated, and that is something that can be said for his entire career. When looking at his offensive contribution, it seems that he has a “quality” season every other year. Nonetheless, Hosmer has been able to get on-base at an above-average rate of .359 OBP for the past three seasons. Also, he has managed to strike out (K%) at an average rate of 17.2% for the same period of time.

Moving forward, Hosmer’s offensive output for 2018 is projected[2] to see a slight decline. As previously mentioned, consistency is not his strength, and this should be reflected on his overall contribution for next year. A decline in wOBA (.351) from last year, alongside an increased K% (17.1%) will negatively impact his sWAR (2.6) in 2018.

Below you can find a detailed 2018 projection.

2018 Projections: Eric Hosmer
YEAR AGE sWAR wOBA OBP SLG OPS ISO AVG K% BB%
2015 25 2.7 0.355 0.363 0.459 0.822 0.162 0.297 16.2% 9.1%
2016 26 0.2 0.326 0.328 0.433 0.761 0.167 0.266 19.8% 8.5%
2017 27 4.0 0.376 0.385 0.498 0.883 0.180 0.318 15.5% 9.8%
2018 28 2.6 0.351 0.359 0.467 0.825 0.173 0.294 17.1% 9.2%

Projections: “SEG Projection System” (Including sWAR for 2015-2018)

sWAR = “SEG Projection System” calculation of WAR

Eric Hosmer’s estimated AAV is $21M, based on a five-year/$105M contract. He should be worth about 11.5 sWAR over the next five seasons. There has been a lot of noise regarding dollar amount and duration of contract. Going up to a seven-year agreement, he should be worth no more than $124M.

Market Value: Eric Hosmer

YEAR

AGE sWAR Value $WAR
2018 28 2.6 $21.8 $8.4
2019 29 2.6 $22.9 $8.8
2020 30 2.6 $23.9 $9.2
2021 31 2.1 $20.4 $9.7
2022 32 1.6 $16.3 $10.2
2023 33 1.1 $11.8 $10.7
2024 34 0.6 $6.7 $11.2
TOTAL 13.2 $123.8

 

sWAR = “SEG Projection System” calculation of WAR 

$WAR: Adjusted for Inflation (5% per year)

 

[1] sWAR = “SEG Projection System” calculation of WAR

[2] 2018 Projections: Eric Hosmer (SEG Projection System)


Lorenzo Cain: Market Value and 2018 Projections

After a strong 2017 (.347 wOBA, 4.1 sWAR[1]), Lorenzo Cain is one of the top remaining free agents. As a plus center fielder, defense is one of Cain’s greatest assets. On the other hand, Cain’s durability is a big question, having played just once over 140 games in a single season (2017). Injuries and age are both substantial concerns moving forward.

If able to stay healthy for at least 130 games in 2018, Cain is projected[2] to get on-base at an above-average rate (.356 OBP). Based on the projections, Cain should see a slight increase in both SLG and ISO from last year. Nonetheless, his wOBA should see a decrease in conjunction with an increase in K%. An overall decrease in offensive output will impact Cain’s sWAR (3.7) for 2018.

2018 Projections: Lorenzo Cain
YEAR AGE sWAR wOBA OBP SLG OPS ISO AVG K% BB%
2015 29 5.5 0.360 0.361 0.477 0.838 0.170 0.307 16.2% 6.1%
2016 30 2.7 0.322 0.339 0.408 0.747 0.121 0.287 19.4% 7.1%
2017 31 4.1 0.347 0.363 0.440 0.803 0.140 0.300 15.5% 8.4%
2018 32 3.7 0.330 0.356 0.443 0.798 0.145 0.298 16.9% 7.4%

Projections: “SEG Projection System” (Including sWAR for 2015-2018)

sWAR = “SEG Projection System” calculation of WAR  

Lorenzo Cain’s estimated AAV is around $21M per year, based on a four-year/$84M contract. He should be worth about 10 sWAR over the next three years. Staying healthy is crucial; as long as his speed does not drop dramatically, he should be able to significantly contribute for the next 2-3 seasons.

Market Value: Lorenzo Cain
YEAR AGE sWAR Value $WAR
2018 32 3.7 $31.2 $8.4
2019 33 3.2 $28.3 $8.8
2020 34 2.7 $24.9 $9.2
TOTAL 9.6 $84.4  
sWAR = “SEG Projection System” calculation of WAR 
$WAR Adjusted for Inflation (5% per year)

[1] sWAR = “SEG Projection System” calculation of WAR

[2] 2018 Projections: Lorenzo Cain (SEG Projection System)


On $/WAR, Its Linearity, and Efficient Free-Agent Contracts

The holiday season has come and gone, but fear not — the offseason, the most wonderful time of the year is still here! Though the “hot” stove has been anything but, it’s still a great time to discuss one of the more popular tools for evaluating free agent contracts sabermetrics: $/WAR. Love it or hate it, $/WAR is a useful tool for evaluating free agent contracts if used properly. $/WAR can reveal quite a bit about the state of the free agent market, as well as where the market might be headed. So, let’s jump in like a Bartolo Colon doing a cannonball.

On the Calculation of $/WAR

The concept of $/WAR, or as it is otherwise known, “The Cost of a Win,” is simple enough to grasp: MLB teams treat players as bundles of WAR to be had in exchange for money. The unit price of 1 WAR is the cost of a win, or $/WAR.

That’s $/WAR in simplest terms, but the strict calculation of $/WAR is actually a little trickier, largely due to disagreements in the way people feel that it should be calculated. For example, Dave Cameron used a simple projection of true-talent WAR of free agents to calculate $/WAR in his series on Win Values, but Matt Swartz (who has written a wealth of articles on the topic of $/WAR that I highly recommend) prefers to use retrospective WAR values to determine the cost of a win. In other words, Cameron’s method for $/WAR measures how much production that teams thought that they were paying for, but Swartz’s looks at how much teams actually paid.

So which method to use? I personally prefer Cameron’s method, largely because I think teams are only paying for production that they assume they will get without 100% certainty.

For this article, I used the Marcel projection system to generate predictions for free agents’ fWAR over the course of their contract for all MLB free agents who signed contracts from 2006 through New Years Eve 2017, with a modified aging curve based on the one used by the FanGraphs Contract Estimation Tool. From these projections, I then divided the total projected fWAR by the total monetary value of the contract to get $/WAR. These projections are hardly precise or representative of what teams think a free agent will produce, but they’re good enough that I can get a rough idea of a players’ production over a contract.

On the Linearity of $/WAR

For those unfamiliar with the metric, $/WAR might seem flawed in that it assumes a linear value of $/WAR. It seems unintuitive that a 6 WAR player will cost only twice as much as a 3 WAR player on the free agent market — after all, since 6 WAR players are more scarce than 3 WAR players, it would seem logical that teams would have to pay more for 6 WAR players. Practically, however, this hasn’t been the case.

This is the roughest implementation of a $/WAR scatterplot, but even then, a strictly linear plot emerges. Teams giving out contracts above the line are overpaying based on $/WAR, and teams below are getting a good deal.

But this $/WAR plot is missing a couple of things — for one, inflation. The purchasing power of a dollar in 2006 is not the same as it is in 2017, so we need to adjust our calculation to take that into account (after all, under the $/WAR model, teams are essentially purchasing a good just as an average American might purchase bread at the grocery store). These values will be put in terms of the value of the dollar in 2017.

We also need to take a look at the fact that $/WAR is dramatically different for relief pitchers as opposed to starting pitchers or position players. Since 2006, the cost of a win for starting pitchers is $4.2 million and $5.7 million for position players, but for relief pitchers, the price is $10.9 million. Since WAR accumulation for pitchers is based largely on IP accumulation, and RPs typically only pitch 50-70 IP on a year if healthy, it might be inappropriate to include RPs in our calculation for $/WAR since there clearly exists a wide gap between how teams pay for production from RPs compared to how they pay for SPs and position players.

With this in mind, we can now examine the linearity of $/WAR from 2006-2017, with separate charts for SPs/hitters…

… and for RPs.

It’s blindingly obvious why I can’t lump in RPs with the rest of the FA population — RPs have a dramatically different range of projected WAR values and contract sizes, and their $/WAR slope is much steeper than that of the general population.

But in both instances, $/WAR is generally linear. When we reach the “elite player” end of the curve — the players who are being paid more for more production — there exists quite a lot of variance, but on average, these players still are paid the same rate for a win as players in other parts of the curve. Why is this? Perhaps it is a matter of teams not being pressed for roster space — MLB players have 25 roster spots and 9 starting players, so having a single 6 WAR player gives teams only a small efficiency advantage over having two 3 WAR players. Given how few elite players are on the market at any given time, it would be difficult to quantify that advantage and how much teams pay for it, and thus, the linear model works well.

If we shrunk the MLB’s roster size and starting player size, perhaps then we would see scarcity manifest itself, where it becomes significantly more advantageous to use roster space efficiently. We can look to the NBA, which has a maximum roster size of 15 and only five players take the court at any given time. Here is the $/VORP chart for NBA free agents from 2015-2017 (VORP stands for “Value Over Replacement Player,” and if the name alone doesn’t make it obvious enough, it’s similar to WAR but for NBA players).

 

This chart is different from either of the MLB $/WAR charts that I’ve discussed thus far — notice how a majority of replacement to low-level players (0-5 VORP) fall below the $/VORP line, and a majority of middle-tier to elite players (5+ VORP) fall above the line. NBA teams are forced to overpay their best players since roster-space efficiency is more important in the NBA. But since MLB teams have an abundance of roster spaces, the consideration of roster space efficiency doesn’t affect the linear model.

On The Luxury Tax Threshold

The linear model that we’re oh-so-in-love-with might start breaking down soon. As the Cespedes Family BBQ twitter account pointed outvery few top-tier free agents have signed thus far this offseason compared to other offseasons. Only two free agents this offseason have signed for contracts of $50 million+, and only Carlos Santana has landed a $20 million+ AAV.

Teams are far more reluctant to sign huge free agent contracts that teams have done in years, partly because of an increasing prevalence of analytics, and partly because of the luxury tax threshold, as Bob Nightengale noted in a column Tuesday, which has led to the slow-down. Teams are waiting longer and longer for big-ticket FAs to lower their prices, and as a result, we’ve had a relatively slow FA market for elite players.

As a result, we might see the linearity of $/WAR begin to fail for elite level players. Simply put, if teams collectively are unable to pay what players feel that they are owed for their production thanks to the luxury tax, players must lower their asking price and accept deals that fall below the $/WAR line, meaning that the slope of $/WAR will decrease at lower levels. While we will need to see what deals players like J.D. Martinez and Yu Darvish accept to verify this effect, it appears as though we may see $/WAR fall at the very least in 2017.

On The Efficiency of FA Contracts

$/WAR also provides us with the ability to judge teams on their ability to make shrewd deals — get the most bang for their buck, if you will. There exists a market price for $/WAR across the MLB, so teams that consistently pay less than the market price are optimizing their payroll cash. Conversely, teams who consistently pay above the $/WAR market price are making significantly less efficient use of their payroll. I’ll exclude relievers from this analysis on the basis that their contracts don’t fit well into our $/WAR model.

I’ve highlighted the five best teams at making efficient deals since 2006 in green and the five worst in red. Surprisingly, the Padres, who are rumored to be offering Eric Hosmer a seven-year contract that would make him the highest-paid-player in team history, have the best history of making efficient deals based on the Marcel projection model. What is hardly surprising is that the historically-sabermetrically-minded Athletics make the top five, in addition to small-market teams like the Padres, Pirates, Rays, and Twins.

On the other end of the spectrum, the teams that have been paying the most $/WAR include the Mets, Diamondbacks, White Sox, Angels, and the Rockies. On average, since 2006, the Rockies have paid almost twice as much for a win on the free agent market as the Padres. Ouch.

I’m very careful to avoid making a blanket statement like “The Padres are the shrewdest investors in baseball,” because the Padres aren’t paying for production on the basis of my model. Instead, they’re using their own tools to determine intelligent investments, like every other front office in baseball. Every front office has their own perspective on the future production of players — but using a highly generalized model, the Padres appear to be doing a good job of investing what little money that they have in free agency.

Unfortunately, smart investing can only take you so far. Baseball is inherently random, and players can suffer career-ending injuries, fall into slumps, or end up like Pablo Sandoval (Sandoval was projected for about 12.2 fWAR over the course of his contract with the Red Sox, but has instead posted -2.9 fWAR during his first three seasons). And only 98 players signed MLB free agent contracts last season, meaning that the other 652 available MLB roster slots had to be filled by other means. Still, it’s wise to play the FA market and play it efficiently — it’s tough to find wins so easily available elsewhere.


Do Fielders Commit More Errors Playing Out of Position in a Shift?

The shift has taken the MLB by storm in recent years.  Broadcasters love to criticize the shift, despite its numerous advantages.  One potential problem that the shift may cause is an increase in fielding errors.  This may be a direct result of fielders playing out of their normal position.  Using the shift data provided to FanGraphs courtesy of Baseball Info Solutions, as well as batted ball data courtesy of Baseball Savant, I ran a logistic regression to find the likelihood of a batted ball resulting in a fielding error.

The approach I used to find the probability of a batted ball being a fielding error was to run a logistic regression.  The variables included in the regression were release speed, hitter-pitcher matchup (dummy variable with a value of 1 if the pitcher and hitter were both righties or lefties), runners on base dummy, launch speed (exit velocity), effective speed, launch angle, and dummy variables for both traditional and non-traditional shifts.  The model only included batted balls that were hit in the infield, as the majority of shifts occur in the infield.

 

Screen Shot 2017-12-23 at 2.01.19 AM

Above are the results of the logistic regression used to determine the probability of a batted ball being an error.  The dependent variable is whether or not the error occurred.  Two results that logically make sense are Exit Velocity (Launch Speed) having a positive coefficient and Launch Angle having a negative coefficient.  Both of these variables are significant on the 1% level.  Exit Velocity having a positive coefficient shows that the harder the ball is hit, the harder the ball is to field.  Launch Angle has a negative coefficient, meaning that the lower the angle (meaning a ground ball over a fly ball) the more likely the fielder is to commit an error.  Both of these results are logical, and are consistent with research that has been conducted in the past. The most interesting results from the model are both traditional and non-traditional shifts leading to an increased likelihood of an error occurring.  Both variables were statistically significant on the 5% level, and prove that players struggle more in the field when playing out of their normal position.

While teams are unlikely to change their shifting patterns (more good comes out of the shift than bad), they must take into account which fielders are worse when playing out of position.

Despite the increased probability of an error occurring, I still believe that the positives out weigh the negatives when it comes to shifting.  In future research, it would be interesting to look at this data on a minor league level, as well as seeing if fielders who shifted more in the minors are more prepared to field out of position in the majors.


Fastball Velocity and Its Effect on Hitters

Over the past few seasons there has been a definite trend toward harder-throwing pitchers in the big leagues. The league average fastball velocity has gone up every year for the past few years, led by hard-throwing reliever Aroldis Chapman. Whether this increase in velocity is leading to a harder time for hitters at the plate would seem to be a topic of big concern for many of these teams who are investing in these hard-throwing players. Currently we see strikeout rates increasing at a rapid pace, but at the same time a home-run surge is happening. Are hitters just swinging hard and hoping to make good contact with these faster speeds? What kinds of effects are these higher velocities having on offensive performance?

AVG_vs_Velo

Taking data from AB results in 2015-17 we can see how batting average changes for hitters with respect to velocity. Here we can see that average of hitters goes down from close to .300 at pitch velocities down around 90 mph to around .200 at pitch velocities of above 100 mph. Clearly intuitive preconceptions, that faster pitches are harder to hit, seem to be justified by the data. Average, however, is not the be all end all of hitting metrics; we can look at the batting average on balls in play (BABIP) to get an idea of how hitters do when they do make contact with the faster pitches.

BABIP_vs_Velo

Here we can see the opposite effect compared to AVG. BABIP tends to slightly increase as velocity of the pitches go up. This tells us that the higher speeds of these pitches aren’t causing batters to make less solid contact, but they are causing the hitters to miss the ball completely. In addition, the rise in BABIP at the higher end of pitch velocity suggests that when contact is made at that speed it comes off of the bat faster and therefore is more likely to go for a hit. This seems to keep in line with what I was taught growing up: the faster a ball gets to the plate, the faster it leaves. That would suggest, however, that a higher percentage of hits are going to go for home runs when hit off of Aroldis Chapman rather than Bronson Arroyo, but does that happen?

ISO_vs_Velo

A look at isolated power (ISO) says that the assumption does not hold true. While the physics may appear correct in repeated lab tests, the conditions are not so predictable in the real world. Clearly the decrease in solid contact at higher velocities is having a major effect on power numbers. It seems that even among the balls that go for hits, more of them are ending up as singles than hits from pitches at lower velocities. This is another great sign for teams with hard-throwing pitchers that the money spent is worth it over a conventional pitcher.

The numbers presented in this article help to statistically show what was already intuitively known. Harder-throwing pitchers are harder to hit, and when they are hit, the hits are less damaging. Perhaps the one surprising conclusion was that faster pitches do not tend to result in more extra-base hits and home runs. In fact they lead to quite a bit fewer, even when looking at just balls that fall for hits. This all translates into good news for teams such as the Yankees who have invested a good amount of money in hard-throwing pitchers. Overall, while most likely detrimental to long-term health of the arms of many of the pitchers, I predict that with data like this coming out we will continue to see a trend of arms going the way of Chapman. Hard throwers that can put up a few seasons of good numbers and can be replaced by another hard thrower when they get injured or lose velocity. Speed is an easy trick to pick up and to use, and data here shows its effectiveness. All of that combined should lead to front offices targeting these types of hurlers for years to come.

 

(All data comes from Statcast and Pitch F/X via Baseball Savant)


A Brief Analysis of Predictive Pitching Metrics

Pitching performance can often be pretty volatile and difficult to predict. Look at Rick Porcello’s 2017 season, for example. After turning in a Cy Young-winning season in 2016, he regressed to have a below-average ERA. His ERA ballooned from 3.15 in 2016 to 4.65 in 2017.

This is where predictive pitching metrics come in. By just looking at Porcello’s ERA from 2016 it may have been hard to predict his 2017 ERA. Thus, we should use different metrics to better predict his performance.

One popular statistic for more accurately quantifying and predicting pitching performance is FIP (Fielding Independent Pitching). FIP attempts to approximate a pitcher’s performance independent of factors which the pitcher cannot directly control himself, such as his defense’s performance. For example, a good pitcher with a weak defense can induce lots of weak contact but still give up lots of runs due to his defense’s inability to successfully field a lot of balls. Additionally, luck may play a significant factor in how many runs a pitcher concedes. A pitcher may be unlucky and give up lots of bloop hits, or weakly hit balls that land away from fielders. Thus, FIP focuses on the factors that pitchers can directly control, such as strikeouts, walks, hit batsmen, and home runs.

The formula for FIP is:

FIP = (13*HR + 3*(BB + HBP) – 2*K) / IP   +   FIP constant

where HR is home runs allowed, BB is walks allowed, HBP is hit batsmen, K is strikeouts, and IP is innings pitched. FIP is scaled to ERA (Earned Run Average) by the FIP constant, and can be read the same way as ERA (i.e., lower FIP corresponds to better performance).

FIP’s formula may look complicated, but all it does is weight certain pitching statistics per inning pitched. Because a favorable FIP is one that is lower, strikeouts are weighted negatively since they contribute to favorable pitching performance, and home runs, walks, and hit batsmen are weighted positively since they contribute to unfavorable pitching performance. Home runs are weighted the most positively (at a coefficient of 13) because they are most detrimental to pitching performance and cause the most runs to be allowed.

Variability Between FIP and ERA

Figure 1

FIP provides an estimate of pitching performance independent of defensive performance and luck. If it is compared to ERA, the variance between the two statistics can provide an estimate of how much defensive performance or luck affects the number of runs allowed by a pitcher. FIP and ERA can be compared by creating a distribution of FIP – ERA for yearly pitching performance. In Figure 1, a distribution of FIP – ERA for all single-season starting  pitching performances (minimum 162 innings) from 2011 to 2015 is created using FanGraphs’ databases. The spread of this distribution is fairly symmetrical. The average FIP – ERA is 0.058 runs, meaning that qualified starting pitchers tend to have slightly higher FIPs than ERAs. The standard deviation is 0.498 runs, signifying that on average starting pitchers’ ERAs tend to differ from the average FIP – ERA of 0.058 by 0.498 runs. Thus, defensive performance and luck cause a starting pitcher’s ERA to differ from what it would be based off fielding-independent factors by about a half run.

Figure 2

Figure 2 shows a distribution of FIP – ERA for all single-season relief pitching performances (minimum 50 innings) from 2011 to 2015. Like the distribution for starting pitchers, the spread of FIP – ERA for relief pitchers is fairly symmetrical. However, the average FIP – ERA is 0.253 runs, meaning that on average qualified relief pitchers have significantly higher FIPs than ERAs. A possible reason for this could be that relief pitchers often throw harder than starters and can induce weaker contact from hitters, thus allowing the defense to convert more outs off balls in play than they would normally. Additionally, the standard deviation is 0.734 runs, meaning that on average relief pitchers’ ERAs tend to differ from the average FIP – ERA of 0.253 by 0.734 runs. Thus, defensive performance and luck cause a relief pitcher’s ERA to differ from what it would be based off fielding-independent factors by close to one run.

Predicting Future Pitching Performance

FIP is also useful in that it can help predict future pitching performance. Since the fielding-independent statistics that FIP uses in its formula (strikeouts, home runs, walks, hit batsmen) tend to stay more constant year to year than ERA, FIP tends to be consistent than ERA year to year. Thus, due to its lack of variability, it can be a better estimator for future pitching performance.

Figure 3

Figure 4

To determine how well ERA and FIP predict future pitching performance, the pitching statistics for the 50 pitchers that pitched at least 162 innings in both 2014 and 2015 are obtained. 2014 ERA and FIP are tested to see how well they predict 2015 ERA by looking at their correlation with 2015 ERA. This is demonstrated by Figure 3, which tests how well 2014 ERA predicts 2015 ERA. There is a moderate, positive, linear relationship with a correlation  coefficient of 0.382. Thus, it can be said that 2014 ERA is a moderately accurate predictor of 2015 ERA. Figure 4 demonstrates how well 2014 FIP predicts 2015 ERA. There is also a moderate, positive, linear relationship, but the correlation coefficient is higher at 0.462. Thus, there is a stronger relationship between 2014 FIP and 2015 ERA, and it can be said that 2014 FIP is a better predictor of 2015 ERA.

However, FIP is not the only fielding-independent statistic that is commonly used. xFIP is a variant of FIP that uses a pitcher’s fly ball rate instead of home runs in its formula. The logic behind this is that fly balls a pitcher gives up are a strong indicator of how many home runs a pitcher will give up in the future — an even better indicator than home runs themselves. The formula for xFIP is:

FIP = (13*(Fly Balls*League Home Run per Fly Ball Rate) + 3*(BB + HBP) – 2*K) / IP   +   FIP constant

Figure 5

Figure 5 demonstrates the relationship between 2014 xFIP and 2015 ERA. Similar to the aforementioned relationships, there is a moderate, positive, linear relationship, but with an even higher correlation coefficient at 0.520. Thus, in comparison to ERA and FIP, xFIP is the strongest predictor for pitcher success.

Figure 6

Skill-Interactive ERA, abbreviated as SIERA, is another fielding-independent statistic. It is a variant of xFIP, but it accounts for various factors that make xFIP less accurate. For example, each walk given up by a pitcher is less detrimental if he generally walks few batters, whereas each walk given up by a pitcher is more detrimental if he generally walks more batters. Thus, SIERA takes this into account. The complete formula of SIERA can be viewed here. Figure 6 shows the relationship between 2014 SIERA and 2015 ERA. There is a moderate, positive, linear relationship with a correlation coefficient of 0.517. This is almost the same as xFIP’s correlation coefficient with 2015 ERA, which was 0.520. Overall, there is likely not a very significant difference in predicting ERA using SIERA or xFIP, but this assertion can be better tested through obtaining more data.

Conclusion

What can be concluded from this piece is how much defensive performance and luck can alter a pitcher’s ERA, and what statistics should be used to predict future performance for pitchers. On average defensive performance and luck account provide about half a run in variation of a starting pitcher’s ERA, and about one run in variation of a relief pitcher’s ERA. Additionally, the statistics that are most effective in predicting future pitching performance are xFIP and SIERA.

Acknowledgments

I want to thank my AP Statistics teacher, Ms. Rachel Congress, for teaching me a lot of the material about statistics that I applied in this paper.

Bibliography

DuPaul, Glenn. “Occam’s Razor and Pitching Statistics.” The Hardball Times. FanGraphs, 26 Sept. 2012. Web. 24 May 2016.

“Fielding Independent Pitching (FIP) Added to Baseball-Reference.com » Sports Reference.”
Sports Reference RSS. Sports Reference, 17 Apr. 2014. Web. 24 May 2016.

A Guide to Sabermetric Research.” Society for American Baseball Research. Society for American Baseball Research, n.d. Web. 24 May 2016.

McCracken, Voros. “Baseball Prospectus | Pitching and Defense.” Baseball Prospectus. N.p., 23 Jan. 2001. Web. 24 May 2016.

Petti, Bill. “How Teams Can Get the Most Out of Analytics.” The Hardball Times. FanGraphs, 27 Jan. 2015. Web. 24 May 2016.

Sawchik, Travis. Big Data Baseball: Math, Miracles, and the End of a 20-year Losing Streak. New York: Flatiron, 2015. Print.

Swartz, Matt. “New SIERA, Part Three (of Five): Differences Between XFIPs and SIERAs.”
Baseball Statistics and Analysis. N.p., 20 July 2011. Web. 24 May 2016.

Swartz, Matt. “New SIERA, Part Two (of Five): Unlocking Underrated Pitching Skills.” Baseball Statistics and Analysis. N.p., 19 July 2011. Web. 24 May 2016.


On Jake Arrieta, Aaron Slegers, and Extreme Release Points

Jake Arrieta turning himself from a Baltimore castoff to a Chicago Cy Young Award winner was a fascinating thing to watch, especially considering how it happened. This wasn’t just a guy who benefited from a change of scenery. When Arrieta adopted a new look, it was much more than his jersey color that changed.

The alterations were covered in a great 2014 Jeff Sullivan article titled Building Jake Arrieta. Among the things noted in that piece was his new release point that was primarily the result of pitching from the third-base side of the rubber.

Sullivan noted changes in Arrieta’s delivery yet again this May, pointing out an even more extreme horizontal release point in a piece titled Jake Arrieta Has Not Been Good. How extreme? Well, he’s throwing like a giant. No, not the kind that play in San Francisco. Arrieta has achieved nearly the exact same release point as Minnesota Twins pitcher Aaron Slegers, who at 6-foot-10 is one of the tallest hurlers to ever grace the mound.

Among the 562 right-handed pitchers Baseball Savant has data on from 2017, only three of them averaged a release point of at least 6.2 feet vertically and 3.3 feet horizontally: Arrieta, Slegers, and Brewers reliever Taylor Jungmann. Jungmann only thew 0.2 innings for Milwaukee last season, so there’s not much to unpack there. Below is the release point chart for Arrieta, courtesy of Baseball Savant:

And here is the chart for Slegers:

And finally, below is a graph showing how Arrieta’s horizontal release point has evolved over his career. You can see the dramatic dip to his first full season with Chicago in 2014. Things leveled out somewhat from there to 2016, but then there’s another noticeable dive last season.

Arrieta’s horizontal release point was farther toward third base than 98.6 percent of right-handed pitchers last year. It’s easy to see why a pitcher would want to create a unique look, as hitters aren’t accustomed to picking up a ball from that point, but how much does that really matter? Well, by the sound of this Francisco Cervelli quote from an MLB.com article in October 2015, I’m guessing it matters a lot.

“What makes him so tough is he throws the ball from the shortstop,” Cervelli said. “He’s supposed to throw straight. It should be illegal.”

Given Arrieta’s struggles, however, you can’t help but wonder if maybe he has taken this too far. He hit a career-high 10 batters and led the league in wild pitches for the second-straight season. Coming into 2017, Arrieta had averaged up just 6.2 H/9 and 0.5 HR/9 as a Cub. Last year, those numbers ballooned to 8.0 H/9 and 1.2 HR/9. His quality of pitch average also dipped from a score of 5.31 over his first three seasons with the Cubs to 4.98 last year.

The free agent market has been slow to get moving, but you’d have to figure things will start to pick up once the calendar turns over to 2018. It’ll be interesting to see if Arrieta’s new team tries to tweak some things with his mechanics. If nothing else, he’s shown a great openness to experiment.

Arrieta used his feet to get his arm into an angle that only a much taller pitcher should be able to achieve. Is it possible another set of eyes could get him pointed back in the right direction in 2018?

Tom Froemming is a contributor at Twins Daily and co-author of the 2018 Minnesota Twins Prospect Handbook.


On Drew Smyly, Michael Pineda, and the History of Signing Injured Free-Agent Pitchers

About 12 hours apart, news of two very similar moves broke out of Chicago and Minnesota, as the Cubs agreed to terms with Drew Smyly while the Twins signed Michael Pineda. Both pitchers inked two-year deals with $10-million guarantees and additional incentives based on innings pitched, but the two deals shared an even more important similarity: both pitchers underwent Tommy John surgery this summer and seem unlikely to contribute significantly during the 2017 campaign. Both clubs are clearly betting on a return to health and productivity in 2019 for the two still relatively young pitchers, as evidenced by the financial distribution of the contracts. Pineda is only owed $2 million for the upcoming season but will receive $8 million in 2019, while Smyly will be paid $3 million next year but will pull in $7 million the following year. Since both pitchers underwent surgery around the same time, during the middle of the summer, it seems unlikely that either will throw pitch in the coming season.

While uncommon, these types of deals certainly aren’t entirely unprecedented. The Kansas City Royals have inked three pitchers with similar situations over the past few years, with varying degrees of success. These contracts, given to Luke Hochevar and Kris Medlen in 2015 and Mike Minor the following season, seem to represent the most relevant examples of such a deal. While Minor was non-tendered by the Braves following repeated shoulder issues, both Medlen and Hochevar underwent Tommy John surgery the previous year. All three pitchers would appear for the Royals in the major leagues over the life of their deals, albeit with differing results. Hochevar would appear in 89 games for the Royals, and accumulate only marginal value, as he posted a FIP around 4.00 and tallied only 0.3 WAR combined before succumbing to thoracic outlet syndrome surgery. Kansas City declined their option over Hochevar last winter, who became a free agent and sat out 2017 recovering.

Medlen would also return to pitch in 2015, making eight starts and seven relief appearances for Kansas City. He saw an uptick in walks and a downturn in strikeouts compared to his previous work, but overall pitched his way to a 4.01 ERA with similar peripherals and rang up half a win of value. 2016, however, would not be so kind to Medlen, as he was shelled to the tune of a 7.77 ERA while walking more batters than he struck out and battling a shoulder injury. He would sign a minor-league deal with the Braves after the season, but would not return to the majors. Although he did not appear with the Royals in 2016 after struggling in AAA, Minor marks the largest success story of the three. Over 65 relief appearances, Minor registered a 2.62 FIP and was worth 2.1 WAR out of the bullpen. He recently signed a three-year contract with the Rangers to return to a starting role.

In total, the Royals invested $25.75 million in the three pitchers and saw them accumulate a grand total of 2.9 WAR, with most it coming from Minor. This works out to a $/WAR figure of $8.88 million per win, which is slightly higher than the $8 million per win value assumed of the free-agent market. Based on these three deals, it would appear that this type of signing is not a bargain, but rather an overpay on average. However, it isn’t fair to make such an assumption without looking at a larger sample of data. If we classify a similar deal as one in which a team signed a pitcher that was injured at the time of the signing and expected to miss at least part of the following season and either signed a major-league deal or a two-year minor-league pact, that leaves us with 18 similar signings since 2007. One of these signings, Nate Eovaldi, has yet to return from his injury but should in 2018, so we won’t include him in the sample.

These 17 signings correlate to 25 player seasons following injury, with 24 of those representing guaranteed contract years, as well as one option year (Joakim Soria, 2015). The breakdown of these player seasons by games, innings pitched, strikeouts, walks, earned runs, and WAR are presented in the table below:

G IP K BB ER WAR
Total 447 725.2 606 246 347 6.9
Mean 18 29 24 10 14 0.27
Median 7 20 15 6 10 0

Altogether, when on a big-league mound, the group pitched to a 4.30 ERA to go along with a 7.52 K/9 and a 3.05 BB/9, numbers not entirely dissimilar from, say, Dustin McGowan or Sal Romano in 2017. So even the healthy group put together fairly middling results, but it’s also important to remember that eight of these player seasons wouldn’t see the player throw a single big-league pitch, and therefore provided no value to the club. Let’s plot the distribution of value produced by WAR:

INJ FA Pit WAR

That 2.1 WAR recorded by Minor last season was the highest figure of any player season in the sample, and besides Mike Pelfrey’s 2013 season, no other player season really comes close. Of the 10 player seasons recorded by primarily starting pitchers, only Pelfrey’s season even came close to average production, as every other starter either wasn’t durable or good enough to rack up any significant value. On the relief side, Minor and 2014 Joakim Soria both excelled, but no other relief season (out of the 15 in the sample) even crossed the 0.5 win threshold. As with the Royals pitchers earlier, it is important to look at these deals from a value standpoint. We can do this by calculating $ per WAR for the whole sample to find a mean, and for each deal to find a median, and visually represent the distribution. Overall, teams invested a total of $78 million in these 25 player seasons, with $71 coming in guaranteed money and $7 million in Joakim Soria’s club option. All minor-league deals to MLB veterans were assigned a dollar value of $333,333 for ease of calculation. Bonuses and incentives were ignored from this figure, as it is very difficult to find these details of the player contracts and few of these seasons would reach such incentives. As we saw above, the sample produced a total WAR of 6.9. This means that on average, teams paid $11.3 million per win when committing money to injured pitchers in hopes of a bounceback, well above the market rate of $8 million per win in free agency. Based on some quick calculations, teams paid that $78 million for production worth $55.2 million, for a net loss of $22.8 million. Let’s now look at the value gained/lost for each contract (in millions of $):

INJ FA Pit Val

As you can see, only five such contracts actually generated positive (above market value of $8 million per win), while the remaining 12 contracts provided their team with below-market value. The mean loss per contract is $1.34 million, while the median is represented by the Phillies’ $700k loss on Chad Billingsley. While neither number is outrageously high, both figures only serve to reinforce the fact that teams have generally lost more often than they have benefited from inking an injured pitcher.

None of this is necessarily to say that the Pineda, Smyly, and Eovaldi contracts are doomed or that no team should ever make this type of investment, but simply to look at how similar deals have worked out in the past. Admittedly, the sample is hardly big enough to make any sort of definitive conclusion, but the overall trend on these “bargain” signings isn’t pretty. Both Smyly and Pineda are better pitchers than most in the sample, so it is entirely possible that they (along with Eovaldi) could significantly shift the outlook on these types of deals in the future. Whether this trio of pitchers can buck the trend or will follow in the footsteps of their predecessors will certainly be an interesting, if minor (pun intended) storyline to watch over the next few seasons.

FanGraphs.com leaderboards, Baseball-Reference transaction data, and MLBReports Tommy John surgery database were all used extensively for this research.


Relationship Between OBP and Runs Scored in College Baseball

There is a segment of the population of the United States which meets the following criteria:  between the age of 18-21, devout FanGraphs reader, and was mesmerized by the movie “Moneyball.”  I have read the book and watched the movie a number of times, as well as dedicating time to understanding the guiding principles in the book and how they relate to professional baseball.  The relationship between on-base percentage and scoring runs in Major League Baseball is well established, but has anyone ever taken the time to examine the relationship at the collegiate level?

Collegiate baseball is volatile — roster makeups change dramatically each year, no player is around more than five years, not to mention there are hundreds of teams competing against one another. In terms of groundbreaking sabermetric principles, this study is not intended to turn over any new stones, but rather present information which may have been overlooked up to this point, which is the relationship between on-base percentage and runs scored in collegiate baseball.

To conduct this study, I compiled a list of Southeastern Conference team statistics from the 2014-2017 seasons (Runs Scored, On Base Percentage, Runs Against, and Opponents’ On Base Percentage).  I then performed linear regression on the distribution by implementing a line of best fit.  Some teams’ seasons were excluded due to inability to access that season’s data, and I felt like removing the 2014 Auburn season on the grounds that it was an outlier affecting the output (235 runs, 0.360 OBP).  Below is the resulting math:  the R2, and the resulting predictive equation:

Runs Scored = ( 3,537. x OBP ) – 933.6791

R² = 0.722849

I am by no means a seasoned statistician, but in my interpretation of the R2 value, the relationship between Runs Scored and OBP in this is moderately strong, with a team’s OBP accounting for roughly 72.3% of the variation in Runs Scored in a season.  Simply, OBP is statistically significant in determining the offensive potency of a team.

At the professional level, the R2 is found to be around 0.90.  The competitive edge the Oakland A’s used in “Moneyball” was using this correlation to purchase the services of “undervalued” players.  But what about in college?  Colleges certainly cannot purchase their players, but the above information can be useful to college programs.

For example, the average Runs Scored per season of the sample I used was roughly 347.8.  If an SEC team wanted to set the goal of being “above average” offensively, they would be able to determine, roughly, what their target OBP should be by using the resulting predictive equation from the Linear Fit:

Does this mean if an SEC program produces an OBP of .362 they would score 348 runs precisely? Obviously not. Could they end up scoring exactly 348 runs? Yes, but variation exists, and statistics is the study of variation.  Here are a few seasons in which teams posted an OBP at or around 0.362, and the resulting run totals:

The average of those six seasons’ run totals was 347.5, which is pretty darn close to 348, and even closer to the average of 347.8 runs derived from the sample.

Another use for this information is lineup construction and tactical strategy in-game.  The people in charge of baseball programs do not need instruction on how to construct their roster and manage their team, but who would disagree with a strategy of maximizing your team’s ability to get on base?

The purpose of this study was to examine the relationship between On Base Percentage and Runs Scored in college baseball, and how the relationship compares to its professional counterpart.  To conclude, the relationship between OBP and runs exists at the collegiate level, and carries considerable weight and value if teams are willing to get creative in utilizing its ability.

 

Disclaimer: I am a beginner-level statistician, and if you have any suggestions or critiques of this article, please feel free to share them with me.

Theodore Hooper is a Student Assistant, Player Video/Scouting, for the University of Tennessee baseball program.  He can be reached at thooper3@vols.utk.edu or on LinkedIn at https://www.linkedin.com/in/theodore-hooper/