Archive for Research

Does Payroll Matter? (Part I)

Money in baseball has been an infinite source of criticism. In MLB, there is no salary cap as in other major sports, and luxury tax is relatively recent. Media has made us believe that the small fish (e.g. small-market teams) will always be eaten by the big one (e.g. big-market teams). The Kansas City Royals’ performance during the last couple of years, along with the tricky and often misunderstood Moneyball concept, has brought back salary to the newspaper headlines even though it is safe to say the Royals were not even a low-end payroll team. In any case, this post is an attempt to see if popular beliefs regarding money, power and on-field performance pass the numerical test.

There are many interesting questions related to this topic. However I will limit myself to the following during two posts:

  1. Is there a relationship between payroll and wins? If so, how strong is it?
  2. Has this relationship changed over time? If so, where are the peaks? Where are we now?
  3. Will money buy you a ring or a post-season ticket? If so, how much should we spend?
  4. Are there truly big spenders? If so, who are they? Have they changed over the years?

Let me start off by stating what my data sources are, and laying out my assumptions so that we are in the same page. My sources for salaries are Baseball Chronology (1976-2006), Sean Lahman database (2007-2014) and Sportrac (2015). For wins and post-season appearances, my references are MLB and the Sean Lahman database. MLB revenue data is from Forbes.

My assumptions and caveats are the following:

  1. Payroll values are not adjusted for inflation. Time value of money has not been taken into account.
  2. The Houston Astros are considered an American League (AL) team. The Milwaukee Brewers are considered to be a National League team.
  3. 1994 strike-shortened season does not have playoff teams or a World Series champion.
  4. Payroll is considered to be Opening Day payroll. Payroll is assumed to be constant throughout the season for simplicity. Arguably this may not hold true as winning/better teams will likely be buyers at the trade deadline. Losing teams will likely be sellers.
  5. I have not tested for any confounding effect on the variables studied (payroll and wins).

Without further talk, I will get to it.

Question 1: Is there a relationship between payroll and wins? If so, how strong is it?

To answer this question, I found the correlation between yearly payroll and winning percentage for every individual season played from 1976 to 2015. Because payroll values have changed so much in 40 years, I used z-scores or standard scores, which allows us to compare different seasons, regardless of payroll differences.  A payroll number on its own does not mean much and should be compared to the pool of teams on a yearly basis i.e. it is the distribution of payroll in the league that matters. Here’s a link in case you are not familiar with the concept of z-scores; please keep in mind that correlation does not imply causation. Check out the correlation here.

A couple of interesting insights can be drawn from this graph. The first one, quite obvious, is there’s a positive slope there, implying that more money affects wins positively. The second point, though, is that payroll alone does not wholly explain the total number of wins. We inherently knew that. In 40 years, we are able to find teams that satisfied each situation: low-payroll teams that were awful (Houston 2013), low-payroll teams that played over a .600 win percentage (Oakland 2001 and 2002), high-payroll teams that unperformed (Boston 2012) and high-payroll teams that exceeded expectations and went on to win 114 games (NYY 1998). There is a mid-tier team that did extremely well (SEA 2001). These are all outliers, though people can (will?) use every one of these cases to support a preconceived idea e.g. “baseball is a sport and it is attitude and effort that matters,” “money will buy you handshakes at the end of each game,” “big-money teams won’t win because they lack camaraderie,” etc. Therefore, let’s focus on the big picture.

The third point I’d like to highlight is the R-square. The R-square measures how successful the fit line is in explaining the variation of the overall data on a 0-to-1 spectrum. In this case R-square is 0.1905 so it looks like ~19% of the total variation in wins can be explained by the linear relationship between payroll and wins. Also, the slope of the best fit line is 0.0302. This means for a one-unit increment in Z-scores, there is a 0.0303 win-percentage increment. Remember z-score increments are not linear e.g. going from -0.5 to 1.5 requires a different amount than moving from 2 to 3.

However, the potential drivers behind the total number of wins are complex (injuries, roster construction, plain luck, etc.) and the R-square, along with the F-test and P-value, shows that money matters but seems to be overrated. Again, remember that correlation does not imply causation.

Question 2: Has this relationship changed over time? If so, where are the peaks? Where are we now?

We have established that team payroll can predict win percentage with a low confidence level. However, has that always been the case? Was money more important in the 80s than now? The following graph shows the R-square value for every two-year period from 1976 to 2015. It is important to keep in mind that the higher the R-square value, the stronger the relationship between payroll and winning percentage.Check out the R-square of payroll and winning percentage for every 2-year period.

The answer to our question of whether the relationship has changed over time is definitely yes. There are noticeable peaks and valleys. There have been two periods (which I highlighted in green) when money was a better predictor of winning percentage: from 1976 to 1979 and from 1996 to 1999. The first period corresponds to the first four years of free agency. Team owners flooded the league with new money as they went after key players e.g. Mike Schmidt or Reggie Jackson, and payroll increased drastically (60% in 1977, 34% in 1978), as shown below. These have been largely documented (here, here and here). Click here for the payroll growth trend since 1976.

The second period (1996 – 1999) is linked to the Yankees, Orioles (though they dramatically underperformed in 1998), Indians and Braves’ successful expenditure (read: lot of won games) and to the lack of Cinderella stories (perhaps only Houston in 1998 and Cincinnati in 1999). This period was also characterized by, firstly, a league expansion sequel: Tampa Bay and Arizona joined the league in 1998 and, understandably, underperformed. Secondly, MLB revenues year-to-year growth averaged 17% from 1996 to 1999 (not adjusted), so probably teams redirected that surplus to the salary pool. Lastly, in the late 90s, MLB was increasingly becoming a rich-team game. The graph below will show the payroll coefficient of variation for the 1976 – 2015 timeframe. This number, which I will call payroll spread, is simply the standard deviation divided by the mean. This number allows us to quickly assess how spread is the payroll across the league over time. Do you see the trend after ~1985? By 1999, this number had increased continuously for almost 15 years and MLB has had enough. As the power money increased AND the gap widened, MLB commissioned the Blue Ribbon Panel to come up with initiatives to level the field A.K.A. a revenue-sharing program to increase competition. Entertainingly, the correlation of money and winning percentage has decreased steadily but the payroll spread has remained pretty much consistent. I am hesitant to attribute the decline in R-square to the Blue Ribbon Panel or to other factors (read: is this coincidence?). Check out the payroll spread here.

If we go back to the yearly payroll and winning-percentage correlation graphs, you’d notice that I highlighted two periods in red too — from 1982 to 1993 and from 2012 until last season. Those were moments when the correlation of salary power and winning percentage was remarkably low. The first period seems to be closely related to the collusion MLB crisis (check out this link as well). The lowest point was in 1984-1987, when the correlation was only 0.03 and the salary spread was 0.22.

The 2012-onwards period has brought down R-square to a 20-year low (0.06 in 2012-2013). While TV revenue keeps rising, the baseball landscape has changed and new variables are in the mix. There is a redefined revenue-sharing model, we have analytically-inclined organizations, an extended wild-card system and international signings – all these factors have added more complexity to the winning equation, effectively diminishing the relationship between payroll and winning percentage – even with the salary spread still at ~0.40. We are living in interesting times in baseball indeed: If investing money in players doesn’t lead to better on-field results, where do teams need to invest e.g. analytics, managers or front office?

Note: This analysis is also featured in our emerging blog www.theimperfectgame.com


The Belt Delusion

A preview of Brandon Belt, Giants first baseman

Ever since Brandon Belt tore apart the Eastern League in 2010, hitting .337/.413/.623 over 201 plate appearances in a very pitcher-friendly league, Giants fans have been hyped up on his potential major-league career. When his name first began to circulate, fans and journalists liked to mention Belt’s raw power. 

That’s a dangerous word for Giants fans: power. You say that word, and all of a sudden we enter fever dream hallucinations of riding Barry Bonds home runs like Concords, waving at our houses as we pass over them, never to land. We’ve been pining for a 40+ home-run hitter since Bonds set the league on fire in 2004. No, that’s not hyperbolic enough. Barry Bonds incinerated baseball history in 2004. Relatively impossible standards for any mortal player, wouldn’t you agree?

So why Belt? How did Belt become the Giants’ next offensive savior, when he doesn’t even play Bonds’ position?

****

BELT AS BONDS

beltbonds

The Giants never had a history of developing hitters well. Will Clark was the one true homegrown star that bridged the Mays and McCovey era to the present one. The post-Bonds years were a concoction of otherworldly young pitching, and Brian Bocock: starting opening-day shortstop. Bengie Molina led the 2008 Giants in home runs….with 16.

We were given a Panda in 2009, swinging at everything for a .330/.387/.556 slash. 23-year-old Pablo Sandoval firmly grasped the hearts of Giants fans, but he wasn’t really heralded for his power. To this day, no Giant has hit 30 or more home runs in a single season since Bonds in 2004.

Then came 2011. In Brandon Belt’s second major-league game, he hit a three-run homer to dead center field at Dodger Stadium against Chad Billingsley. Certainly no easy task, but the way Belt just whipped his bat through the strike zone made it look almost routine. “That’s the guy,” thought the Giants fan. “That’s the team’s new 30-home-run machine.” It was that instantaneous.

But it wasn’t that easy. Belt, like most rookies, struggled to keep pace with major-league-caliber pitching: a 23-year-old kid could be forgiven for facing Clayton Kershaw like he was swinging a fishing rod. Belt bounced from Triple-A to San Francisco, from the bench to the disabled list. Every once in a while he flashed his incredible home-run potential, re-igniting the “Savior Belt” narrative. He just needed more time.

In late 2012 Brandon Belt finally, if unspectacularly, wrested the starting first-base job from Brett Pill, another hitter with serious power. Belt locked himself in during a lost 2013 season, perhaps at last realizing his potential. He came out with dingers blazing in 2014, and was then hit in the wrist with a Paul Maholm fastball. Upon returning, he received a concussion from his own teammate. It was a lost season for Belt, even though he did get to be a postseason hero for one night.

In 2015, he finally put it together. Slashing .280/.356/.478, Belt had his best overall season. He had arrived.

****

So why do people still call into KNBR 680, and bother the poor hosts with poorly-conceived trade proposals that usually involve purging Belt? Is it because he hasn’t unleashed the stupendous slugging ability that we fantasized for him, an unrealistic threshold that is becoming harder for any San Francisco hitter to reach?

In this era of pitching-dominated baseball, in one of the most dramatically home-run-reducing ballpark in the United States, very few left-handed Giants are capable of hitting 30 home runs. Giants hitters, Belt very much included, succeed by hitting .300, maintaining a terrific eye at the plate, hitting to all fields, and playing solid (and sometimes sterling) defense. Park factors have always pegged AT&T Park, with its Grand Canyon outfield gaps, as a doubles and triples park. Therefore, it benefits the team to fill their lineup with contact-first hitters with…you guessed it…doubles and triples power. This is how the Giants have won. This is how the Giants will continue to win.

That said, how do we value Belt? He’ll be 28 for most of 2016, so he’s likely in his prime, or close to it. Via Baseball Prospectus, Belt was worth 4.7 Wins Above Replacement Player in 2015, and 4.4 WARP in 2013, losing 2014 largely to injury. Belt is a plus base-runner, and a very adept fielder (DRS: 8, UZR: 9 in 2015). But how do we know if these numbers are good?

Perhaps we need some context. There are two first basemen in particular whom Belt resembles, both representing existing and theoretical stages of Belt’s development. The first is Joey Votto of the Cincinnati Reds, and the second is Freddie Freeman of the Atlanta Braves. Let’s show a quick comparison of the three players, in 2015.

Name OPS wRC+ ISO K% BB%
Brandon Belt .834 135 .197 26.4% 10.6%
Freddie Freeman .841 133 .195 20.4% 11.6%
Joey Votto 1.000 172 .228 19.4% 20.6%

All three players had great seasons last year, and all three players are similar in different ways. Belt, like Freeman, is young enough to improve. Belt, like Votto, had his best season in 2015, yet remains criminally underrated. Votto and Freeman both survived team rebuilds, and both represent their team’s best player. Both have had to be superstars, whereas Belt has become a role player. All three are left-handed.

But there’s more to both players than their statistics on the surface; all three players have unquestioned power, and power hitters are expected to command the strike zone. One quick glance at Barry Bonds’ Baseball-Reference page reveals his unbelievable plate discipline, usually getting one good pitch to hit per game. Sluggers command the zone, just as they command respect.

This table shows the percent of pitches outside the strike zone at which each player swung (o-swing%), the percentage of pitches inside the strike zone at which each player swung (z-swing%), the percentage of total swings that resulted in contact (contact%), and the percentage of strike swings that resulted in contact (z-contact%). The final column shows the percentage of balls put into play that were hit hard. We are using data collected through PITCHf/x, displayed on FanGraphs.

Name O-Swing% Z-Swing% Contact% Z-Contact% Hard Hit%
Brandon Belt 31% 74% 74% 79% 40%
Freddie Freeman 29% 76% 77% 83% 38%
Joey Votto 19% 59% 79% 83% 38%

****

BELT AS VOTTO

Joey Votto, being the best and longest-tenured hitter on this list, doesn’t swing much. He swings at only 19% of balls, 11% better than league average. Perhaps more importantly, Votto is very selective about swinging at certain strikes. Many pitches in the strike zone cut the corners, with nasty movement running down, away, or into a hitter. If a hitter were to attempt a swing at one of these pitches, he would make weak contact, and likely make an out. It’s a blatantly obvious, but crucial reminder: hitters get three strikes, and they don’t have to swing at all of them.

Votto has a spectacular eye; he will only swing at the best strikes he gets. His eye and stubbornly consistent plate discipline have earned him an MVP award, and have helped established himself as one of the smartest hitters in the game.

Votto, much like Belt, has drawn criticism for his approach. He has endured the ire of many impatient Reds fans due to his deliberate approach to hitting. Fans know Votto has special power, and they don’t want to watch him walk 20% of the time. The old-guard sentiment still lives strong, and contends that Votto is wasting his offensive capabilities by just getting on base, leaving the damage to the hitters behind him in the lineup. Votto should be the one doing the damage. But the value of getting on base is undeniable these days, and Votto is too smart to swing when he doesn’t want to.

So Votto sets the ceiling pretty high for Belt. Both hitters use the entire field very well, but they each play in vastly different hitting environments. Belt makes the hard contact necessary to intimidate opposing pitchers, but he may never hit enough home runs at AT&T Park to command the respect that Votto does. Belt also swings and misses a lot (league-average contact rate in 2015 was 80%), and needs to lower his strikeout rate, lest opposing pitchers taunt him with junk.

beltpoop

Belt has improved his offensive prowess every year since 2012, and if he improves further in 2016, he could draw more comparisons to Votto than he does to the next guy.

****

BELT AS FREEMAN

The closest current comparison to Belt is Atlanta Braves first baseman Freddie Freeman. Both players are relatively young, and love to swing. Neither makes as much contact as Votto does, but both hit a higher percentage of balls harder. Both are very solid defenders, and capable baserunners.

Whereas Votto personifies Belt’s future potential, Freeman represents Belt’s present and past. While the similarities are there, one glaring difference exists in Belt’s favor.

Freeman had easily his best season in 2013, and has posted progressively weaker seasons in the two years since. Belt, on the other hand, has gradually improved. Belt, like Freeman, had a terrific 2013 season, boosted by a ridiculous second-half surge. In 2014, Belt was well on his way to career highs in home runs, OPS and RBIs, until he was repeatedly and mercilessly struck by baseballs, from Dodgers and Giants alike. Broken wrists and concussions kept Belt from playing a full season.

Then 2015 came, and Belt started to resemble the hitter Freeman had been in 2013. After several years of doubt, it was becoming clear that Belt was trending up. He was still improving. There was no reason to suspect any deviation from the trend, and Belt would continue the dedicated upward march toward the summit of his own potential.

****

DOCTOR BELTED AND MISTER SLIDE

Except we’re getting ahead of ourselves again. Part of the reason fans are constantly disappointed by Belt is the incessant, hyperbolic expectation that surrounds him, and the unfair duality with which he becomes associated. He’ll go 3-4 with three singles, and we’re wondering where his power went. Then he’ll go 1-5 with a long home run and four strikeouts, and we’ll throw our hands in the air and complain that he’s too reliant on his power. Why can’t he be more consistent? We can’t allow a middle ground for Belt, because he doesn’t present one: Belt truly is an all-or-nothing hitter.

This doesn’t appear to be the case when Belt’s season statistics are viewed as a whole; he puts up solidly above-average offensive numbers. When Belt plays a full season, he’ll hit 18-24 home runs per year, and posts a batting average between .270 and .290. Sound familiar? We know better, because we’ve watched him play. We know that Belt is one of the streakiest hitters in the major leagues: Does THAT sound familiar?

In 2015, he didn’t hit his first home run until May 15, six weeks into the season. In the two weeks following, he proceeded to hit seven. Belt managed only three through June and July combined, then hit seven again in the month of August, two of those in the same game. He finished with only one in September.

Belt by Month, 2015 Home Runs OPS wRC+
April 0 .613 80
May 7 1.075 198
June 1 .586 65
July 2 .818 133
August 7 .955 170
September/October 1 .738 109

It wouldn’t be so difficult to evaluate him if he spread his 18 home runs equally, one every nine games. If he hit .284 in every month, we would know exactly what Belt’s true value was. But every year, we go through the same cycle:

“What’s wrong with Belt? Are his injuries still bothering him? You know concussions are persistent little things right? Wait, he’s back baby! Damn, Belt for the All-Star Game? Nope, there’s ol’ slumpy again. Why does he always look so sad? Should we trade him to Miami for…wait who’s the Marlins first baseman again? What the hell is a Justin Bour? Yeah okay, sure. Why not. Wait there he is again! Two home runs to right-center at AT&T against a tough lefty, impressive! Can’t believe I ever doubted you Belty. Aaaand he’s gone again. Wonder what Brett Pill is up to these days…”

Every. Damn. Season.

****

BELT AS BELT

It’s increasingly clear to us at this point what type of player Brandon Belt is becoming. He’s a streaky, high-power guy who hits to all fields, strikes out a good amount, plays a mean first base, and will occasionally slump his shoulders. And that’s fine. Because he’s good enough to start, and he fits in perfectly with the rest of the Giants lineup.

Belt doesn’t need to hit like Joey Votto; the Giants already have Buster Posey. Belt doesn’t need to hit like Freddie Freeman; the Giants already have Hunter Pence. With Brandon Crawford’s continued ascent, as well as the dramatic emergence of Joe Panik and Matt Duffy, all Belt really has to do is remain healthy and hit as well as he can.

Even if Belt never blossoms into the next great Giants slugger, even if Belt repeats his 2015 season ad infinitum, during which he was a well above-average baseballer, he’s making the team better by simply showing up.

Perhaps it’s time we leave Brandon Belt alone. He’s doing just fine.

beltout
“Belt Out”

****

 You can follow me on Twitter @theabsolute19


How the Shift has Changed the Game

The shift is one of the most discussed changes in baseball in many years. It is probably the biggest purely defensive change in decades (right?). Commissioner Manfred has publicly stated that he dislikes it. Players are actively working with hitting coaches to beat the shift. People are asking, how can we beat the shift? And some are starting to deny we can. FanGraphs comments predict that the shift will be bad for baseball, because less offense is less fun.

But just how big is the shift? Just how much has it changed the league?

Zero.

Okay, “Zero” is too strong. It might have changed something, but if it has we can’t tell.

Okay, that too is too strong, but, the number of obvious statistical correlates of an effective shift, seen in terms of league wide stats, is zero. Maybe we can tell, but if so, it can only be told in some serious data-mining that goes beyond obvious results, like number of outs, even in splits, since teams started shifting. No evidence exists of a change in the league-wide stats you would expect the shift to change. BABIP is unchanged. Grounder BABIP is unchanged. Left-handed batter BABIP is unchanged. In fact, BABIP is higher today than it was 40 years ago, but BABIP inflated about .02 from the 1970s to the 1990s and hasn’t evidently changed since.

The shift is a defensive strategy whose intent is to depress run expectancy on balls in play. The likely effect of the shift, if the strategy works, would be in increasing outs on balls in play. Here is a table of BABIP since 1995, the last 20 years:

Year    BABIP
1995   0.298
1996   0.301
1997   0.301
1998   0.300
1999   0.302
2000   0.300
2001   0.296
2002  0.293
2003   0.294
2004   0.297
2005   0.295
2006   0.301
2007   0.303
2008   0.300
2009   0.299
2010   0.297
2011   0.295
2012   0.297
2013   0.297
2014   0.299
2015   0.299

The apparent trend is obvious, if something can be obviously non-existent.

We can look deeper: how have lefties, whom the shift allegedly affects more, been hurt by the shift? Well, in 2015 lefty hitters had their highest BABIP (.301) versus lefty pitchers in the last 13 years (as long as FanGraphs data goes for that split.) Against right-handed pitchers, left-handed batters tied their second-worst season (.299) in the last 15 years, for a whopping one hit in 500 less than the average during that time (.301).

You see, the problem is that we need to look at grounders: fly balls and line drives aren’t really being affected, but grounders are, so in the long run, the shift is slightly depressing hits. Except the obvious correlate isn’t there either.  In 2015, grounders had a .236 BABIP, .004 higher than the 13-year average.

2015 isn’t some sort of outlier. In every easy-to-research split you might choose, BABIP fluctuations in the last 13 years are within the range of random variation. The recent years of the shift era show not even a statistically insignificant decrease in BABIP: in many of those splits, BABIP has by a hair increased. (See tables linked below.)

Another source of evidence that the shift works might be found by comparing defense-independent pitching models with non-defense-independent stats. Maybe BABIP leaves something out, but we see that runs are down relative to DIPS predictions. If so, one possible explanation is the shift. FIP, a great DIPS, is equal to 3*BB+13*HR-2*K + C, where C is a constant that makes league-average FIP equal league-average ERA. If C is smaller now, that suggest (but does not prove) that BIP outs have changed. C is bigger now (by just .0053, or .048 runs per inning), suggesting that more runs are scored from balls in play. It’s no proof, but if balls in play were a lot more frequently outs, we wouldn’t expect them, overall, to account for more runs and ERA would be down more than peripherals imply.

We can’t infer from this data that some individual hitters are unaffected by the shift. Jeff Sullivan’s recent piece on adjusting to the shift is what brought me to the data (I was seeking to investigate just how badly lefty hitters have been hurt, and discovered something far more interesting), and he mentioned Jimmy Rollins’ attempts to adjust to the shift. I recall a lot of speculation about Mark Teixeira being hurt by the shift. Maybe those guys are. Maybe they aren’t. Maybe they aren’t, but others yet to be named are. Things which don’t have league-wide effect may interact with particular skillsets in hard-to-identify ways.

It’s possible that the shift has changed things by reducing the value of range up the middle, allowing more offensively-oriented players to man those positions. But that seems more like an effect that we would see in future, not one we have seen, because it should take years of player development for those sorts of changes to have a league-wide effect.

It is possible that the shift increases strikeouts and depresses walks. It would be hard to know this, though. It is also possible that the shift has reduced the value of certain defensive skills (e.g., range) and that the decreased need for range has allowed teams to play more offensively-oriented guys up the middle, effectively cancelling the BABIP effects. It sounds farfetched to suppose that two of eight hitters being more offensively-minded can cancel an effect of a shift that should apply to eight of eight of them, but we haven’t ruled it out.

Overall, league scoring is down. But DIPS suggest this is mostly the result of more strikeouts, with a little home-run and walk noise thrown in. There are some ways in which the shift might be having an effect — please offer further hypotheses below. All the evidence here is correlational and correlation doesn’t imply causation. Even anti-correlation doesn’t imply non-causation (if people who drink more exercise more — both are correlated positively with wealth — drinking might get anti-correlated with bad health because exercise compensates for the health impact of drinking). But when no correlation is found and no obvious counter-effects can be sighted, the lack of a correlation suggests weak influence at best.

References:

League BABIP, 1975 to 2015

LHB v. LHP and LHB v. RHP, all available years

Ground Ball BABIP, all available years


A Different Type of .400 Hitter

You probably know this, but in case you’re new to baseball, the last player to hit .400 for an entire season was Ted Williams, who, in 1941, hit a staggering .406. Since then, only two players have even managed to hit .390 for a season, Tony Gwynn in 1994 and George Brett in 1980. Even then, both those guys accomplished their feats in shorten seasons, with Gwynn only playing in 110 games due to the players’ strike and Brett playing in only 117 due to injury. Needless to say, its very unlikely we see a .400 hitter anytime soon.

But only slightly less difficult than managing a .400 batting average is managing a .400 batting average on balls in play. Since strikeouts started to be tracked as an official statistic (1910 for the National League and 1913 for the American League) there have been only  18 .400 BABIP hitters compared to nine .400 hitters. As you would expect, there is some overlap between these two groups — six of those nine .400 hitters had a .400 BABIP as well. As you would also expect, the majority of the .400 BABIP seasons occurred in the 1910s, when fielders wore slightly more dexterous shoes on their catching hands or in the early 1920s when your utility infielder was hitting .300. Of those 18 seasons with .400 BABIP, only six have happened since 1925 and only four since Ted Williams hit .400 in 1941 (he did not have a .400 BABIP that year). Those four seasons belonged to the following:

Roberto Clemente in 1967

Rod Carew in 1977

Manny Ramirez in 2000

Jose Hernandez in 2002 wait what

Those first three are not all surprising. Carew and Clemente are both Hall-of-Famers, and Manny Ramirez certainly had a Hall-of-Fame-caliber career. All three finished their careers with BABIPs of .330 or better, with Carew’s mark coming in at an astonishing .359. All three also had spectacular seasons in the years above, with Clemente and Carew probably having their best seasons, and Manny only falling shy of that mark due to injuries shortening his year.

And then there is Jose Hernandez. Jose had a .404 BABIP to go along with his .288 batting average. That’s not a typo. In 2002 Jose Hernandez struck out 188 times, which was, at the time, one shy of Bobby Bonds’ single-season record. Of course, by modern standards, that doesn’t seem like a truly ridiculous amount — three different players named Chris (or Kris) have done it in the past three seasons alone. But in 2002, that was a really impressive number.

But as we know, strikeouts aren’t much worse for a hitter than any other out. Despite the strikeouts, Hernandez had a career year for the 2002 Brewers, leading the team with 4.5 WAR. Most of the time, a marginal infielder having a better than expect season for a really bad team is about as forgettable as a 4-WAR season can be, but in this case, it was a truly fascinating season.

Of course, unlike Ted Williams and his .400 batting average, Hernandez likely won’t be the most recent .400 BABIPer for that long. In 2004, Ichiro hit .399 and eight other players have BABIPed over .390 in the 13 seasons since Hernandez joined that exclusive club. Odubel Herrera was one stray grounder a month away from hitting .400 just this past year. But for right now, after Jose finishes a long day of teaching Baltimore farmhands how to strike out a ton in Norfolk, he can sit back with his beverage of choice and compare himself to Rod Carew, Roberto Clemente, and Manny Ramirez. Not half bad.


Brandon Phillips Made Baserunning History

Brandon Phillips was a great baserunner this past season. He stole 23 bases and was only caught stealing three times. It wasn’t an all-time great season in terms of stolen bases or baserunning runs overall, and his baserunning is overshadowed by the baserunning greatness of teammate Billy Hamilton, but we can all agree that Phillips put together a very nice season on the basepaths.

Now let’s make things interesting. In contrast to his great 2015, Brandon Phillips was very bad at stealing bases the last few years. In 2013 and 2014 he combined for a grand total of seven stolen bases and six times caught stealing (Phillips in fact had negative net stolen bases in 2014, being caught stealing three times and stealing just two bases), being worth negative runs on the basepaths both years. We now have a rare situation on our hands, where a player was a prolific base-stealer after doing nothing the year before.

Let’s quantify Phillips’ improvement to find some historical comparisons. Here’s the complete list of players that increased their stolen-base total by at least 20 a year after having negative net stolen bases (stolen bases -t imes caught stealing):

Player Year Stolen Bases (SB) Previous Year SB Previous Year Success Rate
Brandon Phillips 2015 23 2 40%

I know it can be difficult to read through that entire list, so let me summarize it for you: Before Brandon Phillips in 2015, no player had ever, following a season with negative net stolen bases, increased their stolen-base total by over 20 in the following season!

Pretty cool, right? It gets even better!

Here’s what makes Brandon Phillips’ 2015 season on the basepaths even more unique. Brandon Phillips was also very old this season, turning 34 in the middle of the summer. While it’s not unheard of for old guys to steal lots of bases (Lou Brock stole 118 at 35), it is a lot rarer than players in their primes stealing lots of bases. What is very rare is for old guys to suddenly make a leap in their stolen-base totals.

Let’s go back to the numbers again to find some historical comparisons. Here is the complete list of players who had a 20-stolen-base increase at Brandon Phillips’ age or older since baseball became integrated:

Player Year Stolen Bases (SB) Previous Year SB SB Increase Success Rate
Brandon Phillips 2015 23 2 21 88.5%
Lou Brock 1974 118 70 48 78.1%
Bert Campaneris 1976 52 24 28 81.8%
Rickey Henderson 1998 66 45 21 83.5%
Maury Wills 1968 52 29 23 71.2%
Jose Canseco 1998 29 8 21 63.0%

Only five other players since integration have had a 20-stolen-base jump at Brandon Phillips’ age or older. And these aren’t any random players — with Brock, Henderson, Wills, and Campaneris on the list, you have the 1st, 2nd, 14th, and 20th career leaders in stolen bases. The 5th is Jose Canseco, which just confirms what we already knew: Jose Canseco is weird. Canseco’s performance late in his career was also famously PED-boosted to defy normal aging curves, but I decided to just present the stats to you and you could make your own judgment on which performances you consider legitimate.

Even compared to the four all-time great base-thieves and Canseco, Phillips’ 2015 season is still unique. Since integration, Brandon Phillips is the only player his age to ever have an increase of 21 in stolen bases while matching his success rate!

If you had predicted before the season that Brandon Phillips would steal less than 23 bases, no one would have doubted you. After all, 18,845 players have played major-league baseball before and not a single one had accomplished what Brandon Phillips needed to do.

However, as the saying goes, baseball is played on the field and not on a computer. Against all odds there was old Brandon Phillips, chugging along on the basepaths and making his mark in history while doing it.

Notes:

(1) I used a cutoff of 200 at-bats in each consecutive season for players to qualify for the stolen-base-increase list. This was because I wanted the increases in stolen bases to be due to the player’s actions, and not just more playing time. A season where a rookie is called up and steals two bases in five games, and then steals 50 bases in a full season the next year is obviously against the spirit of seeing which players increased their stolen bases the most. I generously made the cutoff to qualify very low to include as many players as possible and so I couldn’t be accused of cherrypicking an at-bat limit to help Brandon Phillips stand out.

(2) A lot of players in the 1890s and 1900s qualified for the 20+ stolen-base increase at 34 years old or later, but since the game was so different back then I decided to just compare Phillips against players from the modern era.

(3) Dave Roberts came close to making the second cutoff, but was just a bit younger than Brandon Phillips.


xHR%: Questing for a Formula (Part 4)

Apologies for the significant delay between the third post and this one. A little Dostoevsky and the end of the quarter really cramp one’s time. Since it’s been a while, it would probably be helpful for mildly interested readers to refresh themselves on Part 1, Part 2, and Part 3.

As a reminder, I have conceptualized a new statistic, xHR%, from which xHR (expected home runs) can and should be derived. Importantly, xHR% is a descriptive statistic, meaning that it calculates what should have happened in a given season rather than what will happen or what actually happened. In searching for the best formula possible, I came up with three different variations, all pictured below with explanations.

HRD – Average Home Run Distance. The given player’s HRD is calculated with ESPN’s home run tracker.

AHRDH – Average Home Run Distance Home. Using only Y1 data, this is the average distance of all home runs hit at the player’s home stadium.

AHRDL – Average Home Run Distance League. Using only Y1 data, this is the average distance of all home runs hit in both the National League and the American League.

Y3HR – The amount of home runs hit by the player in the oldest of the three years in the sample. Y2HR and Y1HR follow the same idea.

PA – Plate appearances

Now that most everything of importance has been reviewed, it’s time to draw some conclusions. But first, please consider the graphs below.

Expected home runs (in blue) graphed with actual home runs (orange) using the .5 method. I plotted expected home runs and actual home runs instead of xHR% and HR% because it’s easier to see the differences this way.

Expected home runs (in blue) graphed with actual home runs (orange) using the .6 method.

Expected home runs (in blue) graphed with actual home runs (orange) using the .7 method.

Conclusions

Honestly, those graphs look pretty much the same. Yes, as the method increases from .5 through .7, the numbers seem to get more bunched up around the mean, but the differences really aren’t significant between the methods. Nor are the results from those methods particularly different from the actual results. And therein lies the crux of the matter. The formulae suggest that what happened is what should have happened, but I don’t think that’s true.

I know a great deal of luck goes into baseball. I know as a player, as a fan, and as a budding analyst that luck plays a fairly large role in every pitch, every swing, and every flight the ball takes. I don’t know how to quantify it, but I know it’s there and that’s what sites like FanGraphs try to deal with day in and day out. Knowledge is power, and the key to winning sustainably is to know which players need the least amount of luck to play well and acquire them accordingly. Statistics like xFIP, WAR, and xLOB% aid analysts and baseball teams in their lifelong quests for knowledge, whether it be by hobby or trade.

For those reasons, xHR% in its current form is a mostly useless statistic. It fails to tell the tale I want it to tell — that players are occasionally lucky. An average difference of between .6 and 1 home runs per player simply doesn’t cut it because it essentially tells me what really happened. At this juncture it’s basically a glorified version of HR/PA where you have to spend a not insignificant amount of time searching for the right statistics from various sources. But hey, you could use it to impress girls by convincing them you’re smart and know a formula that looks sort of complicated (please don’t do that).

I don’t know how big of a difference there needs to be between what should have happened and what actually happened. Obviously there still has to be a strong relationship between them, but it needs to be weaker than an R² of .95, which is approximately what it was for the three methods.

All statistics that try to project the future and describe the past are educated shots in the dark. The concept is similar to the American dollar in that nearly all of their value is derived from our belief in them, in addition to some supposedly logical mathematical assumptions about how they work. Even mathematicians need a god, and if that god happens to be WAR, then so be it.

Even though my formula doesn’t do what I want it to do quite yet, I won’t give up. Did King Arthur and Sir Lancelot give up when they searched for the Holy Grail? No, they searched tirelessly until they were arrested by some black-clad British constables with nightsticks and thrown in the back of a van. I will keep working until I find what I’m looking for, or until I get arrested (but there’s really no reason for me to be).

I know that wasn’t particularly mathematical or analytical in the purest sense, and that it was more of a pseudo-philosophical tract than anything else, but please bear with me. Any suggestions would be helpful. I have some ideas, but I’d appreciate yours as well.

Part 5 will arrive as soon as possible, hopefully with a new formula, new results, and better data.


How Much Is a “W” Worth in Major League Baseball?

Moneyball
Looking at the current landscape of Major League Baseball, it seems that the Moneyball concept is still alive and well (as exemplified by the Houston Astros and the Pittsburgh Pirates — two rather successful ball clubs in what are traditionally considered to be small markets!

Here in Canada, the Toronto Blue Jays’ recent playoff run in 2015 gave us a reminder of how exciting postseason can be when management, players, and fans all share the same goal and vision. Yet, as thrilling as playoff baseball can be, the true definition of success for a team comes down to it being able to win the last postseason game. Why? All teams that bow out of the playoffs — be it the League Division Series, the League Championship Series, or the World Series, ultimately lose their last postseason game. Only one team — the World Series Champion — ends its season by winning its last game in the calendar year!

Before we get ahead of ourselves about winning the last game in October/November, however, we must be reminded that a team cannot participate in the playoffs — let alone advance — unless it wins its division or a wild-card spot. Even with the newly-expended postseason format that saw both leagues (American and National) having two (as opposed to one) wild cards, it remains a challenge to secure one of the 10 playoff berths. One only needs to see how much obstacles Toronto overcame in the 2015 season, aided by then-GM Alex Anthopoulos’ fury of trade deadline activities (acquiring Troy Tulowitzki, LaTroy Hawkins, David Price, and Ben Revere within a span of four days from July 28th to July 31st) to bring an end to the Blue Jays’ 22-year postseason drought. To this end, the first order of business for a team should be getting into the playoffs.

Toronto Blue Jays Fans
Baseball is once again the talk of the town in Toronto (and even across Canada) after the Toronto Blue Jays ended a 22-year playoff drought by winning the American League East Division in 2015. The trick is can the ball club repeat, if not improve, on their success?

In the simplest form, there are arguably three ways to try to make the postseason. One way is to try to “buy” a championship by signing one or more (if not all) the elite unrestricted free agents on the open market. Of course, this approach requires an ownership that has deep pockets and is willing to spend (sometimes without limitations). Traditional big spenders that come to mind include but are not limited to the New York Yankees, the Boston Red Sox, and the Los Angeles Dodgers. An alternative approach, put on full display by Pat Gillick when he guided Toronto to four American League East Division titles, two American League pennants, and two World Series championships from 1989 to 1993, is to build the core of the 25-man roster through smart drafting and player development and then bolster the lineup, starting rotation, and/or bullpen through trade-deadline deals (including rentals if the cost of prospect capital is within reason). Perhaps the least popular method (at least from the fans’ perspective due to the long-term patience required) — albeit arguably just as effective as the other two means — is to rely on continuous and sustainable home-grown talents strictly, much like the Cleveland Indians (which managed to win an impressive six American League Central Division titles and two American League pennants from 1995 to 2001) and Tampa Bay Rays (which managed to win an American League pennant, two American League East Division titles, and two American League Wild Cards from 2008 to 2013 despite having a very modest payroll).

If money is no object, it would be logical to conclude that most baseball executives would opt for the first route given that it is the shortest avenue to get to the promised land, at least in theory. After all, the Yankees are the owner of 27 World Series championships, by far the most championships of any teams among the four North American major sports, i.e., Major League Baseball, National Baseball Association, National Football League, and National Football League. The greatest strength of “buying” a championship is two-fold. On one hand, by taking an elite talent off the unrestricted free-agent market and/or the trade market, you can prevent your rivals from acquiring that talent, meaning that you are strengthening yourself while simultaneously weakening your opponent. On the other hand, you can afford to “make mistakes” because if the player that you signed and/or traded for did not pan out as anticipated, you can always go out and sign and/or trade for another elite talent as a replacement until you find the right one!

New York Yankees World Series Trophies
Even with notable elite home-grown talents such as Derek Jeter, Andy Pettitte, Jorge Posada, Mariano Rivera, and Bernie Williams, one can argue that the New York Yankees essentially “bought” 4 World Series Titles (1996, 1998, 1999, and 2000) within a span of 5 years by outspending all 29 other teams in Major League Baseball.

Yet, there is no guarantee that being a big spender would necessarily get you a championship. In the 2015 season, the eight ball clubs with the highest payrolls — and I purposely limited the scope of my coverage to eight teams because there are only eight “true” playoff spots — as of the 2015 season are as follow: (1) Los Angeles Dodgers at $ 301,735,080; (2) New York Yankees at $221,256,867; (3) Boston Red Sox at $214,789,749; (4) San Francisco Giants at $187,088,630; (5) Washington Nationals at $165,655,095; (6) Detroit Tigers at $162,218,297; (7) Texas Rangers at $152,445,607, and (8) Los Angeles Angels at $151,348,162. As we can observe, among the eight teams with highest payrolls, all of which have a payroll in excess of $150,000,000, only three (3/8 = 37.5%) of the ball clubs — the Dodgers, the Yankees, and Rangers — made the cut! In other words, even if you spend money without reservation, it does not necessarily mean that success is guaranteed! In fact, based on this small sample, there is a (5/8 = 62.5%) chance that your team will be watching (as opposed to playing) postseason baseball even if your ball club has one of the highest payrolls in all of Major League Baseball.

Table 1: Teams with Highest Payroll in Major League Baseball: 2015 Season
Source of Data: http://www.spotrac.com/mlb/payroll/2015/

Conversely, having a modest or low payroll does not necessarily mean that your team is completely out of running for the grand prize. Even though the odds may stack against you, at least from the surface, recent history suggests that the probability of a low-budget ball club making it to the playoffs is actually not terrible. Below are the eight teams with the lowest payrolls — again, I deliberately limited the range of my coverage to eight ballclubs because there are only eight real playoff spots — in the 2015 season: (1) Miami Marlins at $63,590,525; (2) Tampa Bay Rays at $73,582,652; (3) Arizona Diamondbacks at $76,639,242; (4) Cleveland Indians at $77,404,413; (5) Oakland Athletics at $80,376,830; (6) Houston Astros at $81,450,835; (7) Milwaukee Brewers at $94,010,873; and (8) Pittsburgh Pirates at $99,435,606. As we can decipher, among the eight teams with lowest payrolls, all of which have a payroll south of $100,000,000, there are actually two (2/8 = 25%) ballclubs that managed to secure playoff berths. Indeed, the difference between the number of the “rich” teams from among the eight ballclubs with the highest payroll that made the postseason — three in total — and the number of “poor” teams from among the eight ballclubs with the lowest payroll that made the playoffs — two in total — is only one team.

Hence, in statistical terms, there is not a massive gap in the chances of making the postseason between being one of the “rich” teams from among the eight ballclubs with the highest payroll (37.5%) and being one of the “poor” teams from among the eight ballclubs with the lowest payroll (25%) as the difference is only a mere (3/8 – 2/8 = 1/8 or 12.5%). As a matter of fact, if we were to take the average payroll of the eight teams with the highest payroll [($301,735,080 + $221,256,867 + $214,789,749 + $187,088,630 + $165,655,095 + $162,218,297 + $152,445,607 + $151,348,162)/8 = $194,567,186] and subtract the average payroll of the eight teams with the lowest payroll [($63,590,525 + $73,582,652 + $76,639,242 + $77,404,413 + $80,376,830 + $81,450,835 + $94,010,873 + $99,435,606)/8 = $80,811,372], which yields ($194,567,186 – $80,811,372 = $113,755,814), and then divide this difference by 12.5, i.e., the chances of making the postseason between being one of the “rich” teams from among the eight ballclubs with the highest payroll and being one of the “poor” teams from among the eight ballclubs with the lowest payroll, we can deduce that for every additional one percent (1%) in which a team wants to augment its odds of making the playoffs, it would cost that ballclub just less than 10 million dollars ($9,100,465.11). While the math suggest that you are inching closer to the promised land (at a rather slow pace of one percent) for each additional nine million ($9,100,465.11 strictly speaking) that you are dishing out, I am not so sure that the trade-off makes sense from a value (or cost-benefit) perspective unless money is no object whatsoever.

Table 2: Teams with Lowest Payroll in Major League Baseball: 2015 Season
Source of Data: http://www.spotrac.com/mlb/payroll/2015/

If spending money blindly is not the way to go, then it seems logical that the second or third approach (perhaps even a combination of the two) is the preferred option. Recent trends in the baseball industry seem to back this rational strategy as more and more teams are demanding “value” for their investments, meaning that they want to get the most bang for their bucks. Below are the eight teams with the lowest average cost per win in Major League Baseball for the 2015 season, as calculated and ranked by dividing the total payroll of all 30 teams by the number of wins (“W”) they have in the 2015 season: (1) Miami Marlins at $895,641.20 per “W;” (2) Tampa Bay Rays at $919,783.15 per “W;” (3) Houston Astros at $947,102.73 per “W;” (4) Cleveland Indians at $955,610.04 per “W;” (5) Arizona Diamondbacks at $970,116.99 per “W;” (6) Pittsburgh Pirates at $1,014,649.04 per “W;” (7) Oakland Athletics at $1,182,012.21 per “W;” and Minnesota Twins at $1,282,311.06 per “W.”

Among the eight teams with the lowest average cost per win in Major League Baseball for the 2015 season, there are once again two (2/8 = 25%) ballclubs that managed to secure playoff berths. This means that the probability of teams that emphasize values for their spending making it to the postseason is the same as that of ballclubs with lowest payroll in Major League Baseball for the 2015 season. Better yet, the chances of teams that emphasize values for their spending and ballclubs with lowest payroll in Major League Baseball for the 2015 season making it to the playoffs are only slightly worse than teams with highest payroll in Major League Baseball for the 2015 season (3/8 – 2/8 = 1/8 or 12.5%).

Table 3: Teams with Lowest Average Cost Per Win in Major League Baseball: 2015 Season
Source of Payroll Data: http://www.spotrac.com/mlb/payroll/2015/
Source of 2015 MLB standing: http://mlb.mlb.com/mlb/standings/index.jsp?tcid=mm_mlb_standings#20151004

All things taken into account, I would opt for smart drafting and player development rather going for the shortcut of “buying” a championship if I were a GM, unless my budget is a bottomless pit. Bottom line, not only is there no absolute certainty that having one of the eight highest payrolls would mean a ticket to the playoffs, but as we have witnessed, the odds of making it to the postseason are not really that different for the eight teams with the lowest payrolls and for the eight teams with the lowest average cost per win in Major League Baseball for the 2015 season. Coupled with the unattractive fact that it would cost me nearly 10 million dollars to increase my team’s chance of making the playoffs by a mere one additional percent (and each percent thereafter), it seems obvious that smart drafting and player development is by far the most optimal plan.


The Truth About Hitting the Ball Hard

I recently presented evidence that power and contact are independent skills.  An increase in power does not have to come at the cost of contact.  Surely intuition disagrees with these findings and when that happens you should be skeptical. I would be skeptical.

One reason a trade-off between power and contact is intuitive is that we are accustomed to speed-accuracy trade-offs for many everyday actions.  For example, we slow down when we pour a fresh cup of coffee because going too fast is dangerous.  Implicitly, we assume there is a speed-accuracy trade-off when we suggest that hitters can cut down on their swing to achieve more contact. Richard A. Schmidt is like the Bill James of my field — motor behaviour — and in 1979 he and his colleagues published the Theory of Accuracy for Rapid Tasks.  According to Google it has been cited over 1200 times.  While speed-accuracy trade-offs for movement are typical, the theory explains that rapid timing tasks like hitting are an exception to this rule.

The theory is a dense 46-pager including equations, but I’ll provide a couple critical graphs to illustrate its implications to hitting.  First, Figure 1 presents the results of an experiment that investigated the effect of movement distance and movement time on spatial error.

Spatial error
Figure 1. Spatial error (“We”–deg.) as a function of movement time (MT–msec) and movement distance (A–deg).

The results indicate that movement time, that is, movement speed, had almost no impact on spatial error. You’ll notice that the movement times tested in this experiment are conveniently reflective of a short and a long MLB swing (per Zepp).  The movement distances in the experiment were shorter than a swing, and the task far simpler, but the results are suggestive nonetheless.

A second experiment explored the effect of movement speed and distance on timing error.  Unlike the experiment above, movement speed did have an effect on timing error.  Figure 2 presents data indicating that faster movements result in significantly less timing error than slower movements, irrespective of movement distance.

Tempral error
Figure 2. Timing error (VEt–msec) as a function of movement time (MT–msec) and distance (A–deg).

In addition to these two examples, there is a substantial empirical and theoretical framework suggesting rapid timing tasks are exempt from a speed-accuracy trade-off.  Swinging slower does not increase a hitter’s chance to make contact.  On the basis of these data and the data I presented previously, it seems that hitters can try to hit the ball as hard as possible, within reason, without sacrificing contact or base-hit skill.

UNDERSTANDING HARD%

Power, contact, speed and discipline account for 66% of variance in hitting production. Power, measured by Hard%, is by far the most important skill. But what does Hard% measure, exactly?  The description of Hard% can be found in the glossary here.  Basically, Hard% describes the proportion of batted balls that meet an unknown criteria for “hardness,” and depends on hit-type, hang-time, landing-spot, and trajectory.  Importantly, Hard% does not include exit speed in its calculation.

In the plot below, Average Exit Speed for players with a minimum of 190 Abs in 2015 is plotted against their Hard%.  It is pretty clear from Figure 3 that while Hard% doesn’t directly measure exit speed, it does a pretty good job of estimating it.

Exit speed and hard
Figure 3.  Average Exit Speed and Hard%.

Given the tight relationship between Average Exit Speed and Hard%, I wondered if both measures were equally effective at predicting production.  The graphs in Figure 4 and Figure 5 present both power measures plotted against wRC+.

Hard and wRC+
Figure 4. Hard% and wRC+.

Exit speed and wRC+
Figure 5.  Average Exit Speed and wRC+.

Hard% does a better job of predicting production than Average Exit Speed, explaining about 23% more variance.  Since exit speed is a more direct measurement of power than Hard%, it follows non-power related data included in Hard% are relevant to production.  Previous research suggests that hit-type and trajectory are important to the outcome of a batted ball, and since both variables are used to calculate Hard%, it seems likely they contribute to the relationship between Hard% and wRC+.

INTRODUCING LIFT BIAS

Trajectory is tightly linked to outcome and hitters only control the trajectory (or angle) they intend to hit the ball on.  We have no way to measure hitters’ intentions. The only data on vertical launch angle that I’ve been able to access are extremely limited, or incomplete, so we can’t estimate hitters’ intentions based on results.  If we had a database of swing-plane information we could estimate each hitter’s intentions based on his average swing plane relative to the pitch, but we don’t have such a database.  What we do have are data on each hitter’s average exit velocity on ground balls, as well as their average exit velocity on line drives and fly balls.  If we assume that each hitter is trying to hit the ball as forcefully as possible along their intended trajectory, and further assume that over the course of a season exit velocity will be maximal around the force vector intended by the hitter, then we can infer each hitter’s bias toward lower or higher trajectory hits by subtracting their average ground-ball velocity from their average line-drive / fly-ball velocity.  The lower the resultant value, the lower the trajectory we can assume the hitter intended.  I examined the relationship between AvgLD/FB – AvgGB (or, Lift Bias) and Hard% and the results are in Figure 6 below.

Lift Bias and Hard
Figure 6.  Lift Bias and Hard%.

Almost every hitter in the sample hit the ball harder in the air than on the ground.  Only Melky Cabrera, Jason Heyward, and Nick Markakis hit their ground balls harder than their line drives and fly balls in 2015.  As suspected, almost every hitter appears to be trying to hit the ball in the air.  There is an apparent relationship between Lift Bias and Hard%, suggesting that hitters who intend to hit the ball on a higher angle tend to record more hard hits per contact.  To see if this was due to harder hitters choosing to lift the ball more, I examined the relationship between Average Exit Speed and Lift Bias and the results are presented in Figure 6 below.

Exit speed and Lift Bias
Figure 6.  Average Exit Speed and Lift Bias.

Surprisingly, there is practically no relationship between Average Exit Speed and Lift Bias.  This suggests that Lift Bias is associated with Hard% independent of how forcefully a hitter strikes the ball.  Since Lift Bias and Average Exit Speed are independent predictors of Hard%, I modeled the effect of both simultaneously with multiple regression.  The model explained 75% of variance in Hard% overall, and the part and partial correlations are reported in Figure 7 below.

Regression coefficients
Figure 7.  Multiple regression coefficients. 

The part correlation value in Figure 7 indicates the unique variance explained by each predictor.  Thus, Average Exit Speed explained 52% of the total variance in Hard%.  The partial correlation value describes the proportion of the remaining variance explained by one predictor after accounting for the other.  Thus, after accounting for Average Exit Speed, Lift Bias explained 26% of the remaining variance in Hard%.

In order to determine how much of the relationship between Hard% and production can be accounted for by Average Exit Speed and Lift Bias, I plotted predicted Hard% against wRC+.  The results indicate that Average Exit Speed and Lift Bias together account for almost, but not quite all of the relationship between Hard% and wRC+. See Figure 8 below.

Predicted Hard% and wRC+
Figure 8.  Predicted Hard% and wRC+.

If you compare Figure 8 and Figure 4, you can see that real Hard% still explains more of wRC+ than predicted Hard%, but the predicted values are getting close.  Since Hard% is based on the result of each hit rather than a tendency to hit balls harder in the air or on the ground, it makes sense that Hard% should be more related to performance.  It is impressive that two variables not directly measured in Hard% explain so much of its variance, as well as such a high percentage of its relationship to wRC+.

DOES LIFT BIAS COME WITH A TRADE-OFF?

One of the most interesting results described above is the null relationship between exit speed and Lift Bias, suggesting that an increase in Lift Bias may be beneficial regardless of power. Yet again, intuition kicks in protesting that while it might be more effective for power hitters to try to lift the ball, when light hitters lift the ball the result is a fly out. Since Lift Bias is unrelated to exit speed, examining the relationship between Lift Bias and BABIP should give a hint as to whether increasing Lift Bias decreases the chances of getting at least a single.

Lift Bias and BABIP
Figure 9.  Lift Bias and Batting Average on Balls in Play (BABIP).

Lift bias apparently has no relationship to BABIP, which seems counterintuitive.  Does lift bias even have an effect on batted-ball type? Not really.  The relationship depicted in Figure 10 below is the strongest of all, and even then Lift Bias only explains 8% of the total variance in GB%.

Lift Bias and GB
Figure 10. Lift Bias and Ground Ball Rate (GB%).

The launch angle of a batted ball depends more on the offset of the ball and bat at contact than on the attack angle of the swing.  Thus, perhaps it shouldn’t be too surprising that an ostensible measure of swing plane has little relationship to batted ball distribution.  While offset largely determines launch angle, swings that have more positive attack angles (to a point) are more optimal for batted ball distance. If Lift Bias is based on a more positive attack angle, we might expect to see a positive relationship between Lift Bias and HR/FB.  In fact, as shown in Figure 11, Lift Bias accounts for 30% of the variance in home runs per fly ball.

Lift Bias and HR/FB
Figure 11.  Lift Bias and Home Runs per Fly Ball (HR/FB).

Lift Bias has a strong relationship to average distance, and a smaller but still significant relationship to maximum recorded distance as well.  These data suggest that swing plane may be responsible for at least part of the observed Lift Bias, since increased Lift Bias seems to optimize batted-ball distance.

If swing plane does drive Lift Bias, one might expect a trade-off between Lift Bias and contact skill.  Since pitches are typically thrown on a negative angle of around 6 degrees, and attack angles exceeding 6 degrees can result in farther hits, it follows that hitters may be using a more severe uppercut than a 6 degree “level” swing to generate Lift Bias.

I used the Real Contact measure from my previous study to estimate contact skill for the hitters who have data in the 2015 sample.  The results indicated that Lift Bias is negatively associated with Real Contact, accounting for about 20% of the variance. This is the first hint of the nuance between slugging and contact, suggesting that hitters may be using steep swing planes to generate lift.  Conversely, Real Contact was unrelated to Average Exit Speed, confirming the absence of a trade-off between force and accuracy.

COMPARISON OF PLAYERS WITH MOST OR LEAST LIFT BIAS

It still seems counterintuitive that all players would benefit from having a lift bias in the top range of the sample. Is it possible that players at either end of the Lift Bias distribution are especially powerful or light-hitting, causing the appearance of a true relationship but reflecting only selective sampling? To examine the players with the most extreme Lift Bias (or lack thereof), I divided the sample into two groups with the 50 most Lift Biased and 50 least Lift Biased players.  First, I tested for differences in the potential to generate power by comparing the two groups on maximum recorded exit speed. The group with the most Lift Bias had a mean Max Exit Speed of 111mph, while the low Lift Bias group had a mean of 110mph. There is little difference in power potential between the most Lift Biased players and the least.

Next, I tested for differences in power production by comparing the groups on HR/FB.  As you can see in Figure 12, the high Lift Bias group (.167) saw their fly balls leave the park over twice as often as the low Lift Bias group (.074).

Group means: Power
Figure 12.  Mean HR/FB for the Low Lift Bias and High Lift Bias groups. Error bars represent 95% confidence intervals.

Finally, I compared the two groups on overall production.  The high Lift Bias group had a mean wRC+ of 117, while the low Lift Bias group had a mean of 93.  The players with the largest Lift Bias are, on average, substantially better than league average.  Conversely, the players with the smallest Lift Bias are somewhat worse than the league average. Figure 13 presents the observed means with error bars representing 95% confidence intervals.

Group means: Production
Figure 13.  Mean wRC+ for the Low Lift Bias and High Lift Bias groups. 

The players with a large Lift Bias have basically the same power potential as the players with the least bias, yet they have much more power production.  The extra power production completely accounts for the difference in overall production between the groups, which is substantial.

CONCLUSION

Over the last two articles, I have been detailing a hierarchy of measurable skills that explain the majority of variance in hitting production.  Further, I have demonstrated that there is little trade-off between skills.  Fast exit velocity does not come at the expense of contact, and Lift Bias does not come at the expense of base hits.  There does appear to be a small trade-off between Lift Bias and contact, suggesting that situational hitting could require adjusting swing plane or intended trajectory.

Power is the most important skill to production and is comprised of two sub-skills: Hitting balls harder on average (measured by Average Exit Speed), and generating more Lift Bias (measured by subtracting AvgGB velocity from AvgLD/FB).  The next most important is contact skill, which was estimated by parceling the effect of Fastball% out of True Contact (a location-independent measure of contact), to provide an estimate of real contact ability independent of how a hitter is pitched.  Finally, speed and discipline (represented by Spd and O-Swing%) are equally important skills, but much less important than power. Figure 14 depicts the relative importance of each skill in estimating production.

The relative importance of hitting skills
Figure 14.  The relative importance of hitting skills.

It is tempting to assume this model is causal, when in fact the data are all correlational.  If the data were causal, the conclusions for hitting coaches would be obvious:  a) Optimizing exit speed with efficient mechanics and hard work should be an ongoing goal for every player, b) Players should focus on driving the ball in the air and the hitting coach should help his hitters optimize their Lift Bias, c) Equally important, hitters should practice their contact skills against all pitch types on a situational basis, d) Discipline, which can be trained, should get about half the attention that contact receives, and e) The league is full of underachievers – assuming Lift Bias is a learnable skill.

Science will require experimental evidence before concluding that the skill hierarchy provides a causal explanation of hitting production.  Hitters and coaches may not want to wait around.  Hey, Kevin Pillar! Give me a call…


Using Contact Rates to Evaluate Pitchers

A little over a month ago, I published this piece detailing the methods that I had created to alternately assess hitter performance. I highly recommend glancing at that article before reading this one; it will make a whole lot more sense. For the lazy, here is a brief primer: I focused on using rates (contact, hard%, etc.) to create rough estimates of what would happen on any given pitch. What is the probability that Mike Trout hits a hard line drive on a pitch in the strike zone? The more a player does that, is he more likely to be a successful hitter overall? One of the advantages of this approach is that it helps to remove the actions of a hitter from his circumstance; a hard line drive is a hard line drive, but the placement of it will greatly affect whether or not the player reaches base. Poor defense, such as one may find in the minor leagues or college ball, is made less important in judging a player.

On of the questions remaining was whether or not I could apply some of these same methods to evaluating pitching. So far, the answer is a qualified yes. We already have a number of metrics to determine pitching value without regard for circumstance, but these methods still provide useful insights. Using the existing methods, such as xFIP, we can determine which rate stats are strong indicators of success.

There is one result that emerged above all else: there is no such thing as a weak-contact pitcher. There is a significant amount of talk about pitchers “keeping the ball in the park” or “getting weak ground balls.” However, this method indicates no such thing. By simply multiplying contact rates with “Soft%” for all 2015 qualified pitchers and therefore creating the “SoftXCont” statistic, I was able to search for any correlation between this rate and xFIP. Judge the results for yourself:

View post on imgur.com

Clearly, almost no correlation. However, remember that this only examines the aggregate; perhaps some specific pitchers can leverage this so-called skill to great effect. But, it appears that at least on average, generating weak contact is a poor indicator of overall pitching success.

The opposite is absolutely true. Pitchers who allowed less hard contact saw substantial increases in xFIP, as measured by my “HardXCont” number.

View post on imgur.com

The correlation is relatively strong, especially compared to the correlations seen in other baseball metrics. Clearly there is something going on here; pitchers who allow less hard contact per pitch get better results. Duh. For an even more clean-cut view of this, we can look at GoodXCont, which uses a combination of “Hard” and “Medium” contact.

View post on imgur.com

That correlation is excellent, and indicates that measuring GoodXCont would be a significantly powerful way of evaluating pitchers.

So, we see that pitchers who limit hard contact and good contact are more successful than their peers. We also see that allowing a large amount of soft contact is not indicative of overall success. The “weak contact” type pitchers (think Rick Porcello) are not necessarily succeeding thanks to any particular ability to generate soft contact; any corresponding ability comes more from being able to allow less hard contact.

For scouts, this means finding pitchers who both limit total contact and allow only poor contact. By using these metrics, rather than the outdated ERA or a radar gun, they can get a strong impression of future big-league success.

In a future piece, I plan to dive deeper into research on “soft contact” pitchers. While these initial results indicate that soft contact is not a good indicator of overall success, there is further work to be done. Stay tuned.


The Mariners are Finally Using Safeco Field Correctly

It’s no trade secret that playing to the strengths of your ballpark helps your chances to succeed. To gain an advantage, franchises can exploit, and even sometimes manipulate their home ballpark. If you run the Astros or Reds, who play baseball in a lunchbox, you can succeed by employing otherwise-flawed home-run hitters with little regard for who gets on base ahead of them. When you play half your games in an airplane hangar, however, stubbornly attempting to put the ball over 900 foot fences is foolish. A foolish strategy common of recent Mariners teams. A foolish strategy that wasn’t working.

M’s Team Stats OBP ML Rank SLG Pct. ML Rank wOBA ML Rank
2015 .311 22 .411 12 .313 17
2014 .300 27 .376 21 .299 25
2013 .306 26 .390 20 .307 20
2012 .296 30 .369 30 .291 30
2011 .292 30 .348 30 .283 30
2010 .298 30 .339 30 .285 30

If you have a weak stomach, do not view the last few rows.

The Mariners wrote the Greatest Hits on failing to get on base and, not surprisingly, struggled to win games during those seasons. For years and years, the Mariners tried succeeding with players like Logan Morrison, Michael Morse, and Mark Trumbo, desperately clinging to the home run as the heralded harbinger of scoring runs. Whether this was evidence of a failing regime by general manager Jack Zduriencik remains up for debate, but the front office had seen enough. Around the same time, a wayward GM separated cleanly from the Mariners division rival Angels was seeking asylum, armed with his own vision of building a team.

Strategy 1: Get on Base

Jerry Dipoto, presumably having read Moneyball, understood the value of getting baserunners, and how to get players on base.

“Command the Strike zone” Dipoto told Justin Myers and Gee Scott on their ESPN 710 Seattle radio segment. “From the top of the lineup to the bottom, we will command the strike zone”.

Dipoto began addressing the team’s glaring need for baserunners by signing catcher Chris Iannetta, who had played for Dipoto in Anaheim, and had posted OBP numbers over .350 in 2011, 2013 and 2014. Dipoto found further help by trading for Adam Lind (.350 OBP in 2015) and  signing free agent Norichika Aoki (.353 OBP in 2015, 6.4 K%).

None of these moves were meant to be earth-shattering, but each undoubtedly made the Mariners lineup better. With a solid core of Robinson Cano, Nelson Cruz, and Kyle Seager, Dipoto’s goal was to fill the remaining slots with valuable role players, each of whom is more than capable of getting on base.

Here is a table of several key Mariners offseason additions, with 2015 statistics, and 2016 ZIPS projections courtesy of Dan Szymborski. Note that season projections are often more conservative estimates, as they account for a certain level of player regression.

OBP (2015, 2016) wOBA (2015, 2016) BB% (2015, 2016) K% (2015, 2016)
Chris Iannetta .293 .281 12.9 26.2
.329 .306 14.0 25.8
Adam Lind .360 .351 11.5 17.5
.334 .315 10.1 19.5
Nori Aoki .353 .326 7.7 6.4
.332 .313 7.0 7.8

Strategy 2: Prevent runs, Create runs

Dipoto, addressing the fallbacks of that revolutionary A’s season, also understood the value of defense and speed. “We see ourselves as a run-prevention club. You can create a lot of advantage playing good defense. We also see our overall team defense as our biggest area in need of improvement.”

Dipoto went primarily after well-rounded players, but several moves in particular focused on defense and speed. In November, Dipoto traded closer Tom Wilhelmsen to Texas in exchange for Leonys Martin, a light-hitting center fielder with blazing speed. Martin didn’t quite play enough innings (334) in 2015 to qualify for the CF leaderboard, but his 15.4 Ultimate Zone Rating/150 would have ranked him 5th best among MLB center fielders, just above Lorenzo Cain. Martin, by the FanGraphs arm strength statistic, also had the strongest arm of any center fielder in baseball.

In terms of speed, Martin is as fast as they come. He’s been consistently valuable on the basepaths, posting a 4.3 and 4.2 BRR in 2014 and 2013 respectively (BRR is Baseball Prospectus’s baserunning statistic, where 0 represents an average baserunner). Martin posted a lower total BRR in 2015 (1.5), mostly because his on-base percentage dropped 61 points from 2014, and he appeared at the plate 273 fewer times (generally it’s harder to be a valuable baserunner if you don’t get on base as often).

The second move was to acquire Boog Powell, young center field prospect, from Tampa Bay. Powell was part of a larger trade, wherein Seattle received starting pitcher Nate Karns and Powell, and sent Logan Morrison and shortstop Brad Miller to the Rays. We’ll talk about Karns in the last section, but Powell further embodies Dipoto’s vision of commanding the strike zone, getting on base, and playing defense.

Powell’s defensive statistics are less clear than Martin’s, since Powell has never stepped foot in the major leagues, but he’s consistently graded out in the minor leagues as a plus defender. Powell is 22, and serves as outfield depth should Martin fall down a well in center field.

It’s clear that Dipoto aggressively wanted to improve the outfield defense. In his wild spree of moves, he also made his infield defense better. In trading for Lind, he incrementally made first base a more well-defended position (Lind posted a 3.8 UZR in 2015, compared to Logan Morrison’s -2.9). Brad Miller was a plus defensive shortstop (1.1 UZR, 4.6 dWAR), but with the emergence of talented, young Ketel Marte (1.2 UZR, 2.8 dWAR in 310 fewer innings at SS), Dipoto knew he could afford to trade Miller.

If one looks around at the Mariners in the field, Robinson Cano and Nelson Cruz are currently the only remaining defensive liabilities, and Cruz might not see much right-field time this year. Kyle Seager is a plus defender, Aoki is capable in left, and Seth Smith improved his defense dramatically last season. The team re-signed Franklin Guitierrez (3.4 UZR, 1.9 dWAR) to split Right Field with Smith and Cruz. At the catcher position, both Iannetta and Mike Zunino are among the 10 best pitch framers in baseball, saving an aggregate 26.8 runs in 2015.

The Mariners were the 5th worst defensive team in 2015, but that looks likely to improve in 2016.

Strategy 3: Taking advantage of Dinger-hitting tendencies

When you play baseball in an extreme pitcher-friendly park, in a sea-level city whose summer nights are cool and humid, home runs are a rare commodity. The Mariners understand they won’t win by hitting home runs, but they also understand that the same difficulty exists for opposing teams. Thus, the Mariners can fill their starting rotation with pitchers with higher than average fly-ball rates. Here are the totals from Mariners starters in 2015. WARP is Baseball Prospectus’s cumulative wins above replacement player statistic.

IP FB % GB% BABIP WARP
Felix Hernandez 201.2 26.9 56.2 .288 3.3
Taijuan Walker 169.2 39.0 38.6 .291 1.8
Hisashi Iwakuma 129.2 31.1 50.3 .271 2.5
James Paxton 67.0 34.4 48.3 .289 0.0
Roenis Elias 115.1 36.4 44.2 .280 0.9

Normally we’d expect a higher GB rate to correlate with a higher BABIP, since it’s more likely for ground balls to find holes and become hits than it is for fly balls. Felix has the highest GB rate of that table, and still maintained a better-than-average BABIP. That’s because he’s Felix Hernandez, and he’s better than you. Iwakuma, 34, also posted a ground-ball rate of 50%, and he’s never posted a BABIP above .287. After 2000 balls in play, a pitchers BABIP will normalize, and Iwakuma is quickly approaching that. Walker has the highest FB rate, so it’s probably good that he pitches where he does.

Before you even get beyond the innings pitched column, however, it’s clear the Mariners were thin on reliable starting pitching depth in 2015. Out of the players above, only Hernandez and Walker eclipsed 130 innings, only those two and Iwakuma provided any sort of positive contribution, and Roenis Elias is now on the Red Sox.  So the offseason began, and Dipoto got to work.

Earlier we mentioned Boog Powell becoming a Mariner, but he came over as secondary piece that landed the team starting pitcher Nate Karns from Tampa Bay. Karns had a quasi-breakout season in 2015, posting a 3.67 ERA and 3.90 xFIP in 147.2 innings pitched (xFIP is a Fielding Independent Pitching statistic that takes fly-ball rate into account). This was the first full season for the 27-year-old Karns, who also had a 36.5% fly-ball rate in 2015. Of those fly balls, 12.5% went for home runs, an above-average rate for a starting pitcher. While Tropicana Field is not an especially friendly ballpark for hitters, every other park in the AL East dramatically favors home runs, and Karns’s HR rate was likely hurt by pitching frequently at parks like Yankee Stadium and Camden Yards.

Karns should be aided by the expansive parks of the American League West, where more fly balls will become outs. If Karns matches, or even exceeds his peripherals in 2016, while maintaining his high fly-ball rate (fly-ball rate normalizes after 70 fly balls, a total Karns exceeded long ago), he should lower his home-run rate, and his BABIP. Karns also has room for regression, as HR/FB doesn’t normalize until after about 500 IP.

There is a question of Karns’s durability, having only one major-league season with over 100 innings pitched, but no such question exists with Dipoto’s next trade target. A month after grabbing Karns, Dipoto traded Elias and closer Carson Smith to Boston for Wade Miley, one of the most consistently durable left-handed starters in the game. Smith was a bright spot in a bad Mariners bullpen, so Dipoto had to give up some value to acquire Miley, but the GM took that risk to bolster a shaky rotation. Miley has pitched more than 190 innings in four consecutive seasons: 2015 in Boston, and the previous three in Arizona. All of those years have featured FIPs below 4, and improvements across many categories in 2015, lowering his home run/9 rate by .24 despite pitching in the AL East. It’s no stretch of the imagination for Miley to improve even further in 2016, playing in front of an overhauled Mariners defense.

Miley and Karns, 2015 Statistics
Name            IP          FB%          GB%        BABIP        WARP
Nate Karns           147         36.5          41.9          .285            1.6
Wade Miley          193.2         30.5          48.8          .307            2.5

You start to see how exploiting these park advantages becomes mutually beneficial. A speedy outfield defense will turn more of Nate Karns’ fly balls into outs, and a more solid infield defense will help turn Miley’s ground-ball hits into outs as well. On the offensive side, players who don’t strike out will put the ball in play more often, and the increased speed of the lineup will turn more of those balls in play into hits, increasing the number of baserunners. If, with all of these improvements, we still believe in Nelson Cruz’s power, Kyle Seager’s upward trajectory, and continued King Felix domination, we believe in Mariners success.