xHitting (Part 4): 2014 Fantasy Edition!

Welcome to the fourth installment of xHitting!  As always, reader comments and feedback are super encouraged and appreciated.  (Links to parts one, two, and three)

Briefly recapping the method, the gist is to estimate the expected rate of each individual hit type based on a player’s underlying peripherals, and in turn recover all the needed components to compute expected versions of wOBA, OPS, etc.  The only real change to the model since last time is that I now utilize a “hybrid” predicted home run rate, that averages between actual and (raw) predicted home run rate, with the weight given to actual HR rate increasing in the number of plate appearances.  (This is explained in part three, for those curious.)

Perhaps the more exciting change, though, is that this time I actually have results for an ongoing season, which potentially can help for fantasy purposes.  (Not that most readers need my help necessarily.)  Related to fantasy usage, there were a few requests to see a full spreadsheet of past results (2010-2013 seasons), which I have posted here.  Again feel free to take it or leave it at your leisure.

Note: I collected most of these data at the All-Star Break, so numbers may be a few weeks behind, but they’re still mostly true.  Also, for time considerations I only fetched 2014 stats for qualified leaders.  This even leaves out a few big names, but I couldn’t justify time to fetch every player.

So far, I’ve typically posted the biggest “over-” and “under”-achievers for a given season.  And I suppose I’ll continue that tradition today.  But while these lists are useful for highlighting which players seem most likely to regress, it overlooks another main use of the model, which is to assess the realness of a player’s apparent “breakout” or “decline;” at least in-sample.  (In some cases, the model may think that a player’s breakout is entirely justified, given peripherals, while others it may view more skeptically.)  Thus, today I’ll also post a second list, of players who seem to have taken a pronounced step forward/step back this season, and what the model thinks of their season-to-date performance.

Okay, time for results!  I’ll start with the list of “over-” and “underachievers.”

2014 Underachievers (1st half) 2014 Overachievers (1st half)
Name wOBA xWOBA Diff Name wOBA xWOBA Diff
Jean Segura 0.256 0.305 -0.049 Casey McGehee 0.345 0.277 0.068
Chris Davis 0.306 0.353 -0.047 Yasiel Puig 0.398 0.340 0.058
Mark Teixeira 0.352 0.397 -0.045 Matt Adams 0.376 0.324 0.052
Gerardo Parra 0.289 0.327 -0.038 Mike Trout 0.428 0.381 0.047
Brian McCann 0.298 0.330 -0.032 Marcell Ozuna 0.343 0.300 0.043
Torii Hunter 0.323 0.355 -0.032 Lonnie Chisenhall 0.396 0.359 0.037
Joe Mauer 0.308 0.340 -0.032 Scooter Gennett 0.355 0.320 0.035
Jimmy Rollins 0.320 0.352 -0.032 Marlon Byrd 0.344 0.309 0.035
Brian Roberts 0.304 0.334 -0.030 Giancarlo Stanton 0.397 0.363 0.034
Buster Posey 0.326 0.352 -0.026 Hunter Pence 0.359 0.325 0.034

A general pattern I notice is that, having worked with this model for a while now, there do seem to be players that give the model some trouble and have a disproportionate tendency to appear on this list from year to year.  A few of these players appear on this list… more on that later.

Partly for that reason, I wouldn’t necessarily say to “buy low” the guys on the left, nor “sell high” the guys on the right; although you can if you want.  I won’t address every player, but I have some scattered comments:

  • For readers who prefer OPS, .020 wOBA translates to about .050 OPS, on the margin.
  • .397 predicted for Teixeira?  Not sure where that came from…
  • Poor Segura.  All things considered, I think nobody deserves a big second half more than he does.
  • Whatever happened to Casey McGehee’s power?  The guy once hit 23 home runs in a season, but now has ISO of .073, with surprisingly low fly ball distance.
  • Although Chisenhall’s breakout is not as impressive if you take out what the model thinks is luck, it’s still a pretty impressive improvement.
  • Chris Davis is sort of the reverse of Chisenhall.  Adding back in what the model thinks has been bad luck, he’s still way down from what he did last year, but not nearly as disappointing as he probably has been to many owners thus far.

As mentioned, certain players do seem to be able to over/underperform the model somewhat consistently; the same way we think some pitchers are usually better or worse than their FIP.  With now 4.5 years of data to work with, however, I think I can make educated guesses about which players systematically deviate from the model predictions.  I’ll term this deviation the “player fixed effect.”

(Requiring at least 1000 PA from 2010 through 2014 first half)

Model loves too much Model loves too little
Name Player FE
estimate (wOBA)
Name Player FE
estimate (wOBA)
Brian Roberts -0.033 Wilson Betemit 0.032
Todd Helton -0.026 Brandon Moss 0.032
Jean Segura -0.026 Ryan Sweeney 0.028
Jose Lopez -0.025 Mike Trout 0.027
Mark Teixeira -0.025 Peter Bourjos 0.026
Russell Martin -0.024 Matt Carpenter 0.025
Darwin Barney -0.023 Brandon Belt 0.025
Chris Getz -0.023 Melky Cabrera 0.025
Jimmy Rollins -0.021 Carlos Ruiz 0.024
Jason Bay -0.020 Chris Johnson 0.024

Comments:

  • Again, .020 wOBA is equivalent to about .050 OPS, on the margin.
  • Taking out their apparent fixed effect, Teixeira is only underperforming his xWOBA by about .020, and Brian Roberts is actually doing about par.
  • On the reverse side, Mike Trout’s “adjusted” xWOBA jumps up to .408, where really it probably doesn’t surprise us that he’s outperforming even that, since he’s Mike Trout.  And although Giancarlo Stanton misses the Top 10 cutoff above, his apparent fixed effect of +.022 would be 11th; so his “adjusted” xWOBA is more like .385.
  • Yasiel Puig (.058) would also be on the list of “positive fixed effects” if we relaxed the PA requirement (he has 826 during this time).  And Matt Adams (~.040) might also be well on his way to that list; although he has fewer plate appearances still than Puig.
  • I don’t really have good explanations/know any common themes for players with negative fixed effects.  Maybe readers can help?
  • For Trout, home runs are pretty clearly the area where the model underestimates him.  In any given season (2010-2014), he hits about twice as many HR as the model thinks he should in the “raw” prediction.
  • And Trout’s not the only “HR rate defier,” either; just the most salient.  In general, the model has never done as well with home runs as it does with singles, doubles, and triples.  It seems there are other important determinants of home run hitting that really should be in the model, but currently are not.  Intuitively, I sort of would like velocity and angle of the ball off the bat, but so far have not found a good data source to actually include these.  (Maybe that will change in the coming years as MLBAM releases “Hit F/X” style data?)  Until then, reader suggestions are also super welcome here.

And now, finally, for the other usage: here’s a partial list of players who have taken either a pronounced step forward or back this season, relative to established norms.

2014 “Decliners” 2014 “Improvers”
Name Career wOBA 2014 wOBA 2014 xWOBA Name Career wOBA 2014 wOBA 2014 xWOBA
Nick Swisher 0.352 0.285 0.305 Michael Brantley 0.324 0.394 0.404
Joe Mauer 0.373 0.308 0.340 Lonnie Chisenhall 0.328 0.396 0.359
Allen Craig 0.350 0.289 0.309 Seth Smith* 0.334 0.389 0.356
Billy Butler 0.352 0.300 0.309 Victor Martinez 0.362 0.416 0.422
Evan Longoria 0.365 0.315 0.323 Jonathan Lucroy 0.342 0.383 0.354
Domonic Brown 0.315 0.267 0.267 Anthony Rizzo 0.342 0.382 0.382
Chris Davis 0.351 0.306 0.353 Nelson Cruz 0.356 0.393 0.380
Matt Holliday* 0.385 0.342 0.318 Jose Altuve 0.319 0.356 0.325
Jean Segura 0.299 0.256 0.305 Brian Dozier 0.311 0.344 0.362
David Wright 0.377 0.335 0.305 Kyle Seager 0.334 0.367 0.344
Buster Posey 0.366 0.326 0.352 Dee Gordon 0.297 0.329 0.318
Shin-Soo Choo 0.369 0.333 0.346 Alcides Escobar 0.284 0.312 0.300
Dustin Pedroia 0.356 0.325 0.337 Casey McGehee 0.321 0.345 0.277
Jed Lowrie 0.327 0.297 0.305
Jay Bruce 0.343 0.315 0.326

* – To avoid inflation from Coors Field, for these players I’ve taken the total from 2011-13 seasons only

Comments:

  • At least in-sample, Brantley’s breakout seems to be pretty much entirely justified.  Of course this doesn’t mean that he won’t regress somewhat, but if I were to guess, I’m a little more optimistic than ZiPS and Steamer (which currently project .341 and .333 RoS, respectively).  Similar deal for some others.
  • “Yikes” for Billy Butler and Domonic Brown, whose declines this season seem (at least in-sample) to be entirely justified.
  • I’m not sure why the model dislikes Casey McGehee so much.  Obviously his fly ball distance (mentioned earlier) isn’t doing him any favors, and his .369 first-half BABIP is probably unsustainable.  Still, .277 xWOBA?  Seems harsh.

As with any fantasy advice, don’t take any of this too literally…  Take it or leave it as you see fit.

Lastly, although I hyped this piece from a fantasy perspective, the overall goal remains that I would love to see more work done to de-luck hitter stats, the way people do so often for pitchers.  (FIP for pitchers, and xWOBA or xWRC+ for hitters! Is the dream.)

Reader thoughts on how to improve the model, or requests for players not already mentioned?


Projecting the Mariners

At the time of writing, the Seattle Mariners are 54-50 and 0.5 games back in the race for the second wild card spot. With rumors flying as to the upcoming trade deadline, the benefit of selling prospects to improve their odds varies heavily based on how view their current roster. Dave Cameron rang the warning bell for all the of the second wild card contenders, pointing out that these teams are vying for a one game playoff in Anaheim. Not such a great prize.

But. The Mariners haven’t made the playoffs in 13 years and the prospect of a wild card spot has made this most exciting campaign in a decade. So how good are the Mariners? At 54-50, the Mariners have won at a clip of .519, but FanGraphs’ projection models paint a very different picture of the team. FanGraphs’ playoff odds page believes the Mariners are a .503 team going forward. Not so hot.

But. The Mariners’ season-to-date-stats projection from FanGraphs is a lot more favorable, suggesting the true talent level of M’s, based on CoolStandings’s version of Pythagenpat adjusted for remaining schedule, is .557. To put that in context, it’s the 4th best in baseball, behind the third-place Angels (.572) and 5th place Dodgers (.549). The Mariners, by this measure, are the both the 3rd best team in the AL West and the American League.

The CoolStandings model, like that of Baseball Prospectus, relies on base runs and a modified Bill James Pythagorean W-L. As best I understand it, this means that the Mariners, by base runs, have been ‘unlucky’ to the tune of the gap between their projected rest of season W% and their current W%, for a loss of 0.038 W%. 3.8% of the M’s 104 games already played comes out to 4 wins, so the Mariners’ ‘true talent level’ record would be 58-46. That’s still a good margin worse than the Angels, but it’s also well above all the other wild card contenders. By this model, the M’s would be expected to win 86.1 games this year, beating out the next-best Blue Jays by almost 2 wins.

Conversely, by the FanGraphs Zips-Steamer projection system, the M’s should win 83 games and come a game or so short of the Blue Jays. By these numbers, the M’s still finish the as the best team not to make the playoffs.

There is a huge gap between these projections, coming out to about 8 wins over a full season and 5 over the roughly two-thirds of the season already played. The rest of this article is an amateur attempt to account for that gap, and to assess how good the Mariners might actually be.

The CoolStandings model relies on past performance and therefore absorbs some mathematical ‘luck.’ For example, the Mariners’ runs allowed this season has to have been impacted by a league high strand-rate. For that reason, the figures are probably overly optimistic of the M’s chances.

But that isn’t to say the M’s look like a .503 ballclub either. To account for the difference between the M’s .519 W% and their .503 Zips-Steamer talent level, I looked at the ZiPS projections for the rest of the season, but in ignoring Steamer, this is only a rough guess at how the projection could be too pessimistic .016% of 104 games is about 2 wins, so that’s the extent of the disagreement between reality and luck against the projections. For the remainder of the season, that .016% is worth about one win.

For example, Michael Saunders has been worth 1.7 fWAR this year in 219 PAs, a pace of approximately 4.2 fWAR over 550 PAs, but Zips says he’ll be worth 0.7 fWAR over 161 PAs the rest of this year, a clip of 2.39 fWAR/550. If we say the M’s have played two-thirds of their season already, that says that Saunders should’ve been worth 1.6 fWAR this year, and has overperformed to the tune of .1 of the M’s wins.

Looking a little deeper, we can see that ZiPS doesn’t punish Saunders playing time projection very much despite his injury trouble, in that it sees him basically playing full time the rest of the way. If we take the ZiPS projection and put it over his 219 PAs of 2014 service time, the picture is a little clearer: ZiPS says Saunders should’ve been worth 0.95 fWAR this year. So Saunders has already given the M’s almost a full win more than he should’ve, says ZiPS.

Other contributors to the M’s supposed over-performance include Felix and Ackley.

Ackley’s defense has been fine so far, per UZR, but ZiPS says it should be bad. So despite underperforming at the plate by 9% per wRC+, ZiPS says Ackley’s glove has already given the M’s a full half-win more than can be expected.

A topic of some discussion, Ackley’s fielding is hard to assess. To the naked eye he’s looked ok, not terrible, and UZR seems to agree. There was his spectacular catch the other day, and there’s also his mediocre arm, which possibly has been taxed at a below-average rate, but this is all speculation. But if we trust Ackley’s to date figures and performance, we can give the Mariners a third of a win back over their remaining schedule.

And then there’s Felix, who has already contributed a remarkable 5.5 fWAR. ZiPS penalizes Felix quite a bit, expecting him to regress quite a bit. If Felix continues pitching at his current level, the M’s again would be expected to win another half game more than ZiPS suggests. Felix both has reached new highs with his changeup and has benefited quite a bit from Zunino’s pitch framing, as have the M’s, and Zunino’s framing is neither accounted for by the projection systems nor his season-to-date fWAR.

Then there’s Chris Young. Young has been worth 0.6 fWAR and 2.7 rWAR. Young alone can account for the gap between FanGraphs’ and ZiPS’ perception of the Mariners and their current performance. A lot has been written about Young’s season, and going back to 2009, the last season ZiPS looks at for their projections, Chris Young’s 2014 is the single greatest overperformance of ERA against FIP. Young probably hasn’t been as good as his ERA nor nearly as bad as FIP, but I can’t speculate at his true talent.

Between Young, Zunino’s framing, Ackley’s defense, Saunders’ somewhat expected improvement, and Felix’s dominance, the Mariners seem like an especially tough team to project. For a counter example, the Blue Jays are expected to win 84.3 games by ZiPS-Steamer and 84.9 by Cool Standings, versus the Mariners 3 game swing between the projections. Clearly ZiPS-Steamer is the more reliable model and clearly it’s missing a significant piece of the picture. But were I a betting man, I’d certainly bet the M’s finish better than 83 wins.

What these numbers suggest is that by both the models the Mariners are close enough to be competitive for the wild card, and that acquiring marginal talents like Marlon Byrd or a DHing Matt Kemp (not going to happen) could have a real impact on the team’s chances. By both models, going all in for Ben Zobrist at SS and a right-handed OF might not be be such a bad idea, nor would be so ludicrous to pursue David Price. That said, a bad trade is a bad trade no matter the context of standings, and the M’s suffer from an overpay in any event.

What the holes in the ZiPS projections say, however, is that maybe the M’s recent slide isn’t especially important, and that while this probably isn’t the 4th best squad in baseball, it’s still probably a good team, and a team to be excited about. Because even if the M’s fail to make another roster move, they should be a competitor.


Looking at Attendance after Aces are Dealt

As baseball season and the summer months heat up, so too do the trade rumors. Almost every year, baseball media and fans postulate and prognosticate who might be traded before the annual trading deadline.

This year, the big fish on the market is Rays left-hander David Price. With only one year left on his contract, it is unlikely the Rays can afford to keep the former Cy Young Award Winner. But with the team winning eight in a row and 19 of their last 24, trading their ace doesn’t seem like a sure deal anymore. Most recent reports say the Rays management will wait until the absolute last minute to make a decision on if, where, and for whom the popular lefty will be traded.

With the Rays’ status with regards to popularity and market, some of the talk in regards to trading David Price has wound into the realm of attendance. The Rays are currently last in the Major Leagues in attendance, and some are concerned attendance could drop even lower if they traded their best pitcher. There are those who think Rays fans would consider the trade a message from ownership to wait until next year. And if that’s the message, why not wait until next year to buy a ticket?

To estimate how Rays attendance might react to a possible trade of David Price, I looked at 12 prior trades of ace pitchers over the last 37 years. Via Baseball-Reference.com, I looked at attendance before and after each trade. I also looked at winning percentage before and after.

My goal is to see if two maxims hold true:

  1. Attendance goes up when teams win and goes down when teams lose.
  2. A team that trades its best pitcher will have a worse record after the trade.

Hence, if attendance is attached to winning and ace pitchers are attached to winning, attendance should drop after ace pitchers are traded.

Is this really the case? Or is attendance in some cities more sensitive to major trades than others?

Let’s begin by looking at the granddaddy of superstar pitcher trades: the Tom Seaver trade. On June 15, 1977, after a slight tiff with ownership, the Mets shipped the franchise’s first ace to the Reds for Steve Henderson, Pete Flynn, Pat Zachary, and Dan Norman. The Mets were bad before but worse after and attendance followed suit.

Twelve years later, in 1989, two aces were traded during the season. On May 25th, the Mariners moved ace Mark Langston to the Expos for a bevy of prospects headlined by future ace Randy Johnson. Mariners fans reduced their attendance by nearly the same amount Mets fans did in 1977. Although playing .500 baseball prior to the trade, the Mariners winning percentage dropped significantly after the trade.

Two months after the Langston trade, the Minnesota Twins traded 1988 Cy Young Award winner Frank Viola to the Mets for Rick Aguilera, Kevin Tapani, and three other pitchers. The Twins were two games under .500 at the time of the trade, and then played .500 after the trade. Despite their slight improvement, attendance dropped 12.95% after the Viola trade.

We fast-forward to 1998 and another Mariners trade. During the 1998 season, the Mariners dealt the aforementioned Johnson to the Astros for Freddy Garcia, Carlos Guillen, and John Halama. While Johnson immediately did well in Houston, the Mariners played better after his departure, going 28-25 after the trade. Like the 1988 Twins, however, the positive play did not lead to an increase in attendance, as the average per game attendance went down after the trade.

Our next trade is the Bartolo Colon trade in 2002. On June 27, 2002, the Indians shipped Colon and Tim Drew to the Expos for Cliff Lee, Grady Sizemore, Brandon Phillips, and Lee Stevens. The Indians played .467 baseball before the trade and a lesser .447 clip following the deal. Attendance, however, jumped after the trade, up 10.04% over the team’s final 45 games.

We look at Cleveland again in 2008, when the Indians moved CC Sabathia to the Milwaukee Brewers for Michael Brantley, Matt LaPorta, and three other players. After trading Sabathia, the Brewers vastly improved their record, finishing the season 44-30. Attendance also went up after the Sabathia trade, from 25,964 to 27,766 per game, an increase of 6.94%.

The 2009 season saw the trade of three high profile pitchers. Two were legitimate aces, and the other a former ace that might give us insight to a Rays attendance prediction.

The first major pitcher trade in 2009 again involved the Indians. On July 29th, the Tribe shipped Cliff Lee and Ben Francisco to Philadelphia for Jason Knapp, Carlos Carrasco, Jason Donald and Lou Marson. Unlike the Colon or Sabathia trades, following the Lee trade, the Indians winning percentage and attendance per game both decreased.

Two days after the Indians traded Lee, the San Diego Padres moved right-hander Jake Peavy to the Chicago White Sox for Clayton Richard and three other players. Like the Twins in 1989 and the Mariners in 1998, the Padres played better after moving their ace, finishing the remaining 59 games with a 34-25 record. Unfortunately, also like the ’89 Twins and ’98 Mariners, less fans came out to see their now-winning team.

Our final pitcher trade of 2009 occurred on August 29th, when the Rays moved former ace Scott Kazmir to the Angels for Sean Rodriguez, Alex Torres, and Matthew Sweeney. Kazmir was no longer the Rays ace in 2009, handling over the title to James Shields and the up-and-coming David Price. But Kazmir still had name value in the Tampa Bay area, despite his decreased effectiveness.

After trading Kazmir, the Rays stumbled to a 15-20 finish. They went from being 4.5 games out of the wildcard to finishing 11 games out of the playoffs. Per game attendance following the Kazmir trade also dropped considerably, from 24,169 per game to 19,574 per game. This attendance decrease of 19.01% is the biggest drop of any of our surveyed trades.

The next year, two of our most frequent subjects collided when the Mariners traded Cliff Lee. After signing with Seattle in the offseason, Lee was sent to the Rangers for the stretch run. After the trade, the Mariners, who had played .400 baseball prior to trading Lee, finished the season with a .350 winning percentage and saw attendance drop 4.99% over the last 39 home games.

In 2012, the Brewers were on the dealing side when they sent Zack Grienke to the Angels for Jean Segura and two other players. While the Brewers were 10 games under .500 before the trade, they reversed fortune after the deal, going 39-25, a .609 clip. Attendance also increased after moving Grienke, albeit by 124 fans per game, or only 0.3%.

In our final trade, we look at the Chicago Cubs. Prior to trading Matt Garza on July 22, 2013, the Cubs were 10 games under .500 and averaging exactly 33,000 fans per game. After trading Garza, the Cubs dropped to 30 games under .500 and lost 919 fans per game in the seats, a 2.78% decrease.

There are many other trades and fanbases I could have looked at (the Ubaldo Jimmenez trade in 2011 comes to mind), but this small sample set gives a wide spectrum of possible outcomes resulting from trading an ace pitcher. From what we looked at, we found:

  • 50% of the data set decreased in both record and attendance
  • 25% increased in record and decreased in attendance
  • 16% increased in both record and attendance after trading their ace
  • 8% decreased in record but increased in attendance

The Indians are particularly interesting, seeing a different outcomes each time they traded an ace. The Mariners saw an attendance drop after both the Langston and Johnson trades but played better after trading Johnson and worse after moving Langston. Perhaps Langston had a bigger effect on the team in 1989 than Johnson did in 1998.

So what would happen if the Rays traded David Price? Given their current winning streak and the attendance sensitivity seen after the Kazmir trade, my initial estimate would have them in the same category as the 1989 Twins, 2009 Padres, and 1998 Mariners – an improved winning percentages but lower attendance. An better record post-trade might not be difficult considering the beginning of the Rays season was a disaster marred by injured players who are slowly returning (Alex Cobb, Jeremy Hellickson, David DeJesus, and possibly Wil Myers).

But with the Rays struggling to fill seats, moving fan favorite David Price might be a bad public relations move. From the studies I have done, games David Price has pitched in have drawn 6% more than average. That could be because Joe Maddon sometimes aligns the rotation so Price faces prime opponents such as the Yankees and Red Sox, teams that traditionally draw well at Tropicana Field. But some of Price’s “bump” could be the allure of seeing one of the best pitchers in the American League.

My estimate is the Rays would suffer an initial attendance drop if they traded David Price. Games against the Red Sox and Yankees (especially Jeter’s last series in Tampa Bay) will continue to do well. Bobbleheads and other promotions will also do well (expect a good turnout for the Don Zimmer sno-globe). And if the team plays well enough to contend, attendance may recover, but even then, the Rays won’t average over 20,000 per game.

Then again, doubtful they would draw 20K on average even with David Price in the rotation.


Using Triple-A Stats to Predict Future Performance

Over the last couple of weeks, I’ve been looking into how a players’ stats, age, and prospect status can be used to predict whether he’ll ever play in the majors. I used a methodology that I named KATOH (after Yankees prospect Gosuke Katoh), which consists of running a probit regression analysis. In a nutshell, a probit regression tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. While KATOH technically predicts the likelihood that a player will reach the majors, I’d argue it can also serve as a decent proxy for major league success. If something makes a player more likely to make the majors, there’s a good chance it also makes him more likely to succeed there. This hypothesis may be less true for players at the Triple-A level since such a high proportion of these players make it to the majors, but I still think it provides some insight. To address this issue, In the future, I plan to engineer an alternative methodology that takes into account how a player performs in the majors, rather than his just getting there.

For hitters in Low-A and High-A, age, strikeout rate, ISO, BABIP, and whether or not he was deemed a top 100 prospect by Baseball America all played a role in forecasting future success. And walk rate, while not predictive for players in A-ball, added a little bit to the model for Double-A hitters. Today, I’ll look into what KATOH has to say about players in Triple-A leagues. Due to varying offensive environments in different years and leagues, all players’ stats were adjusted to reflect his league’s average for that year. I also only considered what happened during or after the sample season. So if a former big leaguer spends the full season in Triple-A, he’s only considered to have “made it to the majors” if he resurfaces again. For those interested, here’s the R output based on all players with at least 400 plate appearances in a season in Triple-A from 1995-2011.

AAA Output

This output looks pretty similar to what we saw for Double-A hitters, including the “I(Age^2)” coefficient, which adds a bit of nuance into how a players’ age can predict his future success. But in this version, there’s also an interaction between ISO and age. Basically, this says that the ability to hit for power is much more important for older players than younger players at the Triple-A league level.

Rplot

By clicking here, you can see what KATOH spits out for all players who logged at least 250 PA’s in Triple-A as of July 7th. . I also included a few interesting players who missed the 250 PA cut off, including Mookie Betts, Rob Refsnyder, Ramon Flores, and Kris Bryant. Here’s an excerpt of the top players from Triple-A this year. Joc Pederson tops the charts with an impressive 99.91% probability. Many of these players have already played in the majors, so these values can be interpreted as the odds that said player will play in the majors in the future.

Player Organization Age MLB Probability
Joc Pederson LAD 22 100%
Gregory Polanco PIT 22 100%
Kris Bryant CHC 22 100%
Mookie Betts BOS 21 100%
Arismendy Alcantara CHC 22 100%
Oscar Taveras STL 22 99%
Stephen Piscotty STL 23 98%
Steven Souza WSN 25 98%
Javier Baez CHC 21 98%
Maikel Franco PHI 21 97%
Taylor Lindsey LAA 22 97%
Domingo Santana HOU 21 97%
Enrique Hernandez HOU 22 96%
Chris Taylor SEA 23 95%
Jake Marisnick MIA 23 95%
Mikie Mahtook TBR 24 94%
Rob Refsnyder NYY 23 94%
Alfredo Marte ARI 25 93%
Carlos Sanchez CHW 22 93%
Nick Franklin SEA 23 93%
Ramon Flores NYY 22 92%
Ronald Torreyes HOU 21 92%
Joe Panik SFG 23 91%
Tyler Saladino CHW 24 91%
Giovanny Urshela CLE 22 90%

Now that I’ve gone through all levels of full-season ball, I’ll start at the bottom and cycle through the short-season leagues. These samples will be pretty small, but perhaps not completely useless now that those players have a few weeks’ worth of games under their belts. At the very least, it will be interesting to see what KATOH’s able to tell us about batters so far away from the big leagues, even if it’s a little premature to ask KATOH about 2014’s players.


World’s Best Pitcher Faces World’s Best Hitter

If you’re an East Coaster  — like myself — and you stayed up late enough to watch the Angels-Mariners game on Saturday night, you were in for treat. The game, based on the pitching matchup alone, would have been an exciting one to watch: Garrett Richards versus Felix Hernandez. There’s a fair argument that Hernandez is the best pitcher in baseball right now. The same could be said about Clayton Kershaw, but if you’re talking about pitchers who have been healthy for the whole season, then King Felix is your guy.

Richards has been no slouch either, turning himself into the bona-fide ace in Anaheim, pitching to a ERA/FIP/xFIP line of 2.55/2.69/3.21, which is good for a WAR of 3.2. This game previewed as more of pitcher’s duel than a slugger’s fest.

However, this game didn’t just feature the game’s best pitcher.  It also featured arguably the game’s best hitter: Mike Trout. So, how does the world’s best pitcher pitch to the world’s best hitter?

First Pitch

trout pitch 1

This is in the fourth inning, and it’s Trout’s second time facing Hernandez.  Hernandez starts out with a fastball. Sucre set up for a pitch low and away, and Hernandez misses with a 92 MPH up and in. A slight mistake, but definitely not a costly one. The count is 1-0.

Second Pitch

Like any great pitcher, Hernandez fixes the mistake he made on his first pitch. Sucre sets up for the same location, and Hernandez throws a 92 MPH fastball low and away. Hernandez nails it right on the money. The pitch was well executed, and Trout even thought about going around on it. Trout doesn’t go around, but that doesn’t matter because the pitch is a called strike. Hernandez evens up the count at 1-1.

Third Pitch

third

Clearly,  the Mariners had a plan: pitch Trout low and away. This is the third time that Sucre set up in that location, however this time Hernandez decided to flash his signature changeup. This hard changeup  probably would have bounced in the dirt had Trout not fouled it off. It seems as though Hernandez plans on going after Trout by  first establishing the fastball in a particular location, and then attack with the off-speed stuff. The count is in Hernandez’s favor at 1-2.

Fourth Pitch

Hernandez comes after Trout with another changeup, and Sucre sets up in the same location as the last pitch. Hernandez — like any pitcher — is clearly going for the punch out. Hernandez is trying to execute the same pitch, while hoping for better results. Unfortunately he misses inside for a ball. Trout has worked the count to 2-2.

Fifth Pitch

When you throw a pitch two times in a row, you run the risk of becoming predictable. Hernandez decides to break his streak of throwing changeups, and goes for the hard stuff. Sucre sets up in the same location that he always does, however Hernandez floats 93 MPH fastball up and away for a ball. The count is full. The best thing about a full count is that you know that one of three things will happen: strikeout, walk or the batter makes contact.

Sixth Pitch

Well, something did happen. Trout made contact, but it was nothing  meaningful,  as he fouled off a 94 MPH fastball from Hernandez. Nothing is really new from the Hernandez-Sucre side. Sucre sets up in the same location he has been setting up for the entire at bat, and Hernandez probably would have hit his spot had Trout not fouled it off. The count is still 3-2.

Seventh Pitch

troutdouble

Hernandez comes back with a 94 MPH fastball, hoping to hit the low and away location that Sucre sets up. However, Hernandez leaves the pitch up a little bit just enough to slightly miss the location that Sucre had set up. Trout is one of the best hitters in the game, and the best hitters in the game can take advantage of a pitcher’s slight mistake. Trout almost takes this pitch yard , as it bounces off the wall for a double. This double ended up being nothing significant, as Hernandez managed to work his way out of the inning without Trout scoring.

Crisis averted for Hernandez.

Trout happened to come out successful in that at-bat as he got himself a pretty big hit against Hernandez, however later on in the game Hernandez managed to strike out Trout.

Hernandez finally gets Trout on his signature nasty changeup that breaks on the inside part of the plate. If you look at Mike Trout’s heatmap,  you know that going inside on him is risky business. Dave Cameron even wrote a whole article about it. Luckily, Hernandez’s changeup is good enough that he can get away with pitching Trout inside.

One of the best hitters in the game faced off against one of the game’s best pitchers. You could see why these two are the best at what they do. They recognize mistakes, capitalize on those mistakes, and correct their own mistakes. There are a lot of at-bats each year in baseball, and to someone who knows nothing about baseball, this might look ordinary.  However,  if you look closely enough, you can see the intricacies of a particular at-bat. That’s when you start to realize how complex baseball is.


Changes ZiPS Believes In

Mitchel Lichtman’s projection pieces on hitters and pitchers for the rest of the season were discussed quite a lot last month starting with this.  It is hard when you are rooting for a team, and subsequently its players, not to buy in when someone is doing well or poorly.  So let’s look at the heartless projecting system ZiPS to see if it is actually buying into some of the performances of 2014 so far.

To do this I pulled the 2014 pre-season wOBA projections and compared them to the ZiPS (RoS), rest of season, projections.  If you take the RoS wOBA minus what ZiPS was expecting prior to 2014 you should be able to see which players are now expected to hit significantly better or worse the rest of the way.  Here are the top/bottom-five players:

 photo ZIPSros_zpsebe79a2a.jpg

The bottom five, with the exception of Colvin, have been very disappointing and their respective teams would love even the RoS numbers at this point.  The projection still believes Brown can be an above average offensive player despite his putrid play to this point of 2014, but it is starting to look like Raburn’s age might be catching up to him and Gyorko’s rookie year might have been a mirage.  Schierholtz makes less sense, but he has been so bad that ZIPS can’t ignore it, and he was never a great player to begin with.

Others names of note that are projected to finish the year worse may not be surprising.  Raul Ibanez looks done with eyes and statistics, Jean Segura’s lack of plate discipline has really caught up to him, and Brian McCann may not be aging particularly well despite being a lefty with power in the Yankees’ home park.

There are a lot of players on the positive side, and you can see that the nominal and percent wOBA changes are larger for the improvement group too.  There are 31 players with RoS wOBA at least 5% above their pre-season projection while only 17 projected to be 5% or more worse than expected.  Does this mean that ZiPS is actually an optimist?

The Padres believe in Seth Smith as well, having recently signed him to extension.  He is a righty masher, though they only rarely let him face same-handed pitching.  Victor Martinez is 35 years old and decided to have a renaissance, and may end up with his best hitting season ever.  Baseball is weird.  I’m not sure what to make of Steve Pearce.  He has been around since 2007 without ever accumulating more than 200 PAs, but this season he finally has and the Orioles are making out like bandits.  The other two are what you expect on such a list, young players taking a step forward.  JD Martinez was who I was thinking about when I started this.  I have seen him play several times recently, and he seems to put together a quality plate appearance every time up. Mesoraco, like Martinez, is 26 and has had a huge power spike along with a lot more strike outs to the point where he seems like a different player altogether.

Two Cleveland Indians just missed the top five improvers: Michael Brantley and Lonnie Chisenhall seem to have finally taken a step forward too.  There were two notable Brewers as well.  ZiPS seems to have finally decided to believe in Carlos Gomez and Jonathan Lucroy.

Yes, believing in projections sometimes means we need to temper our enthusiasm when a player we like breaks out or be patient with someone slumping.  It can also be a good way to see when players are truly locking into higher levels of play.  For the older players here it is likely that they will come back to the pre-season projections again next year because Victor Martinez is probably not going to turn into a much better hitter year after year at this age, but for the younger guys we may be starting to see who is taking a step forward.


Using Double-A Stats to Predict Future Performance

Over the last couple of weeks, I’ve been looking into how a players’ stats, age, and prospect status can be used to predict whether he’ll ever play in the majors. I used a methodology that I named KATOH (after Yankees prospect Gosuke Katoh), which consists of running a probit regression analysis. In a nutshell, a probit regression tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. While KATOH technically predicts the likelihood that a player will reach the majors, I’d argue it can also serve as a decent proxy for major league success. If something makes a player more likely to make the majors, there’s a good chance it also makes him more likely to succeed there.

Things that were predictive for players in low-A and high-A included age, strikeout rate, ISO, BABIP, and whether or not he was deemed a top 100 prospect by Baseball America in the pre-season. However, a player’s walk rate was not significant in predicting a player’s ascension to the majors. Today, I’ll look into what KATOH has to say about players in double-A leagues. For those interested, here’s the R output based on all players with at least 400 plate appearances in a season in double-A from 1995-2010. Due to varying offensive environments in different years and leagues, all players’ stats were adjusted to reflect his league’s average for that year.

AA Output

Unlike in the A-ball iterations of KATOH, a player’s double-A walk rate is predictive — albeit only slightly — of whether or not he’ll make it to the show. While walk rate is statistically significant, it still matters much less than the other stats: it takes 3 or 4 percentage points on a player’s walk rate to match what 1 percentage point of strikeout rate does to a player’s MLB probability.

This version is also different in that there are a couple of significant interaction terms, signified by the last two coefficients in the above output. The “I(Age^2)” term adds a little bit of nuance into how a players’ age can predict his future success. While the “ISO:BA.Top.100.Prospect” term basically says that if you’re a top 100 prospect, hitting for power is slightly less important than it would be otherwise. Hitting for power and making Baseball America’s top 100 list both make a player much more likely to make it to the majors, but if he does both, he’s a tad less likely to make it than his power output and prospect status would suggest independently. Put another way, a few top 100 prospects hit for power in double-A, but never cracked the majors — such as Jason Stokes (.241 ISO), Nick Weglarz (.204 ISO) and Eric Duncan (.173 ISO). But virtually all of the low-power guys made it, including Elvis Andrus (.073 ISO), Luis Castillo (.076 ISO), and Carl Crawford (.078). For non-top 100 guys, many more punchless hitters topped out in double-A and triple-A.

By clicking here, you can see what KATOH spits out for all current prospects who logged at least 250 PA’s in double-A as of July 7th, as well as a few that fell short of the cutoff — most notably Joey Gallo, Kevin Plawecki, and Robert Refsnyder. Topping the list is Mookie Betts with a probability of 99.95%, and of course the prophesy was fulfilled when the Red Sox called up the 21-year-old last month. Here’s an excerpt of the top players from double-A this year:

Player Organization Age MLB Probability
Mookie Betts BOS 21 100%
Francisco Lindor CLE 20 100%
Gary Sanchez NYY 21 99%
Austin Hedges SDP 21 99%
Alen Hanson PIT 21 99%
Jorge Bonifacio KCR 21 98%
Blake Swihart BOS 22 98%
Kris Bryant CHC 22 93%
Ketel Marte SEA 20 91%
Rangel Ravelo CHW 22 90%
Robert Refsnyder NYY 23 86%
Jake Lamb ARI 23 85%
Jake Hager TBR 21 84%
Darnell Sweeney LAD 23 83%
Joey Gallo TEX 20 82%
Preston Tucker HOU 23 81%
Scott Schebler LAD 23 79%
Kevin Plawecki NYM 23 79%
Cheslor Cuthbert KCR 21 78%
Kyle Kubitza ATL 23 77%
Michael Taylor WSN 23 76%
Christian Walker BAL 23 76%
Ryan Brett TBR 22 75%

Keep an eye out for the next installment, which will dive into what KATOH says about hitters at the triple-A level.

Statistics courtesy of FanGraphs, Baseball-Reference, and The Baseball Cube; Pre-season prospect lists courtesy of Baseball America.


Do Rookie Hitters Decline in the Second Half?

Do rookies perform worse after the All-Star break?

My claim over this statement is nonexistent, while the original thought of its occurrence was brought to my attention by Adam Aizer on the CBS Fantasy Baseball Podcast.

My judgment dissuaded, I thought that it would be worth the effort to look into the validity of the statement.

From the perspective of an offensive player, rookies infrequently make enough of an impact in the size of leagues (i.e. 10-team and 12-team leagues) that pedestrian Fantasy Baseball players occupy. For those sizes of leagues that the aforementioned owners participate in, a rookie hitter that is worth owning is either an elite prospect or a player that has preformed beyond their true talent level. As a result, the former is rare, while it would make sense for the latter to regress to their true talent level and is more common than the former. The idea that rookie hitters decline throughout the year is just a misevaluation of the player’s true talent level.

To put another way, it is the same logic that comes into play with a recent event: the Home Run Derby. Players that participate in the Home Run Derby are players that have exceptional first halves, which are often beyond their true talent level. These players often perform worse in the second half than they did in the first half, not because they participated in the monotonous and dated event that has become the Home Run Derby, but because, just like the rookies who perform worse in the second half of the season than the first, they have regressed toward their true talent level; when the rookies regress, they have just regressed to the point where they are not ownable.

The research looks at all player seasons between 1988 and 2013 where a batter was in their first season, had 250 plate appearances in the first half of the season, and had 250 plate appearances in the second half of the season.

Screen Shot 2014-07-20 at 8.48.48 PM

The rookie second half decline and the post Home Run Derby slump intuitively make sense, but intuition does not always bear truth. Through cognitive ease we rationalize that “Swinging that hard for that long throws off your timing”; “A rookie is too young to be able to make it through the long hot summer.”

Because most fantasy leagues are small, the only reason that the common rookie was on our teams to begin with is because they had to play beyond their ability in the first half of the season. The rookie who is on our team right now, unless he is a reputable prospect, is probably a safe bet to decline. But as a whole, we can see that there is no decline in rookie performance based on first half/second half splits.

Our desire to perceive a decline is just our desire to hold onto our ability as talent evaluators. We know that Yangervis Solarte is a great player, and the only reason he hasn’t been able to sustain his performance is because he is rookie that can’t play out the season: common baseball logic. In actuality, Solarte was not as good as some originally thought, and his true talent was never good enough to be on a 10 or 12 team league.

Summary:

Rookie hitters, as a generalization, are not good enough to play in 10 or 12 team leagues, and, as a generalization, those that do play in ten team leagues regress to their true talent level, which is not valuable enough to be ownable.

Devin Jordan is obsessed with statistical analysis, non-fiction literature, and electronic music. If you enjoyed reading him, follow him on Twitter @devinjjordan.


Mike Minor and All the Home Runs

Mike Minor just keeps giving up home runs. To be fair, he’s a fly ball pitcher and home runs will come with that. And actually, he’s given up the long ball a little more frequently than he should (10.5% HR/FB) throughout his career, so maybe this shouldn’t come as such a surprise.

His 1.51 HR/9 this season is 7th among pitchers who have thrown as many innings as Minor has (83.1). But he’s had some bad luck this year – .343 BABIP, 14.9% HR/FB – and he’s been stricken with a… different kind of offseason injury plus shoulder tendinitis in Spring Training, so it’s reasonable to think that’s where the issue starts and ends. But after personally seeing him give up four home runs in a rehab game against Reds double-A squad Pensacola, it feels like something may be wrong. So I’d like to examine this a little more, if that’s ok.

I imagine that if the problem is something more than just arm trouble or bad luck, it should show up in his numbers somewhere. So I’ll compare his PITCHf/x, pitch type, and heat map data from this season – a not-so-good one – and last season – a quite good one.

First, I just want to show again that he’s been much less lucky this season. It feels to me like there’s something more to it, but luck could be the problem.

babip minor

While that may be so, giving up more home runs could be the result of a change in the amount he’s throwing each of his pitches and the velocity of those pitches.

pitch type

So there’s actually been a small uptick in Minor’s velocity since last season, and he’s been throwing more sliders and fewer changeups. He’s been showing that same trend since his debut and seemed to find a happy medium last year. Those changes from 2013 to this year seem significant, and I think they might be playing a part in his production.

First, we’ll compare how his pitches have been moving and how effective they’ve been the last two years. Rather than show four more tables with a bunch of numbers, here’s a quick summary: 1) His changeup is moving less than it did last year, and it’s getting crushed. 2) His fastball and slider are both moving more than they did last year – but only by a little – and are getting crushed. So those things aren’t great. The BABIP on his changeup is the only one that isn’t outrageous; it’s .281 this year. The opponent’s BABIP on his fastball and slider are .394 and .350, respectively, which are both pretty crazy. So those are two more points for just a ton of bad luck going Minor’s way, and perhaps some good signs pointing towards better luck in the near future. On to the next thing.

Maybe his issue has been locating the ball. He’s walk rate is up a little bit from last year, so it could be that he’s having trouble pitching where he did in 2013. I thought showing his heat maps might illustrate that, but, well…

2013 heat map 2014 heat map

They don’t. Not really, anyway. A lot of his pitches this year, like last year, are right around the middle of the plate, though they were spread out a little more last year. I’m not sure what exactly that means, but maybe he’s not locating quite as well this year.

From what I can gather, it seems like Mike Minor has seen several little changes. (A little higher release point turns into less movement on a pitch every now and then, which turns into everyone crushing your slider, etc.) And a lot of little changes can make a big difference – if things aren’t the same, they’ll be different, right?

Now for a little good news – though I hesitate to call it that. Minor’s historically been a “2nd half pitcher.” Hitters go from a .330 wOBA against him in the 1st half to a .300 after the break, and his FIP and xFIP see some drops as well. In addition, his xFIP is 3.61, which is actually a little better than it was last season. A turnaround doesn’t seem terribly far off for Minor. Cut out a little of that horribly bad luck, and Atlanta’s rotation gets better. Those things might not mean much at all, but maybe it can give Braves fans some hope.


Roster Doctor: Los Angeles Dodgers

With a payroll north of $200,000,000, you would expect the Los Angeles Dodgers to field a competitive team, and indeed they have. As we emerge from the All-Star break, they are neck and neck with the hated Giants, heading into a pennant chase that could be one for the ages. The Dodgers have four of the most watchable players in baseball (Kershaw, Greinke, Puig, and Ramirez) and a farm system with enough talent to supply reinforcements either directly or via trades. The team is not without needs, however. Like almost any team, the Dodgers has some bullpen depth issues, but just alleviated those somewhat by recalling Paco Rodriguez, a non-flamethrower who nevertheless generates a ton of Ks. Catching has been a riddle for manager Don Mattingly as well.  He’s had to use four backstops, none of whom have amassed enough appearances to qualify for the batting title, and of whom only the stalwart but venerable A.J. Ellis has provided anything even approaching an offensive contribution. (Well, Miguel Olivo made an offensive contribution of a different kind.)

But the biggest problem has been Matt Kemp, who dug a Tunguskan-size crater in center field before Mattingly more or less permanently shunted him to left. Kemp has the worst WAR (-1.3) for any position player qualifying for the batting title except Domonic Brown. Kemp’s hitting about as well as last year’s (modest) effort, but his defense has gone from bad (-0.6 dWAR) to eye-watering (-2.5). Whether you’re new school (zone rating) or old school (range factor), you will find nothing to like in Kemp’s defensive metrics. The move to left has probably mitigated the defensive damage he’s doing, but mainly by reducing his opportunities to come within proximity of the ball. His range in left is almost as far below the league as his range in center, although he’s making fewer errors. Kemp’s agent thinks he can still play center, and so presumably do Matt and his mom. That about exhausts the list.

In one sense this is a simple problem that the Dodgers can solve without any outside help. They could bench Kemp immediately. Center field prospect Joc Pederson is murdilating the PCL’s beleaguered pitchers to the tune of a 1.045 OPS, and yes, that’s good even in the PCL. Pederson is third in the league in OPS, behind two guys who are at least five years older. To the extent Pederson would struggle against major league lefties, he could be platooned with righty Scott Van Slyke, with Andre Ethier sliding between center and left. This is a rare situation where a manager can (almost) unilaterally boost his team’s playoff chances with a single lineup change.

And yet … Kemp can still hit. His .752 OPS is third on the Dodgers among batting qualifiers, and while that’s over 80 points off his career number, it still represents useful offense. At this stage in his career, Kemp’s value would dramatically increase if he didn’t have to put on a glove. The question is how to allocate that increased value among the Dodgers and their potential trade suitors. There are four playoff-contending AL teams whose DHs are either injured, ineffective, or both:

New York Yankees (Carlos Beltran .698 OPS)

Kansas City Royals (Billy Butler .675)

Cleveland Spiders (Nick Swisher .641)

Seattle Mariners (Corey Hart .611)

Kemp would immediately boost any of these teams’ offenses. The Yankees could take much of Kemp’s anvil-like contract ($20 m/yr through 2019), but have few if any prospects to offer. The Royals and Mariners are in the opposite situation: good talent to trade but limited ability to absorb such a huge financial hit. Cleveland, sadly, can’t really employ either approach, and in any case hitting is not their main need.

Dodgers president Stan Kasten’s general strategy upon assuming command was to throw immense amounts of Guggenheim money at the major league roster first, and then reinforce the farm system to ensure a steady stream of cost-controlled reinforcements for the future. Part I of the plan is working well, and Part II is underway with Corey Seager, Julio Urias and Alex “Van Gogh” Guerrero headlining a good collection of upper level minor league talent (non-Pederson division). The Dodgers could go either way here: begin their slow march away from the payroll tax penalty by banishing Kemp to the Bronx, or recharge the lower reaches of their farm system with talent from either of the smaller market franchises who could be in on Kemp. They may not succeed in moving Kemp, but if they can it would provide at least a small edge in a pennant race that looks sure to go to the wire.