Using High-A Stats to Predict Future Performance
Last week, I looked into how a player’s low-A stats — along with his age and prospect status at the time — can be used to predict whether he’ll ever play in the majors. I used a methodology that I named KATOH (after Yankees prospect Gosuke Katoh), which consists of running a probit regression analysis. In a nutshell, a probit regression tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. While KATOH technically predicts the likelihood that a player will reach the majors, I’d argue it can also serve as a decent proxy for major league success. If something makes a player more likely to make the majors, there’s a good chance it also makes him more likely to succeed there.
Things that were predictive for players in low-A included: age, strikeout rate, ISO, BABIP, and whether or not he was deemed a top 100 prospect by Baseball America in the pre-season. However, a player’s walk rate was not significant in predicting a player’s ascension to the majors. Today, I’ll analyze what KATOH has to say about players in class-A-advanced leagues. Here’s the R output based on all players with at least 400 plate appearances in a season in high-A from 1995-2009:
This looks very similar to what I found for low-A players: Walk rate isn’t significant, and everything else has very similar effects on the final probability. However, the coefficients from this model are all a tad bigger than those from the low-A version, implying that high-A stats might be a bit more telling of a player’s future. Intuitively, this makes sense: The closer a player is to the big leagues, the more his stats start to reflect his future potential.
By clicking here, you can see what KATOH spits out for all current prospects who logged at least 250 PA’s in high-A as of July 7th. I also included a few notable players who fell short of the threshold, namely Joey Gallo (who checks in at a remarkable 99.8%), Peter O’Brien, and Jesse Winker. Here’s an excerpt of the top-ranking players:
Player | Organization | Age | MLB Probability |
---|---|---|---|
Joey Gallo | TEX | 20 | 100% |
Corey Seager | LAD | 20 | 99% |
Carlos Correa | HOU | 19 | 99% |
Albert Almora | CHC | 20 | 93% |
Nick Williams | TEX | 20 | 93% |
D.J. Peterson | SEA | 22 | 93% |
Jesse Winker | CIN | 20 | 91% |
Orlando Arcia | MIL | 19 | 88% |
Jose Peraza | ATL | 20 | 87% |
Colin Moran | MIA | 21 | 87% |
Renato Nunez | OAK | 20 | 86% |
Tyrone Taylor | MIL | 20 | 85% |
Hunter Renfroe | SDP | 22 | 84% |
Josh Bell | PIT | 21 | 84% |
Raul Mondesi | KCR | 18 | 83% |
Daniel Robertson | OAK | 20 | 83% |
Jorge Polanco | MIN | 20 | 81% |
Dilson Herrera | NYM | 20 | 77% |
Breyvic Valera | STL | 21 | 77% |
Peter O’Brien | NYY | 23 | 76% |
Matt Olson | OAK | 20 | 75% |
Jorge Alfaro | TEX | 21 | 75% |
Patrick Leonard | TBR | 21 | 75% |
Dalton Pompey | TOR | 21 | 73% |
Billy McKinney | OAK | 19 | 73% |
Teoscar Hernandez | HOU | 21 | 73% |
Brandon Nimmo | NYM | 21 | 72% |
Jose Rondon | LAA | 20 | 70% |
Rio Ruiz | HOU | 20 | 70% |
Brandon Drury | ARI | 21 | 70% |
Next up will be double-A. Unlike A-ball, double-A tends to be a random mishmash of prospects and minor-league lifers, so it will be interesting to see how KATOH handles this wide array of players. And perhaps double-A is where a player’s walk rate finally starts to tell us something about his future success.
Statistics courtesy of Fangraphs, Baseball-Reference, and The Baseball Cube; Pre-season prospect lists courtesy of Baseball America.
Chris works in economic development by day, but spends most of his nights thinking about baseball. He writes for Pinstripe Pundits, FanGraphs and The Hardball Times. He's also on the twitter machine: @_chris_mitchell None of the views expressed in his articles reflect those of his daytime employer.
McKinney was traded to the Cubs in the Shark/Hammel deal.
Ah yeah good catch. I just pulled the players’ organizations from their 2014 B-ref stats, which were with Oakland in McKinney’s case.
It’d be interesting to see these numbers for some current major leaguers– especially if there are any great players who would’ve had really low %s
Yeah that would be pretty cool to see. I’ll take a look at that once I finish going through the minor league levels.
Do you use any sort of compensation for the bizzaro offensive environments in High A? Especially if ISO correlates with MLB success. I would imagine that ISO and BABIP correlation with the FSL and Carolina Leagues are vastly more predictive of ML success than those from the CAL league.
I do account a player’s league. So if a player had an ISO .100 higher than his league’s average, I adjust his ISO to be the 2014 average (an average of all A+ leagues) +.100. So a player with league average stats in the FSL would be treated exactly the same as a player with league average stats in the CAL. However, this does not account for ballpark effects.
Cool, I like it better knowing that. 🙂
Good post.