Using Rookie League Stats to Predict Future Performance
Over the last couple of weeks, I’ve been looking into how a player’s stats, age, and prospect status can be used to predict whether he’ll ever play in the majors. I used a methodology that I named KATOH (after Yankees prospect Gosuke Katoh), which consists of running a probit regression analysis. In a nutshell, a probit regression tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. While KATOH technically predicts the likelihood that a player will reach the majors, I’d argue it can also serve as a decent proxy for major league success. If something makes a player more likely to make the majors, there’s a good chance it also makes him more likely to succeed there. In the future, I plan to engineer an alternative methodology to go along with this one, that takes into account how a player performs in the majors, rather than his just getting there.
For hitters in Low-A and High-A, age, strikeout rate, ISO, BABIP, and whether or not he was deemed a top 100 prospect by Baseball America all played a role in forecasting future success. And walk rate, while not predictive for players in A-ball, added a little bit to the model for Double-A and Triple-A hitters. Today, I’ll look into what KATOH has to say about players in Rookie leagues. Due to varying offensive environments in different years and leagues, all players’ stats were adjusted to reflect his league’s average for that year. For those interested, here’s the R output based on all players with at least 200 plate appearances in a season in Rookie ball from 1995-2007.
Just like we saw with hitters in the A-ball leagues, a player’s walk rate is not at all predictive of whether or not he’ll crack the majors. Unlike all of the other levels I’ve looked at so far, a player’s Baseball America prospect status couldn’t tell us anything about his future as a big-leaguer. This was entirely due the scarcity of top-100 prospects in the sample, as only a handful of players spent the year in rookie ball after making BA’s top-100 list.
The season is less than 40 games old for most rookie league teams, which makes it a little premature to start analyzing players’ stats. But just for kicks, here’s a look at what KATOH says about this year’s crop of rookie-ballers with at least 80 plate appearances through July 28th. This only considers players in the American rookie leagues — the Appalachian, Arizona, Gulf Coast, and Pioneer Leagues, meaning it excludes the Dominican and Venezuelan Summer Leagues. The full list of players can be found here, and you’ll find an excerpt of those who broke the 40% barrier below:
Player | Organization | Age | MLB Probability |
---|---|---|---|
Kevin Padlo | COL | 17 | 73% |
Bobby Bradley | CLE | 18 | 67% |
Alex Verdugo | LAD | 18 | 65% |
Luke Dykstra | ATL | 18 | 64% |
Yu-Cheng Chang | CLE | 18 | 59% |
Magneuris Sierra | STL | 18 | 56% |
Juan Santana | HOU | 19 | 54% |
Joshua Morgan | TEX | 18 | 50% |
Jason Martin | HOU | 18 | 49% |
Edmundo Sosa | STL | 18 | 48% |
Oliver Caraballo | TEX | 19 | 46% |
Sthervin Matos | MIL | 20 | 46% |
Alexander Palma | NYY | 18 | 45% |
Eloy Jimenez | CHC | 17 | 45% |
Javier Guerra | BOS | 18 | 44% |
Zach Shepherd | DET | 18 | 44% |
Tito Polo | PIT | 19 | 44% |
Jose Godoy | STL | 19 | 43% |
Henry Castillo | ARI | 19 | 42% |
David Gonzalez | DET | 20 | 42% |
Dan Jansen | TOR | 19 | 42% |
Max George | COL | 18 | 42% |
Gleyber Torres | CHC | 17 | 42% |
Luis Guzman | WSN | 18 | 41% |
Jose Martinez | KCR | 17 | 41% |
Alex Jackson | SEA | 18 | 40% |
Emmanuel Tapia | CLE | 18 | 40% |
What stands out most is that KATOH doesn’t think any of these players are shoo-ins to make it to the majors. Even those who are hitting the snot out of the ball get probabilities that fall short of what we saw for unremarkable performances in Double-A. Kevin Padlo, for example, gets just a 73%, despite hitting a ridiculous .317/.463/.619 as a 17-year-old. Its hard to do much better than that. I think this really speaks to how little rookie ball stats matter in the grand scheme of things. A good offensive showing is obviously better than a poor one, but numbers from this level need to be taken with a huge grain of salt. A hitter’s performance against pitchers who are fresh out of high school just can’t tell us much about how he’ll fare when matched up against more advanced pitching at the higher levels.
Next up, I’ll complete the series by looking at stats from short-season A-ball. Teams at that level are also only a few weeks into their season, but at the very least, it will be interesting to see how KATOH feels about SS A-ballers in general. Next week, I’ll apply the KATOH model to historical prospects and highlight some of its biggest “hits” and “misses” from the past.
Statistics courtesy of FanGraphs, Baseball-Reference, and The Baseball Cube; Pre-season prospect lists courtesy of Baseball America.