xFantasy, Part III: Can xStats Beat the Projections?

Last month, I introduced the xFantasy system to these venerable electronic pages, in which I attempted to translate Andrew Perpetua’s xStats data for 2016 into fantasy stats. The original idea was just to find a way to do that translation, but I noted back then that the obvious next step was to look at whether xFantasy was predictive. Throughout last season, I frequently found myself looking at players who were performing below their projection, but matching their xStats production, or vice versa, and pondering whether I should trust the xStats or the projections. Could xStats do a better of job of reacting quickly to small sample sizes, and therefore ‘beat’ the projections? Today, I’ll attempt to figure that out. By a few different measures, Steamer reliably shows up at the top of the projection accuracy lists these days, and so in testing out xFantasy, I’m going to pit it against Steamer to see whether we can beat the best there is using xStats.

First, a quick note on the players included in this dataset. The original xFantasy model was trained on 2016 data for all players with >300 PA. For the comparisons made here in ‘Part III’, player seasons are separated into halves, and all players with >50 PA in a half are originally included. Some have been eliminated due to either changing teams, or lack of data somewhere in 2015 or 2016 (for instance, if they missed an entire half due to injury). Some players have inconsistent names, and since I’m a bad person who does things incorrectly, I indexed my data on player names instead of playerID’s. That means everyone’s favorite messed up FanGraphs name, Kike/Enrique/“Kiké” Hernandez, isn’t included, along with a couple others.

To recap from last time, the inputs I use to calculate each of the xFantasy stats are:

HR: xISO
R: xAVG, xISO, SPD*, TeamR+RBI, Batting Order
RBI: xAVG, xISO, SPD*, TeamR+RBI, Batting Order
SB: xOBP, xISO, SPD, TeamSB/PA, Batting Order
AVG: xAVG

(*SPD score has been added to R and RBI calculations since the original xFantasy post)

For both years of xStats data, 2015 and 2016, I’ve separated players into first half (1h) and second half (2h) production. I also have pulled old Steamer projections from the depths of my computer from roughly the All-Star break each year (i.e. early July). All data used today is posted up in a Google spreadsheet here. Anyway, that means our three competitors will be…

  1. Prorated 1h production: Take each player’s 1h pace in the five categories and prorate it to their 2h plate appearances.
  2. 1h xStats (xFantasy): Take each player’s xStats production from the 1h and project the same production over their 2h plate appearances.
  3. Steamer: Take each players Steamer projection and adjust based on actual 2h plate appearances.

Option #1 would be our absolute lowest bar, we should hope xStats can do a better job predicting future performance than the raw ‘real’ stats over that same time period. And I’ll go ahead and say that we’re expecting option #3 is probably the highest bar — Steamer is a much more complex system, using several years of player history (where available), adjusting for park factors, and certainly using many more variables. For xFantasy, it’s just Statcast data, and just over a fairly small sample. This same idea was brought up recently by Andrew:

“Both of these methods use a very, very different process to evaluate players.  xStats uses Statcast data and nothing else, it clings to batted-ball velocity and launch angle. ZiPS is quite different, and there are many resources you can look at to learn more about it.  At the end of the day, though, you see very similar results.  Eerily similar, perhaps.”

– Andrew Perpetua, “Using Statcast to Project Trea Turner”

I hope anyone reading this has already seen that post, as Andrew is using xStats in exactly the way I’m considering here — look at a guy with small major-league sample size, with a recent change in skills (more power for Turner), and see what xStats projects for him.

So first, to set the standard, here are our so-called lower and upper bounds for coefficient of determination (R2) values when predicting second-half (2h) stats:

It’s maybe surprising that using first-half stats does a fairly decent job, but that’s largely due to using the known second-half playing time. Steamer is significantly better across the board, though it’s worth noting that AVG is nearly impossible to predict, with Steamer doing a bad job (R2=.143) but 1h stats doing a far worse job (R2=.067). Before we get to xFantasy, I also wanted to test how my slash-line conversion models were working (i.e. the method used to translate xStats into xFantasy). To do so, I took the rate stats predicted by Steamer (AVG, OBP, ISO) and plugged them into the xFantasy equations to arrive at what I’ll call ‘xSteamer’:

And hey, it looks like very little change. That means Steamer’s relationships between the rate stats and HR, R, and RBI are fairly similar to the ones I’ve come up with. Steamer’s models are still (obviously) better for the most part, though xSteamer somehow beats the original Steamer model when it comes to HR! SB is where we see something completely different, where my model is coming up with significantly worse predictions (R2=.494) than the original Steamer (R2=.671). I would guess that means that historical SB stats are more useful predictors of SB than a player’s current SPD score (actually, a simple check will tell you that’s true, 1h SPD and 2h SPD do not correlate well). In any case, it’s finally time to see where xFantasy falls on this spectrum we’ve set up:

If I’m being honest, I was really hoping to see xFantasy fall closer to Steamer on AVG and HR. But at least for R/RBI, we can definitively say xStats are much more useful for projecting future performance than 1h stats. In the case of SB, it’s a bit of a split decision — xFantasy is doing a poor job, but Steamer does a similarly poor job (both with R2 of approx .49) if using the same inputs as my model.

Now I have to acknowledge an obvious weakness of xFantasy in terms of predictive ability: TeamR+RBI, TeamSB/PA, Batting Order, and SPD…we could likely project each of these much more accurately than just using recent history. Rather than pulling real stats from the first half for each of those, I could have pulled projections or longer historical averages, and likely improved the outcomes significantly. As a shortcut, let’s just eliminate those variables and try again. For this next set of data, I’ve plugged in the *actual* second-half performance for each player in TeamR+RBI, TeamSB/PA, Batting Order, and SPD. For the most direct comparison, I’ll show xFantasy vs. xSteamer:

Now that’s looking pretty good! Gifted with the power to know a few things about actual second-half team performances, xSteamer sets the bar with the highest R2 in each of the five categories. And xFantasy is not far behind! One of the most obvious areas for potential improvement is already a work in progress, with the next version of xStats including park factors. Beyond that, I think this stands as good evidence that xStats could be the basis of a successful projection system, especially if combined with additional historical info or team-level projections. To back that claim up, I’ve come up with one final comparison. Using 2015 xStats, along with the first half of 2016 xStats, we can come up with 1.5 years of xAVG/xOBP/xISO to make predictions of second half 2016. For completeness’ sake, I’ll use a 1.5-year average of all other inputs (i.e. team stats, order, and SPD).

Exciting! It turns out that having more than one half (AKA < 300 PA) of stats leads to much better results. Until we have another year of xStats data to play with, this is the best test we can do for the predictive ability of xStats, but I’m personally quite impressed that this very simple model built on top of xStats is nearly matching the much more complex Steamer system.

At the outset of this whole study, I was hoping to show xFantasy/xStats were at least marginally useful for projecting forward, and I think we’ve seen that. So now I’ll return to the original question: Might xFantasy actually beat Steamer when major-league sample size is small? The easiest possible comparison would be to break down the projection accuracy by player age…

And…yes! xFantasy does a better job projecting the second half for players under 26. Using just Statcast data, 1h SPD score, 1h team stats, and 1h batting order, xFantasy is able to beat the Steamer projection in HR, RBI, and AVG, along with an essential tie in R. The SB model is still quite bad, but I suspect pulling a longer-term average of SPD score (would have to include minors data) would push it up to Steamer’s level. Of course, Steamer is still kicking butt in both the other age ranges. On a mostly unrelated note, both systems do a great job projecting HR/R/RBI for old players, but a surprisingly poor job of projecting SB!

Next time…

So far I’m impressed with how useful xStats and xFantasy can be. I’m looking forward to integrating the further upgrades that Andrew Perpetua has been working on! I’ve also done some initial work on xFantasy for pitchers, using Andrew’s xOBA and xBACON allowed stats, along with Mike Podhorzer’s xK% and xBB% stats. If I can get it to a place of marginal usefulness, I’ll return for a part IV to look at that!

As I said last time, it’s been fun doing this exploration of rudimentary projections using xFantasy and xStats. Hopefully others find it interesting; hit me up in the comments and let me know anything you might have noticed, or if you have any suggestions.





Ryan works in technical consulting and failure analysis but baseball analysis is his side hustle. Twitter @RyanEBrock and https://GetBigBoard.com/

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Andrew Dominijannimember
7 years ago

Perhaps it’s been done, but i’d never seen projection efficacy broken down by age range before. I wonder if the dominant effect is simply weeding out a lot of marginal players by the time they reach that age range, or if players truly become more predictable as they age. I suppose it could be studied by controlling for player (excluding “Kiké” Hernandez?).

Anyhow, nice work. I’m curious: How were you able to pull mid-season steamer projections from the past?