Comparing 2010 Hitter Forecasts Part 2: Creating Better Forecasts by Will Larson January 27, 2011 In Part 1 of this article, I looked at the ability of individual projection systems to forecast hitter performance. The six different projection systems considered are Zips, CHONE, Marcel, CBS Sportsline, ESPN, and Fangraphs Fans, and each is freely available online. It turns out that when we control for bias in the forecasts, each of the forecasting systems is, on average, pretty much the same. In what follows here, I show that the Fangraphs Fan projections and the Marcel projections contain the most unique, useful information. Also, I show that a weighted average of the six forecasts predicts hitter performance much better than any individual projection. Forecast encompassing tests can be used to determine which of a set of individual projections contain the most valuable information. Based on the forecast encompassing test results, we can calculate a forecast that is a weighted average of the six forecasts that will outperform any individual forecast. The term “forecast encompassing” sounds complicated, but it’s a simple concept. The idea is that if one projection doesn’t contain any unique information helpful to forecasting compared to another projection, then that forecast is said to be “forecast encompassed” and it can be discarded. When we are left with a group of forecasts that don’t encompass each other, then each must then contain some unique, relevant information. Table 1 shows the optimal forecast weights after forecast encompassing tests have eliminated the forecasts with duplicate or irrelevant information. One thing that we see is that the Fangraphs Fan projections contain a large amount of unique information relevant for forecasting in each statistical category. Marcel projections are relevant in four categories. ESPN and CHONE projections are only useful in two categories, Zips in one, and the CBS projections have no unique, useful information in them according to these metrics. Table 1. Optimal Forecast Weights Runs HRs RBIs SBs AVG Marcel 0.22 0.53 0.25 0.38 Zips 0.30 CHONE 0.44 0.44 Fangraphs Fans 0.19 0.47 0.31 0.29 0.55 ESPN 0.29 0.33 CBS Using these weights, we can compute a forecast for each statistic that is a weighted average of these six publicly available forecasts. Table 2 shows the Root Mean Squared Forecasting Errors (RMSFE) of this composite forecast versus the other six forecasts. Here, we see that the weighted average performs substantially better than any individual forecast. Table 2. Root Mean Squared Forecasting Error Runs HRs RBIs SBs AVG Marcel 24.43 7.14 23.54 7.37 0.0381 Zips 25.59 7.47 26.23 7.63 0.0368 CHONE 25.35 7.35 24.12 7.26 0.0369 Fangraphs Fans 29.24 7.98 32.91 7.61 0.0396 ESPN 26.58 8.20 26.32 7.28 0.0397 CBS 27.43 8.36 27.79 7.55 0.0388 Weighted Average 21.74 6.62 21.71 6.77 0.0338 Even when we correct for the over-optimism of the six base projections, the average forecast still does better in every category, though by not as much. Table 3. Bias-corrected Root Mean Squared Forecasting Error Runs HRs RBIs SBs AVG Marcel 23.36 6.83 22.81 7.28 0.0348 Zips 22.98 7.02 23.52 7.59 0.0341 CHONE 22.96 6.85 22.33 7.24 0.0341 Fangraphs Fans 23.24 6.88 23.53 7.08 0.0340 ESPN 23.03 7.27 23.62 7.14 0.0357 CBS 22.91 7.29 23.90 7.27 0.0347 Weighted Average 21.74 6.62 21.71 6.77 0.0338 So what is the takeaway from this two-part series comparing six of the freely available sets of hitter forecasts? 1) Without correcting for the over-optimism (bias) of the forecasts, the mechanical forecasts, Marcel, CHONE, and Zips, outperform the others. 2) When correcting for the biases, no set of forecasts is any better than another. 3) A weighted average of the forecasts performs much better than any individual forecast. 4) Forecast encompassing tests indicate that the Fangraphs Fan projections and the Marcel projections contain the most unique and relevant information in them compared to the other forecasts considered.