## Comparing 2010 Hitter Forecasts Part 2: Creating Better Forecasts

In Part 1 of this article, I looked at the ability of individual projection systems to forecast hitter performance. The six different projection systems considered are Zips, CHONE, Marcel, CBS Sportsline, ESPN, and Fangraphs Fans, and each is freely available online.  It turns out that when we control for bias in the forecasts, each of the forecasting systems is, on average, pretty much the same.  In what follows here, I show that the Fangraphs Fan projections and the Marcel projections contain the most unique, useful information. Also, I show that a weighted average of the six forecasts predicts hitter performance much better than any individual projection.

Forecast encompassing tests can be used to determine which of a set of individual projections contain the most valuable information. Based on the forecast encompassing test results, we can calculate a forecast that is a weighted average of the six forecasts that will outperform any individual forecast.

## Importance of Category Scarcity in Fantasy Baseball

We hear about position scarcity all the time, but category scarcity also plays a role in valuing players. In 2000, 47 players hit at least 30 HR (hmm, wonder why?) as compared to just 18 players in 2010. Mark Reynolds hit 32 HR last year and tied for 10th in baseball. Many fantasy owners continued to start Reynolds every day despite his sub-Mendoza .198 average because his power was so valuable. Had Reynolds hit 32 HR with a .198 average back in 2000, he would have been riding the digital pine. Power wasn’t at a premium back then.

And that’s category scarcity in a nutshell. In fact, position scarcity is really just a function of category scarcity. Shortstop is only considered shallow because there are so few players who can contribute across the board. A quick look at any shortstop rankings shows how rapidly talent plummets at the position.

## Comparing 2010 Hitter Forecasts Part 1: Which System is the Best?

There are a number of published baseball player forecasts that are freely available and online.  As Dave Allen notes in his article on Fangraphs Fan Projections, and what I find as well, is that some projections are definitely better than others.  Part 1 of this article examines the overall fit of each of six different player forecasts: Zips, CHONE, Marcel, CBS Sportsline, ESPN, and Fangraphs Fans.  What I find is that the Marcel projections are the best based on average error, followed by the Zips and CHONE projections.  However, if we control for the over-optimism of each of these projection systems, each of the forecasts are virtually indistinguishable.

This second result is important in that it requires us to dig a little deeper to see how much each of these forecasts is actually helping to predict player performance.  This is addressed in Part 2 of this article.

The tool that is generally used to compare the average fit of a set of forecasts is Root Mean Squared Forecasting Error (RMSFE).  This measure is imperfect in that it doesn’t consider the relative value of an over-projection versus and under-projection; for example, in earlier rounds of a fantasy draft we may be drafting to limit risk while in later rounds we may be seeking risk.  That being said, RMSE is pretty easy to understand and is thus the standard for comparing average fit of a projection.

Table 1 shows the RMSFE of each of the projection systems in each of the main five fantasy categories for hitters.  Here, we see that each of the “mechanical” projection systems (Marcel, Zips, and CHONE) are the best compared to the three “human” projections.  The value is the standard deviation of the error of a particular forecast.  In other words, 2/3rds of the time, a player projected by Marcel to score 100 runs will score between 75 and 125 runs.

Table 1. Root Mean Squared Forecasting Error

 Runs HRs RBIs SBs AVG Marcel 24.43 7.14 23.54 7.37 0.0381 Zips 25.59 7.47 26.23 7.63 0.0368 CHONE 25.35 7.35 24.12 7.26 0.0369 Fangraphs Fans 29.24 7.98 32.91 7.61 0.0396 ESPN 26.58 8.20 26.32 7.28 0.0397 CBS 27.43 8.36 27.79 7.55 0.0388

Another measure that is important is bias.  Bias occurs when a projection consistently over or under predicts.  Bias inflates the MSFE, so a simple bias correction may improve a forecast’s fit substantially.  In Table 2, we see that the human projection systems exhibit substantially more bias than the mechanical ones.

Table 2. Average Bias

 Runs HRs RBIs SBs AVG Marcel 7.12 2.09 5.82 1.16 0.0155 Zips 11.24 2.55 11.62 0.73 0.0138 CHONE 10.75 2.67 9.14 0.61 0.0140 Fangraphs Fans 17.75 4.03 23.01 2.80 0.0203 ESPN 13.26 3.78 11.59 1.42 0.0173 CBS 15.09 4.08 14.17 2.05 0.0173

We can get a better picture about which forecasting system is best by correcting for bias in the individual forecasts. Table 3 presents the results of bias corrected RMSFEs. What we see here is a tightening in the results of the forecasts across each of the forecasting systems.  Here, we see that each forecasting system is about the same.

Table 3. Bias-corrected Root Mean Squared Forecasting Error

 Runs HRs RBIs SBs AVG Marcel 23.36 6.83 22.81 7.28 0.0348 Zips 22.98 7.02 23.52 7.59 0.0341 CHONE 22.96 6.85 22.33 7.24 0.0341 Fangraphs Fans 23.24 6.88 23.53 7.08 0.0340 ESPN 23.03 7.27 23.62 7.14 0.0357 CBS 22.91 7.29 23.90 7.27 0.0347

So where does this leave us if each of these six forecasts are basically indistinguishable?  As it turns out, evaluating the performance of individual forecasts doesn’t tell the whole story.  It may be true that there is useful information in each of the different forecasting systems, so that an average or a weighted average of forecasts may prove to be a better predictor than any individual forecast. Part 2 of this article examines this in some detail. Stay tuned!

## Jeter, Ichiro, And 100 WAR

Recently, David Appelman introduced all of us to the Automated WAR grids. When I clicked into the WAR grids section, the top-25 all-time leaders in recorded MLB history was illustrated as the sample grid. I took some time to let the awe set in, admiring the absolute dominance of the true legends of the game who seem to transcend even the Hall of Fame.

One of the first things I noticed was that every one of them at least matched 100 career WAR. I got to thinking about which players we watch today that we may someday see on this elite 100+ WAR list. There were 19 players active in 2010 that have accumulated 50 career WAR or better. At the top we already see ARod at 120, the only current player who we know for certain fits into that super-elite status. After ARod there is Pujols, who has racked up 81 WAR to date and will likely only need 3 more seasons to join the club. The rest of the players on the list are all guys who are at least in their late 30s and many of them are on the cusp of retirement and/or are in dramatic decline. Realistically, there were only two other players who I thought may have an outside shot at 100 WAR: Jeter and Ichiro.

## Jason Hammel and the Oddity of ERA

ERA can be a weird thing at times. I love it, but it doesn’t always reveal the full story. Jason Hammel is the perfect subject. After six years in the Rays minor league system, and three bad stints with the Rays Major League club, he found himself looking up at a logjam of starting pitchers in Tampa Bay. The Rays traded him to the Rockies after the 2008 season in exchange for Aneury Rodriguez.

With the trade to Colorado, Hammel was given a great opportunity to start in the Majors for a full season. Since his arrival in Colorado two seasons ago, Hammel has been nothing but consistent. Take a look at his stats:

## The Next Jose Bautista

Jose Bautista took the baseball world by storm in 2010 when, after six MLB seasons of doing nothing in particular, he emerged as a candidate for AL MVP. Compare his 54 homers, 124 RBI, and .995 OPS in 161 games last year to the 59 homers, 211 RBI, and .729 OPS he posted in nearly 600 games from 2004-09. Using WAR/PA, Bautista was more than 11 times better in 2010 than he’d been for the rest of his career.

Interestingly, Bautista’s breakout came just a year after Ben Zobrist came out of nowhere to become the second-most valuable player in baseball. After hitting .222/.279/.370 with just 15 homers, 57 RBI and -0.5 WAR in roughly a full season’s worth of games from 2006-08, Zobrist went bananas in 2009, hitting .297/.405/.543 with 27 taters, 91 knocked in, and 8.4 WAR.

Besides the fact that no one expected monster breakouts from either of them, 2009 Zobrist and 2010 Bautista had some interesting things in common. Both had extensive experience in the big leagues but neither had done anything particularly impressive. Both entered their seasons with at some questions about what their roles would be. And both had enjoyed out-of-nowhere power surges during their respective previous Septembers.

## Another Look at Arod’s 2010 Peformance Against Lefties

This post originally appeared on The Captain’s Blog and is a followup to one published at both the Yankeeist and Fangraphs’ Community Forum.

Over at the Yankeeist, Larry Koestler took a look at one of 2010’s most curious mysteries: Alex Rodriguez’ shockingly poor performance against left handed pitchers. Using pitchFX data, Koestler concludes that the pitch selection of opposing southpaws (i.e., fewer four seamers and more cutters, two seamers and sinkers) contributed to Arod’s struggles (while also conceding the limited sample size), but could the answer be much more benign?

## A PitchFX Look at A-Rod’s Bizarre Reverse Platoon Split

This post originally appeared on Yankeeist.

It’s no secret that Alex Rodriguez produced the lowest full-season wOBA of his career in 2010 — his .363 mark was fueled by career-lows in batting average (.270), on-base percentage (.341) and the second-lowest full-season SLG of his career (.506). That these numbers were not only dramatically off from his superb 2009 (.286/.402/.532; .405 wOBA) but his majestic career triple slash (.303/.387/.571) suggests to me that he should be due for a reasonable bounceback. While it’s not impossible Alex has reached an irreversible decline, he’s been too historically good for me to be willing to write him off just yet. I won’t go so far as to proclaim that the Yankees are going to be getting .400-plus-wOBA A-Rod back, but as I’ve noted on at least one occasion this offseason, all A-Rod needs to do is exercise just a tad more patience and a wOBA in the .380s should be more than doable.

## Hall of Shame: Why BBWAA’s Secret Ballots Matter

This was originally posted on WahooBlues.com

When the Baseball Writers Association of America announced Wednesday that Roberto Alomar and Bert Blyleven had been elected to the Baseball Hall of Fame, two worthy inductees who had waited too long were granted entrance to Cooperstown. But to judge the voting process solely by the selections of two worthy candidates would be to ignore the massive problems with the way the BBWAA does business.

## Heyward, Stanton, and 20 year-old studs

Eno Saris’s recent article on Jason Heyward comps got me thinking about comps. It also happens to coincide with the day that I got my Baseball-reference subscription. That I would start looking at seasons from 20 year-olds was inevitable.

It was maybe the third or fourth thing I noticed: 2010 featured another remarkable season from a 20 year-old hitter: Mike Stanton. Here’s a fun fact about Heyward: among 20 year-olds, only two guys walked in more plate appearances than the Braves’ young stud. (Ted Williams and Mel Ott.) Here’s a fun fact about Stanton: the guy closest to him in batted balls for home runs, among 20 year-olds, is Mel Ott, but Mike Staton sent a greater percentage of batted balls over the fence than any age 20 hitter in the retro-sheet era. (Perhaps less fun: he has the highest K% among 20 year-olds too.)

But who are the players most comparable to Stanton and Heyward? To answer this question, I started focusing on three true outcome rate stats (since those are more stable in small samples than ball-in-play stats) in seasons from 20 year-old hitters (regardless of experience). While it’s tempting to focus on rookies, there are just 102 seasons with 200+ PA from a 20 year-old since 1920, so focusing on similarly young rookies just shrinks an already small group. To expand the group a little, I added 21 year-old in their first season (also cut off at 200 PA).

To compare these players, I developed z-scores for players BB/PA, K/AB, and HR/batted ball (AB-K). (See a technical section below on these scores.) Then, treating each 20 year-olds 3 z-scores as a vector, I found the distance of their vector from Heyward’s and Stanton’s vectors. The smaller this distance from their vector, the more comparable they are.