Evaluating 2013 Projections

Welcome to the 3rd annual forecast competition, where each forecaster who submits projections to bbprojectionproject.com is evaluated based on RMSE and model R^2 relative to actuals (see last year’s results here).  Categories evaluated for hitters are: AVG, Runs, HR, RBI, and SB, and for pitchers are: Wins, ERA, WHIP, and Strikeouts. RMSE is a popular metric to evaluate forecast accuracy, but I actually prefer R^2.  This metric removes average bias (see here) and effectively evaluates forecasted player-by-player variation, making it more useful when attempting to rank players (i.e. for fantasy baseball purposes).

Here are the winners for 2014 for R^2 (more detailed tables are below):

Place
Forecast System
Hitters
Pitchers
Average
1st
Dan Rosenheck
2.80
2.50
2.65
2nd
Steamer
1.60
6.00
3.80
3rd
FanGraphs Fans
5.80
2.75
4.28
4th
Will Larson
6.60
3.00
4.80
5th
AggPro
6.40
4.25
5.33
6th
CBS Sportsline
5.40
8.00
6.70
7th
ESPN
6.60
7.50
7.05
8th
John Grenci
8.00
8.00
9th
ZiPS
9.80
7.25
8.53
10th
Razzball
6.80
10.25
8.53
11th
Rotochamp
8.60
9.00
8.80
12th
Sports Illustrated
8.80
12.00
10.40
13th
Guru
10.60
12.00
11.30
14th
Marcel
11.20
12.50
11.85

 

And here are the winners for the RMSE portion of the competition:

Place
Forecast System
Hitters
Pitchers
Average
1st
Dan Rosenheck
2.60
2.00
2.30
2nd
Will Larson
3.60
2.50
3.05
3rd
Steamer
1.80
5.00
3.40
4th
AggPro
4.00
3.00
3.50
5th
ZIPS
6.00
5.75
5.88
6th
Guru
4.80
7.25
6.03
7th
Marcel
6.20
8.50
7.35
8th
John Grenci
7.50
7.50
9th
Rotochamp
9.40
9.00
9.20
10th
ESPN
9.20
10.50
9.85
11th
Fangraphs Fans
11.80
8.75
10.28
12th
Razzball
9.40
11.25
10.33
13th
Sports Illustrated
10.60
11.75
11.18
14th
CBS Sportsline
11.60
12.25
11.93

 

I’m beginning to notice some trends in the results across years.  First, systems that include averaging do particularly well.  This is pretty well established by now, but it’s always useful to reflect upon.  It’s been asked in the past to perform evaluations separating forecasts computed by averaging with those that do not include information from others’ forecasts (more “structural” forecasts). I decided not to do this because the nature of the baseball forecasting “season” makes it impossible to be sure forecasts are created without taking into account information from others’ forecasts. This can include direct influence (forecasting as a weighted average of others’ forecasts), but can also occur in more subtle ways, such as model selection based on forecasts that others have put forward.  Second, FanGraphs Fans are always fascinating to me, and how they can be so biased, but yet contain some of the best unique and relevant information for forecasting player variation. The takeaway from the Fans forecast set is that crowdsourced-averaging works, as long as you can remove the bias in some way, or ignore it by instead focusing on ordinal ranks.

Some additional notes: it would be interesting to decompose these aggregate stats in to rates multiplied by playing time, but it’s difficult to gather all of this for each projection system. Therefore, I focus on top-line output metrics.  Also, absolute rankings are presented, but many of these are likely statistically indistinguishable from each other.  If someone wants to run Diebold-Mariano tests, you can download the data used in this comparison from bbprojectionproject.com

Thanks for reading, and please submit your projections for next year! Also, as always, I welcome any comments, and I’ll do my best to respond.

R^2 Detailed Tables

system
r
rank
hr
rank
rbi
rank
avg
rank
sb
rank
AVG
AggPro
0.250
6
0.42
9
0.308
8
0.32
1
0.538
8
6.4
Dan Rosenheck
0.296
3
0.45
1
0.340
3
0.3
3
0.568
4
2.8
Steamer
0.376
1
0.45
2
0.393
1
0.31
2
0.572
2
1.6
Will Larson
0.336
2
0.43
6
0.345
2
0.21
13
0.509
10
6.6
Marcel
0.146
12
0.36
12
0.236
12
0.27
8
0.477
12
11.2
ZIPS
0.118
13
0.42
8
0.230
13
0.3
4
0.504
11
9.8
CBS Sportsline
0.278
4
0.44
3
0.320
4
0.25
10
0.542
6
5.4
ESPN
0.241
7
0.43
5
0.317
5
0.29
7
0.532
9
6.6
Razzball
0.239
8
0.43
4
0.314
6
0.24
11
0.553
5
6.8
Rotochamp
0.234
9
0.41
10
0.287
9
0.23
12
0.569
3
8.6
Fangraphs Fans
0.268
5
0.42
7
0.272
10
0.3
6
0.574
1
5.8
Guru
0.186
11
0.33
13
0.263
11
0.3
5
0.476
13
10.6
Sports Illustrated
0.221
10
0.4
11
0.314
7
0.27
9
0.541
7
8.8

 

system
W
rank
ERA
rank
WHIP
rank
SO
rank
AVG rank
AggPro
0.13
3
0.15
4
0.25
4
0.402
6
4.25
Dan Rosenheck
0.17
1
0.19
2
0.27
2
0.406
5
2.5
Steamer
0.09
6
0.15
3
0.26
3
0.341
12
6
Will Larson
0.16
2
0.19
1
0.24
5
0.413
4
3
Marcel
0.05
14
0.02
13
0.17
9
0.293
14
12.5
ZIPS
0.09
7
0.07
9
0.21
6
0.375
7
7.25
CBS Sportsline
0.1
5
0.08
7
0.15
10
0.359
10
8
ESPN
0.08
10
0.05
11
0.2
7
0.43
2
7.5
Razzball
0.06
13
0.07
8
0.14
12
0.374
8
10.3
Rotochamp
0.08
9
0.06
10
0.17
8
0.359
9
9
Fangraphs Fans
0.11
4
0.08
5
0.28
1
0.435
1
2.75
Guru
0.07
11
0.05
12
0.11
14
0.343
11
12
Sports Illustrated
0.09
8
0.02
14
0.14
13
0.338
13
12
John Grenci

0.07

12

0.08

6

0.15

11

0.42

3

8

 

RMSE Detailed Tables

system
r
rank
hr
rank
rbi
rank
avg
rank
sb
rank
AVG
AggPro
22.495
4
7.34
4
23.217
4
0.03
4
7.096
4
4
Dan Rosenheck
20.792
3
6.91
1
21.867
2
0.03
5
6.467
2
2.6
Steamer
20.355
2
7.02
2
21.817
1
0.03
3
6.258
1
1.8
Will Larson
20.091
1
7.2
3
22.234
3
0.03
8
6.864
3
3.6
Marcel
23.473
6
7.51
6
23.831
6
0.03
7
7.334
6
6.2
ZIPS
25.380
7
7.43
5
25.662
7
0.03
1
8.048
10
6
CBS Sportsline
25.866
10
8.63
13
26.837
10
0.03
12
8.527
13
11.6
ESPN
25.698
8
8.37
12
26.418
9
0.03
6
8.120
11
9.2
Razzball
25.831
9
8.01
9
27.842
12
0.03
9
7.920
8
9.4
Rotochamp
26.199
11
8
8
25.995
8
0.04
13
7.686
7
9.4
Fangraphs Fans
26.854
13
8.12
10
30.804
13
0.03
11
8.289
12
11.8
Guru
23.187
5
7.58
7
23.608
5
0.03
2
7.198
5
4.8
Sports Illustrated
26.609
12
8.24
11
27.173
11
0.03
10
8.009
9
10.6

 

system
W
rank
ERA
rank
WHIP
rank
SO
rank
AVG rank
AggPro
4.4
3
1.031
4
0.17
4
47.01
1
3
Dan Rosenheck
4.25
1
1.014
1
0.17
1
47.9
5
2
Steamer
5.02
8
1.030
3
0.17
2
49.45
7
5
Will Larson
4.34
2
1.017
2
0.17
3
47.44
3
2.5
Marcel
4.62
5
1.158
13
0.18
8
50.84
8
8.5
ZIPS
4.78
7
1.101
7
0.17
5
47.85
4
5.75
CBS Sportsline
5.56
13
1.134
11
0.19
11
57.14
14
12.3
ESPN
5.81
14
1.126
10
0.18
7
53.54
11
10.5
Razzball
5.39
12
1.115
8
0.19
12
55.55
13
11.3
Rotochamp
4.71
6
1.138
12
0.18
9
51.81
9
9
Fangraphs Fans
5.29
10
1.123
9
0.17
6
52.57
10
8.75
Guru
4.51
4
1.093
6
0.19
13
48.79
6
7.25
Sports Illustrated
5.33
11
1.176
14
0.18
10
55.32
12
11.8
John Grenci
5.14
9
1.080
5
0.19
14
47.26
2
7.5

 





7 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
jerusalem-artichoke
10 years ago

thanks for doing this Will yet again it’s one of the most interesting articles on fangraphs all year

sam
10 years ago

which of these projection systems will be publicly available? seems like the highest performing ones, Rosenheck, AggPro, Larson, are not available, at least not yet

gabriel syme
10 years ago

So here’s my question: we know that Fangraphs Fan projections have some systematic biases – at least as of a few years ago, they were too optimistic about performance for almost all players. There may be other systematic biases. What happens to the Fangraphs Fan projections if you correct for the systematic biases?

William Wallacemember
10 years ago

Should we infer that FanGraph Fans projections are doing a better job of projecting IP for pitchers and thus are pretty good at projecting Ks? Or am I missing something?

Pitcher rate stats look great for Steamer, but Ks look terrible, so IP projection problems jumped to mind for me.

obsessivegiantscompulsive
10 years ago

He’s no ape, Marcel clearly loves RMSE!

Hey, nice analysis Will. I was wondering if there is any easy way for you to test out how well, say, if we combined all the projection systems together, how that composite projection would do with your analysis. Or, say, if you combined the top three public forecasts, your forecast plus Steamer plus ZIPS, how that would have done in the rankings. I assume they would be better, just curious how they would have done.

I was surprised that Oliver was not in the study since their data is in Fangraphs. Why wasn’t it?

Rufus T. Fireflymember
10 years ago

Any feel for how PECOTA might stack up?