Fantasy Rankings: Why Methodology Matters

by The Stranger

March 10, 2014

By far, the hardest thing about fantasy baseball is the fact that you can’t predict the future. Every year, a Matt Carpenter or a Chris Davis vastly outperforms expectations and wins a fantasy league for somebody, and a Matt Kemp battles injuries all year and makes somebody else tear their hair out. But you learn to deal with that sort of thing, or you take up a less stressful hobby, like Russian roulette. C’est la vie, and all that.

What this article is about, however, is that the second-hardest thing about fantasy baseball is trying to juggle categories. Which is better, Mike Trout’s five-category production, or Miguel Cabrera’s dominance in four categories? How much is it worth to have Billy Hamilton singlehandedly win stolen bases for you while contributing nothing in the other categories? Can you absorb Pedro Alvarez’s batting average hit for the home runs he gives you? Over the years, people have come up with a few different ways to try to answer those questions. Standing Gain Points (SGP) is one popular method. Z-scores are another. There are others, but those are the two I see the most, so they’re the two I’m going to talk about. The point of this article isn’t to compare all of the ranking systems out there and figure out which one is “right.” The point of this article is to call attention to the fact that your choice of ranking system matters, probably more than you think.

Of course, most fantasy ranking systems start with projections. Personally, I like to use composite projections, because I think there’s value in combining projections and smoothing out spots where one system might be exceptionally high or low on a player. You can disagree with the projections – that’s not the point. The point is, you (or your fantasy expert of choice, if you use published rankings) can take the same projections, plug them into different ranking systems, and get substantially different results.

For the purposes of this article, I’m keeping things very simple, perhaps a little too simple. I don’t care about volatility, risk, upside, injuries, etc. I’m assuming that these projections are accurate. And I’m not going to bother with positional adjustment, because I’m lazy and these aren’t the rankings I’m drafting from, and it doesn’t matter anyway. I’m concerned with how using different methods changes players’ rankings relative to each other, not how much to bump Buster Posey up my draft board because I need a catcher. And I’m looking at rankings, not auction values, because that’s another step that I don’t feel like taking right now.

I’m going to look at the shortstop position (specifically the top 14, because I play in a 14-team league) for this article, because I need to narrow things down to a manageable number of players. I’m assuming a standard 5×5 league. And what I’m looking at is SGP (using the formula here), compared to two slightly different ways of calculating z-scores. In all cases, I’m looking at the rankings of each player among shortstops and among all hitters. Really, though, I’m concerned with the overall rankings because I want to see how players move around – the choice to focus on shortstops is just a convenient way to select a handful of players to look at.

Anyway, on to the fun stuff:

SGP shortstop rankings:

Player Name	AB	H	R	HR	RBI	SB	AVG	ORANK	SSRANK
Troy Tulowitzki	525	157	84	28	91	3	0.300	19	1
Hanley Ramirez	510	146	81	23	81	16	0.287	22	2
Jose Reyes	573	169	88	12	54	26	0.295	32	3
Jean Segura	592	164	77	10	51	37	0.277	39	4
Ian Desmond	568	156	72	20	77	19	0.275	41	5
Elvis Andrus	612	168	80	5	60	35	0.275	47	6
Everth Cabrera	575	149	76	4	42	49	0.259	50	7
Ben Zobrist	580	157	82	15	77	11	0.271	71	8
Starlin Castro	636	177	77	12	58	14	0.278	94	9
Asdrubal Cabrera	539	141	70	16	68	11	0.261	108	10
Andrelton Simmons	578	157	73	14	61	9	0.271	115	11
J.J. Hardy	577	151	70	23	69	1	0.262	116	12
Alexei Ramirez	595	161	63	8	57	21	0.270	118	13
Bradley Miller	522	142	71	14	57	11	0.271	119	14

Looks reasonable. I don’t know. We don’t have anything to compare it to yet. So let’s compare it to z-scores. For this example, I’m going to calculate my average and standard deviation for each category using all players projected for over 300 at bats.

Z-score shortstop rankings using all players with >300 AB:

Player Name	AB	H	R	HR	RBI	SB	AVG	ORANK	SSRANK
Troy Tulowitzki	525	157	84	28	91	3	0.300	16	1
Hanley Ramirez	510	146	81	23	81	16	0.287	22	2
Jose Reyes	573	169	88	12	54	26	0.295	36	3
Ian Desmond	568	156	72	20	77	19	0.275	43	4
Jean Segura	592	164	77	10	51	37	0.277	51	5
Elvis Andrus	612	168	80	5	60	35	0.275	57	6
Ben Zobrist	580	157	82	15	77	11	0.271	65	7
Everth Cabrera	575	149	76	4	42	49	0.259	71	8
Starlin Castro	636	177	77	12	58	14	0.278	91	9
Asdrubal Cabrera	539	141	70	16	68	11	0.261	110	10
J.J. Hardy	577	151	70	23	69	1	0.262	112	11
Andrelton Simmons	578	157	73	14	61	9	0.271	114	12
Bradley Miller	522	142	71	14	57	11	0.271	119	13
Alexei Ramirez	595	161	63	8	57	21	0.270	125	14

Comparing those two tables, the methods agree on the top 14 shortstops. For the most part, these rankings are pretty similar. But Tulowitzki moves up a few spots in the overall rankings, which isn’t insignificant that early in the draft. Segura drops a round or two, and swaps spots with Desmond in the shortstop rankings. Andrus moves down the overall rankings a bit. Everth Cabrera moves down the overall rankings quite a lot, going from a mid-round steal to a guy who’s probably merely a decent value at his ADP.

So we learned a few things there, maybe. But when I use z-scores, I don’t think it makes sense to calculate them using every player who sees significant playing time – most of those will probably never be rostered in your fantasy league. I want to compare fantasy-relevant players to other fantasy-relevant players, not waiver wire fodder. So let’s take the top 200 hitters, as determined by the initial z-score rankings, recalculate the average and standard deviation for each category using only those players, and try again.

Z-score shortstop rankings using the top 200 players:

Player Name	AB	H	R	HR	RBI	SB	AVG	ORANK	SSRANK
Troy Tulowitzki	525	157	84	28	91	3	0.300	14	1
Hanley Ramirez	510	146	81	23	81	16	0.287	23	2
Jose Reyes	573	169	88	12	54	26	0.295	36	3
Ian Desmond	568	156	72	20	77	19	0.275	49	4
Jean Segura	592	164	77	10	51	37	0.277	59	5
Elvis Andrus	612	168	80	5	60	35	0.275	63	6
Ben Zobrist	580	157	82	15	77	11	0.271	64	7
Everth Cabrera	575	149	76	4	42	49	0.259	93	8
Starlin Castro	636	177	77	12	58	14	0.278	94	9
J.J. Hardy	577	151	70	23	69	1	0.262	108	10
Asdrubal Cabrera	539	141	70	16	68	11	0.261	111	11
Andrelton Simmons	578	157	73	14	61	9	0.271	115	12
Bradley Miller	522	142	71	14	57	11	0.271	119	13
Jed Lowrie	538	145	70	15	65	3	0.269	126	14

Again, everything looks pretty similar at first glance. Alexei Ramirez drops off the list in favor of Jed Lowrie, but that’s no big deal. But Tulowitzki moves up another couple spots – he’s pushing first-round value now, even before positional adjustments. Segura and Andrus drop a little further in the overall rankings. Cabrera, who was already worth less using z-scores, is even worse with a smaller player pool. Remember, that rank of 93 is only among hitters – factor in pitchers, and Cabrera, a mid-round steal using SGP, now looks overvalued at his ADP of 106 (though we can’t say that for sure without applying positional adjustments). All things considered, simply changing the size of the player pool had as much of an effect as changing from SGP to z-scores in the first place.

Depending on which ranking method you use, you’re going to place a pretty different value on some of these players (again, with the caveat that I didn’t do positional adjustments). At the top of the shortstop rankings, Tulowitzki could be anywhere from a late second round pick to a borderline first-rounder. Cabrera’s value swings wildly depending on what system you use – he’s either a player to target fairly early, or borderline undraftable where you’d have to take him. Other players, like Hanley Ramirez or Brad Miller, are remarkably consistent across all three methods, but there’s no way to know how much of that is chance.

The natural thing now is to wonder is which of these systems is right. This seems like it should be solvable. I really want there to be an answer to this, a clear way to combine five categories of production into a single overall rank. Unfortunately, I’m not convinced that exists. People smarter than me have come up with a few different ways to reach that goal, and the results don’t agree with each other. Even if they did, the needs of your team are going to evolve as the draft goes on. When you pick whatever method you prefer and compile your pre-draft rankings, the numbers you get are going to look pretty absolute, there in black and white in your spreadsheet. But really, they’re more like ballpark estimates, and they could easily be totally different.

Does Pitching Deep into Games Lead to More Wins?

2014 Oakland Athletics Preview

9 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Matthew Zimmermann

12 years ago

Interesting analysis. I have been using the third method for most of my rankings simply because it makes more sense to me than using SGP in a H2H league, but it seems like I need to think about and compare my methodology a bit more.

The Stranger

12 years ago

H2H leagues have a complicating factor that I haven’t seen accounted for anywhere. On a week-to-week basis, luck obviously plays a huge role; the team with an on-paper advantage in any given category probably only wins that category a little over half the time. Where it gets interesting is, not all categories are equally volatile. For instance, a projected advantage in HR probably holds up a little more often than a comparable projected advantage in AVG. Or at least I think so, given that HR rate stabilizes more quickly than AVG.

I’m not sure how to rank the week-to-week volatility of the other categories – in and of itself, that would be an interesting thing to know for H2H leagues. But after you’ve figured that out, do you give more value to the less volatile stat categories, since an advantage there is more likely to pay off and the more volatile categories are a crapshoot anyway? Or do you just try to get a balanced team, figuring that a smaller advantage in a lot of categories maximizes your chances of winning at least half of them?

frivoflava29

12 years ago

Reply to The Stranger

I’ve thought a lot about this over the years. I won my 2010 H2H despite having the worst season stats in nearly every category. I would have been dead last in a roto league.

We often hear about players having better performance in the first half than the second due to a lack of durability. How does that consistency hold up day to day though? There are other factors too, like maybe the player’s team is facing a string of like-handed pitchers that he struggles against. We hear about that a lot too. So many people play in H2H leagues, I’m a little surprised this is largely just ignored. It is difficult to pick out what is and isn’t relevant, and at the same time, it seems really important.

lanceomatic

12 years ago

Reply to The Stranger

I like this discussion. Personally if I were to guess the best (because they are less volatile) categories for a H2H team to excel at they would be R HR RBI on the hitting side and K on the pitching side.

My guess is you want to be really good at the less volatile categories and mediocre in the others.

Elias WalshMember since 2025

12 years ago

Fun article. I’ve also noticed that seemingly small choices can lead to large differences in player valuations. However, here are two reasons NOT to use any of these methods:

1) Projected stats understate the true dispersion in players’ performance. If you calculate z-scores using the standard deviation of your projected stats, you will over value hard-to-predict categories. Instead, use historical standard deviations of actual performance. Doing so will help you avoid paying through the nose for projected pitcher wins that are never realized. (If I understand the SGP method correctly, this issue can lead to the same kind of problems in that method.)

2) Projected plate appearances are irrelevant for setting the cutoff because projected playing time is probably bunk, and because 300 plate appearances of Billy Hamilton might be a lot more valuable than 700 plate appearances of Alexei Ramirez. Instead, use a two-step approach. Calculate z-scores based on the entire population of players with major league projections. Do an initial overall ranking. Then keep only players who are above whatever your league replacement level might be. Now do a second z-score based on that population. Now you have z-scores that measure value relative to the draft-able player population, exactly what you should care about when determining your rankings.

Matthew Zimmermann

12 years ago

Reply to Elias Walsh

Wouldn’t it be easier to just take into account point #1 in your projections by not applying a wide range of values for the categories which are difficult to project? It seems that’s what Steamer has done at least.

More info here:
http://tangotiger.com/index.php/site/article/difference-between-forecasting-results-with-an-without-an-identifier

Elias WalshMember since 2025

12 years ago

Reply to Matthew Zimmermann

That’s true, and exactly the source of the problem. Good projection systems like Steamer will project a narrower range of values than will actually be realized. However, when you z-score the stats you divide by a too small standard deviation and get back a wide range of values, essentially undoing all that good mean reversion in the projections. It’s similar to what would happen if you decided to z-score stolen bases using a standard deviation calculated only among catchers. You’d end up paying a ton for a C with 5 projected SB versus the another identical C with only 4 SB.

The Stranger

12 years ago

Reply to Elias Walsh

One thing I’ve considered in the past is using a few seasons of data to find the correlation between projected and actual values for each category, then multiplying the z-score for each category by its correlation. In theory, that would give more weight to the stats that can be projected with some accuracy. I hadn’t considered using last year’s actual performance to get standard deviations, but it’s an interesting idea. I’m not sure if you create bias by mixing actual and projected stats, though – you probably do, because bias is everywhere.

Which is really the issue this article was meant to highlight. We have a bunch of ways of ranking players, but they’re mostly based on what seems intuitively logical to the person who came up with them. They give you a bunch of different results, and we don’t really know which one most closely matches the actual value of players in terms of winning a fantasy league.

BeanoCook

12 years ago

Stat categories in H2H leagues vary wildly in value due to aforementioned variability over what is a short time period of one week. My data work shows batting average is almost all luck, over a week. SB is worth almost double HR, both are very valuable. In pitching, Ks are most valuable, whip is more stable than era. In general pitching categories are more stable, thus more valuable over a week, factoring in a min of 30 ip. I love that this critical info is almost completely ignored, it’s allowed me to win league after league, ESP Y!

Pitchers are way under drafted, it plays right into the face of conventional wisdom, it’s perfect. Humans still over rate batting average, even in fantasy…perfect. It’s near useless in a H2H league.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG