Fantasy Rankings: Why Methodology Matters

By far, the hardest thing about fantasy baseball is the fact that you can’t predict the future. Every year, a Matt Carpenter or a Chris Davis vastly outperforms expectations and wins a fantasy league for somebody, and a Matt Kemp battles injuries all year and makes somebody else tear their hair out. But you learn to deal with that sort of thing, or you take up a less stressful hobby, like Russian roulette. C’est la vie, and all that.

What this article is about, however, is that the second-hardest thing about fantasy baseball is trying to juggle categories. Which is better, Mike Trout’s five-category production, or Miguel Cabrera’s dominance in four categories? How much is it worth to have Billy Hamilton singlehandedly win stolen bases for you while contributing nothing in the other categories? Can you absorb Pedro Alvarez’s batting average hit for the home runs he gives you? Over the years, people have come up with a few different ways to try to answer those questions. Standing Gain Points (SGP) is one popular method. Z-scores are another. There are others, but those are the two I see the most, so they’re the two I’m going to talk about. The point of this article isn’t to compare all of the ranking systems out there and figure out which one is “right.” The point of this article is to call attention to the fact that your choice of ranking system matters, probably more than you think.

Of course, most fantasy ranking systems start with projections. Personally, I like to use composite projections, because I think there’s value in combining projections and smoothing out spots where one system might be exceptionally high or low on a player. You can disagree with the projections – that’s not the point. The point is, you (or your fantasy expert of choice, if you use published rankings) can take the same projections, plug them into different ranking systems, and get substantially different results.

For the purposes of this article, I’m keeping things very simple, perhaps a little too simple. I don’t care about volatility, risk, upside, injuries, etc. I’m assuming that these projections are accurate. And I’m not going to bother with positional adjustment, because I’m lazy and these aren’t the rankings I’m drafting from, and it doesn’t matter anyway. I’m concerned with how using different methods changes players’ rankings relative to each other, not how much to bump Buster Posey up my draft board because I need a catcher. And I’m looking at rankings, not auction values, because that’s another step that I don’t feel like taking right now.

I’m going to look at the shortstop position (specifically the top 14, because I play in a 14-team league) for this article, because I need to narrow things down to a manageable number of players. I’m assuming a standard 5×5 league. And what I’m looking at is SGP (using the formula here), compared to two slightly different ways of calculating z-scores. In all cases, I’m looking at the rankings of each player among shortstops and among all hitters.  Really, though, I’m concerned with the overall rankings because I want to see how players move around – the choice to focus on shortstops is just a convenient way to select a handful of players to look at.

Anyway, on to the fun stuff:

SGP shortstop rankings:

Player Name AB H R HR RBI SB AVG ORANK SSRANK
Troy Tulowitzki 525 157 84 28 91 3 0.300 19 1
Hanley Ramirez 510 146 81 23 81 16 0.287 22 2
Jose Reyes 573 169 88 12 54 26 0.295 32 3
Jean Segura 592 164 77 10 51 37 0.277 39 4
Ian Desmond 568 156 72 20 77 19 0.275 41 5
Elvis Andrus 612 168 80 5 60 35 0.275 47 6
Everth Cabrera 575 149 76 4 42 49 0.259 50 7
Ben Zobrist 580 157 82 15 77 11 0.271 71 8
Starlin Castro 636 177 77 12 58 14 0.278 94 9
Asdrubal Cabrera 539 141 70 16 68 11 0.261 108 10
Andrelton Simmons 578 157 73 14 61 9 0.271 115 11
J.J. Hardy 577 151 70 23 69 1 0.262 116 12
Alexei Ramirez 595 161 63 8 57 21 0.270 118 13
Bradley Miller 522 142 71 14 57 11 0.271 119 14

Looks reasonable. I don’t know. We don’t have anything to compare it to yet. So let’s compare it to z-scores. For this example, I’m going to calculate my average and standard deviation for each category using all players projected for over 300 at bats.

Z-score shortstop rankings using all players with >300 AB:

Player Name AB H R HR RBI SB AVG ORANK SSRANK
Troy Tulowitzki 525 157 84 28 91 3 0.300 16 1
Hanley Ramirez 510 146 81 23 81 16 0.287 22 2
Jose Reyes 573 169 88 12 54 26 0.295 36 3
Ian Desmond 568 156 72 20 77 19 0.275 43 4
Jean Segura 592 164 77 10 51 37 0.277 51 5
Elvis Andrus 612 168 80 5 60 35 0.275 57 6
Ben Zobrist 580 157 82 15 77 11 0.271 65 7
Everth Cabrera 575 149 76 4 42 49 0.259 71 8
Starlin Castro 636 177 77 12 58 14 0.278 91 9
Asdrubal Cabrera 539 141 70 16 68 11 0.261 110 10
J.J. Hardy 577 151 70 23 69 1 0.262 112 11
Andrelton Simmons 578 157 73 14 61 9 0.271 114 12
Bradley Miller 522 142 71 14 57 11 0.271 119 13
Alexei Ramirez 595 161 63 8 57 21 0.270 125 14

Comparing those two tables, the methods agree on the top 14 shortstops. For the most part, these rankings are pretty similar. But Tulowitzki moves up a few spots in the overall rankings, which isn’t insignificant that early in the draft. Segura drops a round or two, and swaps spots with Desmond in the shortstop rankings. Andrus moves down the overall rankings a bit. Everth Cabrera moves down the overall rankings quite a lot, going from a mid-round steal to a guy who’s probably merely a decent value at his ADP.

So we learned a few things there, maybe. But when I use z-scores, I don’t think it makes sense to calculate them using every player who sees significant playing time – most of those will probably never be rostered in your fantasy league. I want to compare fantasy-relevant players to other fantasy-relevant players, not waiver wire fodder. So let’s take the top 200 hitters, as determined by the initial z-score rankings, recalculate the average and standard deviation for each category using only those players, and try again.

Z-score shortstop rankings using the top 200 players:

Player Name AB H R HR RBI SB AVG ORANK SSRANK
Troy Tulowitzki 525 157 84 28 91 3 0.300 14 1
Hanley Ramirez 510 146 81 23 81 16 0.287 23 2
Jose Reyes 573 169 88 12 54 26 0.295 36 3
Ian Desmond 568 156 72 20 77 19 0.275 49 4
Jean Segura 592 164 77 10 51 37 0.277 59 5
Elvis Andrus 612 168 80 5 60 35 0.275 63 6
Ben Zobrist 580 157 82 15 77 11 0.271 64 7
Everth Cabrera 575 149 76 4 42 49 0.259 93 8
Starlin Castro 636 177 77 12 58 14 0.278 94 9
J.J. Hardy 577 151 70 23 69 1 0.262 108 10
Asdrubal Cabrera 539 141 70 16 68 11 0.261 111 11
Andrelton Simmons 578 157 73 14 61 9 0.271 115 12
Bradley Miller 522 142 71 14 57 11 0.271 119 13
Jed Lowrie 538 145 70 15 65 3 0.269 126 14

Again, everything looks pretty similar at first glance. Alexei Ramirez drops off the list in favor of Jed Lowrie, but that’s no big deal. But Tulowitzki moves up another couple spots – he’s pushing first-round value now, even before positional adjustments. Segura and Andrus drop a little further in the overall rankings. Cabrera, who was already worth less using z-scores, is even worse with a smaller player pool. Remember, that rank of 93 is only among hitters – factor in pitchers, and Cabrera, a mid-round steal using SGP, now looks overvalued at his ADP of 106 (though we can’t say that for sure without applying positional adjustments). All things considered, simply changing the size of the player pool had as much of an effect as changing from SGP to z-scores in the first place.

Depending on which ranking method you use, you’re going to place a pretty different value on some of these players (again, with the caveat that I didn’t do positional adjustments). At the top of the shortstop rankings, Tulowitzki could be anywhere from a late second round pick to a borderline first-rounder. Cabrera’s value swings wildly depending on what system you use – he’s either a player to target fairly early, or borderline undraftable where you’d have to take him. Other players, like Hanley Ramirez or Brad Miller, are remarkably consistent across all three methods, but there’s no way to know how much of that is chance.

The natural thing now is to wonder is which of these systems is right. This seems like it should be solvable. I really want there to be an answer to this, a clear way to combine five categories of production into a single overall rank. Unfortunately, I’m not convinced that exists. People smarter than me have come up with a few different ways to reach that goal, and the results don’t agree with each other. Even if they did, the needs of your team are going to evolve as the draft goes on. When you pick whatever method you prefer and compile your pre-draft rankings, the numbers you get are going to look pretty absolute, there in black and white in your spreadsheet. But really, they’re more like ballpark estimates, and they could easily be totally different.





9 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Matthew Zimmermann
10 years ago

Interesting analysis. I have been using the third method for most of my rankings simply because it makes more sense to me than using SGP in a H2H league, but it seems like I need to think about and compare my methodology a bit more.

frivoflava29
10 years ago
Reply to  The Stranger

I’ve thought a lot about this over the years. I won my 2010 H2H despite having the worst season stats in nearly every category. I would have been dead last in a roto league.

We often hear about players having better performance in the first half than the second due to a lack of durability. How does that consistency hold up day to day though? There are other factors too, like maybe the player’s team is facing a string of like-handed pitchers that he struggles against. We hear about that a lot too. So many people play in H2H leagues, I’m a little surprised this is largely just ignored. It is difficult to pick out what is and isn’t relevant, and at the same time, it seems really important.

lanceomatic
10 years ago
Reply to  The Stranger

I like this discussion. Personally if I were to guess the best (because they are less volatile) categories for a H2H team to excel at they would be R HR RBI on the hitting side and K on the pitching side.

My guess is you want to be really good at the less volatile categories and mediocre in the others.

Elias Walsh
10 years ago

Fun article. I’ve also noticed that seemingly small choices can lead to large differences in player valuations. However, here are two reasons NOT to use any of these methods:

1) Projected stats understate the true dispersion in players’ performance. If you calculate z-scores using the standard deviation of your projected stats, you will over value hard-to-predict categories. Instead, use historical standard deviations of actual performance. Doing so will help you avoid paying through the nose for projected pitcher wins that are never realized. (If I understand the SGP method correctly, this issue can lead to the same kind of problems in that method.)

2) Projected plate appearances are irrelevant for setting the cutoff because projected playing time is probably bunk, and because 300 plate appearances of Billy Hamilton might be a lot more valuable than 700 plate appearances of Alexei Ramirez. Instead, use a two-step approach. Calculate z-scores based on the entire population of players with major league projections. Do an initial overall ranking. Then keep only players who are above whatever your league replacement level might be. Now do a second z-score based on that population. Now you have z-scores that measure value relative to the draft-able player population, exactly what you should care about when determining your rankings.

Matthew Zimmermann
10 years ago
Reply to  Elias Walsh

Wouldn’t it be easier to just take into account point #1 in your projections by not applying a wide range of values for the categories which are difficult to project? It seems that’s what Steamer has done at least.

More info here:
http://tangotiger.com/index.php/site/article/difference-between-forecasting-results-with-an-without-an-identifier

Elias Walsh
10 years ago

That’s true, and exactly the source of the problem. Good projection systems like Steamer will project a narrower range of values than will actually be realized. However, when you z-score the stats you divide by a too small standard deviation and get back a wide range of values, essentially undoing all that good mean reversion in the projections. It’s similar to what would happen if you decided to z-score stolen bases using a standard deviation calculated only among catchers. You’d end up paying a ton for a C with 5 projected SB versus the another identical C with only 4 SB.

BeanoCook
10 years ago

Stat categories in H2H leagues vary wildly in value due to aforementioned variability over what is a short time period of one week. My data work shows batting average is almost all luck, over a week. SB is worth almost double HR, both are very valuable. In pitching, Ks are most valuable, whip is more stable than era. In general pitching categories are more stable, thus more valuable over a week, factoring in a min of 30 ip. I love that this critical info is almost completely ignored, it’s allowed me to win league after league, ESP Y!

Pitchers are way under drafted, it plays right into the face of conventional wisdom, it’s perfect. Humans still over rate batting average, even in fantasy…perfect. It’s near useless in a H2H league.