Fantasy Baseball: Are Some Categories More Important Than Others?

by DragonAsh

January 26, 2015

While doing some work on my pre-season projections sheet, I came across a link to complete data from Razzball – complete full-season data for 48 12-team 5×5 fantasy baseball leagues[1]. I’ve been using this as a handy cross-reference in doing some SPG (Standings Points Gained) calculations, but I decided to try and use the data to do an exercise on something I’d been thinking about: are some categories more important than others?

First, I looked at the by-category scores for all 48 first place teams, then all the second place teams, etc:

	R	HR	RBI	SB	Avg	W	Sv	K	ERA	WHIP	Avg score
1^st pl teams	10.8	10.4	10.2	9.8	8.3	10.7	10.3	11.1	9.8	9.9	10.11
2^nd pl teams	9.8	9.0	9.9	8.3	8.2	9.5	9.8	9.9	9.6	9.1	9.31
3^rd pl teams	9.0	8.4	9.1	8.5	7.6	8.9	8.9	9.1	8.1	7.8	8.56
4^th pl teams	8.5	8.0	8.2	7.8	7.7	7.7	7.7	7.8	7.6	7.6	7.86
5^th pl teams	7.9	7.5	6.9	7.4	6.8	7.3	7.2	7.5	7.1	6.8	7.24

The 48 first place teams, on average, scored 10.11 in the 5×5 categories. So basically a top-3 finish in all categories. Not that surprising.

Digging a bit deeper, I looked at the average score in each category for 1^st place teams, then for 2^nd place teams, and so on. I included the standard deviation (a measure of variability) and how often a team was in the top 3 for that category:

1st Place teams	R	HR	RBI	SB	Avg	W	Sv	K	ERA	WHIP
Average score	10.8	10.4	10.2	9.8	8.3	10.7	10.3	11.1	9.8	9.9
Std Dev	1.6	2.1	2.3	2.3	2.9	1.7	1.8	1.2	2.2	2.0
% in top 3	77.1%	72.9%	70.8%	62.5%	41.7%	79.2%	75.0%	87.5%	64.6%	66.7%
2nd place teams	R	HR	RBI	SB	Avg	W	Sv	K	ERA	WHIP
Average score	9.8	9.0	9.9	8.3	8.2	9.5	9.8	9.9	9.6	9.1
Std Dev	2.0	2.6	2.0	3.0	3.2	1.9	2.3	1.9	2.4	2.6
% in top 3	58.3%	52.1%	68.8%	41.7%	43.8%	60.4%	68.8%	66.7%	62.5%	56.3%
3rd place teams	R	HR	RBI	SB	Avg	W	Sv	K	ERA	WHIP
Average score	9.0	8.4	9.1	8.5	7.6	8.9	8.9	9.1	8.1	7.8
Std Dev	2.5	3.1	2.3	2.8	3.2	2.5	2.6	2.1	2.8	2.7
% in top 3	54.2%	47.9%	54.2%	47.9%	33.3%	52.1%	50.0%	50.0%	39.6%	37.5%

A quick glance seems to suggest that the most important categories were Runs on the batting side, and Ks on the pitching side: the average score for the team that won their league was highest – by quite a margin, and also varied less – for those two categories. Winning teams were also more likely to be at least in the top 3 in Runs and Ks compared to any of the other batting and pitching categories, respectively.

Conversely, Batting Average did not appear to be that important – less than half of the teams that won their league were in the top 3 in Batting Average, and it had the lowest average score for champion teams of all the 5×5 categories. It was also the most volatile – with a standard deviation of 2.9, around 67% of teams that won their league would have had a Batting Average score ranging from 11.2 down to as low as 5.3!

What about second-place teams? Ks and Runs were important here as well, but without the gaps seen for winning teams. The highest-scoring category on the pitching side was again Ks, but at 9.9, this was only 0.1 higher than the second category (Saves). On the hitting side, RBIs had the highest average score at 9.9, with Runs at 9.8

There’s another way to look at the data – if you were the leader in, say, Home Runs, how likely is it that you won your league? Here’s another breakdown:

1st in category
	R	HR	RBI	SB	Avg	W	Sv	K	ERA	WHIP
Avg Finish	2.1	3.0	3.0	3.4	5.2	2.5	3.1	2.2	3.2	3.6
% in top 3	75.0%	58.3%	56.3%	50.0%	31.3%	60.4%	58.3%	75.0%	60.4%	54.2%
2nd in category
	R	HR	RBI	SB	Avg	W	Sv	K	ERA	WHIP
Avg Finish	3.4	4.3	3.3	4.3	4.9	3.5	3.0	3.3	4.5	4.2
% in top 3	39.6%	35.4%	56.3%	31.3%	31.3%	43.8%	41.7%	43.8%	27.1%	35.4%
3rd in category
	R	HR	RBI	SB	Avg	W	Sv	K	ERA	WHIP
Avg Finish	4.3	4.3	4.1	4.7	5.5	4.1	3.8	3.5	4.6	4.9
% in top 3	20.8%	31.3%	25.0%	22.9%	22.9%	31.3%	43.8%	35.4%	39.6%	29.2%

This table tells us, for example, that once again, teams that finished tops in Runs or K’s, had an average overall finish of 2.1 and 2.2, respectively: basically, they finished 1st or 2nd overall in their league, and fully 75% of teams that were first in Runs or K’s had a top-3 overall finish. (15 teams were first in both Runs and Ks – of those, 14 won the league; the lone exception came in third).

Conversely, teams that had the best Batting Average only finished 5th on average, and only 30% of teams with the best batting average were in the top 3.

I’m not showing the data here, but the reverse was also true: of the teams that were in the bottom half in the league in Runs, or in K’s, exactly none of them won the league. None. Only four teams (for both Runs and K’s) even managed a 2^nd place overall finish!

On the flip side, there were 26 teams that were in the bottom half in Batting Average but 1^st or 2^nd overall, including 14 overall winners.

So the data appear to be telling us that we need to focus on Runs and Ks, and not worry quite as much about Batting Average. There may be some logic behind this: players scoring lots of runs are, perhaps, coming to bat more often, which means more opportunities for HRs, SBs and RBIs. Pitchers generating lots of Ks are perhaps more likely to be in position to pick up Wins and Saves and have better ratios.

While I don’t think anyone would recommend ignoring a category altogether – even Batting Average – I think the key takeaway is that in looking at roster construction, you might benefit by paying closer attention to Runs and K’s – for example, by letting those two categories be the tie-breaker if two players appear to be close in value.

Obviously, none of this is particularly new or revolutionary. And of course the usual caveats apply: 48 leagues from one particular year may or may not be a sufficient sample size to draw conclusions from. Results will almost certainly differ in some way or another for leagues with different settings (1 catcher leagues vs 2 catcher leagues, 5 outfielders & 1 util vs 3 OF and 2 util, etc). My knowledge (or lack thereof) of statistics and such could make the entire exercise completely worthless, etc.

But I, at least, found it interesting – that’s all that matters, really – and I am looking to incorporate this as I do my projections this year.

[1] 12-team, standard 5×5, 5 outfielders and one utility spot; max 180 games started for pitchers, and – at least according to Razzball – the Razzball leagues are supposed to be generally more competitive that more casual leagues.

A zDefense Primer

Is Arrieta the Cubs’ True Ace?

20 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

ML610

10 years ago

Interesting. Wonder what the nfbc data would show. For what it’s worth, I entered 8 razz ball leagues two years ago and finished first in 5 and no lower than 3rd in any. I did the same last year, but life events prevented me from keeping up with the teams. I didn’t finish in last in any league and finished 5th and 6th in a couple. Puts the efficacy of that competition level in perspective.

I think it’s tough to focus on a category like runs. I tend to to focus on stats that have a little better predictability or rather the ones where advanced metrics analysis can help with the story of the historical or projected future outputs.

Neil S

10 years ago

Runs and Ks could also be a proxy for player health and/or maximizing games played, though, right? It suggests, to me, that these fantasy players valued keeping slots filled and hitting the max number of games – thus accumulating counting stats – over having the best rates. That might be a more valuable takeaway than ‘pick the guy who scores more runs’.

DragonAsh

10 years ago

Reply to Neil S

> Runs and Ks could also be a proxy for player health and/or maximizing games played, though, right?

Sure, I think that’s another way to look at it, yes – although I think it’s also a matter of trying to avoid guys with ’empty’ batting averages. Martin Prado last year over 536 at-bats hit .281 but only scored 62 runs with 58 RBIs. Kole Calhoun in 493 at-bats only hit .272 and had exactly the same number of RBIs…but had 90 runs. Calhoun was mainly hitting at the top of the LAA lineup. Prado was mainly batting in the middle or lower of the Arizona lineup, then moved around a lot in NY but hit 3rd quite a bit; perhaps not surprisingly he produced more Runs and RBIs per at-bat in NY vs when he was in Arizona – most of that could be attributed to where he was batting in the lineup; there was very little difference between NYY and ARI in terms of team batting stats.

stonepie

10 years ago

Reply to Neil S

i’d like to see where PA and IP totals ranked among the winners. im willing to bet winners had the most, or among the most, PA’s in their league.

Blue

10 years ago

Reply to stonepie

This.

Playing time matters. You need lots of PAs.

RotoholicMember since 2016

10 years ago

There is a danger in looking at a correlation and drawing conclusions about the cause. It seems that health and playing time play a big part in this. The winners didn’t fare too well in the rate stats, particularly batting average. It stands to reason that teams who excel in the rate stats may have not necessarily maximized their innings/plate appearances and thus it wasn’t that they drafted a bad team, but they had poor in-season management or bad luck with injuries and didn’t replace the players soon enough. We all know that guy who doesn’t check his team as often as everybody else and if he had got those extra 400 plate appearances or 200 IP, he would have finished close to the top. And the guy who has a lot of relievers and therefore a good ERA and WHIP but very poor Ks and Ws. I would bet that a team that leads the league in ERA or WHIP and doesn’t use any middle relievers to get there would fare better than the average team who leads the league in ERA and WHIP. Not to mention the wrinkle in this that daily leagues are far different form weekly leagues. I have some NFBC data for last year and it’d be interesting to run this same experiment. Since they are weekly, and the people involved have a lot of money at stake, there should be less noise in the results.

Ryan BrockMember since 2025

10 years ago

Reply to Rotoholic

Yeah, this is the answer. For those unfamiliar with Razzball leagues there is no transaction limit (though there is an IP cap). Teams that stream semi-competently end up with higher counting stats.

DragonAsh

10 years ago

> I’d like to see where PA and IP totals ranked among the winners. im willing to bet
> winners had the most, or among the most, PA’s in their league.

Ask and ye shall, etc etc.

Unfortunately the data set didn’t include PAs, but did have ABs and IPs. Not scoring categories, obviously, but assuming we did assign 12 points for most ABs and 12 for most IP, 11 for second-most and so on, here’s what the data said:

AB IP
1st: Avg score 10.4 10.8
Std Dev 2.0 1.4
Avg Finish 1.7 1.7
% top 3 83% 83%

Don’t know if the formatting will work, but basically: The 1st place teams, on average would have had a score of 10.4 with a standard deviation of 2.0 for ABs and 10.8 / 1.4 for IP. – high scores yes, but on the batting side, the average ‘score’ for ABs was lower than for Runs (10.8), and had a higher degree of variability. Ditto for on the pitching side, looking at IP vs Ks.

On the other hand – if you were tops in your league in either ABs or IPs, you were far more likely to win your league – the average ‘finish’ for teams were tops in their league in ABs or IP were both 1.7; higher than the same score for Runs or K’s. And over 80% of teams that were tops in AB/IP had ad least a top-10 finish. average ‘score’ for 1st place teams was 1.7 for both AB and IP.

And just for yet another twist, there were four teams that were in the bottom half of either ABs or IP and managed to win their leagu – and 21 teams overall managed a top-3 finish.

Make of that what you will 🙂

Josh Barnes

10 years ago

Good research, but I disagree with the conclusion. It’s been mentioned already but I’ll reiterate it. Teams who are finished up near the bottom of the pack are more likely to do well in batting average and less likely to do well in the counting stats.

This doesn’t make runs and K’s the most important, it’s actually the opposite!

Based on years of playing this game, I am certain that Batting Average, ERA, and WHIP are by far the most important categories for you to focus on because you cannot gain on quitters in these categories simply by maximizing your roster moves. If you are winning these three categories, you have a massive leg up on everybody in your league.

Plate Appearances is the number one factor for the counting categories.

-1

Johnny Baseball

10 years ago

Just a quick run through my long standing CBS league 5×5, last three years:
2012 and 2014 winner had both runs and k’s. 2013 winner was a close second in both categories

DragonAsh

10 years ago

I ran a multivariate regression using ‘Overall standings’ as the Y (dependent variable) and the normal 5×5 categories, plus AB and IP, as the x (independent) variables. I assigned a value of ’12’ for an overall first place finish, 11 for a second place finish, etc. The resulting coefficients are below – again, apologies if the formatting is screwed up.

R 0.165830
HR 0.137235
RBI 0.146284
SB 0.139668
Avg 0.156925
W 0.157496
Sv 0.154935
K 0.177355
ERA 0.114153
WHIP 0.147416
AB 0.006296
IP 0.011120

The coefficients measure the mean change in the independent variable for each one-unit change in the dependent variable. For example, for every additional point you score in SBs (for example), you can expect a 0.139 increase in your overall score (again: 12 for 1st overall, 11 for 2nd overall, etc).

The variables giving the highest boost to overall finish: Runs and K’s. ERA and SBs were quite low – amazingly, IP and AB had extremely low coefficients, which is surprising enough to me that I may need to double-check that my data is accurate. The p-values for all X variables were all essentially zero….except for AB and IP (in other words, the data suggest that AB and IP scores do not do a good job of explaining a team’s overall finish).

I stress again that this is one year of data, and 48 teams; there’s a non-zero chance that we’re looking at funky results. By and large the Razzball leagues are supposedly somewhat competitive, so perhaps ‘quitters’ weren’t as big of a factor.

And of course we still run into a bit of a problem in figuring out *how* to implement this information even if we assumed it is accurate…

Blue

10 years ago

Reply to DragonAsh

AB and IB are covariates with the other counting stats. You have a lot of multicolinerity in that model which means the individual parameter estimates need to be treated with caution.

Pancito

10 years ago

Another factor skewing the data is the site itself. Presumably a good percentage of those who play in Razzball leagues also read the blogs. Grey, at least, sort of preaches both streaming and high K pitchers. And Rudy is very big on streaming. The site has two tools that make streaming decisions more manageable as well, Hittertron and Stream-O-Nator. I think data is available on the number of moves made as well, which might be interesting to look at.

EnricoPalazzo

10 years ago

I think the explanation for runs is as simple as this: home runs are sexier than runs. The average drafter gets caught up in this feeling and spends more on big-name crushers than little lead-off guys, excessively so. A better drafter knows this, gives runs their proper value, and wins that category for a bargain.

DragonAsh

10 years ago

Reply to EnricoPalazzo

Hmm: Possibly, but I’m a bit doubtful. Mainly because ‘average drafters’ are almost certainly basing draft decisions only the ‘draft kit’ rankings or mock-draft results from the site their league is using.

EnricoPalazzo

10 years ago

Reply to DragonAsh

Agreed. I was trying to tread lightly, but I believe the issue affects experts as well. Certain categories always get more attention than others, although they all give the same amount of points. One of my main strategies is to dominate these unsung categories, since it is generally cheaper to do so.

Josh Barnes

10 years ago

Reply to EnricoPalazzo

Agree Enrico..

I think your average fantasy drafter is completely unaware of the fact that a guy who hits .300 over 600 PA’s is the equivalent of a guy who hits 26 home runs.

If you have two players:

Player A: 3 HR, 80 R, 70 RBI, 10 SB, .300 AVG
Player B: 26 HR, 70 R, 80 RBI, 10 SB, .250 AVG

They are essentially the same value yet Player A will almost always fall further down the draft board than Player B.

Over the past few years, people have been tricked into believing that power is very rare and hard to obtain in the middle/late rounds but if you read my recent piece on Jose Altuve, I prove this to be completely false. It’s just as hard to find a .300 hitter in these rounds as it is a 26 HR bat. In that Altuve piece, I showed there were actually more above-par HR hitters available from pick 200-300 in NFBC 50-round leagues than stolen base guys.

I am always taking Player A, the guy with the higher batting average, because the batting average category is much harder to gain ground in after owners start quitting your league in June/July.

@jb82mets

buddyglass

10 years ago

Reply to EnricoPalazzo

Based on the formula I arrived at using the method described below I’d rate Player B as worth 1.13 more wins over the course of a season. Agreed that Player A ends up going much later in the draft. In a straight up comparison, though, B wins.

In order for them to have equivalent value B would have to hit only 14 home runs or A would have to bat 0.360.

MustBunique

10 years ago

Great stuff DragonAsh. There are certainly many explanations to consider, as the other posters have stated many of them. However, without your work we would not really be having this conversation at all. Thanks for providing the results of your work. Do you have twitter or contact info? It would be nice if FG community would provide that for authors of posted articles.

buddyglass

10 years ago

I’ve played in a 12-team MLB H2H league at ESPN with mostly the same managers for the past ~10 years. For the last five years I’ve compiled various stats after each season. One thing I like to do is take each team’s regular season totals in each category and do a linear regression against the # of points each team won in that category.

The slope of the resulting trend lines suggest how much each “unit” in a given category is “worth” in terms of eventual fantasy points. What I’ve noticed is that the counting categories usually have a much higher correlation to fantasy points than the “average” categories. Given that, I’m less likely to draft for AVG, ERA and WHIP. I don’t punt them; I just tend not to emphasize them as much.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG