Eno Saris’s recent article on Jason Heyward comps got me thinking about comps. It also happens to coincide with the day that I got my Baseball-reference subscription. That I would start looking at seasons from 20 year-olds was inevitable.
It was maybe the third or fourth thing I noticed: 2010 featured another remarkable season from a 20 year-old hitter: Mike Stanton. Here’s a fun fact about Heyward: among 20 year-olds, only two guys walked in more plate appearances than the Braves’ young stud. (Ted Williams and Mel Ott.) Here’s a fun fact about Stanton: the guy closest to him in batted balls for home runs, among 20 year-olds, is Mel Ott, but Mike Staton sent a greater percentage of batted balls over the fence than any age 20 hitter in the retro-sheet era. (Perhaps less fun: he has the highest K% among 20 year-olds too.)
But who are the players most comparable to Stanton and Heyward? To answer this question, I started focusing on three true outcome rate stats (since those are more stable in small samples than ball-in-play stats) in seasons from 20 year-old hitters (regardless of experience). While it’s tempting to focus on rookies, there are just 102 seasons with 200+ PA from a 20 year-old since 1920, so focusing on similarly young rookies just shrinks an already small group. To expand the group a little, I added 21 year-old in their first season (also cut off at 200 PA).
To compare these players, I developed z-scores for players BB/PA, K/AB, and HR/batted ball (AB-K). (See a technical section below on these scores.) Then, treating each 20 year-olds 3 z-scores as a vector, I found the distance of their vector from Heyward’s and Stanton’s vectors. The smaller this distance from their vector, the more comparable they are.
|Pee Wee Reese||0.89|
(Some one will look at this and say “so and so isn’t comparable to Staton/Heyward!” just by looking at the raw stats. Before you do, please note that these have a rough adjustments for the era, and these rates have varied a lot since 1900. The league average strike out rate was about 8% in 1930. I’ve tried to control for these variations in the scores; see the technical discussion below on how.)
Both players comp to some of the luminaries of the game. Heyward is, overall, much more comparable to past young studs than Stanton is. With the exception of Strawberry and Mathews, no player in Staton’s list is as comparable to him as any player in Heyward’s is to Heyward. You can’t see it here, but Heyward is actually more comparable to Ted Williams (1.38) and Mickey Mantle (1.45) than Stanton is to Jay Bruce. Willie Mays scores a 1.67 against Heyward, so, yeah, this Heyward kid is pretty good.
For those interested in the technical details of how these scores are generated, read on. For those who are happy with everything so far, the rest is just math in this section.
The scores are generated by looking at each player’s BB/PA, K/AB, and HR/(AB-K). I chose these because they’re relatively stable in small samples and independent of one another. Above, I said I generated a z-score, but that isn’t quite right. From each player’s stats I subtracted the league average stat in the year of that season. This creates a quick-and-dirty way to make cross era comparisons. This is critical, since, for example, when Mel Ott put up his gaudy 1929 season, the league average K rate was around 8%, and Ott was actually a tick below average in Ks. I then divided this sum by the standard deviation among all seasons from 2006 to 2010. This, by the way, is a serious limitation and I’m not sure that it works. It assumes that the dispersion of major league talent regarding these three stats stays pretty constant from era to era, even if the averages change a lot.
These three z-scores form a vector. Similarity is determined by the distance of the vectors: where the vectors are <a1,b1,c1> and <a2,b2,c2>, the distance between squared, d^2, equals (a1-a2)^2+(b1-b2)^2+(c1-c2)^2.
There are obviously some limitations to the system (besides the one already mentioned with the standard deviations.) One limitation is that it weighs all three stats equally, but these stats will not evolve equally. Moreover, the whole points of comparables is that they are supposed to inform our judgement about a player’s development: ideally, we weight these scores according to how much they tell us about a player’s future value.
I’d be happy to hear thoughts on how to improve the method of these similarity scores, so fire away if you have them.