2015: A Season of Unprecedented Parity In the American League
Background: When the 2015 season ended, I remarked to myself that there seemed to be a great amount of parity in the American League this year. So I decided to see whether it was just my faulty impression, or if it was indeed a closer race this year from years past.
Methodology: I decided to use variance in win percentages among teams in each season to define parity, with a lower variance equating to more parity.
Variance is a measure of the spread of a dataset. It is calculated as follows:
where N=population size, mu = population mean, x_i = data entry.
I took my dataset from baseball-reference.com and used Python scripts to modify the raw data into a cleaner .csv format, so that I could run analysis in R.
The 2015 season had the lowest variance (.001836222) in win percentage of any season in the history of the American League (1901-2015).
Here is a time plot of the variances across seasons:
On the left, 0 is 2015 and it increases by one season as the graph goes to the right.
Conclusion: 2015 was in fact the season with the most parity all-time in the American League.
The American League season with the worst parity? Go back to 1932, where the Babe Ruth and Lou Gehrig-led Yankees won 107 games and the Boston Red Sox lost 111. (Variance = 0.01710932)