Author Archive

The Leadoff Hitter: Is Speed the Answer?

Classical baseball line-up construction involves putting your fastest player in the lead-off spot. This is due to the belief that speed generates runs (a la Rickey Henderson). In order to test this theory I went back to 1998 (since the last expansion) and looked at how may runs were scored in each season and then looked at 3 indicators, OBP, wOBA and stolen bases to test which indicator would be most useful in predicting runs. Although OBP and wOBA are very similar stats I decided to include both of them in the analysis because of differences in calculation. To put simply OBP gives a home run the same weight as a single and considers them equal (which they are not) while wOBA gives different types of hits more weight (see the OBP and wOBA pages for more information). I’ll admit that I am a huge fan of stolen bases, there is nothing like watching a player steal second or third to try and get a rally started. But the question is, can you expect to score more runs by being fast or by getting on base?

To get started I only looked at data from 2015 and pulled out the top 25 players from each stat category in order to define the “fast” players and the players who get on base the most. I also standardized runs scored to runs per game (RPG) to account for rest days and injuries which may have kept players out of the lineup for short periods of time. In the plot below it appears that the leaders in stolen bases have been scoring fewer runs per game than players who get on base more often. Based on the 95% confidence intervals of the top 25 players the difference was not significant, but the results are interesting nonetheless.

Now let’s look at some long-term data with how many runs were scored each year since 1998. In the plot below we can see that there was a large spike in runs scored in 1999 and 2000 before scoring evened out. The trend seemed to remain relatively stable from 2001 up until around 2006 or 2007 and then we see a dramatic decrease in runs scored up until last year. MLB started testing for steroids in 2003 and perhaps this is why we’ve begun to see that decrease in runs scored, but that is outside the scope of this article so let’s just focus on runs.

Runs are the most important aspect in baseball, whether that means scoring runs or preventing them. In the end, if your team can’t score any runs then you can’t win any games and unless a team have a titan of an offense you need to prevent runs as well. Here we are going to focus on run generation so we can forget about run prevention from here on out. Let’s look at the seasonal stats for our indicators and see how they look over time. I’m going to note here that OBP and wOBA shown in the plots are the league average, while the stolen bases are the league total for each season. A quick look tells us that OBP and wOBA are very closely related to the trend we saw in the second figure while stolen bases have a lot of variability over time. This seems to give a lot of evidence to getting on base, but let’s go one step further and see if we can develop a linear model to predict how each predictor affects the expected runs scored in a season.

In the final plot below I’ve put runs per game on the y axis and each stat on the x axis. In order to test how changes in league performance affects run scored I predicted the number of runs scored based on the 10%, 50% and 90% quantiles to see how many runs a player would generate over a 162-game season.

I’ve created a summary table for easy comparison of each stat and the thing that really jump out is that stolen bases doesn’t have any effect on runs scored. Based on the model, in a season where players steal almost 700 more bases collectively they generate less than 1 extra run.

OBP Expected Runs (Per Season)
0.319 56.51
0.333 60.93
0.340 63.15
wOBA Expected Runs (Per Season)
0.315 56.64
0.328 60.77
0.336 63.31
Stolen Bases (Season) Expected Runs (Per Season)
2583 59.74
2918 60.21
3281 60.72

In the end, getting on base is the most important (Thanks Moneyball!). For many the results should be unexpected, players who get on base more give their teams more opportunities to score runs. There doesn’t seem to be a significant advantage to using OBP or wOBA to predict runs, but based on advanced analytics people should probably consider wOBA more useful since singles, doubles, triple and home runs are all treated differently in the calculation.