Ballpark Attendance and Starting Pitchers

When I am thinking about buying a ticket to a baseball game, often my first question is “Who’s pitching?” I have always felt that the most enjoyable type of game is one in which a great starter is on the mound. Is this feeling common among fans or do they buy tickets regardless of the starting pitcher?

To answer this question, I trained random forest models to predict attendance for games based on situational factors (not including the starting pitcher). Then I considered how the quality of starting pitchers relates to whether the models overestimate or underestimate the attendance. If the models consistently underestimate attendance when star pitchers are on the mound, it would suggest more tickets are sold because of the starter.

Data

Information about each game was collected from Retrosheet’s game logs. In accordance with Retrosheet’s terms of use, please note the following statement: “The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at retrosheet.org.” Pitcher performance data was gathered from FanGraphs. In addition, the people.csv data set found here was used to match player ids from Retrosheet to FanGraphs.

Model

A random forest model to predict attendance was built for each season from 1938-2018. The cutoff point of 1938 was chosen because of sample size restraints.

The independent variables for the model are:

  • Opposing team
  • Month
  • Day-of-week type (Monday-Thursday or Friday-Sunday)
  • Time of day (day or evening)

Analysis

The differential was calculated between the actual attendance and the predicted attendance using the formula:

Differential = 100 (Actual – Predicted) / Predicted

Note that this is the percent error formula where the predicted value of attendance is the reference and the actual attendance is the approximation. Since the actual attendance is the metric influenced by the starting pitcher, that value is being tested. Thus a positive differential indicates the actual is higher than the predicted number. The percent error as opposed to absolute difference is necessary due to the increase in overall park attendance over time.

Next, the average differential for each starting pitcher was determined, and we considered only pitchers who made at least 10 starts in the given season. To compare differentials to pitcher performance, fWAR was used as an approximation. It seemed to be the best choice as it captures both the quantity and quality of a pitcher’s performance. The pitchers were separated into four groups:

  • Best: fWAR at or above the 90th percentile
  • Above_avg: fWAR between the 50th and 90th percentile
  • Below_avg: fWAR between the 10th and 50th percentile
  • Worst: fWAR below the 10th percentile

The differentials by group are considered.

Results

The following graphs show the average differential by group over time.

The Best group consistently has a larger attendance than predicted, the Worst and Below_avg groups have a consistently lower attendance, and the Above_avg group is around the predicted level. All four groups tend to be closer to the predicted level over time. These relationships are also clear in the following summary table that collects the data over decades:

Differentials by Decade
Decade Best Above_avg Below_avg Worst R-squared Pitchers
1938-1949 6.95 -3.50 -4.39 -19.7 0.374 39
1950-1959 2.56 -2.43 -5.23 -13.9 0.456 50
1960-1969 4.07 -3.28 -7.79 -9.15 0.507 72
1970-1979 1.33 -3.25 -6.20 -8.39 0.537 95
1980-1989 6.14 -1.37 -7.21 -6.54 0.600 100
1990-1999 5.97 -1.79 -4.29 -7.02 0.688 108
2000-2009 1.42 -0.108 -4.05 -3.00 0.701 125
2010-2018 2.80 -0.0005 -3.37 -3.73 0.723 126

Although the magnitude ebbs and flows, the sign of the differentials remains consistent within groups. The R-sqaured value shown is the average R-squared value of the models in the decade, and they increase over time as the average number of pitchers who made at least 10 starts increases. Overall, the models do a pretty good job fitting the data, especially with the larger sample sizes in recent years. Also note that the models tend to overestimate attendance more than underestimate, as all of the groups other than Best (which accounts for just the 90th percentile and above) averages a negative differential. This makes it even more convincing that the pitchers in the Best group are bringing in more fans.

In more concrete terms, the average differential of 2.80 for 2010-2018 means that if a pitcher in the Best group is starting, and the model predicts 30,000 people will come to the game, then about 840 additional fans should be expected.

Conclusion

It appears that the quality of starting pitchers do in fact impact ticket sales, and in particular star pitchers starting a game tend to put more fans in seats. This relationship is consistent over time, with no clear linear trend that would suggest starters are having more or less of an impact now than in past years.





4 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Mac
4 years ago

Seems like the most interesting finding is that above average starters only draw average attendance, meaning they under-draw compared to their talent level. Which makes sense with what we know – fans love the top stars way, way more than the good but not great guys.

David Ducksworth
4 years ago

Obviously this is Fangraphs, and fWAR is the home metric, but I wonder if bWAR would not have been a better option for this kind of analysis. We are trying to see how attendance is changed by the pitchers the fans think are best, which makes a metric based on runs allowed likely better track with the opinion of the majority of fans who, for most of the years in the model, and likely even today do not know what FIP is (for earlier years, because it did not even exist yet.)

AHume92member
4 years ago

I would be interested to see controls for team quality (winning percentage) the year before and the given year. We know winning is a major driver for attendance and having better starting pitchers obviously helps with that all-else equal. I know there are better metrics than raw winning percentage for team quality, but that is the metric fans will react to.

Lucas Kellymember
4 years ago

This is great work. Why did you choose the random forest classifier in particular? I would love to see your process, do you have a GitHub link you would be willing to share? Very interesting read. Thanks!