SEAM Methodology for Player Matchup Evaluations

by Daniel Eck

September 1, 2020

Introducing SEAM Methodology

This article introduces the SEAM (Synthetic Estimated Average Matchup) method for describing batter-versus-pitcher matchups, both numerically and visually. We provide a Shiny app, available here, which you can use to follow along.

This app allows users to visualize synthetic spray chart distributions for any batter-pitcher matchup that has or could have occurred in the past five years (which is when Statcast data exists). Our app also reports performance metrics that are calculated directly from the displayed synthetic spray chart distribution. This includes the expected number of singles, doubles, triples, and home runs, as well as the expected batting average on balls in play (xBABIP) and the expected bases on contact (xBsCON), which can be thought of as slugging percentage except the denominator is BIP + HR instead of AB. These matchup-dependent metrics allow for any user to assess the expected performance of batters and pitchers when they face each other.

The SEAM method estimates spray chart distributions in the form of heat maps that are smoothed versions of conventional spray charts. We construct these by combining separate batter spray chart distributions that are constructed for each of the pitches that the pitcher throws. The final combination is also weighted to the usage for each pitch.

One challenge to this approach is the sparsity of some batter-pitcher matchup data. We alleviate this concern with the development of synthetic batters and pitchers with similar characteristics as the batter and pitcher under study. Our synthetic player creation methodology is inspired by the notion of similarity scores like those motivating PECOTA and Bill James’s work. However, unlike the similarity scores presented in the past, we construct similarity scores using a nearest neighbor approach that is based on the underlying batter and pitcher characteristics of the players under study instead of observed statistics.

For pitcher comparisons, players are aggregated on a season and pitch-type basis. The variables considered are velocity, spin rate, horizontal break, horizontal release angle, horizontal release point, vertical break, vertical release angle, vertical release point, and extension. Averages of these variables are taken across each pitcher-pitch type combination. For batter comparisons, players are aggregated on a season, handedness, and pitch-type basis. The variables considered are exit velocity, launch angle, pull%, middle%, and oppo%. Averages of these variables are taken across each batter-pitch type combination.

These player characteristics are obtained from Statcast and scraped using functionality in the baseballr R package. These characteristics should reflect the underlying talent and tendencies of players.

For each batter-pitcher matchup we estimate three spray chart densities. The first is the natural spray chart density corresponding to the players under study, the second is the spray chart of the synthetic pitcher versus the original batter, and the third is the spray chart of the original pitcher versus the synthetic batter. We report a synthetic spray chart density, which is a weighted average of these spray chart densities. The weights are chosen with the aim of minimizing estimation uncertainty in the form of mean squared error. These weights reflect the overall fit and sample size of these spray charts and they are computed internally.

Our app’s default matchup is between Mike Trout and Justin Verlander. The spray chart distribution and the stats for this matchup are displayed below.

View post on imgur.com

The default Trout vs. synthetic Verlander spray distribution is displayed below. This visualization is complemented with the top 10 most similar pitchers to Justin Verlander who have faced Mike Trout over the past five years.

View post on imgur.com

Here’s the default synthetic Trout vs. Verlander spray distribution. This one has 10 most similar batters to Mike Trout who have faced Justin Verlander over the past five years.

View post on imgur.com

Details of the SEAM Method

The complete gory mathematical details are included in our paper, which is available on our main page, but here’s a brief overview.

We will let y denote the location of a batted ball, x denote the collected variables for the batters and pitchers under study, and let f(y|x) be a probability density function corresponding to the batted ball distribution of possible locations y conditional on batters and pitchers possessing characteristics encoded in x.

Our goal is to estimate f(y|x) in the presence of sparse matchup data.

We will estimate f(y|x) with g(y|x), where g(y|x) takes the form:

g(y|x) = l_m * g_m(y|x) + l_sp * g_sp(y|x) + l_sb * g_sb(y|x)

In this equation, l_m, l_sp, and l_sb are non-negative weights so that l_m + l_sp + l_sb = 1, g_m(y|x) is the estimated batted ball density corresponding to the batter and pitcher under study, g_sp(y|x) is an estimated batted ball density between the synthetic pitcher and the batter, and g_sb(y|x) is an estimation between the synthetic batter and the pitcher. The weights (lm, lsp, lsb) are chosen to minimize the mean squared error for using g(y|x) as an estimator of f(y|x). We now motivate the construction of g_sp(y|x) and g_sb(y|x).

Suppose that there are J available pitchers and K available batters that form the pool of players that are comparable to the players under study, and suppose that the pitcher under study throws n_type unique pitches. For each t = 1,…,n_type, let p_t be the proportion that the pitcher under study throws each pitch. For each j = 1,…,J, and t = 1,…,n_type, let w_p,j,t be a non-negative weight that reflects the similarity of pitcher j’s pitch t to that of the pitcher under study, where we require that w_p,1,t + … + w_p,J,t = 1. With these specifications, the estimated density function for the synthetic pitcher is:

g_sp(y|x) = ∑_t∑_j w_p,j,t g_p,j,t(y|x_p,j,t)

In this equation, x_p,j,t are the pitch type t covariates for pitcher j. The density function g_p,j,t(y|x_p,j,t) is estimated nonparametrically.

A similar construction gives the estimated density function for the synthetic batter which takes the form:

g_sb(y|x) = ∑_t∑_k w_b,k,t g_b,k,t(y|x_b,k,t)

Performance Metrics

For this system, we developed novel performance metrics that are theoretically computed as expectations with respect to the synthetic spray chart density g(y|x). As noted above, we estimate the expected number of singles, doubles, triples, and home runs that the batter hits versus the pitcher in a particular matchup. We also estimate the xBABIP and xBsCON as additional summary measures. The metric xBsCON is best interpreted as slugging percentage conditional on balls put into play. Our implementation will not estimate these expectations exactly, as there is not enough historical batted ball data.

To theoretically estimate these quantities, we first obtain five years of batted ball data. Next we estimate the proportion of batted balls that were either an out (O), single (1B), double (2B), triple (3B), or home run (HR) at locations y on the baseball field. We denote this vector of estimated proportions at y as:

P(y) = (p_O(y), p_1B(y), p_2B(y), p_3B(y), p_HR(y))’

Next, we obtain E(x) = ∫ P(y) g(y|x) dy, where:

E(x) = (e_O(x), e_1B(x), e_2B(x), e_3B(x), e_HR(x))’

Thus E(x) is the estimated expected vector of outcomes where the expectation is taken with respect to the estimated spray chart distribution. Expected BABIP is then calculated as:

xBABIP = e_1B(x) + e_2B(x) + e_3B(x)

And xBsCON is calculated as:

xBsCON = e_1B(x) + 2e_2B(x) + 3e_3B(x) + 4e_HR(x)

Our app also displays the floor of 100 e_1B(x), 100 e_2B(x), 100 e_3B(x), and 100 e_HR(x). The 100 multiplier is a normalization that is intended for ease of interpretation.

The previous paragraphs outline how we would calculate our performance metrics if we could obtain P(y) for every y. In reality, we do not have enough data available to achieve this task. Therefore, we calculate discretized versions of these performance metrics over 10-foot-by-10-foot bins.

Discussion Points

The primary contribution of this work is the development of synthetic spray chart distributions that are calculated under the hood of a Shiny app which provides users with visual and numeric summary measures of batter-pitcher matchups. Our application shows users batter-versus-pitcher tendencies while providing summaries of their overall success (or lack thereof), which we hope will be of interest to the broad baseball community. The SEAM method greatly improves upon the inferential power of spray charts as a visualization of a batter’s hitting tendencies, as they may be uninformative for individual matchups due to a lack of data. Our synthetic player construction helps alleviate this problem.

We are not the first to incorporate additional players into an analysis via similarity scores with the understanding that doing so improves estimation performance. The PECOTA prediction methodology tries to forecast the ability of players using aggregate estimates obtained from other similar players. To the best of our knowledge, we are the first to base similarity scores exclusively on Statcast data, which we believe provides a truer notion of similarity in the context of individual batter-pitcher matchups.

On average, players who hit the ball harder with a more optimized launch angle will receive better projected stats, which makes sense since these balls tend to produce more home runs (and thus take fielders out of the equation). One should note that these variables will of course not allow us to measure the complete talent profile of baseball players. Tools such as speed or eye at the plate would not be fully captured by our methodology.

Brief Digression on the History of Context-Free vs. Context-Rich

Baseball has had a rich statistical history dating back to the first box score created by Henry Chadwick in 1859. Fans, journalists, and teams have obsessed over baseball statistics and performance metrics ever since. This obsession is best summarized by the existence of Alan Schwarz’s best-selling book “The Numbers Game: Baseball’s Lifelong Fascination with Statistics,” which is devoted entirely to the statistical history of baseball.

Most commonly used player evaluation metrics are functions of context-free box score totals and/or context-free field tracking information. These metrics include, and are far from limited to, adjusted earned run average (ERA+), adjusted on base plus slugging percentage (OPS+), weighted runs created plus (wRC+), and wins above replacement (WAR). This class of metrics is largely touted as context-free, meaning that contextual information such as park and league run environment effects are taken into account and situational information is ignored.

The paper “Challenging Nostalgia and Performance Metrics in Baseball” showed that context-free metrics and the class of metrics that compares a player’s accomplishments directly with that player’s peers are ill-equipped for player comparisons across eras of baseball, although they may perform well over the course of a single season or a few consecutive seasons. That being said, these context-free metrics do not offer any guidance for how any particular batter will perform against a particular pitcher, the most important and relevant outcome in a baseball game.

Furthermore, baseball outcomes have been assumed to be independent and identically distributed (iid) realizations in the statistics literature (Brown 2008 and Jensen, McShane, and Wyner 2009). The iid assumption of outcomes may be reasonable in the prediction contexts of the previous hyperlinks, which involve long time frames, but iid is not appropriate for small time frames when the variability in quality of batters and pitchers can be very large.

Commonly used context-free metrics are grounded in the iid assumption and are therefore essentially meaningless for studying batter-pitcher matchups since seasonal values are aggregated over widely varying matchups that individually have too small a sample to be reliable. Thus, this is the setting in which the context-rich SEAM method shines.

Authors

This project is a collaborative effort between Charles Young, David Dalpiaz, and Daniel J. Eck at the University of Illinois Department of Statistics.

We would like to thank ATLAS Infrastructure and the Illini Analytics group for web hosting and computational resources. We would also like to thank Alan Nathan, Jim Albert, Shane Jensen, James Balamuta, John Marden, and Dave Zhao for helpful comments.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG