This article introduces the SEAM (Synthetic Estimated Average Matchup) method for describing batter-versus-pitcher matchups, both numerically and visually. We provide a Shiny app, available here, which you can use to follow along.
This app allows users to visualize synthetic spray chart distributions for any batter-pitcher matchup that has or could have occurred in the past five years (which is when Statcast data exists). Our app also reports performance metrics that are calculated directly from the displayed synthetic spray chart distribution. This includes the expected number of singles, doubles, triples, and home runs, as well as the expected batting average on balls in play (xBABIP) and the expected bases on contact (xBsCON), which can be thought of as slugging percentage except the denominator is BIP + HR instead of AB. These matchup-dependent metrics allow for any user to assess the expected performance of batters and pitchers when they face each other.
The SEAM method estimates spray chart distributions in the form of heat maps that are smoothed versions of conventional spray charts. We construct these by combining separate batter spray chart distributions that are constructed for each of the pitches that the pitcher throws. The final combination is also weighted to the usage for each pitch.
One challenge to this approach is the sparsity of some batter-pitcher matchup data. We alleviate this concern with the development of synthetic batters and pitchers with similar characteristics as the batter and pitcher under study. Our synthetic player creation methodology is inspired by the notion of similarity scores like those motivating PECOTA and Bill James’s work. However, unlike the similarity scores presented in the past, we construct similarity scores using a nearest neighbor approach that is based on the underlying batter and pitcher characteristics of the players under study instead of observed statistics. Read the rest of this entry »