Albert Pujols Bunted Once

One time, Albert Pujols bunted.

If we include minor-league play, he’s bunted twice in his professional career. But in the major leagues, the major leagues where he’s played for 12.5 years and hit (as of July 10) 489 home runs, 523 doubles, and on average 1.198 hits per game, the major leagues where his career batting average is .321 and he hits twice as many doubles as double plays, Albert Pujols has bunted once.

It was in his rookie season, of course. But what exactly happened? Why did he bunt?

Theory #1: Pujols was an untested rookie.

Strike one. Albert Pujols bunted on June 16, 2001. When the baseballing world awoke that day, he was a rookie batting .354/.417/.654, with 20 home runs. He’d already been intentionally walked three times. (Compare to our latest Rookies of the Year: Mike Trout was intentionally walked four times in all of 2012; Bryce Harper, zero.) Pujols had 11 hits in the previous seven games, including four homers.

Now, this was only two and a half months of gameplay, a small track record. But if you’re savvy enough to realize that ten weeks is not enough time to assess a player’s quality, you’re probably also savvy enough to realize that this is not the type of player who should bunt.

Unless, of course, it’s a critical situation in the game.

Theory #2: Pujols was bunting at a time when the Cardinals really needed a bunt.

Strike two. Albert Pujols bunted in the bottom of the seventh inning, with the Cardinals ahead 6-3. In the top of the same inning, the White Sox had scored two runs, but St. Louis’ win probability was a healthy 96% when Pujols came to the plate. After he bunted, their odds of winning were still 96%.

Now, in some ways it was a textbook bunt situation. The Cardinals had two men on base. They also had zero outs. No outs and two on is a good time to bunt. But they also had a three-run lead in the seventh. And Albert Pujols was batting cleanup. He bunted.

Theory #3: Pujols was facing a pitcher against whom he might have trouble.

Strike three. The White Sox did bring in a new pitcher to face Albert Pujols, a thirty-year-old right-hander named Sean Lowe.

Now, Sean Lowe was pretty good against right-handed hitters. In 2001, righties hit .233 off him. They didn’t strike out much, but they didn’t walk much either, and they made unusually weak contact. We can suppose this because when lefties put balls into play against Lowe, their batting average was .308, but righties’ batting average on balls in play against Lowe was only .243.

On the other hand, the Sox didn’t trust Lowe that much. According to Baseball Reference, he was placed into low-leverage situations more than half the time in 2001. In 17 of his 34 relief appearances, the Sox were already losing–as they were on this day, losing by three runs with only six outs left. (That’s 17 of 34 in a year when the team had a winning record.)

Oh, and there’s another thing. Albert Pujols was killing right-handed pitching; when 2001 was over, his AVG/OBP/SLG against righties was .342/.408/.624.

No, the White Sox brought Sean Lowe into the game not as a magic bullet, but as something simpler: a Band-Aid. Ken Vining had allowed two runners to reach base without getting the inning’s first out. They simply needed somebody new.

Theory #4: Bonus Dan Szymborski theory: the element of surprise.

I asked Dan Szymborski why he might have Pujols bunt in a FanGraphs chat. His reply: “It may be a good surprise play if he’s confident he can get it down and the 3B is super deep or is Mark Reynolds.”

Strike four. Pujols bunted successfully on the second pitch; the first was a foul bunt attempt, terminating the element of surprise and any super-depth on the part of the defense. The third baseman was Joe Crede.

Theory #5: We’re out of theories.

Let’s set the scene, shall we?

The game is in St. Louis. As the fans sit down after their seventh-inning stretch, the Cardinals are winning 6-3. They’re six outs from victory, with odds of 95%, and their 2-3-4 hitters are due up. Chicago reliever Ken Vining starts the inning by walking third baseman Placido Polanco on four pitches. Next J.D. Drew hits a line drive single to right field on a 1-2 pitch, and Polanco advances to second.

This brings up cleanup-hitting right fielder Albert Pujols. The White Sox replace the flailing Ken Vining with Sean Lowe, a middle relief righty who induces weak contact. (Within a month, Vining will pitch his last major-league game.) The Cardinals have their best hitter at the plate: he’s a rookie, but he’s batting fourth, already has 20 homers, and sees two runners on base with no outs.

On the first pitch, Pujols bunts foul. On the second pitch, Pujols bunts fair.

It works, technically. Polanco and Drew advance, and Bobby Bonilla steps up to the plate. This was the 38-year-old Bonilla’s final season, and at the time of this game, his triple slash was a pitiful .217/.321/.391. (It would get worse, but remember, this is who Pujols bunted in front of.) Bonilla has had four home runs all year, one of them the day previous.

Bobby Bonilla is issued the second-to-last intentional walk of his major league career. (Yes, there was another one; he drew three IBBs that year.)

This brings up left fielder Craig Paquette, staring down loaded bases. He delivers a two-run single, putting the Cardinals up 8-3. Sean Lowe gets Edgar Renteria and Mike Matheny out to end the inning. The Cardinals win the ballgame by the same score, and in the ninth inning the last White Sox hitter to go down is a pinch-hitter making his major-league debut, named Aaron Rowand.

So Why Did Pujols Bunt?

Pujols tried to bunt twice, once hitting the ball foul. This suggests that it wasn’t Albert’s idea but his manager’s. If Pujols was the kind of player who liked to bunt spontaneously, he might have done it again by now.

Why did Tony La Russa have Pujols bunting? His team up by three runs, late in the game, two runners, no outs, best hitter at the plate. Perhaps he was overly concerned about Sean Lowe’s ability to get righties out, but there weren’t any outs and a double play would still leave a baserunner. Perhaps he recognized a classic bunting scenario, but Pujols was his best hitter and Bobby Bonilla, with a slugging percentage .263 lower, may have been his worst. Maybe he wanted to spring a surprise, but then came the foul bunt.

The St. Louis Post-Dispatch archives don’t turn up any hits for “Pujols bunt.” One blog post about the bunt groundlessly speculates that Pujols was improvising. Googling “why did Pujols bunt” in quotation marks yields zero hits. And, looking at the evidence we have, there’s no rational explanation. I’ve hand-written Tony La Russa a letter asking about this, but that was over three months ago and there’s not much chance he writes back.

Aaron Rowand played for eleven seasons, was an All-Star, and won two World Series. His entire career has taken place since the last time Albert Pujols bunted. That’s interesting, but not surprising. What’s surprising is that the only time Pujols bunted, there was no reason for him to do so.

Albert Pujols bunted once. We may never know why.


Should pitcher hitting count for Hall of Fame consideration?

The arbitrary cut-off I use for what is to be considered a great season is a minimum of 6 WAR.  Or 6 wins.  This is the cut-off for many.  Some others will count a say, 5.8, as a 6.  But I don’t.  I use a strict baseline.  It benefits some, hurts others.  But in reality does nothing, since I have no vote for any award that Major League Baseball currently has.

Since I wrote about Tom Glavine not quite being great enough to receive my hypothetical Hall of Fame vote,  I received a bunch of feedback.  Readers of the piece said I shouldn’t use FIP, that it is not as relevant over the course of a long career.  A point well-received.  A point that certainly has some validity behind it.

Many chose to use bWAR in Glavine’s defense instead since it takes into account runs allowed, rather than just the three true outcomes a pitcher encounters.

Here are Glavine’s numbers:

Glavine’s pitcher bWAR: 74.  two seasons of 6 or more WAR.

Glavine’s pitcher fWAR: 63.9. no seasons of 6+ WAR.

But according to Baseball Reference, Glavine added 7.5 wins at the plate.  Yes, his career .454 OPS actually added value.  Adjusted, that is an OPS+ of 22.

At Fangraphs, he added 5.7 wins with his bat, while having his career .214 wOBA.

But the question here  is, should we include Glavine’s offensive game?  We are comparing one player to another in cases like these and not every pitcher has the chance to hit in his career.  Or at least a consistent chance to hit and accumulate value by hitting.

It’s not like a general manager would try to sign a free agent pitcher that could hit and use lingo like, “You know, you have a pretty good stick for a pitcher.  If you sign with us in the NL, that will probably increase your total WAR when the statistic is invented in the future, and give you a better Hall of Fame case.”

Of course, the general manager probably would use the fact that he could hit as a “selling point.”  But obviously not the way I described the scenario above.

So if you add in Tom Glavine’s hitting, he all of a sudden has four seasons of 6+ bWAR and two seasons of 6+fWAR.

Neither are particularly dominating, or truly great, but they definitely help his case a little.

But let’s take a pitcher such as  Mike Mussina, who seems to be a good comp in people’s eyes to that of Glavine.

Mussina pitched in the American League his entire career.  He accrued -0.1 wins as a hitter.  He didn’t hit.  He pitched.

He totaled 82 fWAR with three seasons of 6+ wins.

And totaled 82 bWAR with four seasons of 6+ wins.

He has a better case for the Hall of Fame with or without Glavine’s bat.  But that is kind of aside from the point.

So I ask the question: should a pitcher, who hits terribly, but based on opportunity and even more terrible hitting by other pitchers, get credit for it in terms of value?  In particular, in terms of Hall of Fame voting?

It’s a legitimate argument.  But it seems to be unfair to American League pitching.  And when we compare Hall of Fame pitchers to one another, we compare them from both leagues.

Glavine still isn’t a sure-fire Hall of Famer, no matter which way you look at it.  He was never nearly as dominant as a Maddux or Randy Johnson.

But then again, he didn’t have to be.  He just had to be good enough to make a strong enough impression on the voters.


Estimating Pitcher Release Point Distance from PITCHf/x Data

For PITCHf/x data, the starting point for pitches, in terms of the location, velocity, and acceleration, is set at 50 feet from the back of home plate. This is effectively the time-zero location of each pitch. However, 55 feet seems to be the consensus for setting an actual release point distance from home plate, and is used for all pitchers. While this is a reasonable estimate to handle the PITCHf/x data en masse, it would be interesting to see if we can calculate this on the level of individual pitchers, since their release point distances will probably vary based on a number of parameters (height, stride, throwing motion, etc.). The goal here is to try to use PITCHf/x data to estimate the average distance from home plate the each pitcher releases his pitches, conceding that each pitch is going to be released from a slightly different distance. Since we are operating in the blind, we have to first define what it means to find a pitcher’s release point distance based solely on PITCHf/x data. This definition will set the course by which we will go about calculating the release point distance mathematically.

We will define the release point distance as the y-location (the direction from home plate to the pitching mound) at which the pitches from a specific pitcher are “closest together”. This definition makes sense as we would expect the point of origin to be the location where the pitches are closer together than any future point in their trajectory. It also gives us a way to look for this point: treat the pitch locations at a specified distance as a cluster and find the distance at which they are closest. In order to do this, we will make a few assumptions. First, we will assume that the pitches near the release point are from a single bivariate normal (or two-dimensional Gaussian) distribution, from which we can compute a sample mean and covariance. This assumption seems reasonable for most pitchers, but for others we will have to do a little more work.

Next we need to define a metric for measuring this idea of closeness. The previous assumption gives us a possible way to do this: compute the ellipse, based on the data at a fixed distance from home plate, that accounts for two standard deviations in each direction along the principal axes for the cluster. This is a way to provide a two-dimensional figure which encloses most of the data, of which we can calculate an associated area. The one-dimensional analogue to this is finding the distance between two standard deviations of a univariate normal distribution. Such a calculation in two dimensions amounts to finding the sample covariance, which, for this problem, will be a 2×2 matrix, finding its eigenvalues and eigenvectors, and using this to find the area of the ellipse. Here, each eigenvector defines a principal axis and its corresponding eigenvalue the variance along that axis (taking the square root of each eigenvalue gives the standard deviation along that axis). The formula for the area of an ellipse is Area = pi*a*b, where a is half of the length of the major axis and b half of the length of the minor axis. The area of the ellipse we are interested in is four times pi times the square root of each eigenvalue. Note that since we want to find the distance corresponding to the minimum area, the choice of two standard deviations, in lieu of one or three, is irrelevant since this plays the role of a scale factor and will not affect the location of the minimum, only the value of the functional.

With this definition of closeness in order, we can now set up the algorithm. To be safe, we will take a large berth around y=55 to calculate the ellipses. Based on trial and error, y=45 to y=65 seems more than sufficient. Starting at one end, say y=45, we use the PITCHf/x location, velocity, and acceleration data to calculate the x (horizontal) and z (vertical) position of each pitch at 45 feet. We can then compute the sample covariance and then the area of the ellipse. Working in increments, say one inch, we can work toward y=65. This will produce a discrete function with a minimum value. We can then find where the minimum occurs (choosing the smallest value in a finite set) and thus the estimate of the release point distance for the pitcher.

Earlier we assumed that the data at a fixed y-location was from a bivariate normal distribution. While this is a reasonable assumption, one can still run into difficulties with noisy/inaccurate data or multiple clusters. This can be for myriad reasons: in-season change in pitching mechanics, change in location on the pitching rubber, etc. Since data sets with these factors present will still produce results via the outlined algorithm despite violating our assumptions, the results may be spurious. To handle this, we will fit the data to a Gaussian mixture model via an incremental k-means algorithm at 55 feet. This will approximate the distribution of the data with a probability density function (pdf) that is the sum of k bivariate normal distributions, referred to as components, weighted by their contribution to the pdf, where the weights sum to unity. The number of components, k, is determined by the algorithm based on the distribution of the data.

With the mixture model in hand, we then are faced with how to assign each data point to a cluster. This is not so much a problem as a choice and there are a few reasonable ways to do it. In the process of determining the pdf, each data point is assigned a conditional probability that it belongs to each component. Based on these probabilities, we can assign each data point to a component, thus forming clusters (from here on, we will use the term “cluster” generically to refer to the number of components in the pdf as well as the groupings of data to simplify the terminology). The easiest way to assign the data would be to associate each point with the cluster that it has the highest probability of belonging to. We could then take the largest cluster and perform the analysis on it. However, this becomes troublesome for cases like overlapping clusters.

A better assumption would be that there is one dominant cluster and to treat the rest as “noise”. Then we would keep only the points that have at least a fixed probability or better of belonging to the dominant cluster, say five percent. This will throw away less data and fits better with the previous assumption of a single bivariate normal cluster. Both of these methods will also handle the problem of having disjoint clusters by choosing only the one with the most data. In demonstrating the algorithm, we will try these two methods for sorting the data as well as including all data, bivariate normal or not. We will also explore a temporal sorting of the data, as this may do a better job than spatial clustering and is much cheaper to perform.

To demonstrate this algorithm, we will choose three pitchers with unique data sets from the 2012 season and see how it performs on them: Clayton Kershaw, Lance Lynn, and Cole Hamels.

Case 1: Clayton Kershaw

Kershaw Clusters photo Kershaw_Clusters.jpeg

At 55 feet, the Gaussian mixture model identifies five clusters for Kershaw’s data. The green stars represent the center of each cluster and the red ellipses indicate two standard deviations from center along the principal axes. The largest cluster in this group has a weight of .64, meaning it accounts for 64% of the mixture model’s distribution. This is the cluster around the point (1.56,6.44). We will work off of this cluster and remove the data that has a low probability of coming from it. This is will include dispensing with the sparse cluster to the upper-right and some data on the periphery of the main cluster. We can see how Kershaw’s clusters are generated by taking a rolling average of his pitch locations at 55 feet (the standard distance used for release points) over the course of 300 pitches (about three starts).

Kershaw Rolling Average photo Kershaw_Average.jpeg

The green square indicates the average of the first 300 pitches and the red the last 300. From the plot, we can see that Kershaw’s data at 55 feet has very little variation in the vertical direction but, over the course of the season, drifts about 0.4 feet with a large part of the rolling average living between 1.5 and 1.6 feet (measured from the center of home plate). For future reference, we will define a “move” of release point as a 9-inch change in consecutive, disjoint 300-pitch averages (this is the “0 Moves” that shows up in the title of the plot and would have been denoted by a blue square in the plot). The choices of 300 pitches and 9 inches for a move was chosen to provide a large enough sample and enough distance for the clusters to be noticeably disjoint, but one could choose, for example, 100 pitches and 6 inches or any other reasonable values. So, we can conclude that Kershaw never made a significant change in his release point during 2012 and therefore treating the data a single cluster is justifiable.

From the spatial clustering results, the first way we will clean up the data set is to take only the data which is most likely from the dominant cluster (based on the conditional probabilities from the clustering algorithm). We can then take this data and approximate the release point distance via the previously discussed algorithm. The release point for this set is estimated at 54 feet, 5 inches. We can also estimate the arm release angle, the angle a pitcher’s arm would make with a horizontal line when viewed from the catcher’s perspective (0 degrees would be a sidearm delivery and would increase as the arm was raised, up to 90 degrees). This can be accomplished by taking the angle of the eigenvector, from horizontal, which corresponds to the smaller variance. This is working under the assumption that a pitcher’s release point will vary more perpendicular to the arm than parallel to the arm. In this case, the arm angle is estimated at 90 degrees. This is likely because we have blunted the edges of the cluster too much, making it closer to circular than the original data. This is because we have the clusters to the left and right of the dominant cluster which are not contributing data. It is obvious that this way of sorting the data has the problem of creating sharp transitions at the edge of cluster.

Kershaw Most Likely photo Kershaw_Likely_Final.jpeg

As discussed above, we run the algorithm from 45 to 65 feet, in one-inch increments, and find the location corresponding to the smallest ellipse. We can look at the functional that tracks the area of the ellipses at different distances in the aforementioned case.

Kershaw Most Likely Functional photo Kershaw_Likely_Fcn.jpeg

This area method produces a functional (in our case, it has been discretized to each inch) that can be minimized easily. It is clear from the plot that the minimum occurs at slightly less than 55 feet. Since all of the plots for the functional essentially look parabolic, we will forgo any future plots of this nature.

The next method is to assume that the data is all from one cluster and remove any data points that have a lower than five-percent probability of coming from the dominant cluster. This produces slightly better visual results.

Kershaw Five Percent photo Kershaw_Five_Pct_Final.jpeg

For this choice, we get trimming away at the edges, but it is not as extreme as in the previous case. The release point is at 54 feet, 3 inches, which is very close to our previous estimate. The arm angle is more realistic, since we maintain the elliptical shape of the data, at 82 degrees.

Kershaw Original photo Kershaw_Orig_Final.jpeg

Finally, we will run the algorithm with the data as-is. We get an ellipse that fits the original data well and indicates a release point of 54 feet, 9 inches. The arm angle, for the original data set, is 79 degrees.

Examining the results, the original data set may be the one of choice for running the algorithm. The shape of the data is already elliptic and, for all intents and purposes, one cluster. However, one may still want to remove manually the handful of outliers before preforming the estimation.

Case 2: Lance Lynn

Clayton Kershaw’s data set is much cleaner than most, consisting of a single cluster and a few outliers. Lance Lynn’s data has a different structure.

Lynn Clusters photo Lynn_Clusters.jpeg

The algorithm produces three clusters, two of which share some overlap and the third disjoint from the others. Immediately, it is obvious that running the algorithm on the original data will not produce good results because we do not have a single cluster like with Kershaw. One of our other choices will likely do better. Looking at the rolling average of release points, we can get an idea of what is going on with the data set.

Lynn Rolling Average photo Lynn_Average.jpeg

From the rolling average, we see that Lynn’s release point started around -2.3 feet, jumped to -3.4 feet and moved back to -2.3 feet. The moves discussed in the Kershaw section of 9 inches over consecutive, disjoint 300-pitch sequences are indicated by the two blue squares. So around Pitch #1518, Lynn moved about a foot to the left (from the catcher’s perspective) and later moved back, around Pitch #2239. So it makes sense that Lynn might have three clusters since there were two moves. However his first and third clusters could be considered the same since they are very similar in spatial location.

Lynn’s dominant cluster is the middle one, accounting for about 48% of the distribution. Running any sort of analysis on this will likely draw data from the right cluster as well. First up is the most-likely method:

Lynn Most Likely photo Lynn_Likely_Final.jpeg

Since we have two clusters that overlap, this method sharply cuts the data on the right hand side. The release point is at 54 feet, 4 inches and the release angle is 33 degrees. For the five-percent method, the cluster will be better shaped since the transition between clusters will not be so sharp.

Lynn Five Percent photo Lynn_Five_Pct_Final.jpeg

This produces a well-shaped single cluster which is free of all of the data on the left and some of the data from the far right cluster. The release point is at 53 feet, 11 inches and at an angle of 49 degrees.

As opposed to Kershaw, who had a single cluster, Lynn has at least two clusters. Therefore, running this method on the original data set probably will not fare well.

Lynn Original photo Lynn_Orig_Final.jpeg

Having more than one cluster and analyzing it as only one causes both a problem with the release point and release angle. Since the data has disjoint clusters, it violates our bivariate normal assumption. Also, the angle will likely be incorrect since the ellipse will not properly fit the data (in this instance, it is 82 degrees). Note that the release point distance is not in line with the estimates from the other two methods, being 51 feet, 5 inches instead of around 54 feet.

In this case, as opposed to Kershaw, who only had one pitch cluster, we can temporally sort the data based on the rolling average at the blue square (where the largest difference between the consecutive rolling averages is located).

Lynn Time Clusters photo Lynn_Time_Clusters.jpeg

Since there are two moves in release point, this generates three clusters, two of which overlap, as expected from the analysis of the rolling averages. As before, we can work with the dominant cluster, which is the red data. We will refer to this as the largest method, since it is the largest in terms of number of data points.  Note that with spatial clustering, we would pick up the some of the green and red data in the dominant cluster. Running the same algorithm for finding the release point distance and angle, we get:

Lynn Largest photo Lynn_Large_Final.jpeg

The distance from home plate of 53 feet, 9 inches matches our other estimates of about 54 feet. The angle in this case is 55 degrees, which is also in agreement. To finish our case study, we will look at another data set that has more than one cluster.

Case 3: Cole Hamels

Hamels Clusters photo Hamels_Clusters.jpeg

For Cole Hamels, we get two dense clusters and two sparse clusters. The two dense clusters appear to have a similar shape and one is shifted a little over a foot away from the other. The middle of the three consecutive clusters only accounts for 14% of the distribution and the long cluster running diagonally through the graph is mostly picking up the handful of outliers, and consists of less than 1% of the distribution. We will work with the the cluster with the largest weight, about 0.48, which is the cluster on the far right. If we look at the rolling average for Hamels’ release point, we can see that he switched his release point somewhere around Pitch #1359 last season.

Hamels Rolling Average photo Hamels_Average.jpeg

As in the clustered data, Hamel’s release point moves horizontally by just over a foot to the right during the season. As before, we will start by taking only the data which most likely belongs to the cluster on the right.

Hamels Most Likely photo Hamels_Likely_Final.jpeg

The release point distance is estimated at 52 feet, 11 inches using this method. In this case, the release angle is approximately 71 degrees. Note that on the top and the left the data has been noticeably trimmed away due to assigning data to the most likely cluster. The five-percent method produces:

Hamels Five Percent photo Hamels_Five_Pct_Final.jpeg

For this method of sorting through the data, we get 52 feet, 10 inches for the release point distance. The cluster has a better shape than the most-likely method and gives a release angle of 74 degrees. So far, both estimates are very close. Using just the original data set, we expect that the method will not perform well because there are two disjoint clusters.

Hamels Original photo Hamels_Orig_Final.jpeg

We run into the problem of treating two clusters as one and the angle of release goes to 89 degrees since both clusters are at about the same vertical level and therefore there is a large variation in the data horizontally.

Just like with Lance Lynn, we can do a temporal splitting of the data. In this case, we get two clusters since he changed his release point once.

Hamels Time Clusters photo Hamels_Time_Clusters.jpeg

Working with the dominant cluster, the blue data, we obtain a release point at 53 feet, 2 inches and a release angle of 75 degrees.

Hamels Largest photo Hamels_Large_Final.jpeg

All three methods that sort the data before performing the algorithm lead to similar results.

Conclusions:

Examining the results of these three cases, we can draw a few conclusions. First, regardless of the accuracy of the method, it does produce results within the realm of possibility. We do not get release point distances that are at the boundary of our search space of 45 to 65 feet, or something that would definitely be incorrect, such as 60 feet.  So while these release point distances have some error in them, this algorithm can likely be refined to be more accurate. Another interesting result is that, provided that the data is predominantly one cluster, the results do not change dramatically due to how we remove outliers or smaller additional clusters. In most cases, the change is typically only a few inches. For the release angles, the five-percent method or largest method probably produces the best results because it does not misshape the clusters like the mostly-likely method does and does not run into the problem of multiple clusters that may plague the original data. Overall, the five-percent method is probably the best bet for running the algorithm and getting decent results for cases of repeated clusters (Lance Lynn) and the largest method will work best for disjoint clusters (Cole Hamels). If just one cluster exists, then working with the original data would seem preferable (Clayton Kershaw).

Moving forward, the goal is settle on a single method for sorting the data before running the algorithm. The largest method seems the best choice for a robust algorithm since it is inexpensive and, based on limited results, performs on par with the best spatial clustering methods. One problem that comes up in running the simulations that does not show up in the data is the cost of the clustering algorithm. Since the method for finding the clusters is incremental, it can be slow, depending on the number of clusters. One must also iterate to find the covariance matrices and weights for each cluster, which can also be expensive. In addition, the spatial clustering only has the advantages of removing outliers and maintaining repeated clusters, as in Lance Lynn’s case. Given the difference in run time, a few seconds for temporal splitting versus a few hours for spatial clustering, it seems a small price to pay. There are also other approaches that can be taken. The data could be broken down by start and sorted that way as well, with some criteria assigned to determine when data from two starts belong to the same cluster.

Another problem exists that we may not be able to account for. Since the data for the path of a pitch starts at 50 feet and is for tracking the pitch toward home plate, we are essentially extrapolating to get the position of the pitch before (for larger values than) 50 feet. While this may hold for a small distance, we do not know exactly how far this trajectory is correct. The location of the pitch prior to its individual release point, which we may not know, is essentially hypothetical data since the pitch never existed at that distance from home plate. This is why is might be important to get a good estimate of a pitcher’s release point distance.

There are certainly many other ways to go about estimating release point distance, such as other ways to judge “closeness” of the pitches or sort the data. By mathematizing the problem, and depending on the implementation choices, we have a means to find a distinct release point distance. This is a first attempt at solving this problem which shows some potential. The goal now is to refine it and make it more robust.

Once the algorithm is finalized, it would be interesting to go through video and see how well the results match reality, in terms of release point distance and angle. As it is, we are essentially operating blind since we are using nothing but the PITCHf/x data and some reasonable assumptions. While this worked to produce decent results, it would be best to create a single, robust algorithm that does not require visual inspection of the data for each case. When that is completed, we could then run the algorithm on a large sample of pitchers and compare the results.


Community “Research”: Team COOL Scores

The following is, more or less, useless. It’s meant to be NotGraphsian more than FanGraphsian. It’s meant to be fun, if your definition of fun involves parodying something that’s already incredibly niche (NERD). It’s like if you time travelled to ancient Phoenicia and saw a minstrel play acting as a Hittite. That might not make sense. You will find that COOL does not make much sense in general. Just enough to make you wonder.

COOL scores are to the uninitiated baseball fan as NERD scores are to the statistically-minded baseball fan. They serve a purpose at opposite tails of a made-up bell curve, one with COOL at the tail representing the least baseballsy people and NERD at the other tail for wannabe sabermetricians. NERD is meant for the aspiring baseball savant and COOL is meant for the unaware baseball ignoramus. Someone who’d rather be playing Call of Duty, doing their nails, or eating at Sbarro than watching baseball.

But why have COOL scores at all? What use are they? Well, as baseball zealots it’s our job to brazenly preach our zeal to the unenlightened. Our joy cannot be contained, our cup overfloweth, our fountain runneth over, we are rivers of joy, etc. But our wives, girlfriends, loser younger brothers, and hip co-workers don’t listen to us. Instead they maim our reputations with insults like “nerd”, “loser”, and “wastrel.” Which is why we must resort to craftiness. We must become the Jamie Moyers of proselytism, precisely throwing junk on the corners of life’s strike zone, hoping our feeble heaters and lazy curves are received and not pummeled. All we want is for people to see beauty in the competitive handling of balls on a field (ahem). So as crafty lefties or crafty righties (some of us may be Moyer, others Livan Hernandez), we can use all the tools we can get. COOL is one such tool. It can work like this:

Nerdlet van Nerdinger: Salutations, Cooldred Coolson!

Cooldred Coolson: Hey, nerd.

NvN: Would you love to join me for a baseball viewing?

CC: No.

NvN: But I have a pseudo-scientific way of determining that it might be fun!

CC: Did you say science? I totes trust that shit.

NvN: Great!

CC: Zowie! I can’t wait for homerz, hottiez, and giant racing weinerz!

NvN: And I can’t wait to foster companionship/copulate with you!

There ya go. Sorkin-esque dialogue. Not that we, the baseball loving community, are friendless poon-hounds. I’m just talking about tools, here. Tools at our disposal, like Custom Leaderboards, a wrench, or a Desert Eagle .50.

La-dee-da. COOL stands for the Coefficient Of On-field Lustre. Or how likely it is for a non-fan to think, more or less, “Ooo! Shiny!” when watching the game. The fact that this number isn’t technically a coefficient is not a thing I want to address or think about.  These are the components of COOL, and how they are determined:

TV Announcer Charisma

The Cooldred Coolsons of the world never listen to the radio. Otherwise Bob Uecker alone could swell the baseball fanbase to billions in seconds (seconds!). Alas, holding the attention of a baseball mongrel requires Visual Stimuli, accompanied by Aural Pleasantries. This is why TV Announcer Charisma is included in COOL. To determine this variable, I took Charisma scores from the Broadcast Rankings, and finagled the z-score of each team’s home announcer. I multiplied this factor by 1.5 because: Science.

Variable: zCHAR*1.5

Lineup Attractiveness and/or Virility and/or Youth and/or Sexiness

There is something unbelievably compelling about watching a fine human being being fine, and human. I’m not even talking about sex, though sometimes that’s compelling, too. Watching beautiful people being beautiful is mesmerizing. Unfortunately there’s no easy way to rate the attractiveness of whole teams. One method I considered was using Amazon’s Mechanical Turk to crowdsource ratings of individual players’ headshots. People (Turks, perhaps) would simply rate the face as “attractive” or “not attractive,” and after a few thousand responses we’d have a good idea if a player was good looking. Alas, this was too much work and required money. Instead I took a massive shortcut and figured that, in general, youth=attractiveness, sorted all teams by age, rewarded young teams, and penalized old teams. I divided it in half because my methodology is shitty.

Variable: zSEX/2

Uniform Appeal

What people are wearing while they play sports appears to be very important to my mother. She frequently comments on the “get up” of athletes, while I frequently comment on the “get out” of a fly ball, while you are probably contemplating a “get the f— out” at this stupid article. The outward aesthetics of baseball are hugely important to the uninitiated. As nice as it is look upon a beautiful human in the buff, even a properly adorned Tom Gorzelanny can hold the eye and make it tremble (with desire, not nystagmus). So to determine the Objective Beauty of a team’s uniform, I took nine 2013 uniform rankings that I found online (science!) created by people of varying bias and credential (Jim Caple, myself, user pittsburghsport16 on sportslogos.net, etc.), averaged the rankings, assumed a normal distribution and pooped out z-scores for each team’s uniform appeal. Simple, easy, and deeply flawed. I multiplied uniform appeal by 2 because my mother holds great sway in the way I form opinions/conduct science.

Variable: zUNI*2

Home Runs

Home runs are the most easily understood event in baseball. Anyone can understand a home run and appreciate it. Home runs are great. They are saffron. They are sex. They are Super Saiyan. I used team HR% for this one. It’s not park adjusted because I am simple, and don’t know how to do that. It’s also accounted for in PARK, which is next. I briefly wondered if I should have used team HR/FB, but I’m betting it would give me a similar result. I also briefly considered halving the zHR% value because while HRs are great, they’re not altogether that common, and hinging your crude buddy’s enjoyment on the doorframe of dingerdom… well that’d be foolish. Better to hinge it on something more reliable, like what people are wearing. Science. But that made the end values less pretty so it remains whole.

Variable: zHR%

Ballpark Appeal

Where a team plays matters. To us it matters because where a park is and how it’s arranged can greatly affect the way baseball happens. To them it matters because they might see people running at full speed dressed as giant pierogies. Baseball is wonderful. I took the average Yelp ratings of each ballpark from Nate Silver’s 2011 article on ballparks, then upgraded the Marlins (based on my own subjective approval of the home run monstrosity in their new park), scaled the scores from 0-2, and then multiplied them by average %attendance to reward well-attended parks, and by each park’s 2013 HR park factor because: I’ve already covered this. Fun!

Variable: PARK

The Invisible, the Intangible, the Unknown, the Ghost in the Fandom Machine

Sometimes something unknowable seems to drive the affection of the masses. Often it’s success, or tragedy, or beauty, or infamy. Sometimes people just love things. Like screaming goats. I wanted to isolate the je ne sais quoi of team appeal, and decided a team’s road attendance best approximated their enigmatic allure. And apparently the Giants are just dripping with Mystery Honey, drawing fans like bees to their away games across the country. Is it because they play in a well-attended division? Because they won the World Series? Because they score runs? Because people still think Barry Bonds is around to boo? Possibly. But I’m not one to dig too hard for the truth. After all, I created COOL scores. This variable is merely, mightily, the z-score of %attendance at road games.

Variable: z???

This is the final formula:

(zSEX/2) + (zCHAR*1.5) + (zUNI*2) + zHR% + PARK + z???+Constant

The constant ensures an average score of 5. I refused to floor/ceiling the scores at 0 and 10 because I’m not entirely a plagiarist of NERD, and feel like this can be one, small, passive-aggressive way I can assert myself. Also laziness.

The COOL Leaderboard

Team COOL z-charisma z-age z-HR% z-unirank PARK z-???
Dodgers 10.59 2.26 -1.63 -1.07 1.55 0.65 1.17
Red Sox 9.51 1.04 -0.52 0.16 0.52 1.51 1.32
Mets 9.47 1.96 0.59 -0.09 0.92 0.68 -0.35
Giants 8.86 2.11 -0.52 -1.79 0.51 1.38 1.18
Orioles 8.53 0.12 0.59 1.96 0.73 1.2 -0.73
Cardinals 8.15 -1.41 1.7 -0.69 1.82 1.55 0.74
Cubs 7.53 0.58 -0.52 0.43 0.07 1.35 0.83
Tigers 7.4 0.43 -1.63 -0.01 0.7 1 1.02
Yankees 6.8 -0.79 -1.63 0.34 1.4 0.89 0.61
Athletics 6.39 0.12 0.59 0.02 1.23 0.06 -0.79
Reds 6.32 -0.34 0.59 0.08 0.25 1.07 0.72
Pirates 5.87 -0.34 0.59 0.12 0.33 0.71 0.43
Twins 5.77 0.28 0.59 -0.55 -0.12 1.14 0.55
Blue Jays 5.72 -0.79 -1.63 1.54 1.37 0 -0.72
Braves 5.58 -0.79 0.59 1.27 0.43 0.49 -0.29
Angels 5.4 0.12 -0.52 0.19 -0.1 0.72 0.62
Phillies 5.39 -1.1 -1.63 -0.09 0.34 2.2 0.89
Astros 5.11 1.5 1.7 0.07 -0.63 0.72 -1.67
Rangers 4.76 -0.49 0.59 1.09 -0.76 1.02 0.45
Brewers 4.73 0.43 0.59 -0.06 -1.21 1.48 0.63
Nationals 3.35 -0.79 1.7 -0.37 -0.6 0.4 0.71
Rockies 3.05 -1.1 0.59 1.2 -1.26 0.95 0.62
Indians 2.36 -0.64 -0.52 0.71 -1.26 0.54 0.69
Mariners 1.29 -0.03 -0.52 0.67 -0.75 0.43 -2.16
Royals 1.27 -0.34 0.59 -2.37 -0.07 0.64 -0.82
Padres 0.99 0.12 0.59 -0.26 -1.72 0.65 -0.59
White Sox 0.61 -1.87 -0.52 -0.09 0.33 0.59 -1.65
Rays 0.54 -0.03 -0.52 0.73 -1.29 0.1 -1.57
Diamondbacks -0.19 -0.03 -0.52 -0.94 -1.52 0.41 -0.49
Marlins -1.17 -0.18 0.59 -2.21 -1.2 0.62 -1.37

It’s the Los Angeles Yasiel Puigs at the top! Page views! Interestingly, the Rays are beloved by NERD (a 10!) but hated by COOL with a .054. That seems true to life. And everyone hates the Marlins (0 NERD, -1.17 COOL). So: this measure passes my smell test. But I have a terrible sense of smell due to allergies. So use your own noses.

Of course COOL is in its infancy. It’s zygotic, even. If my “research” is accepted, there will be time for revisions. I also have a Pitcher COOL score in the works, and there will be an umpire strike call flamboyance factor that can help us calculate games scores.

Despite numerous flaws, I still get the sense that COOL is telling us something. Even if that something is completely useless. Which was the point of this whole exercise from the beginning: To create a watchability measure for the people least likely to ever visit Fangraphs. Useless.

Finally, COOL is entirely inspired by Carson Cistulli’s work on NERD, obviously, without which I am a lost, vagrant, nothing–a malodorous abyss, obviously.

That’s it. Go resume Life.


Cooperstown and Tom Glavine Just Don’t Mix

Normally, I wouldn’t even address a pitcher’s won/loss record.  They aren’t useless, they aren’t irrelevant, but they are something that should be overlooked when evaluating a player’s performance.  Front offices don’t look at a pitcher’s wins and losses, so why should we?  Exactly.  They should be nothing more than a fun little stat to add to all the other fun little stats that have use, but are closer to useless than practical.

But 305 wins for a pitcher, well that’s extraordinary.  But an extraordinary number doesn’t necessarily translate into extraordinary performance.

The 305 wins (and 203 losses) HAS to be looked at, and addressed.  Because in 2014 when Tom Glavine is considered for induction into baseball’s most prestigious sanctuary, those 305 wins are going to be discussed, frequently.  Very frequently.  Nearly every old-school writer, former player and most fans of Glavine’s era, are going to be backing him up, using that number: The number 305.

Just to delve into wins and losses for a second if you happen to have come across this article in an old-school mindset:

A pitcher controls less than half of the outcome of a baseball game.  The offense controls 50 percent.  The fielders control some.  And we can add in that a manager affects some of the game too, we just don’t know how much.  So we will just use a manager’s impact, whatever it may be, and include that in the production of the offense, pitching and defense.

So you can see there why wins and losses should not be looked at when determining the quality of a pitcher.

So what is it that makes a Hall of Famer?  Greatness.  Yes, simply put, greatness makes a Hall of Fame player.  They do great things on a baseball field, for a long enough period of time, to allow us as critics to say, “Wow, that guy was a great player.”  A player can actually go through his career without being exceptional at any one aspect of his game, yet still be an exceptional player, a Hall of Fame player, a great player.

Yet, when it comes to pitchers, the guy kinda has to be great at pitching.  Because pitcher fielding is nearly useless.  And a pitcher’s bat is normally about the equivalent of Jeff Francouer’s swings against sliders out of the strike zone.

Bad.

Tom Glavine was a very good pitcher.  He accumulated 63 fWAR in his career, 74 bWAR, 118 ERA+, 3.54 base ERA.  Very, very good pitcher.  His WAR totals are right in that threshold where Hall of Famers “on the brink” usually sit.  Players that could be looking in, or looking out, based on a little subjectivity and bias from the writers who induct these guys.

But Tom Glavine had a 3.95 FIP.  And if you believe in FIP; that’s not great.  He pitched in the National League, so that FIP includes the pitchers he faced — which are easier to strike out, less likely to walk, and extremely unlikely to go deep.

Two times in Glavine’s career, he struck out more than seven batters per nine innings.  He kept his walks under control, walking 3 per nine throughout his career.  But that’s not “exceptional.”  Neither that nor his strikeouts per nine innings are.

Glavine won two Cy Youngs, and finished in the top-five in voting six! times.  Remarkable, yet equated to the subjective.  I’m not saying he didn’t deserve those awards, I’m just saying that a lot of noise goes into the process of who receives the award.

Dwight Evans was a very good baseball player.  One of the better defenders at the corner and well above average offensively.

Orel Hershiser racked up 204 wins in his career and once went 59 consecutive innings without allowing a run.

As for Tom Glavine, he pitched very well, for a long, long time, on one of the greatest runs by an organization that any sport has ever seen.  He made it to the postseason several times because of the talent of he and his supporting cast.  And during his time in October, he performed incredibly well.  To the tune of a 3.30 ERA in 218 innings.  And that probably meant his opponents were better than average offenses than he faced in the regular season, given that they were good enough to qualify for postseason play.

But listen to some of the deserving  names for the potential 2014 Hall of Fame ballot:

Craig Biggio, Jeff Bagwell, Mike Piazza, Tim Raines, Curt Schilling, Roger Clemens, Barry Bonds, Edgar Martinez, Alan Trammell, McGwire, Frank Thomas, Mike Mussina and Jeff Kent.

Then you have a few outsiders that aren’t quite in the same caliber: Sammy Sosa, Jack Morris, Rafael Palmeiro, etc.

There are so many more deserving players than Glavine in next year’s class.  But there are clouds overhead with many of them.  And Glavine doesn’t have a cloud following him around wherever he goes.

I expect Glavine to get voted in:  305 wins.  No storm-cloud.  Played for a great, winning organization.  Seemed to be well-liked by anyone that came across him.  Or at least I know of no incidents surrounding him.

This will be why Tom Glavine gets into the Hall of Fame.  Because of very good pitching, along with very well-known variables by anyone that knows anything about Tom Glavine.

But I don’t think he should be inducted.  He was never an exceptional pitcher.  It wouldn’t be an egregious decision by any means.  And he wouldn’t be the worst player in the Hall of Fame

But the most exceptional thing about Tom Glavine’s career was that he, or anyone for that matter, could pitch that well, for that long.


Breaking Down the Swing: Best Hitters of 2012

Often players will be credited for being very efficient with their swings, or evaluators and coaches will praise a hitter for having tremendous bat speed.  Those who work with hitters and study the art of hitting on a regular basis know that it takes a lot more than being a good athlete or having fast hands to be a successful hitter.  I myself work with many amateur hitters at Carmen Fusco’s Pro Baseball & Softball Academy in New Cumberland, PA.  We use video analysis as an integral part of the learning process, and I spend many hours outside of work devoted to breaking down MLB, MiLB, and draft-eligible players’ swings and pitching deliveries.  In this study I have conducted, I wanted to collect data regarding the best Major League hitters’ swings to discern what actually matters and is worth commenting on from a mechanical perspective of a hitter.

Going into this project, I wanted it to be primarily a data-driven approach to what players do in the batter’s box.  This is a study of hitters’ mechanics at the Major League level, hopefully useful in producing predictive or at least somewhat comparative parameters to be applied to unproven professional or amateur players.  Many criticisms and compliments get heaped on hitters for how their swings work and the correlation to big league success.  However, I have not seen many of these thoughts backed up with hard evidence as proof or even fact-based suggestions that they are truly instrumental to a player’s results on the field.  I will mix in many of my own thoughts here and there as well, but this is meant to be used as an objective analysis of hitters’ mechanical processes.

Read the rest of this entry »


Rebuilding on a Crash Diet: The Brewers and a Calamitous May

To describe May, 2013 as an awful month for the Milwaukee Brewers would not do it justice.

In fact, the Brewers were downright putrid, winning only six games the entire month.  Their record in May was so bad (6-22) that it tied the worst month in franchise history: the August turned out by the 1969 Seattle Pilots, who ended the following season in bankruptcy, followed by a permanent road trip to become the Milwaukee Brewers.

The Brewers ended the month of April only a half game out of first place.  The Brewers ended the month of May 15 games behind the St. Louis Cardinals, managing the impressive feat of losing 14.5 games in the standings in one month.  Now that is a tailspin.

CoolStandings.Com currently gives the Brewers a 1 in 250 chance of making even the wild-card play-in game.  GM Doug Melvin admitted there is no chance the Brewers will be buyers this year at the trade deadline.  Rather, they will either be in a sell mode, seeking high-ceiling prospects a few years away, or keeping the assets they have, presumably only if they cannot get anything in return.  In short, the Brewers are suddenly rebuilding, and are focusing on  stocking up their farm system and developing controllable rotation talent.

But, rebuilding is a complicated topic in small markets like Milwaukee.  As Wendy Thurm has noted, the Brewers, with their limited geographic reach, have one of the smallest television contracts in the league.  Thus, the Brewers rely upon strong attendance to deliver profits for Mark Attanasio and his ownership group.  In recent years, the Brewers’ attendance fortunately has been some of the most impressive in baseball, particularly in comparison to the size of the Milwaukee metropolitan area.  Over the last five years, the Brewers have consistently approached or exceeded three million fans, despite challenging economic times.  So, one thing the Brewers cannot afford is a collapse akin to the mere 1.7 million fans they drew in 2003 during a terrible season — not if they want to make the investments in future talent required to make the franchise a perennial contender.

So, the Brewers face an obvious challenge: the team needs to lose enough games to obtain a prime draft position, and thereby maximize its chances to draft a top-ceiling player with minimum bust potential.  At the same time, the Brewers need to avoid losing in any drawn-out fashion, because a corresponding and sustained decline in attendance could hemorrhage desperately-needed cash from their balance sheet.  As Ryan Topp and others have argued, this need to maintain attendance in the short term seems to be one reason why the Brewers have systematically traded away what previously was an excellent farm system, with the apparent goal of maintaining the aura of a competitive team.

How does one navigate this problem?  Well, the best solution could be to experience a May like the Brewers just suffered.  Doing so addresses two problems: (1) it abruptly puts the team on course to get a top 5 draft pick, and (2) it achieves this result so abruptly, and in this case so early in the season, that the fan base can still — at least in theory —enjoy much-improved baseball for the remainder of the season without jeopardizing that draft slot.  In short, when you can take your medicine over the course of one month, instead of over an entire season, you really ought to do it.

As to the draft:

Thanks to May, the Brewers currently have the fifth-worst record in baseball at 23–37.  As of the morning of June 8, 2013, FanGraphs predicted that the Brewers will end the season tied for baseball’s fourth-worst record with the New York Mets at 73–89.  Provided that 2013’s top five draft picks all reach agreement with their teams, the Brewers are on pace for a top-5 draft slot in 2014.

The Brewers have not had a top-5 pick in the Rule 4 draft since 2005, when they picked some guy named Ryan Braun.  Before 2013, the top five slots in the draft provided, among others, Buster Posey (#5, 2008), Stephen Strasburg (#1, 2009), Manny Machado (#3, 2010), Dylan Bundy (#4, 2011), and Byron Buxton (#2, 2012) — the types of superstar prospects the Brewers have been denied for years, and which they need to anchor their next generation of players.  At the end of April, and before May occurred, the Brewers were on track for yet another mid-round pick slot.

As to the rest of the season:

It is unlikely that the Brewers will continue to suffer the combination of injuries and dreadful rotation pitching that helped ruin their May.  FanGraphs seems to agree, predicting that the current Brewers roster (or something like it) will essentially play .500 baseball for the rest of the season, even while maintaining one of the five worst records in the game.

Average baseball is not contending baseball, but average baseball at least would offer Brewers fans — already pleased with Miller Park’s immunity from rain delays — a reasonable likelihood of seeing a win on any given day.  In 2009, the Brewers were able to bring in over three million fans, despite finishing under .500 overall.  In 2010, the Brewers ended up eight games under .500, but still brought in 2.7 million fans.  It remains to be seen whether playing .500 baseball for the rest of the 2013 season would be sufficient to keep fans coming through the Miller Park turnstiles, but if so, the increasing remoteness of May could be a significant factor, particularly if the team can convince fans that “one bad month” does not represent the current Miller Park experience or true caliber of the team.

Of course, it is also possible that the Brewers will be able to trade significant assets at the deadline in exchange for the prospects Doug Melvin wants.  If so, their projected record could, and probably would decline.  (This is necessarily not a bad thing, given that 68.5 wins is the average cut-off to secure a top 5 draft spot from 2003 through 2012).  If that happens, the Brewers will have a further challenge on their hands in trying to provide even average baseball for their fans, and maintain the attendance they need.

That said, the Brewers’ remarkable close to 2012 — an incredible .610 winning percentage from August through October — was accomplished after trading away Zack Greinke and calling up minor league talent to plug gaps in the rotation left by Greinke’s trade and Shaun Marcum’s injuries.  If the Brewers are once again able to make advantageous trades at the deadline, and also able to play even .500 ball for the rest of the year, they are still in a position to do so without hurting their chances to get the impact player they need in the 2014 Rule 4 draft.

If they can pull both of these things off, much of the thanks should be given to the horrible month of May.


The Ten Highest BABIPs Since 1945

Earlier this season I looked at the ten lowest BABIPs since 1945, investigating what, exactly, this statistic can teach us about hitters. The conclusions ranged from clear to not-so-much: your batting average on balls in play will be lower if you’re too slow to beat out infield grounders, if you hit an unusually low number of line drives, if you’re getting poor contact by swinging at bad pitches, and if you’re just plain unlucky. Sometimes players saw their power numbers drop along with their BABIPs, most likely because of an inferior approach at the plate which caused weak hits, but sometimes players saw their power numbers rise sharply: one of the ten lowest BABIPs ever belongs to Roger Maris, because he put 61 balls out of play and over the outfield fences.

Will our high scorers clear things up?

What is BABIP? (Copied from the First Post)

Batting average on balls in play is exactly that: when you hit the ball and it’s not a home run, what’s your batting average? Imagine you’d only ever batted twice; first you hit a single and then you struck out. Your BABIP would be 1.000. If a single and a groundout, .500. After seven games of the 2013 season, Rick Ankiel had two home runs but no singles, doubles, or triples, so his BABIP was .000.

Across any given season, the average BABIP tends to be about .300. All this means is that, when you hit the ball at professional defenders, there’s a 70% chance they’ll get you out.

The Ten Highest BABIPs Since 1945

Leaderboard

10. Willie McGee, 1985 (.395). McGee’s presence here isn’t surprising, since his hallmarks, aside from excellent hitting skills (and not much power), were speedy outfield defense and quality baserunning. It’s easy to imagine McGee beating infield grounders, hustling out hits, or being above average at driving the ball, even though some of those statistics weren’t tracked at the time.

9. Derek Jeter, 1999 (.396). Jeter’s 2006 ranks 17th on the list, too. Jeter’s 266 infield hits since 2002, when batted-ball data started being counted, ranks second among all hitters in that decade-plus. First place? You’ll find out who that is in a minute (if you don’t know already).

8. Wade Boggs, 1985 (.396). Hey look, two top-ten BABIP seasons in the exact same year! Boggs edges McGee and the whole league with 240 hits in 161 games, 187 (77.9%) of them singles. During all his batting-title years, his BABIP was high, bottoming out at .361. Lucky? No: more like extremely good contact skills.

7. Austin Jackson, 2010 (.396). Jackson’s breakout season in center field for Detroit (that .396 BABIP led him to a .293 average) was followed by a breakup 2011 when his BABIP dropped 56 points (still above average!) and his batting average and on-base percentage fell 54 and 28 points, respectively. So far in 2013 Jackson’s at a career low on balls in play, but he’s also dramatically reduced his previously ugly strikeout rate, which has bolstered his return to the ranks of the truly outstanding.

6. Andres Galarraga, 1993 (.399). Before Galarraga cranked out 47 home runs at the age of 35, he had an also highly improbable 1993. Triple slash, 1989-1992 (509 games): .246/.301/.399. Home runs in those 509 games: 62. Triple slash in 1993: .370/.403/.602.

Three observations. First, Galarraga’s batting average never came within fifty (!) points of that again. Second, this was his first season in Colorado, although it wasn’t a full one, as he only played 120 games. The Coors boost to his power was minimal, at first. Third, the guy could not take a walk.

5. Ichiro Suzuki, 2004 (.399). Will anyone be surprised to see Ichiro here? Speedy, with a near-mythical gift for hitting, Ichiro also has a gift for avoiding fly balls (23.8% flyballs, fourteenth-lowest in baseball since we started counting in 2002). And another thing we’ve been counting since 2002: Ichiro has 463 infield hits, 40% more than second-place Derek Jeter. In 2004, Ichiro had 57 infield hits in 161 games, or about one every series. Since 2002, Mark Sweeney has 12 infield hits in 690 games.

4. Roberto Clemente, 1967 (.403). Clemente was in the middle of a run of six consecutive 6.0+ WAR years. His high batting average on balls in play made this one his most valuable of all (7.7), 40 points above his career average (which was identical to his BABIP the year before). Clemente hit six fewer homers and five fewer doubles but 19 more singles, explaining the paradox that his slugging percentage rose while his power actually dropped.

3. Manny Ramirez, 2000 (.403). This is one of seven seasons in which Manny posted a BABIP above .350. I looked at batted ball data, available from 2002 onward, and found that Manny’s 22.6% line drives ranked 31st among the 481 hitters who’ve racked up more than 1,500 plate appearances since. Of course, Manny was inconsistent in that stretch. His .373 BABIP in 2002 coincided (or not!) with a line-drive rate of 25.3%. (Mark Loretta sits at first since ’02, 26.0%, while at second with 25.2% is Joey Votto, more on whom shortly.)

2. Jose Hernandez, 2002 (.404). I was alive and watching baseball in 2002 and I had never heard of Jose Hernandez. The Brewers shortstop had four pretty good seasons (1998-99, 2001, 2004), three terrible ones (1996, 2000, 2003), and a rather miraculous 2002 which found Hernandez riding a tidal wave of good luck on balls in play. His average rose 39 points, and dropped by 63 the next season; he struck out in literally one-third of his at-bats (188 Ks); his power numbers were unchanged. But, aside from luck, there was another big change. This was the first year batted-ball data is available, and the only year where Hernandez’ flyball rate was below 30%. Between Hernandez, Ichiro, and Jeter, flyball rate is a significant predictor of BABIP.

1. Rod Carew, 1977 (.408). What does it take to have the highest career BABIP of any finished career since 1945? (“Hang on,” you say, “what’s with this ‘finished career’ business?” “Ah,” I say, “Austin Jackson and Joey Votto are in the lead.”) Carew’s career BABIP is .359. Carew’s 1974 ranks 19th on this list (.391). So the guy was a great hitter: but his 1977 was extraordinary. An 8.5 WAR season, it saw a dramatic spike in singles, plus career highs in doubles, triples, and (tied with 1975) home runs. There was also an MVP award.

Conclusions

Again, some of the things we learned are unsurprising: speed is good; being an all-time great contact hitter is good. But there’s a twist: Jose Hernandez benefited from a whole lot of luck, and Rod Carew had the year of his life, but most of the guys here are obviously disposed to high BABIPs based on their skills. We were able to blame a lot of the bottom-ten seasons on hard times and bad breaks, but most of these guys are exceptional hitters with speed and contact ability.

And there’s a new factor begging for our attention.

When we looked at the ten lowest BABIPs, we were unwittingly at a disadvantage, because only one of those low seasons took place while batted-ball data was documented. Three of our ten highest have happened since 2002, though, as well as #13, 14, and 17, which means we have evidence of a new factor.

Hit more line drives, and your batting average on balls in play goes up.

Hit more fly balls, and it goes down–fast.

As a Community Research writer, I can’t insert a chart here; as a lazy person, I don’t have a chart to insert. But the next step in our inquiry is very, very clear. Does fly ball hitting suppress BABIP? Is it because of the increase in home runs, the ease with which defenders catch the ball, both, or neither?

Even More Pertinent Conclusion

We live in the golden age of BABIP. If I had done this “Ten Highest” post including 2013, the present season would have accounted for 40% of the list.

Among the top 20 BABIP guys with more than 700 games played in their careers, there are some retirees: Rod Carew (#2), Ron LeFlore (#7), Wade Boggs, Roberto Clemente, Kirby Puckett, Tony Gwynn, Willie McGee, and John Kruk. But 12 of the top 20 guys are currently active: Joey Votto (#1), Derek Jeter (#3), Shin-Soo Choo (#4), Matt Kemp (#5), Joe Mauer, Miguel Cabrera, Ichiro Suzuki, Matt Holliday, Michael Bourn, Ryan Braun, Wilson Betemit, David Wright.

As commenter Ferd pointed out last time, the league average BABIP was .260 in 1968; when I started the series, I relied on research which assured me that BABIP was consistent over time, but this is clearly not true. This means that there are two more lines of inquiry we should follow.

1. Why are so many BABIP leaders currently active? Is it a change in hitting style? Is it a change in pitching style? Is it a change in the data being used or the calculations being made? Or is it simply because most of them haven’t gotten older, slower, and less talented at the plate, and once they all age and retire order will be restored?

2. Wilson Betemit? How did that happen?


Chris Davis’s Oddly Historic Season So Far

A lot of ink (and pixels) have been spilled about Chris Davis’ great season.  It’s hard to overstate just how great a .337/.432/.721 start through roughly one-third of the season is, especially in this renewed era of depressed offense.  MLB’s .722 OPS this year so far ranks it as Baseball’s second-lowest since 1992’s .700.  (2011 = .720)  Quite straight, Davis is having the best offensive season in the American League of any player whose first name is not some variation of “Michael”.

Here’s yet another data point for you to chew on: Chris Davis is on track to have one of the highest extra-base hit (XBH) to plate appearance (PA) ratios in history.

As of the morning of Memorial Day 2013, Davis has hit an XBH in 16.5% of his PAs.  In conversational terms, he hits an XBH about every six times he steps to the plate.

If Davis were to end the season with this ratio and qualify for a batting championship, it would rank second in history behind this other guy’s pretty good season.

In fact, only nine qualified players in modern history have ever had an XBH-PA ratio of greater than 15% over the course of an entire season.  Here is the list, with Davis’s 2013 added for context:

Rk Player Year XBH PA XBH %
1 Babe Ruth 1921 119 693 17.2%
2 Chris Davis 2013 34 206 16.5%
3 Albert Belle 1995 103 631 16.3%
4 Lou Gehrig 1927 117 717 16.3%
5 Barry Bonds 2001 107 664 16.1%
6 Babe Ruth 1920 99 616 16.1%
7 Jeff Bagwell 1994 73 479 15.2%
8 Al Simmons 1930 93 611 15.2%
9 Albert Belle 1994 73 480 15.2%
10 Todd Helton 2001 105 697 15.1%

You may have noticed that 30% of the players on this list are named either Al or Albert, but none of them are named Pujols.  None of them are named Miguel, either.  In fact, the closest the reigning American League Triple Crown winner has come to cracking this list was in 2010 with a 13.0% XBH-PA ratio, and as of this morning he sits well out of range in 2013 at 12.5%, despite his own empirically otherworldly start.

This is, without a doubt, a most exclusive list of a most consistently slugging nature.  It’s enough to send pitchers into grand mal seizures at the very contemplation of this.  Or perhaps more exactly, it might if they were even aware of it.  This data point has probably not yet been illuminated in quite this way—this here article is the closest I myself have found so far, and Davis is not even the star of the piece.  But that does not mitigate the impressiveness of this feat of his so far.

This is not to say that Chris Davis is a better hitter than Miguel Cabrera, or Albert Pujols or Joey Votto or even Shin Soo Choo, for that matter.  But even if this does turn out to be a world class-level fluke season for him, Davis has a chance to crack an elite list inhabited only by the greatest of the great, even if he never knows it.


The Ten Lowest BABIPs Since 1945

For hitters, BABIP is often an explanation for unusually good or bad seasons. But what causes a great or poor BABIP? And are we right to simply blame BABIP whenever a bizarre season happens? It might help to look at some extreme cases. Even if we don’t learn something about how to interpret hitters’ BABIP, we can at least have fun. Nerdy, nerdy fun.

What is BABIP?

Batting average on balls in play is exactly that: when you hit the ball and it’s not a home run, what’s your batting average? Imagine you’d only ever batted twice; first you hit a single and then you struck out. Your BABIP would be 1.000. If a single and a groundout, .500. After seven games of the 2013 season, Rick Ankiel had two home runs but no singles, doubles, or triples, so his BABIP was .000.

Across any given season, the average BABIP tends to be about .300. All this means is that, when you hit the ball at professional defenders, there’s a 70% chance they’ll get you out.

What influences BABIP?

The enemy. Defense and to some extent pitching are factors, but over the course of a full year, as you face the entire league, this averages out.

Power. If you hit twenty balls to the warning track, and a lot of them fall for hits, your BABIP will increase. But if they all carry right over the fence for home runs, they will stop counting for this purpose, meaning your BABIP will probably decrease since more of your hits will be excluded from the stat.

Hitting style. There are six infielders, so more ground balls tend to be fielded; this is why pitchers, who are wimpy at hitting, tend to have low BABIPs. Fly balls are often caught, so the best scores go to line-drive hitters.

Speed. If you’re fast enough to beat throws and bunt for singles, your BABIP will be higher. If you run like I do, probably not so much.

Luck. Maybe the biggest single factor is: are you lucky? We all see hard-hit balls straight at defenders, or guys who go on “hot streaks” where the ball “finds all the holes.” That’s called “luck,” and BABIP can quantify it. Believe it or not, you really can have good or bad luck that lasts an entire year.

Let’s illustrate these principles by looking at some hitters with very low BABIPs.

The Ten Lowest BABIPs Since 1945

10. Roger Maris, 1961 (.209). 38.4% of Roger Maris’ hits that year were home runs. (Stop now to think about that.) If the ball stayed in the park, somebody probably caught it. On the other hand, if the ball had a chance of leaving the park, it did. 61 of them did.

9. Jim King, 1963 (.208). Although somewhat powerful (24 homers), Jim King was also something else: bad. His BABIP never came close to league average, and in partial seasons after ’63 it would be .207 and .209. He was known as a power-hitting bench bat, and only found regular playing time on the miserable Washington Senators (106 losses that year).

8. Dave Kingman, 1982 (.207). Dave Kingman hit homers (37) and struck out a whole lot, and based on his terrible, terrible fielding metrics, he was a mighty slow fellow. There’s also another factor here: he was old. “But he was only 33,” you say. “If there was something to this age thing, he’d get worse as he got even older.” “Aha,” I reply, “that’s why you’re supposed to keep reading!”

7. Dick McAuliffe, 1971 (.206). Here’s our first plausible “bad luck” guy. A career .264 BABIP, and indeed the following year he had a .264 BABIP. A career .247 hitter, and the following year he hit .240. A career .343 OBP, and the following year his OBP was .339. So Dick McAuliffe bounced back just fine, but it’s worth noting two things: first, a career .247 hitter is not that good, and second, for whatever reason his walk rate did decline sharply during his “unlucky” year. Was he swinging more aggressively? If so, he was still striking out less than usual.

6. Roy Cullenbine, 1947 (.206). I mentioned Roy Cullenbine in my first post on these venerable pages: a man who combined all-time bad luck with a truly incredible batting eye, walking 22.6% of the time despite being a distinctly non-intimidating hitter. The only guy in 1947 who walked more was Triple Crown winner Ted Williams, and Williams was frequently being walked on purpose. Cullenbine’s possibly all-time-great ability to take a walk was rewarded with–well, never playing in another major league game.

He did hit 24 homers, but this is another bad luck year. Heck, Cullenbine’s BABIP in 1946 was .347.

5. Dave Kingman, 1986 (.204). Toldja so! Here’s Kingman, age 37, hitting home runs (35) but nothing else. A full-time DH by now, he (like Cullenbine) never played in the big leagues again.

4. Brooks Robinson, 1975 (.204). Only six home runs to his name, still manning third base, Brooks Robinson is another example of what’s becoming a clear trend: he was 38 years old. He played partial seasons after this, but not full ones. This was a truly godawful year: .201/.267/.274, good for a wRC+ of 54.

3. Ted Simmons, 1981 (.200). A catcher and a fairly slow runner turning 32, Simmons saw a small drop in power, which he partially recovered the next year, and a 97-point drop in his BABIP, hard to explain just from the power outage. The traditional explanation for his poor 1981 is that he had just moved to Milwaukee and the American League. Luck might have hurt him, too.

2. Curt Blefary, 1968 (.198). Carson Cistulli previously highlighted Blefary on this site. After winning Rookie of the Year in 1965, the young outfielder posted two more above-average seasons before falling off a metaphorical cliff in 1968. He was being bounced around between positions, and he was never a speedster: his defense inspired the nicknames Clank and Buffalo.

Part of it must be bad luck. The BABIP .045 below his career average bounced back in 1969, when he moved to catcher and had a fairly good season for the Astros; a power decline turned out to be real, but his other numbers recovered. And yet Blefary would play his last major league game at age 29, moving on to a career as a “sheriff, bartender, truck driver, and night club owner.”

1. Aaron Hill, 2010 (.196). Aaron Hill’s notoriously lost season is the only one here from the last twenty-five years–and the most dramatic of all. Interestingly, a RotoGraphs article on Hill attributes his 2010 to pure awfulness but his recovery in 2011 to an “inflated” BABIP. But a .196 BABIP, a full hundred points below average, counts as deflated, right? Hill sucked in 2010 despite 26 homers and a slightly increased walk rate.

The advantage of recency is that we have more data. Here the culprit is obvious: he had previously been, and would soon be again, very good at hitting line drives, but in 2010 his line-drive percentage dropped by half (just 10.6%) and more than half of the balls he hit all year became fly balls. Some of those drifted out of the park, but most drifted over a waiting defender. And even though Hill was walking more, he was also swinging more frequently at pitches outside the strike zone. Hill’s new approach in 2010 didn’t hurt his ability to take a walk, but it hurt his ability to drive the ball. Still, to earn the lowest BABIP in modern history, he also suffered from an entire season of some of the worst luck any batter’s ever had.

Conclusion

The BABIP losers here didn’t do badly over their careers: combined, these “bottom 10” earned 42 All-Star appearances (18 by Brooks Robinson), 3 MVP awards, and a Rookie of the Year prize.

This unscientific survey confirms a lot of preconceived ideas:
– slower players don’t create their own luck on balls hit in fair territory
– aging players often lose their speed or power or both
– swinging at balls outside the strike zone means you make inferior contact
– sometimes, good luck isn’t enough to save a terrible hitter
– sometimes, terrible luck is enough to end a good hitter’s career

But there’s an interesting question to be raised here. Some of these guys–Maris, Kingman–hit homers like crazy, thus suppressing their BABIPs. On the other hand, Blefary and Simmons lost home run power in their hard-luck years. Simmons was playing in a new ballpark and Blefary at a new position. Maybe they were the Aaron Hills of their times, adjusting their approaches in deleterious ways (probably swinging at more pitches). Maybe they hit the ball poorly for unknown, reversible reasons. Maybe they had bad luck.

If I were counseling hitters on how to maximize their batting average on balls in play, I would say this: cultivate speed and athleticism, swing at better pitches, and try to hit line drives. I don’t know if BABIP can or should be learned, however. Ultimately, BABIP is the baseball version of a zen koan or hippie bumper sticker. BABIP: Stuff Happens. Or, more accurately, sometimes in baseball you make your own fate, but sometimes your fate makes you.