Grading 2013 AL SP Performance with Attention to the 2-D Direction of Batted Balls

Foreword

Two years ago, I began developing a system for evaluating the performance of minor-league pitchers relative to their minor-league level/league peers. My goals were to use only game data that could be extracted from the MLB Advanced Media Gameday archives for every level of the minors (ruling out any of the pitch-outcome data that is available for AA and AAA games), to ignore whether batted balls went for hits or home runs, and to ignore runs allowed. In brief, the challenge amounts to using whatever else information can be compiled from the game-specific dataset to arrive at the best approximation of the pitcher’s true performance, as judged independent of those factors which tend to fall outside their control (defense, park effects, etc.). What eventually follows are the results of applying the latest iteration of this “Fielding and Ballpark Independent Outcomes” method to 2013’s American League starting-biased pitchers.

Basic Steps of Applying the Method to a League

  1. Download the relevant details of every plate appearance (PA) from the league’s season into a spreadsheet/database
  2. Derive a 24-outs-baserunners-state run expectancy matrix à la Tango in The Book
  3. Quantify how each PA of the season impacted the inning’s run expectancy
  4. Exclude all bunts and foulouts, plus every PA taken by a pitcher
  5. Reweight the proportion of line drives (LD), outfielder fly balls (OFFB), ground balls (GB), and infielder flyballs (IFFB) by ballpark to offset any stadium- or stringer-related anomalies in play event classifications
  6. Referencing the run-expectancy value determined for each PA in Step 3, the corresponding basic description of the play (BB vs HBP vs K vs GB vs IFFB vs OFFB vs LD), and the 2 coordinates indicating where the batted ball was fielded (if there was one), quantify what each of the following 12 general PA event types were worth in terms of runs, on average, for the season: 1) walk or hit-by-pitch, 2) strikeout, 3) IFFB, 4) GB to batter’s pull-field-third of the diamond, 5) GB to batter’s center-field-third, 6) GB to batter’s opposite-field-third, 7) LD to pull-third, 8) LD to center-third, 9) LD to opposite-third, 10) OFFB to pull-third, 11) OFFB to center-third, and 12) OFFB to opposite-third.
  7. For each pitcher in the study sample, tally up the number of each of the 12 event types that they allowed and in each instance charge them with the exact number of runs determined in Step 6 for the corresponding event type; divide the resulting sum by the total number of events to arrive at a single number for each pitcher that quantifies how a PA against them that season should have affected the inning’s run expectancy, on average (the more negative this number the better the pitcher should have performed on the year)
  8. Quantify how high or low the pitcher rated on the value in Step 7 versus the mean of the sample on a standard deviation (SD) basis

What were the 12 Event Types Worth in 2013?

The table below shows how the studied event types impacted run expectancy in AL Parks during 2013, on average. The 2-D direction of the batted ball does tend to be rather consequential for LD and even more so for OFFB.

 photo 2013ALParksPAEventType-EffectofRunExpectancies2_zps4e1054de.jpg

So as far as Step 7 described above goes, each pitcher in what follows will be charged +0.29 runs for every BB and HBP, -0.26 runs for every K, … and -0.08 runs for every OFFB to the Opposite-Field-Third, with that sum ultimately divided by the total number of PA events to arrive at a single number that quantifies what an average PA against the pitcher in 2013 was worth in terms of runs (per run expectancies). Think of that as the equation being used to evaluate each pitcher’s performance.

Study Sample

The 101 pitchers who faced more than 200 batters as an American Leaguer in 2013 while averaging more than 10 batters faced per game. Data they accumulated as relievers is included in the analysis. Data they accumulated as National Leaguers is not. As before, any PA that resulted in a bunt or foulout or that was taken by a pitcher was excluded.

Scores Computed

The overall rating number described in Step 8 above is termed Performance Score. Steps 7 and 8 can be repeated with the non-batted-ball events (BB,HBP,K) stricken from the numerator and denominator at Step 7, and this result is termed Batted Ball Subscore (in short, how should the pitcher have rated versus their peers on batted balls?). To further understand how the pitcher achieved their Performance Score, a Control Subscore (how many SDs high or low was the pitcher’s BB+HBP% versus the study population’s mean?) and a Strikeout Subscore (how many SDs high or low was the pitcher’s K% ?) are computed. An Age Score is also calculated that quantifies how young the pitcher was versus the population’s mean age, per SDs. Given the method’s minor-league origins, the scores are typically expressed on a 20-to-80 style scouting scale where 50 is league-average, scores above 50 bettered league-average, and any 10 points equates to 1 SD (percentiles will be listed for those who prefer them).

2013 American League Starting Pitcher Results

In the tables to follow, green text indicates a value that beat league-average by at least 1 SD (“very good”) while red text indicates a value that trailed league-average by at least 1 SD. Asterisks indicate left-handed throwers.

Sorting by Performance Score

Here are the Top 33 2013 AL SP per the Performance Score measure. Scherzer edged Darvish for the #1 spot as the top of the list somewhat mimicked the BBWAA’s Cy Young vote.

 photo FG-2013ALSPScoresTop33_zps7510f67e.jpg

Detroit and Cleveland each landed five in the Top 33 while Boston, Oakland, and Tampa Bay each placed four. Perhaps not coincidentally, those clubs were also the playoff teams.

And below are the Middle 34 by Performance Score.

 photo FG-2013ALSPScoresMid34_zps31e63487.jpg

And below are the Bottom 34 by Performance Score.

 photo FG-2013ALSPScoresBot34_zps37b53dab.jpg

Pedro Hernandez took last place by a comfortable margin as five other Twins joined him on this dubious list of 34. To further corner the market on these sorts of arms, the club has since inked another of the 34 to a three-year free-agent contract.

Sorting by Batted Ball Subscore

Given the system’s unique weighting of batted-ball types by direction, let us examine how the pitchers grade out on this metric. Below are the Top 20 sorted by Batted Ball Subscore. Masterson nosed out Deduno for top honors. Here, the Twins fare better as three besides Deduno crack the Top 20.

 photo FG-2013ALSPBattedBallSubscoresTop20_zps83100793.jpg

 One unique angle of this approach is that a pitcher can be a relatively strong batted-balls performer without being a noteworthy groundball-inducer if their outfield flyballs, line drives, and groundballs are skewed optimally to the least dangerous zones of the field per the batter’s handedness. Colon serves as a prime example of such a pitcher.

Below are the laggards who comprise the Bottom 20.

 photo FG-2013ALSPBattedBallSubscores2Bot20_zps7c4024d0.jpg

Garza’s 29 number as an American Leaguer is somewhat scary for the sort of money he’s likely to command as a free agent (he’d earn about a 35 Batted Ball Subscore if the Cubs NL data were factored in). Salazar’s numbers show how a very high rate of strikeouts and good control can successfully offset a dangerous distribution of batted balls by type and direction.

Admittedly, there is a third dimension to each of these batted balls (launch angle off the bat relative to the plane of the field) that would stand to further improve the batted-balls assessment if such information were available.

Other Directions

A variety of things can be done with these numbers, such as breaking them down further into LHB values and RHB values, identifying comparable pitchers who share similar subscores (MLBers to MLBers, MiLBers to MLBers), studying how these values evolve as the minor leaguer rises through the farm towards the majors and their predictive value as to future MLB performance, and so on. And then there’s also the reverse analysis — evaluating hitter performance under a similar lens.

On Tap

Perhaps the most intriguing research question that application of this system raises is, “Would advanced metrics familiarly used to grade pitcher performance yield better results if their equations included batted-ball directional terms?” As a first attempt to test those waters, I plan to follow this up with a post that shows how these results compare to those obtained by variants of more familiar advanced statistical-evaluation methods (SIERA, FIP, etc.). In the interim, I welcome whatever comments, criticisms, and suggestions this readership has to offer.


Democratic and Fascist Pitchers

As we all know from the movie Bull Durham, strikeouts are fascist and groundballs are democratic.  So, I want to set out to find the most democratic pitchers and the most fascist pitchers out there.  Luckily, FanGraphs offers a custom leaderboard page that includes batted-ball data.

I set the filters to allow a K/9 rate of 5 or less in a game, a groundball percentage of 50% or greater, and a minimum innings-pitched threshold of 500 innings from 2002-2013.  I realize that five strikeouts a game is kind of arbitrary but I wanted to focus on pitchers who were striking out a batter about every two innings.  You can see the leaderboard for the most democratic pitchers from 2002-2013.

Based on that leaderboard, Aaron Cook should be considered the most democratic pitcher of the twelve-year span, based on his 3.7 K/9 and 57.5% groundball rate.  So, there’s that on his mantle.  Although, I still get confused trying to figure out how Cook was successful. Some other options for most democratic pitcher could be Jake Westbrook and Chien-Ming Wang.  Westbrook had a higher K/9 than Cook but also a higher GB%.  Wang was only slightly higher than Cook on his K/9 but induced groundballs at a slightly higher rate, too.  If you want to say Wang should be more democratic than Cook, far be it from me to stop you.

But, I also wanted to look at pitchers who have had democratic seasons during the span.  So, I created another leaderboard.  Not surprisingly, Cook appears near the top of the leaderboard in terms of value for his democratic season.  Tim Hudson had the most valuable democratic season in 2004, having an fWAR of 4.9.  The difference between 4.9 and 4.5 fWAR, that Cook put up in 2008 is probably not statistically significant.  I feel confident in saying that Aaron Cook is the most democratic pitcher for which we have comprehensive data.

On the flip side of this, I wanted to see who would be considered the most fascist pitchers for which we have data.  To set the parameters, I chose a K/9 of greater than or equal to 10.8 (represents 40% of 27, or how many outs a pitcher can get in a ball game) and a GB% of less than 40% with the same innings requirement as before.  The leaderboard can be found here.

Based on the leaderboard, there are only two fascist pitchers out there: Octavio Dotel and Carlos Marmol.  For some baseball fans, they are essentially the same pitcher and based on the rate stats it is hard to tell them apart.  Dotel was more valuable somehow being able to register a lower FIP than Marmol and pitching about 70 innings more.  So, Dotel is probably a little more fascist based on this stat.

Looking at individual seasons, I chose the same rates but with a minimum of 60 innings pitched.  The leaderboard for individual seasons has a handful of seasons registered by starting pitchers.  By and large, though, these types of seasons are usually only put up by relief pitchers.  Max Scherzer, Rich Harden, and Oliver Perez had more or less the same season in terms of value.  But Rich Harden’s season in 2008 is absolutely stunning.  Look at that low GB%, look how fascist it is.  There are a couple of pitchers on the individual-season list who don’t meet the 500-innings mark in Aroldis Chapman and Kenley Jansen who could also be in the running for most fascist pitchers.  Harden’s individual season was the most fascist, for the purpose of this exercise.  It seems unlikely that a starting pitcher can survive with such a low GB% or keep up such a high K/9 over the course of his career, or a number of seasons.


What Makes a Good Pinch-Hitter?

There seems to be quite a bit of disagreement in FanGraphs-land over what skills make for a good pinch-hitter. Some will argue that power is more important while others might say that on-base skills are more important. And while I know that it’s fashionable for the author to make a stance at the start of his article, I’m not going comply. I’m just going to unsexily dive face-first into Retrosheet.

How can we solve this problem? How do we know what skills are best for pinch-hitters? Well, we can examine the base-out states that pinch-hitters confront and then derive from those base-out states specific pinch-hitter linear weights. We will then compare pinch-hitter linear weights to league-average linear weights to see which skills retain value. Simple.

We’re also going to split the data by league, since pinch-hitting tendencies in the National League are likely going to be different than American League tendencies. I’m going to use the last five years of data, because whim. The table below, then, includes league-average linear weights followed by NL and AL pinch-hitter linear weights (aside: the run values of linear weights are from 1999-2002, per Tango’s work. This won’t make a real difference in the results, however, since we’re examining relative value of different base-out states and not overall run-value of different events).

Relative Linear Weights, 2009-2013

Linear Weight HR 3B 2B 1B NIBB Out K
League Average 1.41 1.06 0.76 0.47 0.33 -0.300 -0.310
AL Pinch-Hitting 1.45 1.07 0.77 0.49 0.32 -0.305 -0.325
NL Pinch-Hitting 1.42 1.05 0.75 0.48 0.31 -0.290 -0.310

In the National League we can see that the value of home runs have increased slightly while walks have seen a corresponding decrease. This is because pinch-hitters often come to the plate when there are more outs than average. This sensibly decreases the value of walks and increases the importance of hurrying up and sending everyone around the bases already. This note comes with a caveat, however — the differences in linear weights are pretty small. It seems that managers in the National League are often forced to use the pinch-hitter to replace the pitcher, and therefore pinch-hitters are used in a lot of sub-optimal places.

The American league does not condone making everyone hit, however, and the impact upon pinch-hitting situations is pretty clear. The run value of home runs increases by .04 in pinch-hitting situations in the American League compared to the paltry .01 National League increase. In fact the run values of nearly all events increases — managers in the American League simply have more flexibility on when to use pinch-hitters and so they are able to deploy their pinch-hitters in base/out situations that are strategically favorable.

What does this all mean? Like everything, this simultaneously means quite a bit and not much at all. Home run value increases while walk value decreases during average pinch-hitter situations, but the change isn’t huge. If you’re a general manager looking for a bench bat and there’s a home-run guy available with a 90 wRC+ and a plate-discipline guy with a 95 wRC+, take the plate-discipline guy. What if they both have a 90 wRC+? Then take the home-run guy. The pinch-hitter linear weights here are more of a tie-breaker than a game-changer. Power is more important than walks when it comes to being a pinch-hitter, but being a good hitter is more important than power.

Roster construction is never that simple, though. Ideally a team will have both power and plate-discipline guys available on the bench and then the manager will be able to leverage both of their abilities based upon the base/out state (and also the score/inning situation, which is outside the scope of this article). Managers tend to be kind of strategic dunces, though, so I’m not sure if I see this happening. If I were in charge of anything I would supply my manager with a chart of base/out states that list the team’s best pinch-hitters in each situation. I’m not in charge, though, and even if I were I would probably be ignored.

I am in charge of this article, however, which means that I can bring it to a close. I’ll note that another valid way to do this study would be to create WPA-based weights rather than run-expectancy weights. There’s a lot more noise in WPA, but it could still create some interesting conclusions. I reckon the conclusion would be pretty much the same though — what makes a good pinch-hitter? Well, a good hitter makes for a good pinch-hitter. And a little power doesn’t hurt.


The Untold Story of Roberto Clemente’s Plane Crash Litigation

The Fatal Crash

Roberto Clemente was both a remarkable ballplayer and genuine folk hero. As an outfielder for the Pittsburgh Pirates, Clemente was a perennial All-Star and Gold Glove recipient. He won four batting titles, was the National League’s MVP in 1966 and the World Series MVP in 1971.

Roberto Clemente

On September 30, 1972, Clemente stroked a double off of Mets pitcher Jon Matlack to reach the 3000 hit milestone in his final regular season at bat. After closing out the 1972 season with a playoff series loss to the Cincinnati Reds, Clemente traveled to Nicaragua in November to manage the Puerto Rican All-Stars in the Amateur Baseball World Series.

A 6.2 magnitude earthquake rocked Managua, Nicaragua on December 23, 1972. Some 5,000 people lost their lives, another 20,000 were injured and over 250,000 were displaced from their homes. Swayed by the time he had just spent in Nicaragua, Clemente coordinated a extraordinary effort to provide emergency supplies to the victims. Even after sending three airplane loads to Managua, there were still supplies that needed to be flown to Nicaragua.

Clemente was approached by Arthur Rivera, who offered the services of his DC-7 cargo plane to airlift the remaining relief supplies. Clemente inspected the plane and agreed to pay Rivera $4000 (approximately $22,000 today) upon his return to Puerto Rico.

By law, Rivera was to provide a pilot, co-pilot and flight engineer. Rivera hired a pilot, Jerry Hill, and appointed himself as the co-pilot, despite his lack of certification to co-pilot the DC-7. He was unable to hire a flight engineer for the flight.

Unbeknownst to Clemente, the DC-7 had been involved in an accident on December 2, 1972 when a loss of hydraulic power caused the aircraft to leave the taxiway and crash into a water-filled concrete ditch. After the incident, an airworthiness inspector with the Federal Aviation Administration (F.A.A.) questioned Rivera about intended repairs to the plane. Mr. Rivera confirmed that he intended to repair the plane and the inspector took no further action.

Thereafter, the damaged propellers were replaced and the engines were run for three hours, showing no signs of malfunction. The airplane was returned to service by the repairmen; however, no inspection was conducted by the F.A.A. prior to the ill-fated flight. In fact, the plane had not even been flown since its arrival from Miami in September, 1972.

The loading of Rivera’s DC-7 was completed on December 31, 1972. Clemente decided to personally accompany this flight after having been advised that their prior shipments may not have reached the intended recipients due to governmental interference with the relief efforts.

The flight plan was filed with the F.A.A. on the morning of December 31st. At approximately 9:11 p.m., the flight taxied down Runway 7 and was cleared for takeoff at 9:20 p.m. The weather was good and visibility was at 10 miles.

Upon takeoff, the plane gained very little altitude and at 9:23 p.m. the tower received a message that the plane was turning back around. Unfortunately, the aircraft did not make it, crashing into the Atlantic Ocean about one and a half miles from shore. Everyone aboard the plane, including Roberto Clemente, perished in the crash. He was just 38 years old.

The post-occurrence investigation revealed that there was an engine failure before the crash and that the plane was nearly 4200 pounds over the maximum allowable gross takeoff weight.

Resulting Lawsuit

Vera Zabala Clemente and the next of kin of the other passengers filed a lawsuit against the United States of America alleging that the F.A.A. employees were negligent under the Federal Tort Claims Act and responsible for the resulting crash. (The Federal Tort Claims Act is a limited waiver of sovereign immunity that authorizes parties to sue the United States for tortious conduct.)

Factually, the plaintiffs’ claim was based on the premise that the F.A.A. owed a duty to promote flight safety which was breached by their failure to revoke the airworthiness certificate of the DC-7 after the December 2, 1972 accident; monitor the repair process; and, otherwise discover that the plane was not airworthy, had an improper registration number, was not properly weighted and balanced and did not have a qualified crew. It was the plaintiff’s contention that had the F.A.A. acted in accordance with their own internal procedures (Order SO8430.20C, “Continuous Surveillance of Large and Turbined Powered Aircraft”), the aircraft would have been denied flight clearance, the deceased passengers would have been advised of the deficiencies and that the plane crash would never have happened.

The United States countered that the F.A.A. did not have any legal duty towards the decedents to “discover or anticipate acts which might result in a violation of Federal Regulations.” They also claimed that there was no connection between any duty and the fatal crash.

Who won?

The trial court found for Vera Zabala Clemente and the next of kin of the other deceased passengers on the issue of negligence.

Why?

The trial court was convinced by the F.A.A. investigative report that the cause of the crash was “overboosting” of the No. 2 engine at takeoff and the fact that the plane was overloaded by more than two tons. Because the flight crew was inadequate, the situation was such that “…for all practical purposes the Captain was flying solo in emergency conditions.”

Section 6 of Order SO8430.20C called for “continuous surveillance of large and turbine powered aircraft to determine noncompliance of Federal Aviation Regulations.” Furthermore, a “ramp inspection” was required to determine that the crew and operator were in compliance with the safety requirements regarding the airworthiness of the aircraft as to the weight, balance and pilot qualifications. Any indication of an “illegal” flight crew was to be made known to the crew and persons chartering the service. Finally, discovery of such noncompliance was to be given the highest priority, second only to accident investigation.

The trial court found that these provisions of the Continuous Surveillance of Large and Turbined Powered Aircraft order were applicable to Roberto Clemente’s chartered flight and that the decedents were within the class of people sought to be protected under the order. If the required ramp inspection had been completed, the lack of a proper crew and overloading would have been discovered, Clemente would have been notified and, presumably, he would not have agreed to board the plane and avoided his untimely death.

The order was held to be mandatory in nature and because the F.A.A. violated its own orders, a failure to exercise due care was evident. Accordingly, the F.A.A.’s failure to inspect and ground the plane “contributed to the death of the…decedents.”

The appeal

The United States appealed the decision claiming that the trial court erred in its finding of a duty on the part of the Federal Aviation Administration. The critical question the appellate court was asked to address was whether the F.A.A. staff in Puerto Rico had a duty to inspect the subject DC-7 and warn the decedents of “irregularities.”

The appellate court acknowledged that the Federal Aviation Act was enacted to promote air safety but that this “hardly creates a legal duty to provide a particular class of passengers particular protective measures.” Further, the issuance of the Continuous Surveillance of Large and Turbined Powered Aircraft order was done gratuitously and did not create a duty to the decedents or any other passengers.

The court ultimately held that the order created a duty of the local inspectors to “perform their jobs in a certain way as directed by their superiors.” The failure to comply with this order, however, was grounds for internal discipline but did not create a cause of action based on negligent conduct against the F.A.A.

It is well-founded that the pilot in command has responsibility to determine that an airplane is safe for flight. There was nothing in this F.A.A. directive that shifted this responsibility to the federal government.

Further, the court found that the failure of the F.A.A. to inspect the plane did not add to the risk of injury to the passengers and there was no evidence that any of the deceased had relied on the F.A.A. to inspect the aircraft prior to takeoff or even knew about Order SO8430.20C.

Who won the appeal?

The United States. The finding of negligence on the part of the Federal Aviation Administration was reversed.

In its opinion, the appellate court concluded, “The passengers on this ill fated flight were acting for the highest of humanitarian motives at the time of the tragic crash. It would certainly be appropriate for a society to honor such conduct by taking those measures necessary to see to it that the families of the victims are adequately provided for in the future. However, making those kinds of decisions is beyond the scope of judicial power and authority. We are bound to apply the law and that duty requires the reversal of the district court’s judgment in favor of the plaintiffs.”

The plaintiff’s request that the case be heard by the United States Supreme court was denied.


Billy Hamilton: 2014 Leadoff Hitter?

The signing of Shin-Soo Choo gives the Rangers a player with strong on-base skills, solid power, and decent corner-outfield defense. The signing also left a gaping hole in the outfield for the Reds. Choo was one of three Reds starters that got on base at an above-average clip. He was easily the first- or second-best offensive player for the Reds in 2013. While he was miscast in center field, Choo brought a great deal of value to a team that needed his particular offensive skill set.

Walt Jocketty has stated that Billy Hamilton is the new center fielder and will likely bat leadoff for the 2014 Reds. Hamilton starting in center field should come as no surprise as the Reds do not have many other options. The wisdom of Hamilton batting leadoff is at least up for debate. You can easily go look at his projections for 2014 and draw your own conclusions, but I would like to at least provide some context.

Every baseball fan knows about Hamilton’s speed. He is ferociously fast. He stole 155 bases in the minors in 2012 and successfully stole 13 bases in 14 attempts in limited major league action in 2013. Speed is nice , but it is certainly not close to the most important skill for a player in the leadoff spot. Reds fans may know this best of all from watching Corey Patterson, Willy Taveras, and Drew Stubbs flounder at the plate. Those players were wickedly fast, but as the saying goes, you can’t steal first base. None of them had the on-base skills to bat leadoff, but they found themselves there anyway because of their speed. To avoid this list of failed Reds leadoff hitters, Billy Hamilton will need to get on base enough to justify being at the top of the order. That is the obvious question: can Hamilton get on base to use that blinding speed of his to turn singles into doubles and doubles into triples? There are signs that he can but others that he shouldn’t in 2014.

The 2012 season launched Hamilton into top-20 prospect territory. He obviously broke the stolen-base record, but he also showed some ability with the bat. In a 132 games between high A and AA, Hamilton hit .311/.410/.420. He had 14 triples. His walk rate rose dramatically from the year before. Hamilton looked like a perfect leadoff hitter through two levels.

Then 2013 and AAA came. Hamilton slashed .256/.308/.343. His walk percentage dropped from 16.9% in 50 games in AA (small sample size noted) to 6.9% in 123 games in AAA. it was arguably his worst season as a professional. He looked completely overmatched at times and questions about his ability to get on base resurfaced.

So which is the real Billy Hamilton, and what does it mean for 2014? Hamilton’s ceiling is likely between his 2012 and 2013 minor league performance. In five seasons as a minor leaguer, Hamilton slashed .280/.350/.378. Coupled with his speed and potential excellent defense in center field, that slash line could make him an All-Star-caliber player. The hope is that 2013 was a product of learning a new position and a significant drop in BABIP from over .370 to .310.

Still, Hamilton was very inconsistent at the plate in 2013 and didn’t prove he could hit AAA pitching for an extended period of time. The major leagues are an obvious step up in competition, and it would be surprising to see him match his .280/.350/.378 minor league career slash line in 2014. Steamer projects him to have a .305 OBP, and after last year, it is easy to see why.

While it is very possible Hamilton could surpass gloomy projections, the Reds probably shouldn’t risk it in 2014, at least at first. It makes much more sense to see how Hamilton adjusts to major-league pitching in a less important part of the lineup (7th for instance). He would get fewer at bats and would not be so heavily scrutinized if he struggled adjusting to the level. If he performs well, he can always move up in the lineup, but the Reds likely have better leadoff options than Hamilton to begin the year.

If Hamilton plays excellent defense in center field and has a good year on the bases, he will provide solid value for the Reds. To fill Choo’s shoes, he will have to hit closer to his career minor league mark as opposed to his 2013 numbers. In 2014, that may be difficult.


The Cascading Bias of ERA

There are so many problems with ERA that it’s unbelievable. I’m not going to sit here and tell you what’s wrong with ERA, though, because you’re probably smart. But there’s a problem with ERA, and it’s a problem that transcends ERA. It’s a problem that trickles down through FIP, xFIP, SIERA, TIPS, etc. etc. name your favorite stat, etc., and it’s something I don’t see talked about much.

All of our advanced pitcher metrics are trying to predict or estimate ERA. They’re trying to figure out what a pitcher’s ERA should be, and herein lies the problem: Because they could be exactly right, but they’d still be a little incorrect due to one little assumption.

This assumption–that pitchers have no control over whether or not the fielders behind them make errors–seems easy to make. Like most assumptions, however, this one is subtly incorrect. Thankfully, the reason is pretty simple. Ground balls are pretty hard to field without making an error, and fly balls aren’t. And the difficulty gap is pretty huge.

How big? Well in 2013 there were precisely 58,388 ground balls, 1,344 of which resulted in errors. On the other hand a mere 98 out of 39,328 fly balls resulted in errors. That means that 2.3% of ground balls result in errors while a tiny 0.25% of fly balls do. It’s time to stop pretending that this gap doesn’t exist, because it does.

So now that we know this, what does it mean? Well it means this: ground-ball pitchers will have an ERA that suggests they are better than their actual value, while fly-ball pitchers have the opposite effect. Pitchers who allow contact, additionally, are worse off because every time they allow contact they put pressure on their defense. They’re giving themselves a chance to stockpile unearned runs which nobody will count against them if they’re only looking at ERA derivatives. When it comes to winning baseball games, however, earned runs don’t matter. Runs matter.

I am going to call this the “pressure on the defense” effect, which will cause some pitchers to be more prone to unearned runs than other pitchers. How big is this effect? Well, not huge. The gap between the best pitcher and worst pitcher in the league is roughly three runs over the course of the season. But keep in mind that three runs is about a third of a win, and a third of win is worth about $2 million dollars. We’re not discussing mere minutiae here.

In order to better quantify this effect I have developed the xUR/180 metric, which will estimate how many unearned runs should have taken place behind each pitcher with an average defense. Below is a table of all qualified starting pitchers from 2013 ranked according this metric. I have also included how many unearned runs they actually allowed in 2013, scaled to 180 innings for comparative purposes.

# Name xUR/180 UR/180
1 Joe Saunders 7.24 9.84
2 Jeff Locke 7.11 4.33
3 Wily Peralta 6.97 17.7
4 Edwin Jackson 6.88 13.36
5 Edinson Volquez 6.81 6.35
6 Kyle Kendrick 6.77 8.9
7 Justin Masterson 6.66 0.93
8 Doug Fister 6.58 5.19
9 Wade Miley 6.57 7.12
10 Rick Porcello 6.51 2.03
11 Jerome Williams 6.47 7.45
12 Jorge de la Rosa 6.43 5.38
13 Yovani Gallardo 6.42 7.99
14 A.J. Burnett 6.35 8.48
15 Scott Feldman 6.32 8.94
16 Mike Leake 6.26 5.62
17 Andrew Cashner 6.25 8.23
18 Felix Doubront 6.22 6.66
19 Jhoulys Chacin 6.13 5.48
20 Kevin Correia 6.13 2.92
21 Jeremy Guthrie 6.13 3.41
22 Mark Buehrle 6.11 5.31
23 Andy Pettitte 6.05 7.78
24 Hyun-Jin Ryu 6.01 2.81
25 Jeff Samardzija 6.0 5.07
26 C.J. Wilson 5.93 11.03
27 CC Sabathia 5.9 8.53
28 Jon Lester 5.84 4.22
29 Ryan Dempster 5.8 10.52
30 Tim Lincecum 5.77 5.48
31 Hiroki Kuroda 5.72 4.48
32 Bud Norris 5.72 7.15
33 Jordan Zimmermann 5.69 3.38
34 Patrick Corbin 5.68 1.73
35 Dillon Gee 5.67 3.62
36 Ervin Santana 5.67 7.68
37 Kris Medlen 5.66 8.22
38 Bronson Arroyo 5.63 2.67
39 Stephen Strasburg 5.62 9.84
40 Mat Latos 5.62 6.85
41 Ubaldo Jimenez 5.61 7.9
# Name xUR/180 UR/180
42 Jarrod Parker 5.61 4.57
43 John Lackey 5.6 5.71
44 Gio Gonzalez 5.55 5.53
45 Lance Lynn 5.55 2.68
46 Eric Stults 5.5 7.09
47 Felix Hernandez 5.49 4.41
48 Zack Greinke 5.48 2.03
49 Hisashi Iwakuma 5.47 3.28
50 Jose Quintana 5.46 4.5
51 Ian Kennedy 5.46 8.95
52 Ricky Nolasco 5.45 7.23
53 R.A. Dickey 5.44 6.42
54 Jeremy Hellickson 5.4 3.1
55 Homer Bailey 5.38 3.44
56 Miguel Gonzalez 5.36 9.47
57 Madison Bumgarner 5.34 5.37
58 James Shields 5.32 1.58
59 Adam Wainwright 5.32 2.99
60 Bartolo Colon 5.32 3.79
61 Derek Holland 5.3 7.61
62 Kyle Lohse 5.26 3.63
63 Cole Hamels 5.18 4.91
64 Anibal Sanchez 5.18 3.96
65 David Price 5.18 8.7
66 Chris Sale 5.14 6.73
67 Justin Verlander 5.06 8.25
68 Chris Tillman 5.04 1.75
69 Jose Fernandez 5.03 5.23
70 Shelby Miller 4.98 6.24
71 Matt Cain 4.97 2.93
72 Clayton Kershaw 4.9 5.34
73 Julio Teheran 4.9 2.92
74 Matt Harvey 4.86 1.01
75 Cliff Lee 4.79 4.86
76 Travis Wood 4.78 3.6
77 Dan Haren 4.78 4.26
78 Yu Darvish 4.53 1.72
79 A.J. Griffin 4.46 5.4
80 Mike Minor 4.46 5.29
81 Max Scherzer 4.15 3.36

 

Some notes:

  • Groundballs are still good, they’re just not as good.
  • A combination of groundballs and contact lead to more unearned runs. The pitchers at the top of the board demonstrate this.
  • A combination of strikeouts and fly balls will tend to limit the impact of unearned runs, as demonstrated by the bottom of the board.
  • Errors that occur on fly balls tend to be more costly than errors on ground balls. This metric accounts for that gap, but the low likelihood of fly-ball errors make this bullet point’s effect relatively negligible.
  • Line drives are similar to fly ball in terms of error rate, but they tend to be less costly than fly ball errors.

I’m sure there is more to be gleaned, but the point is this: we need to stop trying to predict ERA, because ERA is not a pure value stat. We should be trying to figure out how many runs a pitcher should/should have given up, because that’s what matters. Runs matter, and who cares if they’re unearned? They’re kind of the pitcher’s fault, anyways.


What Is an Ace? (2013)

After the 2011 season I asked, and attempted to answer, the question, “what is an ace”?

It’s time to do that again.

Kershaw

Ok. While Kershaw is the aciest of aces right now, that’s not really the answer that we’re looking for.

I certainly don’t claim to be the first person to do something like this, nor am I the most rigorous, but I think it’s good to take a look at things like this every now and then just to reset our baselines.

What I did was to take the average of every starter’s fWAR and RA-9 WAR. Then I used that number to group pitchers into groups of (roughly) 30 — 30 aces, 30 number 2’s, etc. Then, I looked at the average performance of the pitchers in each group.

Here’s what I found:

There’s a couple of interesting things to note.

One is that the best 30 pitchers in baseball are, far and away, the best group. They strike out the most hitters, they walk the fewest hitters, they give up the fewest home runs, they have the lowest BABIP, they’re the best. That’s not surprising when guys like the above-pictured Kershaw, Cliff Lee, Max Scherzer, Justin Verlander, Matt Harvey and Yu Darvish are in the ranks.

The second interesting thing is how similar the #3, #4 and #5 groups are in terms of performance. Look:

#3 18.2% K, 7.2% BB, 3.85 ERA, 4.06 FIP, 4.04 xFIP, 4.13 SIERA
#4 18.7% K, 8.2% BB, 3.89 ERA, 3.86 FIP, 3.96 xFIP, 4.09 SIERA
#5 17.4% K, 6.9% BB, 4.26 ERA, 4.09 FIP, 4.02 xFIP, 4.12 SIERA

In many ways, every way other than walks really, #4 starters outperformed #3 starters. Well, in every way except for number of starts and innings. Number-three starters made about seven more starts and pitched almost 50 more innings than #4 starters. Similarly, #5 starters were a little worse than both #3 and #4 starters but what really limited them from producing value was that they made 12 fewer starts and pitched half as many innings as #3 starters.

The third point is similar to the above. Starters not in the top five accounted for more starts and more innings than the best pitchers in baseball. That makes sense when you stop to think about it, there are more bad pitchers than elite ones, but we don’t think about just how important it is for the other starters to make their starts so these guys don’t have to.

As I mentioned when I first did this little exercise after the 2010 season:

Next time your team signs a pitcher with a 10 – 8 record and 3.99 ERA in 160 innings realize just what you are getting. One of the top 100 pitchers in the league.

The numbers are a little different now — now the average #3 is 10 – 9 with a 3.85 ERA in 158 innings — but the point remains the same: the average baseball fan vastly underrates pitcher performance.


Another Highly Unimportant Stat: Pitcher Craftiness

In this post on measuring a player’s scrappiness, commenter Eric Garcia said “Next up, measuring a pitchers’ craftiness.” I liked this idea and thought I would give it a shot. Of course, the first problem is deciding what makes a pitcher “crafty”. Eric Garcia gave his suggestions and we will look at them eventually. I, however, thought about pitchers that came to my mind when the word “crafty” is used and looked at what they had in common. Generally, they do not have an overpowering fastball and don’t throw it that often. They usually don’t have that many strikeouts, but also don’t walk that many, so they still have a decent WHIP. The perception is that they are good at pitching out of jams, either by inducing ground-ball double plays or popups.

There were 81 pitchers that qualified for the ERA title in 2013. I found the average of this group in four categories: fastball velocity, strikeout percentage, WHIP, and LOB%. For each player I calculated how many standard deviations from the mean they were in each of these categories. I then summed these up (using the negatives for fastball velocity, strikeout percentage, and WHIP). Though “crafty” often seems to be used as a synonym for “left-handed”, I feel that you should be able to be crafty with either hand, so I did not use handedness at all. I considered using fastball percentage instead of velocity, but felt velocity better captured what we are looking for. Pitchers I think of as crafty seem to often outperform their FIP, so I considered using ERA-FIP, but felt that since the outperformance is often the result of a low strikeout rate and generally good WHIP, that it was already taken into account. The numbers are not league adjusted, so National League pitchers get a slight advantage.  So, using these criteria, here are the 2013 leaders in craftiness:

Name Craftiness Score
Bronson Arroyo 4.70
R.A. Dickey 4.44
Hisashi Iwakuma 4.03
Bartolo Colon 3.80
Kyle Lohse 3.65
Mark Buehrle 3.38
Travis Wood 3.20
Mike Leake 2.55
A.J. Griffin 2.50
Dillon Gee 2.38
Zack Greinke 2.25
Eric Stults 2.03
Kris Medlen 1.94
Clayton Kershaw 1.89
Hyun-Jin Ryu 1.86
Jeremy Guthrie 1.68
Julio Teheran 1.60
Kevin Correia 1.43
Hiroki Kuroda 1.39
Chris Tillman 1.30
Cliff Lee 1.26
Ervin Santana 1.26
Mike Minor 1.24
Jhoulys Chacin 1.22
Andy Pettitte 1.11
Doug Fister 1.04
John Lackey 0.94
Jose Quintana 0.83
Jarrod Parker 0.79
James Shields 0.77
Miguel Gonzalez 0.73
Adam Wainwright 0.72
Madison Bumgarner 0.68
Wade Miley 0.69
Scott Feldman 0.64
Jorge de la Rosa 0.55
Jeff Locke 0.47
Patrick Corbin 0.44
Jordan Zimmermann 0.35
Ricky Nolasco 0.01
Dan Haren -0.03
Matt Cain -0.13
Shelby Miller -0.23
Yu Darvish -0.32
Jose Fernandez -0.35
Chris Sale -0.39
Cole Hamels -0.47
Mat Latos -0.50
Andrew Cashner -0.57
Justin Masterson -0.55
Kyle Kendrick -0.64
Felix Hernandez -0.77
Anibal Sanchez -0.86
Matt Harvey -0.94
C.J. Wilson -0.89
Jon Lester -0.93
Jerome Williams -1.00
Max Scherzer -1.05
David Price -1.05
Rick Porcello -1.04
Ryan Dempster -1.09
Yovani Gallardo -1.10
Gio Gonzalez -1.16
Homer Bailey -1.32
Joe Saunders -1.28
Derek Holland -1.38
Ubaldo Jimenez -1.42
Jeremy Hellickson -1.83
Felix Doubront -1.85
Tim Lincecum -1.88
Ian Kennedy -1.94
Justin Verlander -2.12
Stephen Strasburg -2.17
Bud Norris -2.19
CC Sabathia -2.20
Lance Lynn -2.26
A.J. Burnett -2.35
Jeff Samardzija -3.60
Wily Peralta -3.64
Edwin Jackson -4.26
Edinson Volquez -4.84

Considering the model used here, Bronson Arroyo being on top is not really a surprise (though I really thought Dickey would probably wind up on top and he would have easily if I had used fastball percentage instead of fastball velocity).  Now some people might protest that a low strikeout rate should not be required.  They would argue that it is certainly possible that a pitcher might still be considered crafty and have a fair number of strikeouts.  If we remove the strikeout percentage from the stat, we get the following:

Name Craftiness Score
Hisashi Iwakuma 4.34
R.A. Dickey 3.99
Bronson Arroyo 3.40
Clayton Kershaw 3.28
Yu Darvish 2.88
A.J. Griffin 2.63
Bartolo Colon 2.56
Travis Wood 2.51
Cliff Lee 2.53
Kyle Lohse 2.47
Zack Greinke 2.37
Mark Buehrle 2.22
Julio Teheran 2.06
Madison Bumgarner 1.83
Hyun-Jin Ryu 1.73
Mike Minor 1.71
Kris Medlen 1.67
Dillon Gee 1.53
Chris Tillman 1.56
Jose Fernandez 1.52
Adam Wainwright 1.39
Mike Leake 1.29
Chris Sale 1.10
Max Scherzer 1.09
John Lackey 1.06
Matt Harvey 0.98
James Shields 0.91
Hiroki Kuroda 0.88
Ervin Santana 0.89
Anibal Sanchez 0.87
Eric Stults 0.74
Felix Hernandez 0.75
Jose Quintana 0.70
Patrick Corbin 0.57
Shelby Miller 0.59
Doug Fister 0.46
Justin Masterson 0.46
Dan Haren 0.14
Andy Pettitte 0.09
Cole Hamels 0.03
Jhoulys Chacin -0.02
Matt Cain -0.01
Wade Miley -0.03
Jordan Zimmermann -0.04
Scott Feldman -0.11
Miguel Gonzalez -0.11
Ricky Nolasco -0.13
Jarrod Parker -0.18
Jeff Locke -0.21
Ubaldo Jimenez -0.25
Mat Latos -0.26
Jeremy Guthrie -0.30
Gio Gonzalez -0.37
Kevin Correia -0.45
Homer Bailey -0.51
Jorge de la Rosa -0.60
Stephen Strasburg -0.67
C.J. Wilson -0.83
A.J. Burnett -0.90
Ryan Dempster -1.01
David Price -1.01
Jon Lester -1.09
Andrew Cashner -1.08
Derek Holland -1.16
Tim Lincecum -1.24
Rick Porcello -1.30
Justin Verlander -1.31
Yovani Gallardo -1.55
Lance Lynn -1.57
Ian Kennedy -1.93
Felix Doubront -2.04
Kyle Kendrick -2.32
Jeremy Hellickson -2.37
Jerome Williams -2.41
CC Sabathia -2.49
Bud Norris -2.53
Jeff Samardzija -2.82
Joe Saunders -3.14
Wily Peralta -4.70
Edwin Jackson -5.03
Edinson Volquez -5.40

 

When the poster Eric Garcia suggested this, his idea of a crafty pitcher was someone with a low velocity, high ERA, and a decent number of wins.   If we use those criteria and the same methodology, we come up with the following list:

Name Craftiness Score
R.A. Dickey 6.809175606
Mark Buehrle 4.7547704381
Bronson Arroyo 3.2944617169
Joe Saunders 2.9423646195
Jeremy Hellickson 2.6685500615
CC Sabathia 2.4966613422
Eric Stults 2.7180128884
A.J. Griffin 2.4076452673
Doug Fister 2.2427676408
Dan Haren 2.1691672291
Adam Wainwright 1.7071128009
Kyle Kendrick 1.7118241134
C.J. Wilson 1.5737716721
Jeremy Guthrie 1.4387583548
Chris Tillman 1.4439760517
Rick Porcello 1.459202682
Edinson Volquez 1.2576799698
Bartolo Colon 1.6239711996
Jorge de la Rosa 1.4181128961
Max Scherzer 1.1306105901
Kris Medlen 1.4880878755
Jhoulys Chacin 1.4133629431
Yovani Gallardo 1.1947402807
Lance Lynn 1.0099538962
Felix Doubront 1.1495573185
Scott Feldman 1.1974157822
Ricky Nolasco 1.1042531489
Dillon Gee 1.1994224084
Ryan Dempster 1.1677938881
Tim Lincecum 1.035870434
Andy Pettitte 1.1821092279
Mike Leake 1.05406572
Jordan Zimmermann 0.5825671124
Jon Lester 0.5408497347
Ian Kennedy 0.6776977315
Jarrod Parker 0.4623781624
Justin Masterson 0.3885353357
Hyun-Jin Ryu 0.4892567298
Mike Minor 0.3742320945
Hisashi Iwakuma 0.4643968873
Kevin Correia 0.2593286667
Patrick Corbin 0.0564390189
Kyle Lohse 0.312739265
Julio Teheran 0.0997486611
Cliff Lee 0.0886564906
Miguel Gonzalez -0.0925431019
Jerome Williams -0.256429514
Edwin Jackson -0.4285314635
Ubaldo Jimenez -0.2221254921
Bud Norris -0.5000332264
Jeff Locke -0.2451919386
Mat Latos -0.5647784102
Zack Greinke -0.4470782733
Wade Miley -0.5363196771
Travis Wood -0.3273302268
James Shields -0.705666201
Justin Verlander -0.8883247518
Shelby Miller -0.9631708917
Matt Cain -0.7250659316
Wily Peralta -1.1640247285
Hiroki Kuroda -0.7950288123
Madison Bumgarner -0.7855967316
John Lackey -0.9654585733
Felix Hernandez -1.0396358378
Jose Quintana -1.1617514899
Gio Gonzalez -1.3356468354
Yu Darvish -1.5340675857
Anibal Sanchez -1.598691562
Cole Hamels -1.4973933151
A.J. Burnett -1.7115883636
Clayton Kershaw -1.6983975443
Homer Bailey -1.9877439854
Jeff Samardzija -2.0853342328
Chris Sale -2.0119349525
Derek Holland -2.1558326826
Ervin Santana -2.0875298917
David Price -2.224336605
Andrew Cashner -3.1088119908
Jose Fernandez -3.8720417313
Stephen Strasburg -4.3734432885
Matt Harvey -5.3067683524

I doubt these numbers have any real value and are just presented here for entertainment.  What do you think makes a pitcher crafty?  Let me know in the comments.


xHitting (Part 2): Improved Model, Now with 2013 Leaders/Laggards

Happy holidays, all.  It took me a while, but I finally have the second installment of xHitting ready.  First off, thank you to all those who read/commented on the first piece.  For those who didn’t get a chance to read it, the goal here is to devise luck-neutralized versions of popular hitter stats, like OPS or wOBA.  A main extension over existing xBABIP calculators is that this approach offers an empirical basis to recover slugging and ISO, by estimating each individual hit type.

I’ve returned today with an improved version of the model.  Highlights:

  • One more year of data (now 2010-2013)
  • Now includes batted-ball direction (all player-seasons with at least 100 PA)
  • FB distance now recorded for all player-seasons with at least 100 PA

(There’s no theoretical reason for the 100 PA cutoff, only that I was grabbing some of the new data by hand and couldn’t justify the time to fetch literally every single player.)

I have also relaxed the uniformity of peripherals used for each outcome.  At least one reader asked for this, and after thinking about it a while, I decided I agree more than I disagree.  The main advantage of imposing uniformity was that it ensures the predicted rates (when an outs model is also included) sum to 100%.  But it is true that there are certain interactions or non-linearities that are important for some outcomes, but not others.  Including these where they don’t fully belong has a cost to standard errors/precision, and to intuitive interpretation.  To ensure rates still sum to 100%, there’s no longer an explicit ‘outs’ model; outs are simply assumed to be the remainder.

For those curious, below I display regression results for each outcome and its respective peripherals.  You can otherwise skip below if these are not of direct interest.

(The sample includes all player-years with at least 100 plate appearances between the 2010 and 2013 MLB seasons.  Park factors denote outcome-specific park factors available on FanGraphs.  Robust standard errors, clustered by player, are in parentheses; *** p$<$0.01, ** p$<$0.05, * p$<$0.1)

The new variables seem to help, as each outcome is now modeled more accurately than before (by either R2 or RMSE).  For comparison, here are the R2’s of the original specification:

  • 0.367 for singles rate
  • 0.236 for doubles rate
  • 0.511 for triples rate
  • 0.631 for HR rate

Something else I noticed: for balls that stay “inside the fence,” both pull/opp and actual side of the field matter.  Consider singles: the ball needs to be thrown to 1st base (right side of infield) specifically.  Thus an otherwise-equivalent ball hit to the left side is not the same as one hit to the right side, since the defensive play is harder to make from the left side.  Similarly, hitting the ball to left field is less conducive for triples than hitting the ball to right field.

But hitting the ball to the left side as a lefty is not the same as hitting it there as a righty, since one group is “pulling” while the other group is “slapping.”  The direction x handedness interactions help account for this.

How well do the predicted rates do in forecasting?  For singles, doubles, and triples, the predicted rates do unambiguously better than realized rates in forecasting next season’s rates.  Things are a little less clear for home runs, which I will expand on below.

Although predicted HR rate shows a slight edge in Table 1, the pattern often reverses (for HR only) if you use a different sample restriction — say requiring 300 PA in the preceding season.  (For other outcomes, the qualitative pattern from Table 1 still holds even under alternative sample restrictions.)

So home runs appear to be a potential problem area.  What should we do when we need HR to compute xAVG/xSLG/xOPS/xWOBA, etc.?  Should we:

  1. Use predicted HR anyway?
  2. Use actual HR instead?
  3. Use some combo of actual and predicted HR?

Empirically there is a clear answer for which choice is best.  But before getting to that, let’s take a look at whether predicted home-run rate tells us anything at all in terms of regression.  That is, if you’ve been hitting HR’s above/below your “expected” rate, do you tend to regress toward the prediction?

The answer to this seems to be “yes,” evidenced by the negative coefficient on ‘lagged rate residual’ below.

So, although realized HR rate is sometimes a better standalone forecaster of future home runs, predicted HR rate is still highly useful in predicting regression.  Making use of both, it seems intuitively best to use some combo of actual and predicted HR rate for forecasting.

This does, in fact, seem to be the best option empirically.  And this is true whether your end outcome of interest is AVG, OBP, SLG, ISO, OPS, or wOBA.

Observations:

  • (Option 1 = predicted HR only; Option 2 = actual HR only; Option 3 = combo)
  • Whether you use option 1, 2, or 3, xAVG and xOBP make better forecasters than actual past AVG or OBP
  • Option 1 does not do well for SLG, ISO , OPS, or wOBA
  • ^This was not the case in the previous article, but results to that point had sort of a funky sample, having recorded flyball distance only for a partial list of players
  • Option 2 “saves” things for xOPS and xWOBA, but still isn’t best for SLG or ISO
  • Option 3 makes the predicted version better for any of AVG, OBP, SLG, ISO, OPS, or wOBA

End takeaways:

  • The original premise that you can use “expected hitting,” estimated from peripherals, to remove luck effects and better predict future performance seems to be true; but you might need to make a slight HR adjustment.
  • The main reason I estimate each hit type individually is for the flexibility it offers in subsequent computations.  Whether you want xAVG, xOPS, xWOBA, etc., you have the component pieces that you need.  This would not be true if I estimated just a single xWOBA, and other users prefer xOPS or xISO.
  • A major extension over existing xBABIP methods is that this offers an empirical basis to recover xSLG.  The previous piece actually provides more commentary on this.
  • Natural next steps are to test partial-season performance, and also whether projection systems like ZiPS can make use of the estimated luck residuals to become more accurate.

Finally, I promised to list the leading over- and underachievers for the 2013 season.  By xWOBA, they are as follows:

Overachievers (250+ PA) Underachievers (250+ PA)
Name 2013 wOBA 2013 xWOBA Difference Name 2013 wOBA 2013 xWOBA Difference
Jose Iglesias 0.327 0.259 0.068 Kevin Frandsen 0.286 0.335 -0.049
Yasiel Puig 0.398 0.338 0.060 Alcides Escobar 0.247 0.296 -0.049
Colby Rasmus 0.365 0.315 0.050 Todd Helton 0.322 0.369 -0.047
Ryan Braun 0.370 0.321 0.049 Ryan Hanigan 0.252 0.296 -0.044
Ryan Raburn 0.389 0.344 0.045 Darwin Barney 0.252 0.296 -0.044
Mike Trout 0.423 0.379 0.044 Edwin Encarnacion 0.388 0.429 -0.041
Junior Lake 0.335 0.292 0.043 Josh Rutledge 0.281 0.319 -0.038
Matt Adams 0.365 0.323 0.042 Wilson Ramos 0.337 0.374 -0.037
Justin Maxwell 0.336 0.295 0.041 Yuniesky Betancourt 0.257 0.294 -0.037
Chris Johnson 0.354 0.314 0.040 Brian Roberts 0.309 0.345 -0.036

Comments/suggestions?


A Different Look at the Hall of Fame Standard

I’m writing this as a response to Dave Cameron’s two articles on December 19 and 20 concerning the Hall of Fame.  While I completely understand the point Dave is/was trying to make in both pieces, I felt that his methodology was slightly flawed and perhaps deserved a fresh look.  As mentioned multiple times in the comments section on both articles, the data he used included players that were elected via the Veterans Committee.  Also included were players elected by the Negro Leagues Committee.  The purpose of this post is to look at players elected strictly by the BBWAA.  That list includes 112 inductees, the most recent of which being Barry Larkin.

Using the data Dave listed in his follow-up article that limits the player pool to either 5000 PA or 2000 IP, we get the following results:

Year of Birth

“Eligible Players”

Elected Players

Percentage

<1900

258

20

7.8%

1900-1910

93

16

17.2%

1911-1920

66

10

15.2%

1921-1930

77

8

10.4%

1931-1940

99

22

22.2%

1941-1950

168

15

8.9%

1951-1960

147

19

12.9%

1961-1970

160

2

1.3%

If you combine all the data, you get 112 elected players out of 1068 “eligible” players.  That works out to 10.5% of the eligible population being inducted.  If we remove the 1961-1970 births, it’s 110 elected out of 908 eligible, or 12.1%.  If we try and bring the 1961-1970 total up to the overall average, that would mean ~17 inductees.  To reach pre-1961 levels, we need ~19 inductees.  To reach the lowest percentage of induction, we need a total of ~12 inductees.  To reach the highest percentage, we need a total of ~36 inductees.  I think it is safe to assume that, with the scrutiny given by Hall voters to the Steroid Era, the possibility of 36 inductees is nearly zero.

Dave also listed six players that he felt would surely get inducted in the coming years.  That list included Greg Maddux, Ken Griffey Jr., Randy Johnson, Mariano Rivera, Tom Glavine, and Craig Biggio.  If we include those six with the two already elected from the era (Barry Larkin and Roberto Alomar), the Hall would only need to elect four more members from the era to reach the current lowest standard.  I would think that John Smoltz has a pretty persuasive case for the Hall of Fame as well, being the only pitcher with 200 wins and 150 saves.  Also, Smoltz is one of the 16 members of the 3000 Strikeout Club.  That list includes 10 current Hall of Famers (all elected by BBWAA).  The other members not currently inducted include Smoltz, Roger Clemens, Randy Johnson, Curt Schilling, Pedro Martinez, and Greg Maddux.  Dave already included Johnson and Maddux on his list of “should be in” Hall of Famers.  Martinez was born in 1971, so he isn’t included in this discussion.  That leaves Smoltz, Schilling, and Clemens.  Clemens’ story doesn’t need to be rehashed at this point, and Schilling received 38.8% of the vote on his first ballot last year.  Also, simply looking at traditional stats, you have to think Frank Thomas has a strong case as well (521 HR, .301 BA).

Another point I wanted to bring up involves the ages of the players elected by the BBWAA.  The average age of a player elected is 49.7 years, with the median age being 48.  The data gets skewed a bit by pre-1900s players (as the first election wasn’t until 1936) and by extremely young inductees like Lou Gehrig, Roberto Clemente, and Sandy Koufax .  Gehrig was elected by a special ballot the year he retired after being diagnosed with ALS.  Clemente was elected a year after his death.  Both were elected before the five-year retirement period required for most players elapsed.  Koufax only played 11 years in the MLB, a remarkably short time for a Hall of Famer.

If we use the ~50 year average age of election though, anyone born in 1964 or after still “has a decent chance” at election.  If we figure an even distribution of eligible players born each year between 1961-1970, that means 60% of eligible players, or 96, still can make a case.  That becomes 90 if we take out Maddux, Glavine, Griffey, Rivera, Johnson, and Biggio.  As I stated earlier, they only need to elect four more to reach previously seen levels of induction.  4/90 is only 4.4% needed.  That list of 90 players also doesn’t include still eligible players such as Don Mattingly, Roger Clemens, Edgar Martinez, Fred McGriff, and Mark McGwire.

I’m not trying to take a stand on either side of the PED Hall of Fame discussion.  I’m just trying to point out that maybe the Hall of Fame isn’t being so much more strenuous on eligible players as they’ve been throughout history.  Just something to think about.