Archive for Research

“Stuff” and Father Time

The question of how pitchers age is paramount to players and front offices. “Stuff,” the colloquial term for raw talent throwing the baseball, can really be boiled down to velocity and movement (if we really wanted to oversimplify things). PITCHf/x gives us an opportunity to use big data to estimate “stuff” by looking at measurements of velocity and movement. We can use the copious data collected to estimate what “stuff” we can expect from pitchers as they pass the dreaded 30-years-old mark and beyond.

PITCHf/x reports movement in horizontal and vertical vectors. Horizontal movement (Hmov) is the right or left movement of the pitch compared to the expected trajectory without air resistance. A positive value is away from a right-handed batter. Vertical movement (Vmov) is the amount the ball moves up or down relative to the expected drop in a vacuum. A positive value means the ball dropped less than would be expected without effective spin.

It has been established that fastball velocity tends to decrease with age, but movement trends haven’t been looked at before. Might aging pitchers compensate their decrease in velocity with an increase in movement? Or does time steal away effective spin as well?

Let’s find out.

Methods:

I collected all PITCHf/x data from every pitcher with at least 300 innings pitched from 2007 to 2018 (n=537). Data was aged based on the age of the player on April 1st of the corresponding season. Velocity, horizontal movement, and vertical movement were averaged for each age and graphed. The horizontal axis of left-handed pitchers was flipped so right and left-handed data could be analyzed together.

I then took out the top starters by WAR (n=63), according to FanGraphs, from 2007 to 2018 and graphed their data separately.

Results/Discussion: Graphs are available by clicking on the links below, and raw data available in tables at the end of this post. Read the rest of this entry »


Do Higher Signing Bonuses Help Players Advance?

A lot has been written over the past year about pay at the minor league level and attempts to fix things, and with good reason — it’s a pretty bad situation, and with fundamental decency in mind, it is certainly a good thing that it may be changing.

But alongside that discussion, I’ve been kind of curious of how changing minor league pay would actually change performance. In theory, paying players more could let them focus on baseball, translating to better performance. If that’s the case, it’s even possible that paying players more could actually “pay for itself” if the value of the extra wins players generate outweighs the costs of paying them more. In a perfect world, to test that, you could randomly pay some players more than others and see which group does better.

We don’t live in a perfect world, but we do live in one where signing bonuses are still pretty random. Yes, obviously players drafted higher receive higher bonuses on average, but there’s still pretty significant variation across the board, especially when you get into later rounds. In 2015, for example, there were 105 players drafted who had assigned “slot values” of between $130,000 and $200,000, and their bonuses were anywhere from $2,000 to $1,000,000. While in general higher bonuses should go to more talented prospects, it also stands to reason that two players drafted around the same time with around the same slot values should have around the same talent level and chances to make the majors.

With that in mind, I took a look at a couple different ways of seeing how well players with much lower bonuses progressed. Using 2014-16 draft data from SBN, I had a set of all players drafted in the first 10 rounds along with their signing bonuses and slot values, which I then matched with FanGraphs’ data on player appearances at either the Triple-A or major league level from 2014 to 2019. In total, this left me with 922 players, of whom 319 (~35%) made a Triple-A or MLB appearance and 144 (~16%) that made an MLB appearance. 153 (~17%) had a signing bonus of $50,000 or lower. I looked at two different ways to see how signing bonuses varied with advancement. Read the rest of this entry »


Does Rule 5 Draft Position Matter?

Orioles fans like myself don’t have a lot of hope. It’s hard to get excited about a starting lineup featuring Austin Wynns, Joey Rickard, and Rio Ruiz. The Orioles’ hope is for the future, and one thing that got some Orioles fans excited this winter was the selection of Richie Martin with the first pick of the Rule 5 draft. Fans can dream about their team unearthing a diamond in the Rule 5 draft, reminding each other that Jose Bautista was a Rule 5 draft pick once. But the likelihood of success remains extremely low. Still, the first shot at a Rule 5 draft pick seems to suggest a better chance at success. The question is, how much does Rule 5 draft position predict the player’s future career value or team contribution?

To answer this, I identified data from the 2003 to 2014 Rule 5 drafts. I included only players selected in the major league portion of the draft, a sample size of 175. I also only included data up until 2014 to give players time to contribute towards their career bWAR and team bWAR values.

First off, the bar for success in the Rule 5 major league draft is fairly low. Take a look at the distribution of total bWAR provided to the team during the selected players’ tenure.

teambwar

That’s a lot of clustering around 0 with the exception of some highlights like Shane Victorino, Dan Uggla, Joakim Soria, Marwin Gonzalez, and Odubel Herrera, who all come in at top-10 in team bWAR. The mean team bWAR provided is .61 for this sample. Only six players, or 3.4%, provide more than 5 bWAR to their selecting team. In comparison, 25% of them posted a negative team bWAR, including poor Levale Speigner, who posted a -1.7 bWAR in 26 games across two seasons with the Washington Nationals. Read the rest of this entry »


The Logic Behind Opt-Outs

Opt-outs are complicated to understand. On a basic level, an opt-out allows a player the choice, during a specified offseason, to nullify his current contract and become a free agent again. How an opt-out affects the value of a contract has been written about plenty — despite the differences in methods or dollar-per-WAR values, it is generally accepted that the inclusion of an opt-out lowers the total salary of the contract.

Given the issues with trying to calculate an exact value of an opt-out — the two biggest challenges being having sparse contract data and the necessity of a reliable future projection system — I tried to explore opt-outs from a theoretical perspective: why would a player ask for an opt-out, and why would a team write one into a contract. Note: the equations were originally in latex, but they lost formatting through submission. They have been replaced with plain text.

From the Player’s Perspective:

A player would sign a contract with an opt-out if he believed the expected present value of the contract was greater than a contract offer without an opt-out.

EPV_opt < EPV_no-opt

The expected present value of the contract without an opt-out (EPV_no-opt) is just the expected present value of the contract itself. The expected present value of the contract with an opt-out (EPV_opt) is more complex.

The expected present value of a contract with an opt-out can be broken down into two components: the expected present value of the pre-opt-out portion of the contract ($latex EPV_{pre\:opt}$) and the expected present value of the post-opt-out portion. Regardless of whether the player opts out or not, the pre-opt-out value of the contract is the same. The post-opt-out value differs, depending on three values: the value of the new contract should the player opt-out ($latex EPV_{opt}$), the value of staying in the current contract and not opting out ($latex EPV_{no\:opt}$), and the probability the player opts out (P opt-out). Read the rest of this entry »


Lucas Giolito and the Long-Awaited Comeback

Are we finally seeing the Lucas Giolito performance that we waited so long for? Once pegged as a “top-of-the-rotation demigod,” Giolito has struggled to find any consistency in the majors. Through the month of May, he’s got the highest K% of his career at 29.2% and the largest K% increase in MLB from 2018 to 2019 with a 13.1% jump. He’s got an average fastball velocity of 93.4 mph, up exactly one tick from last season, and has also added 148 rpm to his heater. Giolito has been more aggressive in terms of overall zone percentage, with the third-largest MLB increase from 2018 to 2019 at 6.8%. Even while down in a hitter’s count, he’s found ways to battle back in the zone, something he was below league average in last season:

Batters are having a tougher time squaring him up and he’s even added some vertical break on his fastball and curveball: Read the rest of this entry »


How Are Starting Pitchers Affected by Their Previous Start’s Workload?

Pitchers’ workloads are certainly a topic we’re used to hearing about as baseball fans. We live in the pitch count era after all, and every game has a pitch count indicator on the screen showing how many the starter has thrown. We’ve gotten used to starters getting the hook right around 100, even if they’re pitching well. We also know that it is to avoid injury to this most injury-prone of positions. It’s never been shown very clearly that higher pitch counts lead to injury, but there’s enough worry that teams want to play it safe with these prized assets. This is even more true with young pitchers: they often aren’t allowed past 85 or 90 pitches if the team is especially worried about their arm.

We also know the other reason why: pitchers just aren’t going to keep doing as well if you leave them in for that long. Past 100 pitches, pitchers are usually well into their third time through the opposing team’s batting order, if not their fourth. We know that each additional time hitters get to see the same pitcher in the same game, the better the hitters do against him. And we know that, of course, pitchers get tired as they throw more pitches, and their velocity drops, and with it, their effectiveness.

But should there be another consideration here? We know the long-term reasons for limiting pitch counts, as well as the short-term ones. But what about the medium term: how does a starter’s pitch count affect how he’ll do his next time out on the mound?

Over at Baseball Prospectus, Russell Carleton (a.k.a. @pizzacutter4) looked at this question back in 2013. He found that past 100 pitches, every further pitch thrown leads to more home runs and more singles being given up next time out, as well as fewer balls in play meekly falling for outs. But his study was only focused on the extreme upper end of pitch counts, inspired as it was by Tim Lincecum’s brilliant 148-pitch no-hitter. That matters, but I also want to know what happens before a starter gets to 100 pitches. There’s no reason to think the effect of workload only kicks in after 100 pitches have been thrown. Will a pitcher do better next time out if his pitch count is kept significantly below 100? I decided to find out. Read the rest of this entry »


Ulnar Collateral Ligament Reconstruction and Its Effect on Yearly and Career WAR

Tommy John’s Legacy

Tommy John belongs in the Hall of Fame. With 12 more wins to his name, he almost certainly would be. However, his record 188 career no-decisions held him back. With more advanced analytics, his case becomes clear. In terms of all-time WAR, Tommy John sits in 22nd among pitchers, sandwiched between John Smoltz and Phil Niekro. His impressive total can be attributed largely to his astounding longevity, pitching 26 seasons in MLB. This becomes even more incredible when his ulnar collateral ligament (UCL) is taken into account. Tommy John underwent the first UCL reconstruction (UCLR) ever performed on a pitcher in 1974. After taking the 1975 season off, he went on to pitch 14 (!) more seasons, essentially putting in an entire career’s worth of work after a still experimental surgery.

Tommy John surgery, as it is now called, is still extraordinarily common in Major League pitchers, and the specter of a UCL tear haunts pitchers and general managers alike. But how does actually undergoing Tommy John surgery affect a player’s ability to perform? There have been considerations that Tommy John surgery actually improves performance, though this assertion is controversial at best.

Brief Review of Current Literature

A 2014 cohort study from Erickson et al. investigated MLB pitchers who underwent UCL reconstruction and compared performance measures between those who underwent surgery and controls that were matched by age, BMI, position, handedness, and MLB experience. Also measured was the rate of the return to pitching after surgery. This study showed that 83% of those who underwent surgery were able to return to pitching. In terms of performance, it was found that performance significantly declined the year before surgery and improved after surgery in the experimental cohort (as measured by losses, losing percentage, ERA, walks, hits allowed, runs, and home runs allowed). The surgical group even improved in some measures after surgery as compared to the controls, specifically in terms of losses, losing percentage, ERA, walks allowed, and hits allowed per inning.1

Another cohort study shortly followed in 2014 from Drs. Jiang and Leland that investigated the velocity of MLB pitchers after UCL reconstruction. In this study, of those who were able to return to pitching at the major league level, the mean velocity they were able to reach was unchanged with respect to the control group. In addition, performance measures of those who received surgery were not affected relative to the control group (in this case ERA, BAA, W/9, K/9, and WHIP).2

Yet another cohort study came in 2015 by Marshall et al., which compared 33 MLB pitchers who received Tommy John surgery to 33 age-matched controls. These groups showed mixed results in terms of performance, with little effect of surgery on ERA and WHIP. Surgery was correlated instead with a decline in innings pitched and BB/9. Of note, those who received surgery had significantly shorter careers after surgery than the control group (a difference of 0.8 years (P<0.1)).3 Read the rest of this entry »


Are Players Learning to Cut Their Strikeout Rate?

Strikeouts are continuing to go up. In 2016, batters struck out 21.1% of the time. It was 21.6% in 2017, and 22.3% in 2018, and now 23.2% in 2019, which would again be a new record.

However, while looking at the leaderboards, it appeared to me that there were some quite spectacular K-rate improvers this year, most notably Matt Chapman and Cody Bellinger. This leads to two questions:

1. Is there an increase in players improving their strikeout rate?
2. Do those improvements stick?

I looked at guys who improved at least five points in strikeout rate in April 2019 vs. 2018.

2019 contact gainers

There have been 20 hitters that have improved five or more points, with five guys improving by more than 10. Read the rest of this entry »


Projecting Risk in Major League Baseball: A Bayesian Approach

The following is an introduction to a new Bayesian projection system, which can be found here.

Introduction / Motivation

This project was partly inspired by a recent episode of the Driveline Baseball podcast which aired this offseason in which Kyle Boddy, founder of Driveline, and Mike Rathwell, CEO of Driveline, had a conversation about the topic that overwhelmed baseball for many months – whether teams should sign Manny Machado or Bryce Harper. Their initial reaction was to express disappointment that the debate had started in the first place. Machado and Harper, they argued, had almost nothing in common besides the fact they belonged to the same free agent class. They play different positions (implying different replacement levels and, thus, entirely different markets), and more importantly, have entirely different amounts of uncertainty associated with their projections. Machado has been as reliable as they come, playing premium positions (SS and 3B) and improving his defensive abilities every year. Harper, on the other hand, had just come off of what some have called one of the worst defensive seasons by an outfielder in recent history. However, he also had a 10-WAR season in 2015, a ceiling which Machado hasn’t touched. This led into a broader discussion about how to compare contracts from players with different levels of risk. Specifically, they explain how the sabermetric community’s approach to answering the question of valuing risk in Major League Baseball contracts has fallen short in three areas.

First, while many writers make note of the riskiness of certain assets, they fail to define that risk in precise terms. Most public projection systems output point estimates and public researchers suggest that the output is at the upper limit of predictive accuracy, and hence should be treated as a near certainty. It is worth noting that baseball is not the only field which has grown uncomfortable with uncertainty. Whether it is a decision to buy a certain stock, hire a particular company to ship your goods across the country, or decide who will be our next president, many analysts make the mistake of assuming a binary, discrete outcome is the result of a binary, discrete process. Instead, we posit that once we start to see the world as the outcome of several continuous, probabilistic processes, we can manipulate those processes in ways that give us an extreme competitive advantage (in baseball at the very least). Boddy explains:

“While a tweeter very fairly pointed out that sites like Baseball Prospectus and FanGraphs do mention upside and downside, rarely is it quantitatively actually approached in these articles. Rarely if ever, I should say. It’s very frustrating because just making a note that Tesla’s stock is more volatile than Microsoft’s is not enough. That wouldn’t be enough for a financial planner to be like ‘Oh, ok that’s a very deep analysis.’ It’s also not all downside, which is how a lot of these tweets [go].”

In other words, before describing the optimal mix of risky and safe players on a major league roster, there need to be accurate and reliable methods by which to describe that risk. There are very significant drawbacks to assuming too much downside, so carefully tracking exactly how uncertain you are of your team’s future performance as a whole is imperative. Also, as is the case with all science, precisely measuring levels of uncertainty and tracking resulting performance over time is the most reliable way to gain a deeper understanding of what exactly is uncertain about player projections and perhaps eliminating some of that ambiguity in the future. Read the rest of this entry »


Judge and Altuve: A Tale of Two Strike Zone Oddities

Aaron Judge and Jose Altuve seem like they shouldn’t coexist in Major League Baseball, but their mutual success reveals an amazing fact about different paths to pitcher dominance.

Here’s a line of reasoning using the transitive property about short hitters and small strike zones:

  • If a player is shorter, then his shoulders are closer to his knees.
  • If his shoulders are closer to his knees, his strike zone should be smaller.
  • If his strike zone is smaller, then it should be harder for the pitcher to throw strikes.
  • If it is harder to throw strikes, he should get on base more.
  • Shorter players should have higher on-base percentages.

But this isn’t how it actually works. Why not?

Aaron Judge is tall and Jose Altuve is short. That’s analysis. They both are very productive at the plate. That’s deeper analysis. However, Judge gives pitchers a 25% larger strike zone to target due to his 6-foot-7 height, so he must be doing something different and better than Altuve to offset the larger area he gifts to pitchers with every pitch.

Read the rest of this entry »