Top 5 Fantasy Starting Pitching Prospects for 2016

For this list there will be two requirements:

  1. The players under consideration must not have thrown even a single pitch in the major leagues. This throws out notable names such as Steven Matz, Jon Gray, and others who have already done so.
  2. The players under consideration must be projected to graduate from the prospect label in 2016 and have a significant influence on a major-league team. This throws out notable names like Julio Urias, Lucas Giolito, and others who are projected to be unlikely call-ups for the 2016 season.

 

So here we go, my projections for the top five pitching prospects you should keep an eye on for your fantasy team in 2016.

1. Tyler Glasnow

Team: Pittsburgh Pirates

Throws: Right

Height/Weight: 6’8”/225

Age: 22

Projected Path: Opening day rotation

Rundown: Nobody in this year’s projected rookie class boasts a bigger frame or, more importantly, a bigger fastball than Glasnow. Standing at an intimidating 6’8”, Glasnow is known to be able to pound the catcher’s mitt repeatedly with an upper 90s fastball that routinely overpowers hitters. MLB.com gives his fastball a 75 on the 20 to 80 scale, a truly remarkable grade. Add that to an above-average power curveball and an improving changeup, and it is easy to see why scouts rave about this guy’s immense upside.

But really, who cares about what scouts think? Not me, and neither should you. Let’s check out some upper-minor-league numbers Glasnow produced in 2015. Glasnow’s largest body of work in 2015 came in AA where he threw an even 63 innings over a span of 12 starts. Here he struck out a remarkable 33.1% of batters while only walking a respectable rate of 7.7% of batters he faced. His strikeout rate was tied with fellow highly-touted right-hander Jose De Leon for tops in all AA leagues among pitchers who threw at least 60 innings. Throw in his respectable walk rate, and he led all AA pitchers with the same inning restrictions in K-BB%. In AA he had an ERA of 2.43 while only stranding 66.4% of baserunners, a statistic generally attributed to luck. Again the average LOB% from 2015 was 72.3%, so one can assume he was a little unfortunate giving up some of those runs. With that said, his ERA could have easily looked more like his FIP which was an outstanding 1.98.

Either way, Glasnow proved to be dominant in AA and was later called up to the next level. In 43 innings of AAA ball he struck out a similarly great 27.6% of hitters. He walked some extra guys, leading to a high 12.6% BB%. Unsurprisingly, he was able to strand more runners in AAA, 73.3% of them to be exact, and yielded a 2.20 ERA. His FIP was 2.82.

Final take: Go get this guy. If he’s available late for a cheap price, Glasnow could be the ultimate diamond found in the rough. His best strength is in his strikeout numbers which plays really well for fantasy. The only weakness to his game is the walks. If he can find a way to limit walk totals, Glasnow could join the conversation for top young arms in 2016 and beyond.

2. Jose Berrios

Team: Minnesota Twins

Throws: Right

Height/Weight: 6’0”/190

Age: 21

Projected Path: Opening day rotation

Rundown: Perhaps more polished than Glasnow, Jose Berrios is a very strong name to have on your radar. As an undersized righty, Berrios hits mid-90s with his fastball, but will mainly live in the lower 90 range. He also throws a slurve-like breaking ball with various velocities as well as an above-average changeup. The best thing about Berrios is his plus command. As a 21 year old, he walked only 6.5% and 4.7% of batters in 90⅔ innings in AA and 75⅔ innings in AAA respectively. His strikeout numbers were also strong with a 25.1% mark in AA and a 27.7% effort in AAA. An interesting note on Berrios was his major improvement from AA to AAA. His K-BB% improved by 4.5%, as well as his FIP and ERA numbers.

Final Take: I was going back and forth for quite sometime trying to decide who was more valuable between Berrios and Glasnow. In the end I chose Glasnow mainly due to the unprecedented strikeout potential as well as the national league benefit. However, that by no means says that Berrios can’t be better. Led by his impressive ability to limit walks, go into your draft with Berrios’ name in mind.

3. Blake Snell

Team: Tampa Bay Rays

Throws: Left

Height/Weight: 6’4”/180

Age: 23

Projected Path: Opening day rotation

Rundown: Blake Snell is one of the most intriguing names in the prospect heap for 2016, partly because he came into the year as a relatively unknown 22-year-old in the Tampa Bays Rays A+ affiliate. The other part is that in 2015 he didn’t let up a run until his 50th inning of work. He escaped A+ ball in 21 innings without letting up a run and then rattled off another 28 scoreless innings in AA. Eventually though, he did prove to be human as he let up a first-inning home run to the Cubs’ Wilson Contreras ending his scoreless inning streak at 49. Nevertheless, he put up astounding numbers across three levels of the minor leagues in 2015.

Like Glasnow, Snell’s best tool is his ability to strike hitters out. He does this with a low to mid-90s fastball as well as an above-average slider and changeup. His biggest flaw is the walks, as he walked over 10% of the batters he faced in both AA and AAA. Like Berrios, he posted his best K-BB% numbers in AAA. In 44⅓ there he struck out 33.3% of batters and only walked 7.6%, good for an incredible 25.7% K-B%. Although it is very difficult to project his basic run-prevention skill without the aid of batted-ball type or velocity, he certainly excelled in that area in 2015. In 21, 68⅔, and 44⅓ innings in A+, AA, and AAA his ERA was 0.00, 1.57 and 1.83 respectively.   

Final Take: Like I said at the beginning, Blake Snell is intriguing. Walks will hold him down, strikeouts will bring him up. If you like what see take a shot and thank me later. He has the potential of an elite starter.

4. Jose De Leon

Team: Los Angeles Dodgers

Throws: Right

Height/Weight: 6’2”/185

Age: 23

Projected Path: Mid/late season call up

Rundown: The only thing holding De Leon back from being closer to the top of this list is the Dodgers’ management. Most likely, he will not make the team out of camp and will head to AAA to start the year.  However, due to the Dodgers’ thin staff and postseason desperation, De Leon is bound to make a splash sometime in 2016. As mentioned earlier, he was tied with Tyler Glasnow in K% in AA during the 2015 season among pitchers with more than 60 innings pitched. De Leon pitched a total of 76⅔ innings at the AA level through 16 starts. Before that, also in 2015, he threw 37⅔ innings at the A+ level. He put up ridiculous numbers there, striking out batters at a rate of a nearly unheard of 40% while only walking 5.4% of hitters. His walk rate increased a little bit in AA, but he still boasts better command than the likes of Glasnow and Snell. De Leon pairs his low to mid-90s fastball with a slider and changeup.

Final Take: Although De Leon is unlikely to make the team out of spring camp it is worth keeping this guy on your fantasy radar. Pay attention for any news on a potential call-up, and if you find any, don’t waste time to add him to your roster. In deeper formats, De Leon certainly deserves a late-round draft choice.

5. Josh Hader

Team: Milwaukee Brewers

Throws: Left

Height/Weight: 6’3”/160

Age: 21

Projected Path: Mid/late season call up

Rundown: After coming to the Brewers in the Carlos Gomez deal, Hader quickly improved his prospect stock by increasing his K-BB% by almost 10% with the move from the Astros AA affiliate to the AA affiliate of the Brew Crew. Although he started his only 7 games with Milwaukee, Hader spent time both starting and coming out of the pen before the deal in Houston. Over there he was not nearly as impressive with a higher BB% as well as significantly lower K% in 65⅓ innings. Like I promised, things got better in his 38⅔ innings for the Brewers in AA. Hader struck out a robust  32.9% of hitters while only walking 7.2%. Overall, Hader finished sixth in K-BB% among starters under 25 in AA who logged more than 60 innings. Hader pairs his mid-90s fastball with an average changeup and curveball. Due to his shot forward with the Brewers, and the lack of organizational pitching skill combined with likely trades of veterans either during the offseason or before the July trade deadline, Hader could be looking at a potential midseason call-up where his ability to get strikeouts would be an asset, especially in the NL. On top of this, Hader has better command than most 21-year-olds.

Final Take: Hader’s upside is real. A strong fastball, paired with above-average command bodes well for National League pitchers. Now all he has to do is continue his success in the minor leagues for the Brewers, and he will almost certainly see a call-up to the big-league rotation. If this happens make sure you remembered his name.

If you enjoyed this article be sure to check out our website www.analyticfb.com and our instagram @fantasybaseballanalytics!

Stats and research courtesy of FanGraphs and MLB.com


How Game Theory Is Applied to Pitch Optimization

The timeless struggle between pitcher and batter is one of dominance — who holds it and how. Both players use a repertoire of techniques to adapt to each other’s strategies in order to gain advantage, thereby winning the at-bat and, ultimately, the game.

These strategies can rely on everything from experience to data. In fact, baseball players rely heavily on data analytics in order to tell them how they’re swinging their bats, how well they’ll do in college, how they’ll perform at Wrigley versus Miller.

Big data has been used in baseball for decades — as early as the 60s. Bill James, however, was the first prominent sabermetrician, writing about the field in his Bill James Baseball Abstracts during the 80s. Sabermetrics are used to measure in-game performance and are often used by teams to prospect players.

Baseball fans familiar with sabermetrics, the A’s, and Brad Pitt have likely seen Moneyball, the Hollywood adaptation of Michael Lewis’ book. The book told the story of As manager Billy Beane’s use of sabermetrics to amass a winning team.

Sabermetrics is one way baseball teams use big data to leverage game theory in baseball — on a team-wide scale. However, by leveraging their data through the concepts of game theory on a smaller scale, baseball teams can help their men on mound out-duel those at the plate.

Game theory studies strategic decision making, not just in sports or games, but in any situation in which a decision must be made against another decision maker. In other words, it is the study of conflict.

Game theory uses mathematical models to analyze decisions. Most sports are zero-sum games, in which the decisions of one player (or team) will have a direct effect on the opposing player (or team). This creates an equilibrium which is known as the Nash equilibrium, named for the mathematician John Forbes Nash. What this means is that if a team scores a run, it is usually at the expense of the opposing team — likely based on an error by a fielder or a hit off a pitcher.

In the case of pitching, game theory — especially the use of the Nash equilibrium — can be used to predict pitch optimization for strategic purposes. Neil Paine of FiveThirtyEight advocates using big data and sabermetrics to analyze each pitch in a hurler’s armory, then cultivating the pitcher’s equilibrium — the perfect blend of pitches that will result in the highest number of strikeouts, etc.

Paine has gone so far as to create his own formula, the Nash Score, to predict which pitcher should throw which pitches in order to outwit batters.

In perfect game theory, the Nash equilibrium states that each game player uses a mix of strategies that is so effective, neither has incentive to change strategies. For pitchers, Paine’s Nash Score uses their data to find the optimal combination of pitches to combat batters, including frequency.

Paine does point out that creating this kind of equilibrium in baseball can be detrimental to a pitcher. He is, after all, playing against another human being who is just as capable of using game theory to adapt strategies to upset the equilibrium.

If a pitcher’s fastball is his best, and his Nash Score shows that he should be using it more often, savvy hitters are going to notice. “ . . . In time, the fastball will lose its effectiveness if it’s not balanced against, say, a change-up — even if the fastball is a far better pitch on paper,” writes Paine.

In this case, a mixed strategy is the best — in game theory, mixed strategies are best used when a player intends to keep his opponent guessing. Though pitch optimization using Paine’s Nash Score could lead to efficiency, allowing pitchers to throw fewer pitches for more innings, it could also lead to batters adapting much quicker to patterns, thus negating all the work.


Stephen Strasburg Is Better Than You Think

To a casual baseball fan, Stephen Strasburg’s numbers are not pretty. The owner of a 4.76 ERA and a 1.38 WHIP, Strasburg is clearly having the worst season of his career. But how bad has he been, really? Not as bad as you think. Take a look at these 2015 stats:

Player A: 3.48 xFIP, 22.8 K%, 5.5 BB%
Player B: 3.31 xFIP, 24.1 K%, 5.3 BB%
Player C: 3.18 xFIP, 24.9 K%, 6.0 BB%

Player A is none other than Johny Cueto, recently traded to the Kansas City Royals. 12th in ERA among qualified pitchers, Cueto is widely considered among the best, and perhaps deservedly so with five straight years of a sub-3 ERA. While he has consistently outperformed the above metrics, they are still indicative of general pitcher performance and should not be overlooked when comparing the quality of different pitchers.

Player B actually has the fifth lowest ERA among qualified pitchers and was also traded at the deadline. He’s been one of the most reliable pitchers over the past five years and has been an ace on every staff for which he’s pitched. Player B is David Price.

Player C is obviously Stephen Strasburg, and as you can see, his peripheral stats stack up against the best in the game. In addition to these 2 players, Strasburg also compares positively to others like Sonny Gray and Scott Kazmir, both of whom have better ERAs but a worse xFIP, K%, and BB%.  Strasburg is pitching like an ace, and xFIP shows that, so why have his results been so poor?

Well, first of all, there’s his .345 BABIP. Not only is this high compared to the league average (.296), it’s well above his career mark of .302. Considering he’s not giving up any more line drives or hard contact than usual, his BABIP should fall back to around the .300 mark and bring his ERA down with it.

Not only is his BABIP at an all-time high, his LOB% is at an all-time low. Currently at 65.3%, it figures to inch back up to his career 73.2% mark, or at least to the league average of 72.4%. Considering his strikeouts have not dropped off, there’s no reason for his drop on LOB%, and it can simply be chalked up to bad luck, something that he’s had plenty of this year.

Looking at these stats, there’s nothing that suggests Strasburg is anything but unlucky. However, as Jeff Sullivan pointed out here, Strasburg’s problem could stem from the injury he suffered in the spring. He had apparently adjusted his mechanics to compensate for the discomfort, and even though it appears as though he has fixed this, it’s possible that when pitching from the stretch and in higher leverage situations, he returns to this altered motion by default. When looking at the difference in Strasburg’s stats between pitching from the windup and the stretch, this is what we see:

K% xFIP
Bases Empty 30.1 2.73
Runners on Base 17.0 3.98

Evidently, this claim has some ground. Strasburg is clearly having some problems with runners on base, particularly in striking batters out. Before we deal with the strikeout numbers, let’s take a look to make sure that he’s not just getting killed during the at bats that don’t end in strikeouts.

GB/FB Batted Ball Velocity (mph) Hard Hit % Infield Hit %
Bases Empty .98 89 29.7 4.5
Runners on Base 2.05 88 28.7 12.2

Strasburg is actually generating more ground balls and weaker contact with runners on base. His infield hit percentage is triple what it is when the bases are empty, something that can be attributed to luck. With such weak contact, it’s safe to say this isn’t the problem. So it must be the strikeouts. If we take a look at his whiff rates, the results are intriguing:

2010-2014 2015
Bases Empty 20.1% 17.5%
Runners On Base 17.9% 8.6%

OK, so there’s definitely a problem here. With runners on base, he’s only whiffing batters at half the rate he’s done previously in his career, as well as half the rate that he does with the bases empty. So what’s the issue? Well, it’s not his pitch velocity:

4 Seam 2 Seam Changeup Curve Slider
Bases Empty 95.1 mph 95.4 mph 88.4 mph 81.3 mph 86.7 mph
Runners on Base 95.2 mph 94.9 mph 88.0 mph 81.5 mph 87.2 mph

Strasburg’s average velocity with runners on base is 91.5 mph, compared to 91.0 mph with the bases empty, so he’s actually throwing the ball harder when there’s runners on base. That can’t be the problem. He’s also not walking a significant amount more batters when there are runners on base, so it’s not like he’s sacrificing control for increased speed.

Without any numbers to provide a reason, it appears Strasburg’s struggles when striking out batters with runners on base are either based purely in luck or are completely mental. This is not necessarily a good thing, as we have no idea if or when he will sort it out. With his skill, Strasburg has the potential to be one of the best in the game. He just needs to get out of his own head, and maybe get just a little bit luckier.


A Discrete Pitchers Study – Out & Base Runner Situations

(This is Part 4 of a four-part series answering common questions regarding starting pitchers by use of discrete probability models. In Part 1 we explored perfect game and no-hitter probabilities, in Part 2 we further investigated other hit probabilities in a complete game, and in Part 3 we predicted the winner of pitchers’ duels. Here we project the probability of scoring at least one run in various base runner and out scenarios.)

V.  I Don’t Know’s on Third!

Still far from a distant memory, the final out of the 2014 World Series was preceded by an unexpected single and a nerve-racking error that brought Alex Gordon to 3rd base with two outs. Closer Madison Bumgarner, who was on fire throughout the playoffs as a starter, allowed the hit but would be left in the game to finish the job. There is some debate as to whether Gordon should have been sent home rather than stopped at 3rd base , but it would have taken another error overshadowing Bill Buckner’s to get him home; also, next up to bat was Salvador Perez, the only player to ever ding a run off Bumgarner in three World Series. So even though the Royals’ 3rd Base Coach Mike Jirschele had to make a spur of the moment critical decision to stop Gordon as he approached 3rd base, it was a decision validated by both statistics and common sense. We will show our own evidence, by use of negative multinomial probabilities, of how unlikely the Royals would have scored the tying run off of Bumgarner with a runner on 3rd with two outs and we will also consider other potential game-tying or winning situations.

Runs are generally strung together from sequences of hits, walks, and outs; in the situations we will consider, we will only focus on those sequences that lead to at least one run scoring and those that do not. Events not controlled by the batter in the box, such as steals and errors, could also potentially reshape the situation and lead to runs, but we’ll take a very conservative approach and assume a cautious situation where steals are discouraged and errors are extremely unlikely.

Let A and B be random variables for hits and walks and let P(H) and P(BB) be their respective probabilities for a specific pitcher, such that OBP = P(H) + P(BB) + P(HBP) and (1-OBP) is the probability of an out; we combine the hit-by-pitch probability into the walk probability, such that P(BB) is really P(BB) + P(HBP) because we excluded hit-by-pitches from our models, P(HBP) > 0 against Bumgarner in the 2014 World Series, and the result on the base paths is the same as a walk. The first negative multinomial probability formula we’ll introduce considers the sequences of hits, walks, and an out that can occur after two outs have been accumulated, setting the hypothetical stage for the last play in Game 7 of the 2014 World Series.

Formula 5.1

In the 2014 World Series, Bumgarner’s dominantly low P(H) and P(BB) were respectively 0.123 and 0.027 and his (1-OBP) was 0.849; by applying these values to the formula above we can generate the probabilities of various hit and walk combinations shown in Table 5.1. The yellow highlighted cells in the table represent the combination of hits and walks that would let Bumgarner escape the inning without allowing the tying run (given a runner on 3rd with two outs and a one run lead). By combining these yellow cells, we see that the odds were overwhelmingly in in Bumgarner’s favor (0.873); all he had to do was get Perez out, walk Perez and get the next batter out, or walk two batters and get the third out.

Table 5.1: Probability of Hit and Walk Combinations after 2 Outs

0 Hits 1 Hit 2 Hits 3 Hits 4 Hits
0 Walks 0.849 0.105 0.013 0.002 0.000
1 Walk 0.023 0.006 0.001 0.000 0.000
2 Walks 0.001 0.000 0.000 0.000 0.000
3 Walks 0.000 0.000 0.000 0.000 0.000
4 Walks 0.000 0.000 0.000 0.000 0.000

The Royals could have contrarily tied the game with a simple hit from Perez given the runner on 3rd and two outs, yet this wasn’t the only sequence that would have kept the Royals hopes alive. Three consecutive walks, one walk and one hit, or any combination of walks and one hit could have also done the job; examples of these sequences are shown in the graphics below:

Graphic 5.1

Generally, any combination of walks and hits not highlighted yellow in Table 5.1 would have tied or won the World Series for the Royals. This glimmer of hope was a quantifiable 0.127 probability for Kansas City, so it was justified that Gordon was kept at 3rd rather than sent home after shortstop Brandon Crawford just received the ball. It would have taken an error from Crawford or Buster Posey, with respective 0.033 and 0.006 2014 error rates, to get Gordon home safely. The probability 0.127 of winning the game from the batter’s box is noticeably three times greater than the probability of winning it from the base paths (where Crawford and Posey’s joint error probability was 0.039).

We should note that the layout in Table 5.1 is a simplification of what could occur with a runner on 3rd, two outs, and a one run lead, because it only applies to innings where a walk off is not possible. In innings where a walkoff can occur, such as the bottom of the 9th, the combinations of walks and hits captured in the red highlighted cells are not possible because they would occur after the winning run has scored and the game has ended. However, Bumgarner was so dominant in the World Series that these probabilities are almost non-existent, thereby making our model is still applicable; we would otherwise exclude these red-celled probabilities for less successful pitchers.

The next probability formula considers the sequences of walks, hits, and outs that can occur after one out has been accumulated, which is situation definitely worth examining if there is a lone runner on 2nd base.

Formula 5.2

Once again we’ll use Bumgarner’s 2014 World Series statistics to evaluate this formula and insert the probabilities into Table 5.2. According to the sum of the yellow cells, Bumgarner would be able to prevent the tying run from scoring (from 2nd base with one out) with a probability of 0.762 and would otherwise allow the tying run with a probability of 0.238.

Table 5.2: Probability of Hit and Walk Combinations after 1 Out

0 Hits 1 Hit 2 Hits 3 Hits 4 Hits
0 Walks 0.721 0.178 0.033 0.005 0.001
1 Walk 0.040 0.015 0.004 0.001 0.000
2 Walks 0.002 0.001 0.000 0.000 0.000
3 Walks 0.000 0.000 0.000 0.000 0.000
4 Walks 0.000 0.000 0.000 0.000 0.000

To get out of the inning unscathed, Bumgarner would need to prevent any further hits or allow fewer than 3 walks given a runner on 2nd with 1 out; it would be possible to advance the runner to on 3rd with 2 walks and then sacrifice him home in this situation (with no hits), but this probability is insignificantly tiny especially for a dominant pitcher like Bumgarner. Once again we depict these sequences that could get the tying run home from 2nd with 1 out, with the second out inserted randomly.

Graphic 5.2

A runner on 2nd base with one out is a scenario commonly manufactured in an attempt to tie the game from a runner on 1st with no outs situation. The logic is that if the hitting team is down by one run and the first batter leads off the inning with a single or walk, the next batter can control getting him into scoring position and hope that either of the next two batters knocks the run in with a hit. However, this method of control, a bunt, sacrifices an out to move the runner from 1st to 2nd. The defense will usually allow the hitting team to move the runner into scoring position for an out, but the out wasn’t the only sacrifice made. The inning is truncated for the hitting team with one less batter and the potential to have more hitters bat and drive in runs is reduced. Indeed, against a pitcher like Bumgarner, the out is likely not worth the meager 0.238 probability of getting that runner home.  We’ll see in the next section what exactly gets sacrificed for this chance at tying the game.

We should note that in this “runner on 2nd with 1 out” model we added few more assumptions to those we made in the prior “runner on 3rd with 2 outs” model, neither of which should be farfetched. The first assumption is that with the game close and the manager intent on tying the game rather than piling on runs, he should have a runner on 2nd base fast enough to score on a single. Another assumption is that the base runners will be precautious enough not to cause an out on the base paths, yet aggressive enough not to get doubled up or have the lead runner sacrificed in a fielder’s choice play. Lastly, we assume that the combinations of hits, walks, and outs are random, even though we know the current state of base runners and outs can have a predictive effect on the next outcome and the defensive strategy used. By using these assumptions we simplify the factors and outcomes accounted for in these models and reduce the variability between each model.

The final probability formula considers the sequences of walks, hits, and outs that can occur when we start with no outs accumulated; this allows to forge situation will allow us to forge the outcomes from a runner on 1st with no outs scenario and compare them to a runner on 2nd with 1 out scenario.

Formula 5.3

Table 5.3 below uses Bumgarner’s 2014 World Series statistics, the same as before, although in this model we deal with more uncertainty because the sequences captured in each box are not as clear cut between run scoring or not given a runner on 1st with no outs. The yellow and non-highlighted cells are still the respective probabilities of not allowing and allowing the tying run to score, however, we now introduce the green probabilities to represent the hit and walk combinations that could potentially score a run but are dependent on the hit types, sequences of events, and the use of productive outs. These factors were unnecessary in the prior two models because in those models any hit would have scored the run, the sequence of events was inconsequential, and the use of productive outs was unnecessary with the runner is already on 2nd or 3rd base (except when there is a runner on 3rd and a sacrifice fly or fielder’s choice could bring him home).

Table 5.3: Probability of Hit and Walk Combinations after 0 Outs

0 Hits 1 Hit 2 Hits 3 Hits 4 Hits
0 Walks 0.613 0.227 0.056 0.011 0.002
1 Walk 0.050 0.025 0.008 0.002 0.000
2 Walks 0.003 0.002 0.001 0.000 0.000
3 Walks 0.000 0.000 0.000 0.000 0.000
4 Walks 0.000 0.000 0.000 0.000 0.000

We must break down each green probability into subsets of yellow probabilities representing the specific sequences that would not score the tying run from 1st base with no outs; we depict these sequences below, but for simplicity, not all are depicted.

Graphic 5.3

Now that we know the conditions when a run would not score, we take the probabilities from the green cells in Table 5.3, narrow them down according to the proportion of sequences and the proportion of hit types that would not score the run, and separate them based on the usage of productive and unproductive outs; the results are displayed in Table 5.4. For example, there are 6 possible combinations for 1 hit, 1 walk, and 3 outs and 3 of these 6 combinations would not score the tying run on a single, where P(1B | H) = 0.755, with unproductive outs; yet, the run would score with productive outs, with unproductive outs on a double or better, or with unproductive outs and the other 3 combinations. When we finally sum these yellow cells, they tell us that an aggressive manager would score the tying run against Bumgarner with a 0.370 probability and Bumgarner would escape the inning with a 0.630 probability. Otherwise, a less aggressive manager would score the tying run with a mere 0.154 probability and Bumgarner would leave unscathed with a significant 0.846 probability.

Table 5.4: Probability of No Runs Scoring after 0 Outs

Productive Outs Unproductive Outs
0 Hits 1 Hit 0 Hits 1 Hit
0 Walks 0.613 x (1/1) 0.227 x (0/3) 0.613 x (1/1) 0.227 x (3/3) x 0.755
1 Walk 0.050 x (1/3) 0.025 x (0/6) 0.050 x (3/3) 0.025 x (3/6) x 0.755
2 Walks 0.003 x (2/6) N/A 0.003 x (6/6) N/A

We summarize the results from Tables 5.1-5.4 into Table 5.5 from the perspective of the hitting team.  We compare their chances of success not only against Madison Bumgarner from the 2014 World Series but also against Tim Lincecum, Matt Cain, and Jonathan Sanchez from the 2010 World Series.

Table 5.5: Probability of Allowing at least One Run to Score

2010 Tim Lincecum 2010 Matt Cain 2010 Jonathan Sanchez 2014 Madison Bumgarner
Runner on 1st & 0 Outs w/Unproductive Outs 0.305 0.224 0.531 0.154
Runner on 1st & 0 Outs w/Productive Outs 0.576 0.475 0.758 0.370
Runner on 2nd & 1 Out 0.382 0.288 0.543 0.238
Runner on 3rd & 2 Outs 0.212 0.154 0.318 0.127

Let’s return to the scenario that is the launching point for this study… The hitting team is down by one run and there is a runner on 1st base with no outs. If the game is in its early innings, where it is not mandatory that this runner at 1st gets home, the manager will likely decide against being aggressive and avoid sacrificing outs in order to increase his chances of extending the inning to score more runs; there are several studies supporting this logic. Yet, if the game is in the latter innings and base runners are hard to come by, the manager should lean towards utilizing productive outs and intentionally sacrifice the runner from 1st to 2nd base. His shortsighted goal should only be to tie the game.  By forcing productive outs rather than being conservative on the base paths, his chances of tying the game increase significantly (between 0.216 and 0.271) against our four pitchers given a runner on 1st and no outs scenario.

However, the if the manager does successfully orchestrate the runner from 1st to 2nd base with a productive out, he does still lose a little bit of probability of tying the game; between 0.132 and 0.215 of probability is lost against our pitchers. And if he decides to sacrifice the runner further from 2nd to 3rd base with another out, his team’s chances would decrease again by a comparable amount; this decision is ill-advised because a hit is likely going to be needed to tie the game and the hitting team would be sacrificing one of two guaranteed chances to hit in this situation. In general, the probability of scoring at least one run decreases as more outs are accumulated, regardless of the base runners advancing with each out. The manager could contrarily decide against sacrificing his batter if he has confidence that his batter can hit the pitcher or draw a walk, yet the imperative goal is still to tie the game. The odds of tying the game actually favor an aggressive hitting team that is able to get the runner to 2nd base with one out, by an improvement ranging from 0.012 to 0.084, over a less aggressive team with a runner at 1st with no outs. Thus, even though sacrificing the runner from 1st to 2nd base does decrease the chances of tying the game, it would be worse to approach the game lifelessly when the situation demands otherwise.


A Discrete Pitchers Study – Pitchers’ Duels

(This is Part 3 of a four-part series answering common questions regarding starting pitchers by use of discrete probability models. In Part 1 we explored perfect game and no-hitter probabilities and in Part 2 we further investigated other hit probabilities in a complete game. Here we project the probability of winning a pitchers’ duel for who will allow the first hit.)

IV. Pitchers’ Duels

Bronze statues and folk songs are created to honor legendary feats of strength and stoicism… And Madison Bumgarner is deserving given his performance in the 2014 World Series. On baseball’s biggest stage, Bumgarner not only steamrolled an undefeated Royals team that was firing on all cylinders but he also posted timeless statistics (21 IP, 0.43 ERA, 0.127 BAA) that were beyond Ruthian or Koufaxian. Even as a rookie hidden among the 2010 Giants World Series rotation, Bumgarner’s potential radiated. So what do you do with an athlete who transcends time? You throw him into hypothetical matchups versus other champions. It would be thrilling, unless you like runs, to pit him against a pack of no-hitter-throwing pitchers (his 2010 rotation-mates) and even his 2010 self. We would be treated to great pitchers’ duels comparable to the matchups we would expect from a World Series.

When you oppose an excellent starting pitcher against another (and their hitters), the results will likely not reflect each players’ season averages. Hits and walks will be hard to come by and runs will be even harder. For our duels, we use each pitcher’s World Series probability of a hit, P(H), Bumgarner from 2014 and 2010 and the rest from 2010; P(H), hits divided by the same base as on-base percentage (AB+SF+HBP+BB), represents the quality of pitching we want from our duels. Even though 2014 Bumgarner faced a different lineup (the Royals) than the lineup his 2010 rotation-mates faced (the Rangers) to produce their respective averages, we are encapsulating the performances witnessed and assuming they can be recreated for our matchups. If okay with this assumption, then we can construct a probability model that predicts which pitcher will allow the first hit in our hypothetical pitchers’ duels. If interested further, we could also switch the variables to predict which pitcher will allow the first base runner by using on-base percentage (OBP).

The first formula we construct determines the probability that 2010 Pitcher A will allow m hits before 2014 Bumgarner allows his 1st hit; it is possible for the mth hit from A and the 1st hit from Bumgarner to occur after the same number of batters, but in a duel we want a clear winner. Let a be P(H) for 2010 Pitcher A and TAm be a random variable for the total batters faced when he allows his mth hit; similarly, let b be P(H) for 2014 Bumgarner and TB1 be a random variable for the total batters faced when he allows his 1st hit. If 2010 Pitcher A allows his mth hit on the jth batter, he will have a combination of m hits and (j-m) non-hits (outs, walks, sacrifice flies, hit-by-pitches) with the respective probabilities of a and (1-a); meanwhile 2014 Bumgarner will eventually allow his 1st hit on the (j+1)th batter or later and he will have 1 hit and the rest non-hits with the respective probabilities of b and (1-b). We can then sum each jth scenario together for any number of potential batters faced (all j≥m) to create the formula below:

Formula 4.1

If we assume an even pitchers’ duel of who will allow the 1st hit, for m=1, then we have the following intuitive formula for 2010 Pitcher A versus 2014 Bumgarner:

Formula 4.2

This formula takes the probability that 2010 Pitcher A allows a hit minus the probability that both pitchers allow a hit and divides it by the probability that 2010 Pitcher A or 2014 Bumgarner allow a hit. Furthermore, if we let this happen for m hits, we arrive at our deduced formula. We should also note that according to the deduced formula, we should see the probability decrease as m increases. This logic makes sense because the expected span of batters until 2014 Bumgarner allows his 1st hit, TB1, stays the same, but we are trying to squeeze in more hits allowed by 2010 Pitcher A, which makes the probability become less likely.

Table 4.1:  Probability of 2010 Pitcher A Allowing mth Hit Before 2014 Bumgarner Allows 1st

Tim Lincecum

Matt Cain

Jonathan Sanchez

Madison Bumgarner

World Series P(H) 0.196 0.143 0.273 0.111
Allows 1st Hit before Bumgarner’s 1st 0.583 0.504 0.660 0.441
Allows 2nd Hit before Bumgarner’s 1st 0.340 0.254 0.435 0.195
Allows 3rd Hit before Bumgarner’s 1st 0.198 0.128 0.287 0.086

In Table 4.1, we compare 2014 Bumgarner and his 0.123 World Series P(H) versus each starter from the 2010 World Series Giants rotation and their respective P(H). We expect 2014 Bumgarner to have the advantage over 2010 Lincecum, Cain, and Sanchez, given how he dominated the 2014 World Series; clearly he does. In an even pitchers’ duel, he would win with a probability greater than 50% even after the chance of a tie is removed; we could even see 2 hits from the other pitchers before 2014 Bumgarner allows his 1st with a probability greater than 25%. However, against a comparably excellent pitcher, himself in 2010, he would likely lose the duel because 2010 Bumgarner actually has a better P(H). Notice that from Sanchez to Lincecum and from Lincecum to Cain, the P(H) descends steadily each time; consequently, the same pattern of linear decline also follows duel probabilities when transitioning from pitcher to pitcher for each of the different hits allowed. Hence, the distinction between exceptional and below-average pitchers stays relatively constant as we allow more hits by them versus 2014 Bumgarner.

We can also construct the converse formula to calculate the probability that 2010 Pitcher A allows 1 hit before 2014 Bumgarner allows his nth hit. We let TBn be a random variable for the total batters faced when 2014 Bumgarner allows his nth hit and TA1 for when 2010 Pitcher A allows his 1st hit. However, instead of directly deducing the probability that 2010 Pitcher A allows 1 hit before 2014 Bumgarner allows his nth hit, we’ll do so indirectly by taking the complement of both the probability that 2014 Bumgarner allows his nth hit before 2010 Pitcher A allows his 1st hit (a variation of our first formula) and the probability that 2014 Bumgarner allows his nth hit and 2010 Pitcher A allows his 1st hit after the same number of batters.

Formula 4.3

The resulting formula takes the complement of the probability that 2014 Bumgarner allows n hits and 2010 Pitcher A does not allow a hit in (n-1) chances and divides it by the probability that 2010 Pitcher A or 2014 Bumgarner allow n hits. In this formula we can contrarily see the probability increase as n increases. By extending the expected span of batters, TBn, to accommodate 2014 Bumgarner’s n hits instead of just 1, we’re granting 2010 Pitcher A more time to allow his 1st hit, resulting in an increased likelihood.

Once again, if we set n=1 for an even matchup, we get the same formula as before:

Formula 4.4

Table 4.2:  Probability of 2010 Pitcher A Allowing 1st Hit Before 2014 Bumgarner Allows nth

Tim Lincecum

Matt Cain

Jonathan Sanchez

Madison Bumgarner

World Series P(H) 0.196 0.143 0.273 0.111
Allows 1st Hit before Bumgarner’s 1st 0.583 0.504 0.660 0.441
Allows 1st Hit before Bumgarner’s 2nd 0.860 0.789 0.916 0.723
Allows 1st Hit before Bumgarner’s 3rd 0.953 0.910 0.979 0.862

In Table 4.2, we again use 2014 Bumgarner’s 0.123 P(H) versus those displayed in the table above. As expected, the probabilities from the even duels are the same as Table 4.1 because the formulas are the same. Although this time from Sanchez to Lincecum and from Lincecum to Cain, the difference between each pitcher noticeably decreases as we adjust the scenario to allow 2014 Bumgarner more hits. Thereby, there is less distinction between exceptional and below-average pitchers if we widen the range of batters, TBn, enough for them to allow their 1st hit versus 2014 Bumgarner.

Madison Bumgarner may have dominated the 2014 World Series as a starter, but he also forcefully shut the door on the Royals to carry his team to the title (by ominously throwing 5 IP, 2 H, 0 BB). Given the momentum he had, he proved himself to be Bruce Bochy’s best option. However, not every game is Game 7 of the World Series, where a manager must decisively bring in the one reliever he trusts the most. A manager needs to assess who is the appropriate reliever for the job and weigh which relievers will available later. Fortunately, an indirect benefit of the pitchers’ duel model is that it can calculate the relative probability between two relievers for who will allow a hit or baserunner first; this application could be very useful in long relief or in extra innings.

Table 4.3:  Probability of 2010 Pitcher A Allowing mth Baserunners Before 2014 Bumgarner Allows 1st

Tim Lincecum

Matt Cain

Jonathan Sanchez

Madison Bumgarner

World Series OBP 0.268 0.214 0.409 0.185
Allows 1st BR before Bumgarner’s 1st 0.602 0.547 0.698 0.511
Allows 1st BR before Bumgarner’s 2nd 0.362 0.299 0.487 0.261
Allows 1st BR before Bumgarner’s 3rd 0.218 0.164 0.339 0.133

Suppose we’re entering extra innings and the only pitchers available are 2014 Bumgarner and 2010 Bumgarner, Lincecum, Cain, and Sanchez with their respective statistics from Table 4.3 (where we substituted P(H) in Table 4.1 for OBP). We wouldn’t automatically throw in our best pitcher, 2014 Bumgarner, with his 0.151 OBP; we need to compare how he would perform relative to the other 2010 pitchers and see what the drop off is. Nor is it a priority to know how many innings to expect out of our reliever because we don’t know how long he’ll be needed. What is crucial in this situation is the prevention of baserunners as potential runs. 2010 Bumgarner, Cain, and Lincecum would each be worthy candidates to keep 2014 Bumgarner in the bullpen, because each has a reasonable chance (greater than 40%) of allowing a baserunner by the same batter or later than 2014 Bumgarner. Hence, the risk of using a pitcher with a slightly greater chance of allowing a baserunner sooner may be worth the reward of having 2014 Bumgarner available in a more dire situation. Yet, we would want to avoid bringing in 2010 Sanchez because the risk would be too great; the probability is approximately 49% that he could allow two baserunners before 2014 Bumgarner allows one. Preventing baserunners and using your bullpen appropriately are both high priorities in close game situations where mistakes are magnified.


A Discrete Pitchers Study – Predicting Hits in Complete Games

(This is Part 2 of a four-part series answering common questions regarding starting pitchers by use of discrete probability models.  In Part 1, we dealt with the probability of a perfect game or a no-hitter. Here we deal with the other hit probabilities in a complete game.)

III. Yes! Yes! Yes, Hitters!

Rare game achievements, like a no-hitter, will get a starting pitcher into the record books, but the respect and lucrative contracts are only awarded to starting pitchers who can pitch successfully and consistently. Matt Cain and Madison Bumgarner have had this consistent success and both received contracts that carry the weight of how we expect each pitcher to be hit. Yet, some pitchers are hit more often than others and some are hit harder. Jonathan Sanchez had shown moments of brilliance but pitch control and success were not sustainable for him. Tim Lincecum had proven himself an elite pitcher early in his career, with two Cy Young awards, but he never cashed in on a long term contract before his stuff started to tail off. Yet, regardless of success or failure, we can confidently assume that any pitcher in this rotation or any other will allow a hit when he takes the mound. Hence, we should construct our expectations for a starting pitcher based on how we expect each to get hit.

An inning is a good point to begin dissecting our expectations for each starting pitcher because the game is partitioned by innings and each inning resets. During these independent innings a pitcher’s job is generally to keep the runners off the base paths. We consider him successful if he can consistently produces 1-2-3 innings and we should be concerned if he alternately produces innings with an inordinate number of base runners; whether or not the base runners score is a different issue.

Let BR be the base runners we expect in an inning and let OBP be the on-base percentage for a specific starting pitcher, then we can construct the following negative binomial distribution to determine the probabilities of various inning scenarios:

Formula 3.1

If we let br be a random variable for base runners in an inning, we can apply the formula above to deduce how many base runners per inning we should expect from our starting pitcher:

Formula 3.2

The resulting expectation creates a baseline for our pitcher’s performance by inning and allows us to determine if our starting pitcher generally meets or fails our expectations as the game progresses.

Table 3.1: Inning Base Runner Probabilities by Pitcher

Tim Lincecum

Matt Cain

Jonathan Sanchez

Madison Bumgarner

P(O Base Runners)

0.333

0.352

0.280

0.356

P(1 Base Runner)

0.307

0.310

0.290

0.311

P(≥2 Base Runner)

0.360

0.338

0.430

0.333

E(Base Runners)

1.326

1.250

1.586

1.233

Based upon career OBPs through the 2013 season, Bumgarner would have the greatest chance (0.356) of retiring the side in order and he would be expected to allow the fewest base runners, 1.233, in an inning; Cain should also have comparable results. The implications are that Bumgarner and Cain represent a top tier of starting pitchers who are more likely to allow 0 base runners than either 1 base runner or +2 base runners in an inning. A pitcher like Lincecum, expected to allow 1.326 base runners in an inning, represents another tier who would be expected to pitch in the windup (for an entire inning) in approximately ⅓ of innings and pitch from the stretch in ⅔ of innings. Sanchez, on the other hand, represents a respectively lower tier of starting pitchers who are more likely to allow 1 or +2 base runners than 0 base runners in an inning. He has the least chance (0.280) of having a 1-2-3 inning and would be expected to allow more base runners, 1.586, in an inning.

As important as base runners are for turning into runs, the hits and walks that make up the majority of base runners are two disparate skills.  Hits generally result from pitches in the strike zone and demonstrate an ability to locate pitches, contrarily, walks result from pitches outside the strike zone and show a lack of command.  Hence, we’ll create an expectation for hits and another for walks for our starting pitchers to determine if they are generally good at preventing hits and walks or prone to allowing them in an inning.

Let h, bb, and hbp be random variables for hits, walks, and hit-by-pitches and let P(H), P(BB), P(HBP) be their respective probabilities for a specific starting pitcher, such that OBP = P(H) + P(BB) + P(HBP). The probability of Y hits occurring in an inning for a specific pitcher can be constructed from the following negative multinomial distribution:

Formula 3.3

We can further apply the probability distribution above to create an expectation of hits per inning for our starting pitcher:

Formula 3.4

For walks, we do not have to repeat these machinations.  If we simply substitute hits for walks, the probability of Z walks occurring in an inning and the expectation for walks per inning for a specific pitcher become similar to the ones we deduced earlier for hits:

Formula 3.5

We could repeat the same substitution for hit-by-pitches, but the corresponding probability distribution and expectation are not significant.

Table 3.2: Inning Hit Probabilities by Pitcher

Tim Lincecum

Matt Cain

Jonathan Sanchez

Madison Bumgarner

P(O Hits in 1 Inning)

0.457

0.466

0.439

0.443

P(1 Hits in 1 Inning)

0.315

0.314

0.316

0.316

P(2 Hits in 1 Inning)

0.145

0.141

0.152

0.150

P(3 Hits in 1 Inning)

0.056

0.053

0.061

0.060

E(Hits in 1 Inning)

0.896

0.870

0.947

0.936

The results of Table 3.2 and Table 3.3 are generated through our formulas using career player statistics through 2013. Cain has the highest probability (0.466) of not allowing a hit in an inning while Sanchez has the lowest probability (0.439) among our starters. However, the actual variation between our pitchers is fairly minimal for each of these hit probabilities. This lack of variation is further reaffirmed by the comparable expectations of hits per inning; each pitcher would be expected to allow approximately 0.9 hits per inning. Yet, we shouldn’t expect the overall population of MLB pitchers to allow hits this consistently; our the results only indicate that this particular Giants rotation had a similar consistency in preventing the ball from being hit squarely.

Table 3.3: Inning Walk Probabilities by Pitcher

Tim Lincecum

Matt Cain

Jonathan Sanchez

Madison Bumgarner

P(O Walks in 1 Inning)

0.685

0.718

0.589

0.776

P(1 Walk in 1 Inning)

0.244

0.225

0.286

0.189

P(2 Walks in 1 Inning)

0.058

0.047

0.093

0.031

P(3 Walks in 1 Inning)

0.011

0.008

0.025

0.004

E(Walks in 1 Inning)

0.404

0.351

0.580

0.264

The disparity between our starting pitchers becomes noticeable when we look at the variation among their walk probabilities. Bumgarner has the highest probability (0.776) of getting through an inning without walking a batter and he has the lowest expected walks (0.264) in an inning. Sanchez contrarily has the lowest probability (0.589) of having a 0 walk inning and has more than double the walk expectation (0.580) of Bumgarner. Hence, this Giants rotation had differing abilities targeting balls outside the strike zone or getting hitters to swing at balls outside the strike zone.

Now that we understand how a pitcher’s performance can vary from inning to inning, we can piece these innings together to form a 9 inning complete game. The 9 innings provides complete depiction of our starting pitcher’s performance because they afford him an inning or two to underperform and the batters he faces each inning vary as he goes through the lineup. At the end of a game our eyes still to gravitate to the hits in the box score when evaluating a starting pitcher’s performance.

Let D, E, and F be the respective hits, walks, and hit-by-pitches we expect to occur in a game, then the following negative multinomial distribution represents the probability of this specific 9 inning game occurring:

Formula 3.6

Utilizing the formula above we previously answered, “What is the probability of a no-hitter?”, but we can also use it to answer a more generalized question, “What is the probability of a complete game Y hitter?”, where Y is a random variable for hits. This new formula will not only tell us the probability of a no-hitter (inclusive of a perfect game), but it will also reveal the probability of a one-hitter, three-hitter, etc. Furthermore, we can calculate the probability of allowing Y hits or less or determine the expected hits in a complete game.

Let h, bb, hbp again be random variables for hits, walks, and hit-by-pitches.

Formula 3.7

Formula 3.8

Formula 3.9

The derivations of the complete game formulas above are very similar to their inning counterparts we deduced earlier. We only changed the number of outs from 3 (an inning) to 27 (a complete game), so we did not need to reiterate the entire proofs from earlier; these formulas could also be constructed for an 8 inning (24 outs), a 10 2/3 inning (32 outs), or any other performance with the same logic.

Table 3.4: Complete Game Hit Probabilities by Pitcher using BA

Tim Lincecum

Matt Cain

Jonathan Sanchez

Madison Bumgarner

P(O Hits in 9 Innings)

0.001

0.001

0.001

0.001

P(1 Hit in 9 Innings)

0.006

0.007

0.004

0.005

P(2 Hits in 9 Innings)

0.023

0.026

0.017

0.018

P(≤3 Hits in 9 Innings)

0.060

0.067

0.046

0.049

P(≤4 Hits in 9 Innings)

0.124

0.137

0.099

0.105

E(Hits in 9 Innings)

8.062

7.833

8.526

8.420

The results of Table 3.4 were generated from the complete game approximation probabilities that use batting average (against) as an input. Any of the four pitchers from the Giants rotation would be expected to allow 8 or 9 hits in a complete game (or potentially 40 total batters such that 40 = 27 outs + 9 hits + 4 walks), but in reality, if any of them are going to be given a chance to throw a complete game they’ll need to pitch better than that and average less than 3 pitches per batter for their manager to consider the possibility. If we instead establish a limit of 3 hits or less to be eligible for a complete game, regardless of pitch total, walks, or game situation (not realistic), we could witness a complete game in at most 1 or 2 starts per season for a healthy and consistent starting pitcher (approximately 30 starts with a 5% probability). Of course, we would leave open the possibility for our starting pitcher to exceed our expectations by throwing a two-hitter, one-hitter, or even a no-hitter despite the likelihood. There is still a chance! Managers definitely need to know what to expect from their pitchers and should keep these expectations grounded, but it is not impossible for a rare optimal outcome to come within reach.


Why is Bronson Arroyo Still Throwing a Changeup?

I respect the change-up. As a pitcher myself, I know how difficult it is to throw a good one (thus I don’t). It’s not the most glamorous pitch in baseball, but certainly an effective one if executed correctly. Plus, what constitutes a good off-speed offering reads like a laundry list of mechanical and ball path attributes that have to be repeated over and over again. Proper grip on the baseball. Delivery and arm speed must be identical to the fastball. Velocity needs to be lower than the fastball. The ball should move (ideally both horizontally and vertically) and spotted in a good location. And lastly, there’s the intangible pitching IQ of understanding when to throw it.

The Diamondbacks Bronson Arroyo and his change-up seem to be missing a majority of these qualities… but for some reason he continues to throw the darned thing. 16% of the time in 2013, in fact, and already almost 18% of the time this season. I’m baffled.

Now, of course I can’t know what’s going on in his head (although if someone can point me to an all-encompassing Pitching IQ metric I would be more than happy to apply it). And I also can’t measure his arm velocity at release. So I can’t quantify all of his deficiencies. But there is, fortunately, hard numerical and visual data showing he’s lacking the necessary skills to throw a change-up well.

Let’s look at Arroyo compared to pitchers who threw more than 200 change-ups between 2011 and 2013:

Movement:

Since change-ups (especially the circle change) tend to move down and to the right for right-handed pitchers versus down and to the left for southpaws, absolute value of x-Mov and z-Mov is used to standardize axis movement for both.

2011-2013 Abs(x-Mov) Abs(z-Mov)
League Average 7.17 4.30
Arroyo 6.00 3.60

I’ll give him a C- for movement. F’s are left for the likes of a Samuel Dedunowho posted a whopping 0.3″ of lateral and 1.6″ vertical (ignoring the natural pull of gravity) movement in 2013.

Velocity:

Again, keep in mind this does not include all pitchers, just ones who have thrown 200 or more change-ups between 2011 and 2013.

2011-2013 vFA (pfx) vCH (pfx)
League Average 90.9 82.9
Arroyo 86.6 78.2

When batters are already sitting on a below average fastball, it’s fair to say it won’t take much of an adjustment to catch up to the change. Below average may even be an understatement. There are only 12 guys in this data set of 275 with a lower average vFA. Jamie Moyer is one of them.

D+.

Location:

There are very few pitchers that can have success locating the change-up for called strikes.  Fernando Rodney being the freak off-speed guru who fools batters looking with a career 46.2 Swing%, 48.8 Zone% and 1.51 Val/C on the change. Typically the best change hurlers induce swings. And those swings either result in bad contact or a flat out whiff. But location of the pitch is still overwhelmingly crucial to achieve either.

I’ll use 2013 poor contact master Hyun-Jin Ryu and Braves injured whiff king Kris Medlen for illustration.

Ryu, with his 56.2 Swing% and 70.9 Contact% is looking to get bat on ball with the change. Ending 2013 with a .187 BABIP, the pitch worked beautifully to induce dribbling grounders (54.7 GB%) to an already above average Dodgers defense (3.1 UZR/150). How did he do it? Pin-perfect location (courtesy of Brooks Baseball).

 photo 74025e6d-0ca0-4068-802d-d2575977591e_zps07ccd3d1.png

Arroyo also induces hitters to get the bat on the ball with the change… at a whopping 85.5 Contact% rate. But is he getting poor contact with the pitch? I somehow don’t think .600+ SLG and 23 HR  over the past three full seasons would constitute bad contact. Let’s compare his zone chart with that of Ryu.

 photo 53238386-6da5-4f8b-9c21-44707dbd34a3_zpsc37ace95.png

 

Not quite, Bronson.

“But what about whiffs?” you ask. With a 6.8 career SwStr%, batters aren’t swinging and missing Arroyo’s meatballs either.

Let’s look at Medlen who owns a 27.5 career SwStr% on the pitch for comparison.
 photo 312d97a3-59b7-474d-8d46-43e4196b2988_zps9c5924cd.png

Pretty, no?

I’ll give Arroyo a D- for location. At least he’s not hanging them up and in on lefties.

So overall grade: barely passing.

I really don’t know what to say at this point. I’m miffed. Confounded. And who is the culprit to blame in the grand mystery of why he continues to throw this sub-par pitch? Batters have already gone deep on it twice in 2014. Is it the catchers? Do we point the finger at Devin MesoracoRyan Hanigan, and now Miguel Montero for keeping blind faith and confidence? Are these guys cursed with chronic short-term memory loss? Or do we blame Arroyo for stubbornly going out there outing after outing and continuing to shove that ball in the back of his palm and firing away? If that’s the case, I get it. I’m a pitcher. I’ve stood there on the mound and though, “This next one will be better, guys. I swear!”

So, please, Bronson. In the end, there is really nothing good that has come from you throwing the thing so often. I like you. I really do. I will forever be indebted to you for giving my beloved 2004 Red Sox their first World Series since “tarnation” was a common curse word. But please. Enough change-ups already.


Battle of the Ks: K/9, K/BB and K%

The great debate has been raging for years: which strikeout-related metric is a better predictor of actual pitching success? Some would say there is no right or wrong answer — that each metric has it’s own unique merit and value. That one must look at certain strikeout-related metrics in combination with others. Unfortunately, as tragic as it may seem, statistical evidence begs to differ. Statistics tell us there is in fact a right answer, and it’s a whopper.

Let’s start with K/9. Looking at all 2013 pitchers with 80+ innings, the correlation (R2) between strikeouts per 9 and ERA is a solid  .1081. This correlation has been consistent, plus or minus a few hundredths, for the past five years. So nothing exciting or anomalous can be found in looking at other seasons. Yu Darvish leads the category with Tony Cingrani, Max Scherzer, Anibal Sanchez, and A.J. Burnett rounding out the top five. Additionally, eight of the top ten K/9 leaders ended up with sub 3.10 ERAs. So a decent indicator all-around.

 photo 53a65e17-24d6-482d-b2de-766753f09051_zps2940fbe7.png

K/BB get’s a bit more interesting. We see a jump in linear correlation to .1671 — more than a 50% increase over K/9. Clayton Kershaw, Cliff Lee, and Adam Wainwright  all leap into the top ten of this metric, with Hisashi Iwakuma climbing into the top fifteen — four elite hurlers in 2013 left out of the K/9 leaderboard.

 photo 98225caf-a307-44c3-850b-d610a9444d32_zps70ee67d9.png

But the real gem is K%. It shows double the correlation versus K/9. Plus, the top fifteen in this category ended the year with sub 3.30 ERA — whereas Scott Kazmir (4.04) and Josh Johnson (6.20) smeared the good name of the K/9 leaderboard; with Kevin Slowey (4.11) and Dan Haren (4.67) unpleasantly loitering on the K/BB board.

The reason K% is so powerful is that it simplifies how effective a pitcher is at simply striking out each batter he faces. When BABIP gets involved — as it does for K/9 (high BABIP pitchers are rewarded on K/9 since the number of outs remains the same even if they’re giving up, say, 10+ hits per game) — the value of each strikeout is severely reduced.

 photo 17feabf1-8665-45c5-af39-48d69923e54a_zpsf45972cf.png

 

To recap:

2013 R2 (correlation to ERA)
K/9 .1081
K/BB .1671
K% .2089

So should we end the debate completely? No. But if you asked me to put money on Tim Lincecum, a career 25.8 K% pitcher with no decline in the stat over the past 2 years, over Tyler Chatwood, a career 13.0 K% who had a breakout year in 2013 with his freakish 76.3% LOB, I would bet on Lincecum every doggone time.


Another Look at Tom Glavine’s Generous Strike Zone

Jeff Sullivan recently suggested that despite his reputation Tom Glavine did not pitch to a significantly more generous strike zone. Sullivan points out Glavine did not get significantly more called strikes than other pitchers, even during the peak of his career. Sullivan’s analysis piqued my interest and made me wonder if Glavine’s reputation for getting a wider strike zone helped him succeed in ways beyond called strikes.

Glavine’s reputation alone likely influenced a batter’s behavior at the plate, encouraging batters who were behind the count to swing at questionable pitches. Batters believed if they did not swing these pitches would be called strikes for Glavine (when a batter swings at a pitch out of the zone when the batter is ahead of the count that has more to do with a pitchers stuff than the batter giving the pitcher an expanded zone). So, what would we expect from a pitcher who is getting batters to expand the strike zone? You would expect batters to make poor contact, yielding a lower BABIP. The batter would most likely swing at pitches outside the zone when the batter is behind the count.

Based on this reasoning, I hypothesize that Tom Glavine will see a greater reduction in quality of contact when he gets ahead of the count than a league-average pitcher. I’m going to look at the time span from 1991 to 2002 because that was the time span Jeff looked at and because I like palindromes.

To measure quality of contact I will be looking at BACON (batting average on contact). BACON is slightly different than BABIP because BACON includes home runs. If batters are expanding the strike zone when Glavine is ahead in the count we should see the quality of contact decrease. To measure the decrease in quality of contact, I will look at the ratio of BACON when Glavine is ahead to BACON to when Glavine is behind (the lower the number the greater improvement the pitcher experiences by getting ahead in the count). I will refer to this measure as EXP (a lower EXP shows a greater decrease in quality of contact, an EXP above 100 shows an increase in quality of contact).  The graph below compares Glavine’s EXP to the league average EXP for each season during the 11-year span.

 The league-average EXP is consistent year to year, hovering around 91, which suggests batters expand the strike zone for most pitchers when batters are behind in the count. Glavine’s EXP is not always better than the league-average EXP. In ‘94 and ‘96 Glavine was actually worse when ahead in the count than when he was behind.  This is to be expected because BACON takes a while to stabilize. Looking at Glavine’s data for a single season is subject to a fair amount of random noise because you have a relatively small sample of data. One season for Glavine gives us about 170 fair balls with Glavine ahead and 280 fair balls with Glavine behind. However, over a larger sample BACON stabilizes. At around 2,000 fair balls (more than in a single season for Glavine) BACON stabilizes. For example, when looking at the league-average EXP for a full year BACON is stable — with 3,500 fair balls with the pitcher ahead of the count and 4,600 fair balls with pitcher behind the count.

To make sure we are not just attributing skill to some random variation we need to look at a larger sample for Glavine. Over the 11 year span form 1991-2002 Glavine induced weaker contact (lower BACON) than the league average both when he was ahead of the count and behind the count. This is not surprising as we would expect a good pitcher to be better than average ahead and behind the count.  What’s interesting is Glavine has better than league-average EXP  (87 vs. 92) which suggests Glavine is better at expanding the strike zone than league-average pitchers. This comes with the caveat that while we have 3,056 fair balls when Glavine is behind the count, we only have 1,853 fair balls when Glavine is ahead — just shy of the 2000 at which the measure should stabilize.  Even so, the difference between Glavine’s EXP and the league-average EXP is very convincing.

Glavine (1991-2002)

MLB ave (1991-2002)

Ahead Behind EXP Ahead Behind EXP
BACON

0.266055

0.304319

87.42626

0.303134

0.330999

91.58153

To stabilize BACON, I increased the sample by looking at all the balls put in play. I compared balls put in play when the pitcher had two strikes to balls put in play when the pitcher had fewer than two strikes, which led to EXP2: the ratio of BACON when a pitcher has two strikes, to when he has fewer than two strikes. The table bellow shows a comparison of the quality of contact in two strike counts to non-two strike counts.

Glavine (1991-2002)

MLB ave (1991-2002)

2 Strikes Not 2 Strikes EXP2 2 Strikes Not 2 Strikes EXP2
BACON

0.275

0.302

91.22

0.3118

0.331

94.19

Even with this larger sample size Glavine’s BACON is still lower than the league average in respective counts. More importantly, his EXP2 is still better than league average (although higher than his EXP).  Pitchers in general try to induce weaker contact when they are ahead of the count, but the data shows Glavine is doing something special to induce even weaker contact.

Is Glavine getting batters to give him a wider strike zone? We cannot definitively say what is causing this pattern in the data, but we are seeing the type of numbers we would expect to see if the batter was giving him a wider strike zone.

 

All splits number are from Baseball-Reference.


The R.A. Dickey Effect – 2013 Edition

It is widely talked about by announcers and baseball fans alike, that knuckleball pitchers can throw hitters off their game and leave them in funks for days. Some managers even sit certain players to avoid this effect. I decided to analyze to determine if there really is an effect and what its value is. R.A. Dickey is the main knuckleballer in the game today, and he is a special breed with the extra velocity he has.

Most people that try to analyze this Dickey effect tend to group all the pitchers that follow in to one grouping with one ERA and compare to the total ERA of the bullpen or rotation. This is a simplistic and non-descriptive way of analyzing the effect and does not look at the how often the pitchers are pitching not after Dickey.

Dickey's Dancing Knuckleball
Dickey’s Dancing Knuckleball (@DShep25)

I decided to determine if there truly is an effect on pitchers’ statistics (ERA, WHIP, K%, BB%, HR%, and FIP) who follow Dickey in relief and the starters of the next game against the same team. I went through every game that Dickey has pitched and recorded the stats (IP, TBF, H, ER, BB, K) of each reliever individually and the stats of the next starting pitcher, if the next game was against the same team. I did this for each season. I then took the pitchers’ stats for the whole year and subtracted their stats from their following Dickey stats to have their stats when they did not follow Dickey. I summed the stats for following Dickey and weighted each pitcher based on the batters he faced over the total batters faced after Dickey. I then calculated the rate stats from the total. This weight was then applied to the not after Dickey stats. So for example if Janssen faced 19.11% of batters after Dickey, it was adjusted so that he also faced 19.11% of the batters not after Dickey. This gives an effective way of comparing the statistics and an accurate relationship can be determined. The not after Dickey stats were then summed and the rate stats were calculated as well. The two rate stats after Dickey and not after Dickey were compared using this formula (afterDickeySTAT-notafterDickeySTAT)/notafterDickeySTAT. This tells me how much better or worse relievers or starters did when following Dickey in the form of a percentage.

I then added the stats after Dickey for starters and relievers from all four years and the stats not after Dickey and I applied the same technique of weighting the sample so that if Niese’12 faced 10.9% of all starter batters faced following a Dickey start against the same team, it was adjusted so that he faced 10.9% of the batters faced by starters not after Dickey (only the starters that pitched after Dickey that season). The same technique was used from the year to year technique and a total % for each stat was calculated.

The most important stat to look at is FIP. This gives a more accurate value of the effect. Also make note of the BABIP and ERA, and you can decide for yourself if the BABIP is just luck, or actually better/worse contact. Normally I would regress the results based on BABIP and HR/FB, but FIP does not include BABIP and I do not have the fly ball numbers.

The size of the sample was also included, aD means after Dickey and naD is not after Dickey. Here are the results for starters following Dickey against the same team.

Dickey Starters

It can be concluded that starters after Dickey see an improvement across the board. Like I said, it is probably better to use FIP rather than ERA. Starters see an approximate 18.9% decrease in their FIP when they follow Dickey over the past 4 years. So assuming 130 IP are pitched after Dickey by a league average set of pitchers (~4.00 FIP), this would decrease their FIP to around 3.25. 130 IP was selected assuming ⅔ of starter innings (200) against the same team. Over 130 IP this would be a 10.8 run difference or around 1.1 WAR! This is amazingly significant and appears to be coming mainly from a reduction in HR%. If we regress the HR% down to -10% (seems more than fair), this would reduce the FIP reduction down to around 7%. A 7% reduction would reduce a 4.00 FIP down to 3.72, and save 4.0 runs or 0.4 WAR.

Here are the numbers for relievers following Dickey in the same game.

Dickey Bullpen

Relievers see a more consistent improvement in the FIP components (K, BB, HR) between each other (11.4, 8.1, 4.9). FIP was reduced 10.3%. Assuming 65 IP (in between 2012 and 2013) innings after Dickey of an average bullpen (or slightly above average, since Dickey will likely have setup men and closers after him) with a 3.75 FIP, FIP would get reduced to 3.36 and save 3 runs or 0.3 WAR.

Combining the un-regressed results, by having pitchers pitch after him, Dickey would contribute around 1.4 WAR over a full season. If you assume the effect is just 10% reduction in FIP for both groups, this number comes down to around 0.9 WAR, which is not crazy to think at all based off the results. I can say with great confidence, that if Dickey pitches over 200 innings again next year, he will contribute above 1.0 WAR just from baffling hitters for the next guys. If we take the un-regressed 1.4 WAR and add it to his 2013 WAR (2.0) we get 3.4 WAR, if we add in his defence (7 DRS), we get 4.1 WAR. Even though we all were disappointed with Dickey’s season, with the effect he provides and his defence, he is still all-star calibre.

Just for fun, lets apply this to his 2012. He had 4.5 WAR in 2012, add on the 1.4 and his 6 DRS we get 6.5 WAR, wow! Using his RA9 WAR (6.2) instead (commonly used for knucklers instead of fWAR) we get 7.6 WAR! That’s Miguel Cabrera value! We can’t include his DRS when using RA9 WAR though, as it should already be incorporated.

This effect may even be applied further, relievers may (and likely do) get a boost the following day as well as starters. Assuming it is the same boost, that’s around another 2.5 runs or 0.25 WAR. Maybe the second day after Dickey also sees a boost? (A lot smaller sample size since Dickey would have to pitch first game of series). We could assume the effect is cut in half the next day, and that’d still be another 2 runs (90 IP of starters and relievers). So under these assumptions, Dickey could effectively have a 1.8 WAR after effect over a full season! This WAR is not easy to place, however, and cannot just be added onto the teams WAR, it is hidden among all the other pitchers’ WARs (just like catcher framing).

You may be disappointed with Dickey’s 2013, but he is still well worth his money. He is projected for 2.8 WAR next year by Steamer, and adding on the 1.4 WAR Dickey Effect and his defence, he could be projected to really have a true underlying value of almost 5 WAR. That is well worth the $12.5M he will earn in 2014.

For more of my articles, head over to Breaking Blue where we give a sabermetric view on the Blue Jays, and MLB. Follow on twitter @BreakingBlueMLB and follow me directly @CCBreakingBlue.