Author Archive

Projecting Strength of Schedule for Pitchers and Hitters

Friday morning, as I began the tedious process of combining all MLB schedules in one spreadsheet, I noticed that FanGraphs’ resident volcano expert and prolific content generator Jeff Sullivan posted one very similar article, and then another shortly thereafter. He focused on projected WAR, while I planned to look specifically at projected average ERA and wOBA a team must contend with over the 2014 season. So at the risk of writing a similar post, one with drier writing and less cool graphics, I submit to you the following simple table and graphs.

We often look at the strength of a division and make generalizations about the hardest place to pitch (AL East) and hit (NL East). Like park effects, we sometimes jump to conclusions about the effects of dream lineups and weak interdivision rivals. Chad Young’s analysis of Prince Fielder’s move to Arlington is a perfect example of how enthusiasm can be misplaced when we forget that 90 of a club’s 162 games take place outside of their division, with 20 games occurring in a different league.  The table below shows projected mean wOBA and ERA by team, which are weighted by expected plate appearances and innings pitched, respectively. As expected, AL teams generally have a DH-fueled high wOBA and inflated ERA when compared to their NL counterparts. All projections are courtesy of Steamer’s 2014 pre-season projections. Keep in mind that Steamer regresses stats like wOBA and ERA, so there is not as huge a gap between the Red and White Sox (0.332 vs. 0.317) compared to what you might see during the season. However, Steamer has been shown to be one of the best projection systems available when it comes to capturing player-to-player variation in performance (i.e. ranking players by production), which is sufficient for looking at the differences between teams.

2014 Steamer Projections*

Team

wOBA

ERA

BOS

0.333

3.85

TOR

0.331

4.16

BAL

0.326

4.13

NYY

0.322

3.92

TB

0.318

3.63

DET

0.330

3.64

KAN

0.324

3.95

CLE

0.321

3.91

CHW

0.317

4.35

MIN

0.312

4.33

TEX

0.332

4.09

LAA

0.327

4.00

SEA

0.325

3.84

OAK

0.320

3.81

HOU

0.310

4.41

WAS

0.328

3.58

ATL

0.322

3.66

PHI

0.310

3.72

NYM

0.309

3.85

MIA

0.309

4.04

STL

0.326

3.49

PIT

0.323

3.73

MIL

0.321

4.02

CHC

0.319

3.98

CIN

0.318

3.66

COL

0.347

4.22

LAD

0.329

3.44

ARI

0.329

3.78

SF

0.323

3.72

SD

0.319

3.80

*adjusted for PA and IP

I was surprised by the high ERA attributed to the San Diego Padres, poor enough for 6th worst in the NL. The Reds’ Choo-less offense is also, somewhat surprisingly, projected as the 7th worst in the majors. Let’s take a moment to silently reflect that the Minnesota Twins, despite having a spacious ballpark and a non piss-poor payroll, are still projected to give up more earned runs than the Colorado Rockies.

While the table displays projected wOBA and ERA by team, the charts below illustrate the mean wOBA and ERA faced by each team over 162 games.

 

Projected wOBA

Last September Dave Cameron presented a convincing argument that Chris Sale’s 2013 season was as good if not better than Max Scherzer’s, but was obscured in part because Sale routinely pitched against the Tigers and Scherzer routinely pitched against the White Sox. These projections reinforce the argument in favor of opponent-adjusted measurements—Detroit pitchers are projected to face a wOBA of 0.321 while Chicago pitchers play against teams with a projected wOBA of 0.324.

San Diego and San Francisco are home to some of the most pitching-friendly stadiums in the country. However, in part because they play 28 away games against the Rockies, Diamondbacks, and Dodgers, their opponent’s wOBA is higher than people might expect. However great it is that a flyball pitcher like Ian Kennedy has a home in spacious San Diego, it’s important to note that the Padres are slated to face some tougher-than-average lineups. Projected ERA

ERA drops off pretty sharply when we get to the NL. Surprisingly, hitters for the Nationals and Dodgers appear to have the easiest schedules in their league, despite being in divisions which are better known for their sharp pitching than strong offense. Not having to face the likes Clayton Kershaw or Stephen Strasburg can do wonders for a lineup.

The heavy-hitting Tigers are slated to face the worst pitching staff in the majors. While this is somewhat unfair considering they have the league’s best hitter, it is very unfair that the lowly Marlins will face the best pitchers in the league.

Projections are only predictions, and assuredly some teams will drastically outperform and others will underwhelm by season’s end. However, these data remind us that our preconceptions about who plays in an extreme park or which teams are in difficult divisions should not be overemphasized, nor should we discount the idea that some lineups or pitching staffs will have a significantly more difficult time than others. Over the course of the season, a single team will square off against almost 20 other teams in over a dozen different parks. Whatever the strength of their schedule, position players and pitchers face a wide variety of competition, and no doubt a good many will surprise us all.


Does Pitching Deep into Games Lead to More Wins?

Predicting pitcher wins is a capricious exercise, and few factors have been shown to have any correlation whatsoever with win percentage (W%). To predict wins, one should consider a pitcher’s ERA, offensive support, strength of bullpen, quality of defense behind the mound, and, innings pitched (IP) in a season.

In fact, research has shown that IP and ERA are the only two factors that have a correlation above .30, and the two are very close. In a sample of pitchers from 2003-2013, the correlation for both eclipsed .40.

Obviously, pitching more games leads to more wins in a season, but many fantasy experts insist that pitching deep into games is an important part of earning a win as well. The theory, which I’ve seen taken for granted by experts at ESPN, CBS, Baseball Prospectus, and Rotographs, is that a starting pitcher who pitches into the 8th or 9th inning and leaves with a lead intact is more likely be credited with the W.

However, to earn a win a starter must pitch only 5 innings. Since we know that starters are often less effective after 75 pitches or so, pulling a pitcher early and relying a fresh bullpen that is at least league average should, in theory, be more effective than keeping a starter in the game. Dave Cameron articulated this point when creating a gameplan for the Pirates’ all-important play-in game in October 2013 when he suggested Liriano be pulled after only 3 innings. The chart below reinforces the obvious point that, except for walk rate, relievers generally eclipse starters in most skill metrics.

Figure 1

In 2013 Shelby Miller started 31 games and came away with the W a total of 15 times, earning a W% that ranked 22nd in the majors right behind Clayton Kershaw and Anibal Sanchez. That’s impressive, but also consider that the innings-limited rookie pitched an average of 5.5 innings per start—he only racked up 13 quality starts (QS), ranked 86th in the league. QS, after all, require putting in 6 innings of work with at least a 4.50 ERA.

Why, then, are innings pitched per start (IP/GS) so important, relatively, when considering W%? I hypothesized that pitchers who are given the leeway to pitch deep into games, and hence give their bullpen a rest, were generally better at run prevention than their peers, i.e. sported a lower ERA.

In healthcare research, where we don’t write particularly well, we love simple diagrams to explain hypothesized effects. Below is a diagram showing how one might view the relationship between various factors like ERA, IP, defense, offensive support and bullpen ERA. The perceived link between IP/GS and Pitcher Wins is confounded by ERA, which has an effect on both factors.

DAG
Pitch Efficiency

Before examining the theory that ERA accounts for the correlation between IP and W%, lets look at another possible explanation. Perhaps pitch efficiency is the key. Jordan Zimmermann was the 3rd most efficient starter (14.5 P/IP) in the majors last year, and was tied for the 8th highest W% (.68). However, the table below shows the correlation between W per game started (GS) and P/IP, ERA, and IP/GS among starters between 2009-2013:

 

W% and…

R2

     ERA

0.39

     IP/GS

0.36

     P/IP

0.08

While ERA and IP/GS appear to be almost equally correlated, the squared correlation coefficient for P/IP was negligible at .08. Variance in pitch efficiency has little to do with variance in W%.

IP/GS: How to Measure a Confounder

There are 2 straightforward ways to determine if the relationship between 2 variables is actually being skewed by a third factor, in this case ERA. The first is to stratify the sample by ERA and see if the relationship between IP/GS and W% still stands. If ERA is not a confounder, we would expect the correlation between each tier to remain relatively stable. As we can see in the chart below, it follows no clear trend.

Figure 3

Interestingly, only the best tier of pitchers, those with an ERA less than 3.65, show any discernible relationship between W% and IP/GS, supporting the theory that those starters who have demonstrated a strong ability to prevent runs are given the chance to pitch more innings.  Among more middling pitchers, the relationship between pitching deep into games and W% is negligible.

The second way to measure confounding is using a regression model. If you create a model examining how factor X predicts factor Y, introducing factor Z should not change the coefficient for X by more than 10% if Z does not have a strong pull on the relationship. For example, if we run a model that shows that smoking doubles your chance of getting lung cancer, then introducing tea drinking into the equation should not really change that smoking-lung cancer connection by more than 10%, unless we believe that drinking tea can also affect lung cancer and/or smoking.

I’m with MGL that regression is often unnecessary in baseball research, as its results can be difficult to interpret and unnecessarily complicated. I might add that even simple linear regression rests on a series of assumptions that are not always met. With that caveat, the data in this sample are normally distributed and I kept the model as simple as possible. Model 1 examines the relationship between W% and IP/GS. Model 2 adds a third variable, ERA.

Parameter

Coefficient (%)

P-Value

Model 1

IP/S

11.13

<.01

Model 2

IP/S

5.71

<.01

Model 2

ERA

-4.77

<.01

All results are statistically significant. Model 1 indicates that for each 1-inning increase in IP/GS, we would expect an 11% increase in W%. Once we control for ERA, we see that each 1-inning increase would result in an even weaker relationship— we would expect a 6% increase in W%. The new coefficient, .057, is more than 10% different from .111 and we can safely conclude that ERA is confounding this relationship, just as we found in the stratified analysis above.

Predicting Wins?

Here at FanGraphs we might mock the idea of pitcher wins, since they are mostly a byproduct of an era when pitchers did pitch deep into games and bullpens were not utilized as often or as effectively. However, when it comes to predicting wins, Will Larson has shown that projection systems like Steamer and CAIRO do a pretty good job, and are on average within 3.5-4 wins of the actual end-of-season results.

In fact, projection systems across the board are better at capturing player-to-player variation (ranking players) in counting statistics like W and strikeouts than rate stats ERA and WHIP.

Figure 4

While I have previously shown that QS correlate much better than W with pretty much every measure of pitcher skill we have, W% is still somewhat predictable. As long as we have yet to #killthewin, we might as well keep trying to forecast the future. 


Estimating the Advantage of Switch Hitting on BB/K Splits

It is generally a marked advantage for a batter to face an opposite-handed pitcher. Platoon splits across the league are evidence of this well documented phenomenon, and managers are quick to take advantage of matchups.

One of the chief advantages of switch-hitting is that the opposite-handed pitcher’s release point is closer to the center of the hitter’s field of vision. This allows him to get a better look at the ball, and judge whether the pitch is worth swinging at. If a switch-hitter generally gets a better look at the incoming pitch he should, in theory, be better at commanding the strike zone than his one-sided counterparts, walking more and striking out less. Do switch hitters have a better BB/K split than other hitters?

While we are limited by a small sample size of switch-hitters who accrue a enough at bats against lefties to possibly stabilize (according to work done by Russell Carleton), we can calculate their splits and compare it to the average split for batters who always hit from one side.

If we assume that switch-hitters would ‘naturally’ hit from the side in which they throw, we can roughly estimate what their split might be if they were not switch-hitters by calculating BB/K split for righties when facing left-handed pitchers (LHP) and right-handed pitchers (RHP).

Right-handed batters (RHB), on average, post a healthy BB/K ratio of .63 against LHP and more dismal .38 against RHP. The table below shows how splits for switch-hitters who throw right-handed compared to those righties who do not swing from both sides of the plate.

Right-Handed Players

BB/K vs. LHP BB/K vs. RHP Difference
Alberto Callaspo 1.5 1.03 0.47
Andres Torres 0.52 0.26 0.26
Dexter Fowler 0.82 0.57 0.25
Kendrys Morales 0.55 0.37 0.18
Jarrod Saltalamacchia 0.44 0.26 0.18
Jed Lowrie 0.63 0.52 0.11
Shane Victorino 0.38 0.31 0.07
Nick Franklin 0.42 0.35 0.07
Everth Cabrera 0.63 0.58 0.05
Emilio Bonifacio 0.32 0.28 0.04
Ryan Doumit 0.5 0.48 0.02
Pablo Sandoval 0.6 0.59 0.01
Eric Young Jr. 0.45 0.46 -0.01
Asdrubal Cabrera 0.3 0.31 -0.01
Chase Headley 0.45 0.48 -0.03
Carlos Santana 0.77 0.88 -0.11
Jimmy Rollins 0.53 0.68 -0.15
Matt Wieters 0.31 0.48 -0.17
Erick Aybar 0.21 0.48 -0.27
Ben Zobrist 0.57 0.9 -0.33
Victor Martinez 0.59 1.09 -0.5
Coco Crisp 0.68 1.18 -0.5

Left-Handed Players

BB/K vs RHP BB/K vs. LHP Difference
Daniel Nava 0.64 0.38 0.26
Carlos Beltran 0.48 0.27 0.21
Justin Smoak 0.57 0.46 0.11
Nick Swisher 0.42 1.07 -0.65

 

Or if you prefer to see the splits visually, and compared to the mean for all non-switch hitters:
Difference vs RHP

Difference vs RHP

 

We can see the results are relatively mixed. If switch-hitters really did display a better ability to draw walks and avoid strikeouts we would expect to see smaller than league-average (below the red line) splits, in the positive direction. Among righties, hitters from Kendrys Morales to Chase Headley in the chart above do not display as severe a split as the average right-handed batter, and may derive a significant benefit to never seeing a same-handed pitcher. However, a surprising number of hitters display reverse splits, improving their ratio considerably when batting from their own weak side.

The extreme negative splits of Coco Crisp, Victor Martinez, and Nick Swisher are all consistent with their recent career numbers. Indeed, these negative splits are even evident when examining their wOBA splits for the last several years.

Alberto Callaspo’s outlier split belies a an impressive ability to avoid strikeouts while taking walks at a accelerated pace. Against lefties he posts an outstanding BB/K of 1.5, and his ratio of 1.03 vs. RHP is still impressive. The dropoff from facing LHP to RHP is steep in absolute terms, but his knowledge of the strike zone is still elite.

The BB/K ratio for Jarrod Saltalamacchia, and Justin Smoak both see a slight benefit in switch-hitting, featuring splits a bit lower than the league average. Justin Smoak, however, suffers from a serious power outage, posting a .218 ISO when hitting from his left side, and a miserable .082 ISO from his left. Salty’s power split is not as egregious, but the .128 point drop in ISO is troubling for a player who’s contact % is only slightly above Dan Uggla and Pedro Alvarez. Andres Torres, a natural right hander, sees a similar decline in his wOBA splits– .318 against LHP but a paltry .249 against RHP. These players enjoy a nonexistent or marginal advantage in BB/K ratio as a switch hitter, and hitting primarily from their strong side might be an experiment worth performing.

The Shane Victorino Experiment

 Shane Victorino’s ratio of walks to strikeouts reduces by .07 when facing RHP as opposed to LHP. After tweaking his hamstring in the second half of 2013, he decided to at least temporarily abandon switch-hitting for the remainder of the season. Since mid-August had almost 50 plate appearances as a RHB vs. RHP,  offering a real-life counterfactual case. How does not switch-hitting affect a productive hitter’s BB/K ratio?

From September and into the postseason, Victorino has managed to walk just twice and strike out over 20 times, giving him a miniscule BB/K ratio of just .09, much smaller than his .33 season average. Still, with a wOBA of .356 right in line with his season long average, his overall production at the plate has not suffered despite the more aggressive and less patient approach.

Victorino’s small sample size of hitting exclusively right-handed fails to reliably estimate the counterfactual scenario. However, his case is interesting because, while switch-hitters like J.T. Snow did abandon their dual approach, most did so because of a decline in production from their weak side. Players who eventually decided the advantages of switch-hitting did not offset the challenges of being ambidextrous were already in decline mode—Victorino on the other hand is coming off a great season. While he has officially achieved veteran status, the 32-year old proved this season that reports of his bat’s death have been greatly exaggerated. If he and his coaches are encouraged by his recent wOBA spike, and he abandons hitting from the left side entirely, his BB/K may continue to steadily decline even if his power improves.

Conclusions

The results seen here do not strongly support the hypothesis that switch-hitters have an inherent advantage over others when considering the ratio of bases on balls to strikeouts. While there is some evidence that switch-hitters do enjoy better splits, it is not overwhelming and may provide only marginal benefit to players like Andres Torres, Dexter Fowler and Justin Smoak. Overall, lefties like Carlos Beltran and Daniel Nava joined Alberto Callaspo as possible examples of the reverse, a larger than average split when going from the strong side to weak side.

There are obvious limitations to this study, starting with a  small sample size. We only examined 2013 splits, and the number of left-handed hitters who switch-hit is very low. It may be possible moving forward to use career splits for lefties going back decades to determine if left handed switch-hitters generally have worse BB/K splits than their counterparts.

Currently, switch-hitters account for slightly less than 15% of major league hitters.  To say that having the platoon advantage is always an advantage for the hitter may not be accurate– players whose weak side bat is significantly less powerful, like Justin Smoak or Jarrod Saltalamacchia, may inadvertently harm their value as a hitter by sticking to switch-hitting in all cases. Baseball is a game of adjustments and gaining incremental advantages, and switch-hitting is no different. Some players use it to gain an upper hand, and others may be wasting their potential.


Is Using Wins + Quality Starts the Answer?

Rotograph’s venerable duo Mike Podhorzer and David Wiers recently contemplated aloud a new statistic, formulated by Ron Shandler, that replaces Wins (W) and Quality Starts (QS) by simply adding the two (W+QS). Chandler decided to use this approach in monthly fantasy leagues, and its useful to look at how using this combination could best be used to solve an implacable problem, the overall crappiness of using wins to evaluate a pitcher’s ability.

W+QS is interesting because it weights QS more than W, since a pitcher usually has considerably more QS than W. With a mean of 19 QS and only 12 W, a starting pitcher is more likely to throw at least six innings with 3 earned runs or less than he is to get the W. Wins are capricious and depend greatly on the pitcher’s offensive support. As a way to measure a pitcher’s ability, one might argue that wins are a waste of time. In fantasy baseball, a pitcher is most often valued by his ERA, WHIP, number of Ks and W and Saves. Some more progressive leagues use QS in place of the W.

As evidenced by the table below, ranking a pitcher by W+QS instead of wins alone certainly helps many a fine pitcher, especially James Shields, who leads the league in QS but only is ranked 38th in wins, while also penalizing others like Shelby Miller who has even more wins (14) than quality starts (12). Stephen Strasburg and Cole Hamels see the greatest percent increase jumping from wins to QS+W, while Jeremy Hellickson and Shelby Miller’s total changed the least.

Conversely, Shelby Miller and Jeff Locke saw the greatest increase from quality starts to W+QS, again showing that Mr. Miller, while pitching well his first full season, got the W more often that he made a quality start. A quick glance at his game log shows the innings-limited young pitcher often earned the win when pitching less than the 6 innings needed to record a quality start.

  Comparing Wins, Quality Starts, and Wins + Quality Starts

Name

W+QS Rank

W Rank

Change in Rank

W

QS

W+QS

% Change from W to W+QS

% Change from QS to W+QS

Max Scherzer

1

1

0

20

24

44

120

83

Adam Wainwright

2

3

1

18

26

44

144

69

Clayton Kershaw

3

8

5

15

26

41

173

58

Jordan Zimmermann

4

2

-2

19

21

40

111

90

C.J. Wilson

5

5

0

17

23

40

135

74

Bartolo Colon

6

4

-2

17

22

39

129

77

James Shields

7

38

31

12

26

38

217

46

Cliff Lee

8

12

4

14

23

37

164

61

Patrick Corbin

9

17

8

14

23

37

164

61

Chris Tillman

10

7

-3

16

20

36

125

80

Bronson Arroyo

11

20

9

14

22

36

157

64

Jon Lester

12

10

-2

15

20

35

133

75

Kris Medlen

13

16

3

14

21

35

150

67

Doug Fister

14

21

7

14

21

35

150

67

Hisashi Iwakuma

15

26

11

13

22

35

169

59

Madison Bumgarner

16

27

11

13

22

35

169

59

Mike Minor

17

31

14

13

22

35

169

59

Jarrod Parker

18

42

24

12

23

35

192

52

Anibal Sanchez

19

11

-8

14

20

34

143

70

Mat Latos

20

15

-5

14

20

34

143

70

Yu Darvish

21

28

7

13

21

34

162

62

Hyun-Jin Ryu

22

29

7

13

21

34

162

62

Justin Verlander

23

33

10

13

21

34

162

62

Chris Sale

24

45

21

11

23

34

209

48

Jorge De La Rosa

25

6

-19

16

17

33

106

94

Jhoulys Chacin

26

14

-12

14

19

33

136

74

Felix Hernandez

27

37

10

12

21

33

175

57

Travis Wood

28

66

38

9

24

33

267

38

Zack Greinke

29

9

-20

15

17

32

113

88

Justin Masterson

30

19

-11

14

18

32

129

78

Lance Lynn

31

24

-7

14

18

32

129

78

Jose Fernandez

32

36

4

12

20

32

167

60

Derek Holland

33

54

21

10

22

32

220

45

Ervin Santana

34

67

33

9

23

32

256

39

Cole Hamels

35

74

39

8

24

32

300

33

Jeremy Guthrie

36

23

-13

14

17

31

121

82

Julio Teheran

37

30

-7

13

18

31

138

72

R.A. Dickey

38

34

-4

13

18

31

138

72

Rick Porcello

39

35

-4

13

18

31

138

72

Gio Gonzalez

40

47

7

11

20

31

182

55

Homer Bailey

41

48

7

11

20

31

182

55

Mike Leake

42

18

-24

14

16

30

114

88

CC Sabathia

43

25

-18

14

16

30

114

88

Ricky Nolasco

44

32

-12

13

17

30

131

76

Mark Buehrle

45

43

-2

12

18

30

150

67

Hiroki Kuroda

46

46

0

11

19

30

173

58

Wade Miley

47

58

11

10

20

30

200

50

A.J. Griffin

48

22

-26

14

15

29

107

93

Scott Feldman

49

40

-9

12

17

29

142

71

Andrew Cashner

50

53

3

10

19

29

190

53

Kyle Lohse

51

55

4

10

19

29

190

53

John Lackey

52

57

5

10

19

29

190

53

Eric Stults

53

60

7

10

19

29

190

53

Matt Harvey

54

65

11

9

20

29

222

45

Dillon Gee

55

41

-14

12

16

28

133

75

Wily Peralta

56

51

-5

11

17

28

155

65

Andy Pettitte

57

59

2

10

18

28

180

56

Miguel Gonzalez

58

61

3

10

18

28

180

56

Felix Doubront

59

49

-10

11

16

27

145

69

Yovani Gallardo

60

50

-10

11

16

27

145

69

Kyle Kendrick

61

64

3

10

17

27

170

59

Matt Cain

62

75

13

8

19

27

238

42

Shelby Miller

63

13

-50

14

12

26

86

117

Ubaldo Jimenez

64

39

-25

12

14

26

117

86

Bud Norris

65

62

-3

10

16

26

160

63

A.J. Burnett

66

68

2

9

17

26

189

53

Jose Quintana

67

69

2

9

17

26

189

53

Jeff Samardzija

68

76

8

8

18

26

225

44

Kevin Correia

69

70

1

9

16

25

178

56

Joe Saunders

70

52

-18

11

13

24

118

85

Tim Lincecum

71

63

-8

10

14

24

140

71

David Price

72

73

1

8

16

24

200

50

Stephen Strasburg

73

79

6

7

17

24

243

41

Jeremy Hellickson

74

44

-30

12

11

23

92

109

Jeff Locke

75

56

-19

10

13

23

130

77

Dan Haren

76

72

-4

9

14

23

156

64

Ryan Dempster

77

77

0

8

14

22

175

57

Edwin Jackson

78

78

0

8

14

22

175

57

Jerome Williams

79

71

-8

9

11

20

122

82

Ian Kennedy

80

80

0

6

13

19

217

46

 

In fantasy, the 5 categories are meant to evaluate the overall value of a pitcher, and players that are best able to predict future value can win serious jelly beans. A pitcher accumulates Ks by defeating individual batters, while a low WHIP indicates that he can avoid putting opposing players on base. ERA evaluates a pitcher’s run prevention skill. Saves and wins are meant to measure a pitcher’s ability to dominate opposing teams, whether for an inning or an entire game. However, wins compare poorly with quality starts and W+QS when correlated with commonly used pitching statistics.

The chart below shows the correlation between wins, quality starts, and the combination of the two with other commonly used pitcher evaluation metrics. By calculating the correlation between these 3 categories and other pitcher metrics such as FIP, OPS allowed, batting average against, homeruns allowed per 9 innings, and runs above average by the 24 base/out states (RE24), we can measure not only the relationship between the variables, but also how much they differ from each other.
Chart

None of these statistics correlate as well with wins as they do with quality starts and W+QS. In fact, the difference between QS and W+QS is negligible in every case. This result makes sense—since QS make up the majority of the W+QS total, the two are almost identical in the chart. The actual values of each correlation are less important that the overwhelming conclusion that wins do not have much to do with pitcher skill, while the difference between QS and W+QS is negligible.

 Why, then, might it be useful to use W+QS? These results show that it may not be very different from using quality starts, but is far more reliable way to judge a pitcher’s performance than wins alone. W+QS double count the games when a pitcher goes somewhat deep into a game, pitches fairly well (3 ER or less), and exits the game while leading his opponent. This scenario might not be much different than the QS by itself, but it does retain an element of “winning the ballgame for your team”, which is what the win category somewhat accurately captures. A winning pitcher is generally on a winning team, although that statement may not mean much.

W+QS may be an unnecessarily complicated way to repeat the same evaluation standards as quality starts, but some players may prefer it simply because it retains the W while relegating it to a position of less importance. Maybe owning a great pitcher like James Shields doesn’t have to be so frustrating after all.


Plate Discipline Correlations, 2008-2013

Plate Discipline Correlations, 2008-2013 

In fall 2008 FanGraphs was kind enough to release new plate-discipline metrics, including first-pitch strike percentage (F-Strike %), outside-the-zone swing rate (O-Swing %), and inside-the-zone swing rate (Z-Swing %).  At the time, Eric Seidman was even kinder when he investigated the correlation of these plate-discipline statistics with standard pitcher metrics like WHIP, FIP, BB/9, and K/9. Very thoughtful indeed.

Now we have another 4.5 years of plate discipline data, compiled by Pitch f/x rather than Baseball Info Solutions. It may be worthwhile to see how these numbers compare with Seidman’s, as well as add a measure of uncertainty to the correlations. It is possible for two factors to have a strong relationship, but because of small sample sizes or other forms of variability, the correlation value may not be as precise a measure as a high R-value may suggest.

Bootstrapping

Correlation coefficients, which fall between -1 and 1, allow us to measure the strength of linear dependence between two variables, such as O-Swing % and K %. We can use bootstrapping techniques to obtain 95% confidence intervals for these correlation coefficients. Calculating confidence intervals for correlations adds a measure of uncertainty to the process—narrow intervals indicate we can have greater confidence that the R-value we obtain represents the true correlation between the two metrics.

Bootstrapping is a statistical technique in which we resample our current sample, in this case 500 times. This repeated process allows us to assign measures of accuracy to sample estimates, such as medians, means, or correlation coefficients. For our purposes here, it is only important to note that we can be 95% confident that the true R-value lies between the intervals. If the interval includes 0, meaning absolutely no correlation, we can conclude that there is not enough evidence to indicate any relationship between the two variables.

First Strike %

These correspond well enough to the values obtained by Seidman, with one exception worth noting. While he used K/9 and BB/9 to correlate with F-Strike %, here we examine the correlation with strike and base on balls percentages. Our correlation coefficient is similar in magnitude at .24 versus .19, but its wide confidence interval approaches the null value and suggests the estimate is not very precise. This is worth noting, especially considering that BB % appears to have such a strong correlation with F-Strike % of -.79 with relatively narrow confidence intervals. Seidman observed a similar pattern—pitchers who get into an 0-1 count are more prone to not walking batters than striking them out.

First Strike %

       R-Value                    (95% CI)

K%

0.24

(.024, .455)

BB%

-0.72

(-.848, -.604)

WHIP

-0.52

(-.649, -.376)

FIP

-0.41

(-.576, -.237)

 

O-Swing %

O-Swing % is the percentage of pitches a pitcher pitched outside the zone but still generated a swinging strike. Think anyone facing Pablo Sandoval. Here we again see relatively moderate correlations with relatively tight confidence intervals ranging from 0.30 to 0.19. Pitchers who induce swings at pitches outside the zone may be especially tricky for hitters to do damage against. So far this season Adam Wainwright and Matt Harvey are both in the top three in O-Swing %, and top two in both WHIP and FIP.

O-Swing %

   R-Value        (95% CI)

K%

0.39

(.274, .548)

BB%

-0.44

(-.637, -.254)

WHIP

-0.50

(-.677, -.317)

FIP

-0.45

(-.650, -.283)

Z-Swing %

We can see from the results below that Z-Swing %, the rate of inducing swings at pitches in the zone, bears little relationship with any of these metrics. Seidman’s analysis showed that the correlations were negligible at best. The confidence intervals for all of these measure metrics include 0, meaning that we cannot be 95% confident that there is any relationship present. A quick glance at the leaderboards shows that Ian Kennedy and Miguel Gonzalez are near the top of the list this season, and these guys aren’t exactly shoving.

Z-Swing %

   R-Value        (95% CI)

K%

-0.17

(-.370, .035)

BB%

-0.17

(-.381, .048)

WHIP

-0.09

(-.276, .111)

FIP

0.10

(-0.09, .286)

All data courtesy of FanGraphs.

 Because I’m a believer in open data, you can find my R code here.


Visualizing Pitcher Consistency

Visualizing Pitcher Consistency

When evaluating starting pitcher performance, fantasy owners and fans alike lament the relative inconsistency of certain pitchers deemed especially volatile (Francisco Liriano will break your heart), while others like Mark Buehrle are workhorses often viewed as among the most steady arms available.  A.J. Mass of ESPN has written about the value of calculating “Mulligan ERAs,” in which a pitcher’s three worst outings are subtracted from his overall ERA. His colleague Tristan Cockroft routinely publishes Consistency Ratings to let readers know which pitchers have remained relatively high on ESPN’s player rater from week to week.

While these methods focus on pitcher performance from start to start, it may be useful to evaluate pitcher performance against individual batters. If Tommy Milone gets rocked pitching on the road in Texas, we may be less concerned than if he is routinely unable to get out low quality hitters. To this end, we can examine how pitchers perform against different levels of batters. How well does a given pitcher avoid putting low OBP batters on base? How does this compare to his rate of putting a high OBP batter on base? We would expect to see a linear relationship—the Emilio Bonifacios of the world should be easier to get out than the Joey Vottos.

Methods

We begin by examining the 31 pitchers with the most innings pitched for the 2012-2013 seasons. After obtaining batter vs. pitcher data for each of these pitchers during the last season and a half, we can calculate the OBP allowed by each pitcher to any batter with at least 5 plate appearances during this time period (arbitrary cutoff alert!). We can now see how Buster Posey fares against the likes of Clayton Kershaw, Ian Kennedy, and any other NL pitcher in which he has accrued at least 5 PA. It turns out Posey did pretty well for himself.

In order to obtain the OBP of batters in general, not in relation to particular pitchers, we can examine the leaderboards for players with at least 450 PA in 2012-2013. Based on the work of Russell Carleton, we have confidence that after ~450 PA, a batter’s OBP tends to stabilize and represents their long-term OBP skill level.

Batters were then placed in five buckets, lowest, low, medium, high, and highest OBP levels.

Batter On-Base Percentage Classification

OBP Category

OBP

Player Examples

Lowest

0.243-.311

Colby Rasmus, J.J. Hardy, Raul Ibanez

Low

.311-.330

Ruben Tejada, Eric Hosmer, Michael Young

Medium

.330-.338

Elvis Andrus, Jason Heyward, Yoenis Cespedes

High

.338-.349

Brandon Belt, Jason Kipnis, Coco Crisp

Highest

.349-.458

Allen Craig, Andrew McCutchen, Mike Trout

Each batter, assigned a score of lowest to highest, was then matched with the batter vs. pitcher dataset, allowing for us to calculate the mean OBP allowed by individual pitchers to hitters in each of the categories. So, although someone like Zack Cozart sports a .283 OBP in 2012-2013, earning a spot in the lowest category, he does own a .329 OBP against Yovani Gallardo. Maybe this is all the evidence Reds Coach Dusty Baker needs to keep batting Cozart second in the lineup.

Results

If we examine the performance of pitchers across five categories of OBP skill, we can calculate the correlation coefficient of these five points. R2 in this case is a measure of how well the data fits a straight line—if a pitcher allows a low OBP to low OBP hitters, and a correspondingly higher OBP to high OBP hitters, the data points should increase linearly and the value of R2 should approach 1. Conversely, pitchers that are inconsistent in their ability to get hitters of a certain skill level out would have a R2 much closer to 0.00.

 

Correlation Coefficient for OBP Allowed Among Differently Skilled Batters

Name

R2

Adam Wainwright

0.798

Jason Vargas

0.793

Max Scherzer

0.771

Ricky Nolasco

0.740

Matt Cain

0.734

Yu Darvish

0.717

Wade Miley

0.705

C.J. Wilson

0.700

Jordan Zimmermann

0.697

Kyle Lohse

0.660

Bronson Arroyo

0.657

Yovani Gallardo

0.638

Justin Verlander

0.619

Mat Latos

0.617

Cliff Lee

0.553

Hiroki Kuroda

0.536

James Shields

0.469

Justin Masterson

0.443

Homer Bailey

0.377

Ian Kennedy

0.353

Clayton Kershaw

0.329

Cole Hamels

0.159

Gio Gonzalez

0.140

Mark Buehrle

0.105

Trevor Cahill

0.083

Felix Hernandez

0.076

Chris Sale

0.031

R.A. Dickey

0.029

CC Sabathia

0.028

Jon Lester

0.028

Madison Bumgarner

0.025

There is a wide range of R2 values among this list of starting pitchers. Adam Wainwright takes the grand prize for consistency. He is far more prone to putting elite OBP hitters on base than lowly hitters. Madison Bumgarner, on the other hand, strangely performs worse against low OBP than high OBP hitters, and has the lowest R2.  And R.A. Dickey, as you might expect, is sort of all over the place.

 

 

Below is a visual representation of the OBP against pitchers with high and low R2 values. We can see that the pitchers with the highest correlation coefficient have a much more linear relationship overall with OBP allowed than pitchers with low values.

 

 

Additional analyses showed that there was no relationship between a starter’s FIP and their correlation coefficient. A quick glance at the names in the two graphs above confirms this. Jason Vargas, with a R2 of .793 is a worse pitcher, in pretty much all respects, than Felix Hernandez at .076. Interestingly, Jason Vargas has one of the league’s highest HR/9 at 1.28 during 2012-2013, while King Felix sports one of the lowest ratios at .62.

What, then, does pitcher consistency tell us? While it may not tell us much about the overall skill of a pitcher by itself, we can discern from the data which pitchers are doing a good job getting out poor hitters. Pitchers like Adam Wainwright and Max Scherzer are doing extremely well, and their R2 values indicate that they are pitching steady—they are less likely to blow up against poor hitters. Of course, pitcher performance can differ greatly from start to start, but one can have confidence that Ricky Nolasco will probably dominate his former Marlins teammates (30th in team OBP), because he consistently allows a low OBP to low OBP hitters. Conversely, perhaps it’s a good thing Jason Vargas does not have to pitch against his Angels teammates, who collectively have the 4th highest team OBP in the majors.

Oddly enough, Justin Masterson’s OBP allowed has a small range, from .299 in the middle OBP tier to .371 against the highest tier, indicating that when he’s brought his good stuff, he mostly dominates all batters regardless of their level of skill. We can have less confidence that Justin Masterson will dominate a middling OBP team like Kansas City (6.39 ERA this year), ranked 20th overall in the majors, while he has repeatedly humiliated the Blue Jays, who just beat out the Royals at 17th overall.

Despite the comically bad timing of his recent piece on batting Raul Ibanez against CC Sabathia, David Cameron was right to point out the relative worthlessness of individual batter vs. pitcher matchups and the danger of drawing conclusions from such small sample sizes. However, we can use aggregated batter vs. pitcher data to learn more about what kinds of players pitchers are more likely to strike out, or serve up the long ball, or a base on balls. While it’s easy to assume that pitcher X will be less likely to strike out Norichika Aoki than Ike Davis, by studying consistency we may be able to see who deviates from this linear pattern. Are some average strike out pitchers more likely to strike out low strikeout hitters? We can already see from the data above that R.A. Dickey is as likely to put a low OBP hitter on base as a high OBP hitter. While this fact seems to make little sense, these results indicate that the knuckleball can baffle expert hitters as much as less skilled batsmen. It may be worthwhile to use consistency ratings such as these to determine what kinds of pitchers deviate from the expected patterns.

All data courtesy of Fangraphs and Baseball Reference.

Because I’m a big believer in open data, here is a link to the R code used to find Batter vs. Pitcher OBP percentages by quintile.