Does it matter which side of the pitching rubber a pitcher starts from throwing a sinker?

As we start a new baseball season, I start a new season of my own. This is my first – of many I hope – analysis and write-up on baseball that I am submitting. I am an avid fan, a numbers geek, an aspiring writer and lastly a bored software engineer. I am also very fortunate. I have a close connection with a former major league player and the ability to leverage his vast experience and knowledge of the game. Hopefully, I can parlay the knowledge I have learned from many years of observation along with the knowledge I have gleaned from my connection to realize my goal as a contributor to the sabermetric community and to the enjoyment of baseball fans everywhere. Here we go!

Question

Is the effectiveness of a sinker dependent on from which side of the rubber the pitcher throws?

I was in Florida in mid March for spring training, talking with a minor league coach when he mentioned that he and a former all star pitcher were in a disagreement about how to throw a sinker. Their debate centers on where a pitcher should stand on the rubber to throw a sinker most effectively. We all understand that a pitcher should not move all over the rubber to become more effective on a single pitch. This would obviously tip off the hitters as to what type of pitch might be coming. But for argument’s sake, a team might have some newly transformed position players learning to throw different pitches. Wouldn’t a team want to know if, for some pitches, it was more beneficial to stand on one side of the rubber than another?

I consider myself a pretty observant guy, but I will have to admit that I never really paid much attention to where a pitcher stood on the rubber. To me the juicy part is watching the ball just after it is released. The dance, dip, duck and dive a pitcher is able to command of the ball is where the action is as far as I am concerned. So watching what a pitcher does before he even starts his motion was asking a little much. Nonetheless, I was certain that with so many pitchers in the majors, that a breakdown of data would show that there was not a singular starting point on the rubber. Every pitcher is different, right?

Setup

I started my analysis by downloading the last 4 years (2009-2012) of PitchFx data. Most of us know this already but by using PitchFx data there are some limitations to analysis. Unlike Trackman, PitchFx initially records each pitch at 50’ from home plate, not the actual release point of the pitch. For PitchFx this data point is called “x0”, and for all intents and purposes this is pretty good data, as for most pitchers their strides are approximately 5 to 6’ from the rubber, and with arms length added in we are talking about a difference of a couple of percentage points from being the same as the release point metric from Trackman. But full disclosure, it is not exactly the release point. Another factor that I didn’t measure is a pitcher’s motion to the plate. Some pitchers throw “across” their bodies and not down a straight line, and even fewer open up their body to the batter (stepping to stride leg’s baseline). Also, there is probably a bit to glean from going between the stretch and wind-up, but again without doing a very in-depth study I assume no factor in the analysis. Lastly, arm length is an unmeasured factor. For example, I didn’t check to see if there were any right-handed pitchers with extra long arms standing on the first-base side of the rubber distorting the data.

I started by combining the PitchFx Sinker (SI) and Two-seam fastball (FT) data into a single database. The reason to combine the data is due to the fact that the grips for each pitch are the same, combine this with a two-seam fastball can and a sinker break the same way (down and in to a RH batter from a RH pitcher), and lastly they are also somewhat synonymous in major league vernacular. Maybe somewhere along the line the pitch was invented twice (north or south), the name given is based on region like when asking for a Coke… it’s a “soda”, a “pop”, or a “tonic” depending on where you are in the states. Maybe in the South it was labeled a sinker and the North it was taught as a “two-seamer”? Either way it’s the same pitch as far as I am concerned, and the etymology of pitch naming is a different topic for a different time.

Back to the question above about every pitcher being different, I was wrong. Using the 2012 data I created a frequency distribution for right-handed pitchers (figure 1), and as you can see there is definite focal area at around -2’ point from the centerline of the pitching rubber (and home plate).

Image

Figure 1 – Right-handed pitchers in 2012

This shows that most pitchers start from about the same side; which I determined to be the right side of the rubber (3rd base side). I determined this by adding 9” to one-half the length of the pitching rubber (24”) which comes to 21” (9”+12”). Add in arm length and you can see that using an x0 that is less than or equal to 2’ (remember we are using negatives here) should prove that the pitcher is throwing from the right side.  I would like to add that the 9” used above is based on the shoulder width of an average man, which is around 18”. This metric is based on studies on the “biacromial diameter” of male shoulders in 1970 (pg. 28 Vital and Health Statistics – Data from the National Health Survey). I think we can all agree that the 18” is probably conservative by today’s growth standards. I mentioned in the limitations of the analysis written above, I don’t account for arm length or pitcher motion. Therefore I needed to make sure that there are right-handed pitchers who are throwing from the left hand side of the rubber; just not a bunch of super long-armed, cross bodied throwers.  With the data in hand I was able to identify which pitchers had thrown the ball closer to centerline of the rubber and therefore would be good candidates for standing on the left side of the rubber. The first pitcher who had a higher (>-2) x0 value was Yovani Gallardo of the Milwaukee Brewers. Without knowing Gallardo’s motion I needed to go to the video. From the video, you can clearly see that Gallardo starts on the left side of the rubber and throws fairly conventionally, straight down the line to the batter.

I wanted to keep this as simple as possible, breaking up the pitchers in two categories – Left side or Right side. Without looking at video for each pitcher I had to come up with a tipping point for classifying the side based on the x0 data I had available. If we simply take what we determined above and correlate it to the left hand side we will come up with 1 (starting on left side of rubber) and an x0 of 0. But it isn’t quite that simple. The frequency chart shows that there are less than 1000 balls thrown in 2012 with an x0 greater than or equal to 0. Gallardo threw 504 pitches himself in 2012. So we have to increase the scope a bit. By arranging the x0 data into quartiles we see that upper or lower quartile – depending on handedness – is around -1 or 1 (remember we are using negatives) so for a right handed pitcher the x0 splits are:

Min

25%

Med

Avg

75%

Max

-5.264

-2.315

-1.868

-1.849

-1.372

2.747

 

For left handers:

Min

25%

Med

Avg

75%

Max

-3.787

1.455

1.953

1.924

2.401

5.378

 

As I am trying to stay conservative, and the fact that these are not release point numbers I use 1 and -1 as the cut off for classification based on the handedness of the pitcher. Using these numbers provided a pretty clean break in the distributions (90-10%).

Findings

So who was right, the all star pitcher or the minor league pitching coach? Is there an advantage depending on where the pitcher stands on the rubber? Neither – both of them. It’s a tie.

What can I say; my initial analysis is a bit anticlimactic, but not because of lack of effort.  To denote the labels below:

  • LH or RH (Handedness)
  • RR or LR (Right or Left Rubber)
  • B – Balls
  • K – Strikes
  • P – In play (No Outs)
  • O – In play (Outs)
  • BackK – Called Strikes
  • FT – Two seam fastballs
  • SI – Sinkers
  • Efficiency – O/(P+O)
  • XSide – Cross Side (i.e. RH-LR or LH-RR)
  • Same side – LH-LR or RH-RR

 

LHData

194487

pitches
LH_LR

173145

89.03%

LH_RR

21342

10.97%

LH_LR_B

62957

36.36%

LH_RR_B

7932

37.17%

LH_LR_K

75241

43.46%

LH_RR_K

9067

42.48%

LH_LR_O

22610

13.06%

LH_RR_O

2843

13.32%

LH_LR_P

12335

7.12%

LH_RR_P

1500

7.03%

LH_LR_FT

108600

62.72%

LH_RR_FT

15846

74.25%

LH_LR_SI

64545

37.28%

LH_RR_SI

5496

25.75%

LH_LR_BackK

34932

46.43%

LH_RR_BackK

4406

48.59%

RHData

473032

pitches
RH_LR

48791

10.31%

RH_RR

424241

89.69%

RH_LR_B

18266

37.44%

RH_RR_B

153014

36.07%

RH_LR_K

20486

41.99%

RH_RR_K

180611

42.57%

RH_LR_O

6453

13.23%

RH_RR_O

58895

13.88%

RH_LR_P

3583

7.34%

RH_RR_P

32459

7.65%

RH_LR_FT

21781

44.64%

RH_RR_FT

194582

45.87%

RH_LR_SI

27010

55.36%

RH_RR_SI

229659

54.13%

RH_LR_BackK

10520

51.35%

RH_RR_BackK

82482

45.67%

Xside  667519

pitches

Same Side
LH_RR&RH_LR

70133

10.51%

LH_LR&RH_RR

597386

89.49%

LH_RR&RH_LR_B

26198

37.35%

LH_LR&RH_RR_B

215971

36.15%

LH_RR&RH_LR_K

29553

42.14%

LH_LR&RH_RR_K

255852

42.83%

LH_RR&RH_LR_O

9296

13.25%

LH_LR&RH_RR_O

81505

13.64%

LH_RR&RH_LR_P

5083

7.25%

LH_LR&RH_RR_P

44794

7.50%

LH_RR&RH_LR_FT

37627

53.65%

LH_LR&RH_RR_FT

303182

50.75%

LH_RR&RH_LR_SI

32506

46.35%

LH_LR&RH_RR_SI

294204

49.25%

BackK

14926

50.51%

BackK

117414

45.89%

Efficiency

64.65%

Efficiency

64.53%

 

The efficiency is so very close. Twelve-hundredths (.12) of a percent is not a lot – 169 outs out of 140678 – but give any Chicago Cub fan five of those outs in 2003 and Mr. Bartman would be an afterthought. Which, I am sure is the way he and all Cub fans around the world would like it. The efficiency is the same, no other way to put it which is the beauty of statistics and sabermetrics. Numbers can say so much, even when they are the equal.

But the analysis wasn’t all for naught, there are some nuggets to glean from the numbers above. As a segue, I am currently watching Derek Lowe of the Texas Rangers pitch on opening night and from the left side of the rubber he throws a sinker and it dips back over the rear part of the plate for a called strike. With all of the similarities within my analysis the most striking observation is the difference in called strikes depending on the side of the rubber. If a pitcher, coach or manager could get a strike or a strike out without the fear of having a batter get a hit or moving a runner forward they would do it every time. With a five percent difference in getting a strike and not having the worry of the ball being put into play would be an interesting thing to know in some tight situations with runners on base. My thought on the difference revolves around the back door being open a little wider when it comes to getting called strikes. With a pitcher throwing X-side you can definitely see a pattern of called strikes on the same side of the plate from which the pitcher throws from. Positive numbers in figures below indicate right side of plate (1st base side)

Image

With today’s specialization where pitchers are matched up to batters based on handedness, the ability for a pitcher to throw a strike as it tails back over the plate or close to the plate (or maybe not even close for some of the pitches above ) is essential. It appears that umpires are a little more flexible with their perception of the strike zone for these pitchers as well.

Closing

I didn’t get the results that I anticipated when I started this analysis, and that is great! As a society we are determined to have a winner! Just as there is “no crying in baseball”, there are no ties in baseball. Even when there is a tie; like on a close play at first – it proverbially goes to the runner. We can’t settle for a tie…. hockey reduced ties by adding a shootout after overtime.  College football removed the tie by introducing sudden death (hopefully the bowl playoff with help eliminate the subjective BCS tie). With no clear cut advantage (read – TIE) identified in my analysis means that a more in depth analysis could/should be performed to validate. Maybe expanding the percentage of X-side pitchers to 15-20, or identifying when pitchers are throwing from the stretch and removing those instances would alter the results and provide a much needed winner? If after all analytical statistical avenues have been exhausted there’s still not a proven advantage, we can always resort to having the coach and player settle it with a coin flip?


A Case Study in Lineup Construction

Controversy and speculation have surrounded the Texas Rangers’ lineup for the better part of a year.  First, Michael Young was a consistent presence in the middle of the Rangers’ order despite lackluster performance.  More recently, the departure of Josh Hamilton and Mike Napoli have led many to speculate the Rangers’ offense would take a step back in 2013.  But how did Ron Washington’s lineups compare to an optimized lineup? How will the loss of Hamilton and Napoli affect the Rangers’ run production?

To find out, I wrote a Monte Carlo program which simulated 50 seasons of games for all 362,880 (9!) lineup combinations. It takes as input the percentage of singles, doubles, triples, home runs, walks, and strikeouts with respect to their number of plate appearances for each batter in the lineup. The outcomes of each at bat is determined by a random number generator as if each batter faces a league average pitcher, and base runners advance according to the league averages for taking extra bases. While not including all the variations of pitcher quality, player speed and defensive quality, it allows for an adequate picture of the effectiveness of various lineups.

Let’s first look at the effect of moving Young from the 5th spot to the 9th spot. We’ll start with the most frequently occurring lineup from 2012:

Ian Kinsler
Elvis Andrus
Josh Hamilton
Adrian Beltre
Micheal Young
Nelson Cruz
David Murphy
Mike Napoli
Mitch Moreland

We’ll plot a histogram of the runs per game (labeled rpg in the plots, always full 9 innings games) scored by all 362,880 possible lineup combinations, all 40,320 lineup combinations with Young batting 5th, and all 40,320 lineup combinations with Young batting 9th (y-axis is frequency of occurrence, note the logarithmic scale).

2012 Lineup distribution, Young in 5 slot vs 9 slot

Most possible lineup combinations produce the same number of runs to within a 0.1 runs per game. No matter the lineup combination, the variation of runs scored is around 16 runs a year. For the Rangers’ lineup, lineup optimization is a relatively small effect. Lineups with different hitters may show a greater or lesser dependence of lineup construction on run scoring.

The difference between moving Michael Young from 5th in the order to 9th in the order is smaller; 0.02 runs per game, or 3 runs over the course of a year. Given the hitters in the Rangers lineup, batting Young 5th in the order did not make a significant difference. But there was another option, Ron Washington could have substituted Craig Gentry for Michael Young. We again plot a histogram of the runs per game scored for all possible lineup combinations with Gentry batting (red) or Michael Young batting (blue).

Rangers Lineup Distribution, Young vs. Gentry

Again, we find the difference to be minimal; this time roughly 0.01 runs per game, or a mere 1.6 runs per season. While it was painful to watch Young batting 5th in 2012, the increased production at the bottom of the lineup largely offset the loss of production in the middle of the lineup. So what happens now that the Rangers’ lineup has lost Hamilton, Napoli and Young in exchange for AJ Pierzynski, Lance Berkman, and Leonys Martin/Craig Gentry? Based on Ron Washington’s lineups in spring training, a likely common lineup for the Rangers in 2013 is as follows:

Ian Kinsler
Elvis Andrus
Lance Berkman
Adrian Beltre
Nelson Cruz
AJ Pierzynski
David Murphy
Mitch Moreland
Leonys Martin

I ran all possible lineup combinations in which Adrian Beltre batted 2nd, 3rd or 4th for both the 2012 and likely 2013 Rangers’ lineup. For the 2013 Rangers’ lineup, I used projections (ZiPS, Steamer, Oliver, Bill James) for the upcoming season to seed the simulation with the hitters’ likely production. Again, a histogram of runs scored per game for all these lineup combinations, with 2012 in blue and 2013 in red.

2013 Rangers Lineup Distribution vs 2012 Lineup Distribution

The peaks as fit predict a 0.22 runs per game increase for the Rangers in 2013, or roughly 36 runs over the course of the year. The non-Gaussian (or normal distribution) tail of the 2013 distribution indicates it might be possible to improve even more.

We will finish with comparisons of the optimized lineups for 2012 and 2013 to the most usual/expected lineups for those years.

2012 Lineup 2012 Optimized 2013 Lineup 2013 Optimized
5.03 rpg 5.11 rpg 5.29 rpg 5.34 rpg
Ian Kinsler David Murphy Ian Kinsler Ian Kinsler
Elvis Andrus Adrian Beltre Elvis Andrus Lance Berkman
Josh Hamilton Josh Hamilton Lance Berkman Leonys Martin
Adrian Beltre Mitch Moreland Adrian Beltre Adrian Beltre
Micheal Young Nelson Cruz Nelson Cruz Nelson Cruz
Nelson Cruz Mike Napoli AJ Pierzynski Mitch Moreland
David Murphy Ian Kinsler David Murphy AJ Pierzynski
Mike Napoli Micheal Young Mitch Moreland David Murphy
Mitch Moreland Elvis Andrus Leonys Martin Elvis Andrus

We’ll start with the big picture. While moving/substituting for Michael Young in 2012 would have made little difference in run production, an optimized lineup would have increased the Rangers’ run total by 13 runs over the course of the year. Not much, but it would likely have been enough to have won the division instead of losing to the A’s. Of course, it is much easier to optimize a lineup when you already know how everyone is going to perform; using an optimized lineup based on 2012 projections wouldn’t have netted the 13 run increase. Most notably, leading off with Murphy (in his breakout year) instead of Kinsler (in his down year) to increase production is not a move one could expect an organization to predict before any games had been played in 2012.

Second, the probable lineup for the Rangers in 2013 is projected to score 8 runs a year less than an optimized lineup. Given the large variance in the production of a hitter as compared to his projections, these lineups seem virtually equivalent.

The optimized lineups show different characteristics than the lineups generated by Ron Washington. The optimized lineups forego Elvis Andrus batting second in preference for a power hitter with good average. Elvis Andrus is instead relegated to the 9th spot. The 2013 optimized lineup puts a lot of faith in rookie Leonys Martin, due entirely to some very respectable projections for the coming year (and not knowing he’s a rookie). Given the uncertainty of how much offense Martin will produce in 2013, have Martin bat in the bottom of the order, as in Ron Washington’s lineup, seems prudent. Finally, Mitch Moreland is preferred in the middle of the lineup in the optimized lineups instead of the bottom of the order as in Washington’s lineups.

If the Rangers are looking to optimize their lineup for 2013, this simulation indicates the two main points to consider: moving Moreland to the middle of the order, and considering batting Andrus 9th.


Measuring a pitcher’s ability, performance, and contribution

I’d like to share some of my thoughts and research on how we evaluate Major League Baseball pitchers. I think for the most part when we use statistics to discuss a pitcher, we are really looking at the pitcher from one or more of the following three perspectives: 1) ability, 2) performance, and 3) contribution. Before I get into my research, I will take a moment to describe what I mean by each of the three terms.

ABILITY

When I use the word ‘ability’, I am describing the physical and mental skills the pitcher has at his disposal. Some examples of ability are: how hard he can throw, what kind of movement he has on his pitches, how well he can locate, how well he mixes his pitches, etc. With the introduction of PitchFX, we are now capable of measuring ability better than ever before. With that being said, it is still difficult to accurately and meaningfully quantify many aspects of ability. Since a pitcher’s performance is based at least in part upon his ability, performance statistics can sometimes be used as a substitute for direct ability measures.

PERFORMANCE

Performance literally describes how well a pitcher performed. In other words, it refers to the outcome or outcomes resulting from that pitcher throwing pitches. Nearly all baseball statistics describe performance. Some statistics measure a pitcher’s individual performance fairly well, whereas others combine the pitchers performance with the performance of his team and other factors. For example, ERA is generally not considered a great measure of a pitcher’s individual performance; however, FIP is considered a better measure of individual performance.

CONTRIBUTION

I have not found much reference to the word ‘contribution’ in the baseball literature, but I do think it is an important concept to consider. Contribution is a word I use to describe a pitcher’s contribution in helping his team win baseball games. By this general definition, I suppose ERA (and other performance measures) could also be considered a contribution measure in some respects, since wins are related to runs allowed. Therefore, I also propose that the relationship between ability, performance, and contribution is not divided by solid lines but is instead a spectrum where each statistic can be considered somewhat a part of each category. However, in an attempt to clear up this somewhat murky discussion, I will offer stats such as W-L, WAR and WPA as the most obvious contribution stats*.

*Note: Contribution stats can be measured directly (ie. W-L) or derived from performance stats (ie. fangraphs WAR is derived from FIP).

RESEARCH

Now on to my research… The hypothesis that drove this work was: pitcher ability measures are more consistent between seasons than performance or contribution. This hypothesis is based on my belief that unlike performance and contribution, which are affected by countless outside factors, a pitcher’s ability is within himself and therefore less likely to dramatically change between seasons.

To test this, I took each pitcher that pitched a minimum of 120 innings in each season from 2008-2011. This gave me a pool of 63 pitchers.

For my ability measure, I took the statistic whiff/swing. I like this measure of ability because to me it is the simplest measure of an isolated part of a pitcher’s ability. Since the batter has already decided he will swing, we are only looking at the pitcher’s ability to throw a ball that will evade a hitter’s bat. I know ability to hit the ball is also heavily dependent on the hitter’s ability, but I think that using pitchers that pitched 120 innings in each season will let me take the individual batter out of the equation and use this as a measure of pitcher ability.

For my performance measures I used ERA and FIP from FanGraphs. I agree ERA is not the best performance measure, and may be considered more of a contribution; however, I have included it nonetheless. Finally, for my contribution measure I decided to use FanGraphs WAR.

I calculated the average whiff/swing, ERA, FIP, and WAR for each pitcher of the four-year period. I also calculated the standard deviation within each pitcher for each stat and the within pitcher coefficient of variation (stdev/avg). Coefficient of variation is the best way to report the variability of each statistic over the four seasons because it effectively normalizes each stat by the units they are reported in.

Globally, over the four-season period the 63 pitchers in my group had an average:
whiff/swing = 0.205
ERA = 4.03
FIP = 3.97
WAR = 3.08.

The average within pitcher coefficient of variation was:
9.6% for whiff/swing
18.5% for ERA
12.0% for FIP
and 47.7% for WAR.

TAKE HOME

So what does this mean? Well, I know this is just a start, but based on this I believe my hypothesis was correct. A pitcher’s ability is much more consistent between seasons than their performance and/or contribution. Furthermore, performance is more consistent than contribution. It appears as though the further you get from pure ability measures the more difficult it will be to accurately/reliably predict a pitcher’s future performance and contribution. I’d like to do some further research on performance prediction to confirm this but, my guess is that trying to predict future WAR from past WAR will be extremely difficult. Perhaps predicting future WAR from past ability measures may prove to be more effective.


Bill “Moneyball” Veeck

I was sitting on a park bench reading Veeck as in Wreck, the memoir of legendary ballclub owner Bill Veeck, when I came across this passage:

Ken Keltner, our third baseman and one-time power hitter, had a miserable season in 1946. There seemed little doubt that he was on the downgrade. Still, when I signed him for the next year, I gave him the same amount of money and told him that if he had what I considered a good year I’d give him a bonus of $5,000.

The next year, Kenny hit the ball better than anybody on our club, with less luck than anybody in the league. If you walked into the park late and saw somebody making a sensational leaping, diving backhanded catch, you could bet that Keltner had hit the ball.

On the last day of the season, he was hitting under .260 and had driven in around 75 runs. I called down to the locker room, got him on the phone, and said, “Hey, where have you been? Weren’t you supposed to come up and see me at the end of the season?”

“I didn’t win anything,” he said. “I’m having a lousy season.”

I suggested that he wander up anyway. As he came through the door I said, “I’ve got $5,000 for you.”

And he said, “I didn’t earn it, Bill.” And he started to weep.

“You hit the ball better than anybody else on this club,” I told him. “It wasn’t your fault they kept catching it.”

As a loyal FanGraphs reader, I immediately thought: BABIP! For those who need a quick reminder, batting average on balls in play (BABIP) measures just that: batting average on balls hit somewhere the defense can get to them. It’s expected that BABIP will generally hover around .300, modified by such factors as the enemy defense (this averages out over a season), whether the balls you hit go over outfield fences, and, most of all, luck.

Now, Veeck’s comment that Keltner “hit the ball better than anybody else” was probably a kindness rather than a hypothesis. But his observation that “they kept catching it” checks out. I looked at the leaderboard for the BABIPs of every qualifying hitter in 1947. Sure enough, Ken Keltner’s down near the bottom, ranking 68th of 86 with a BABIP of .264. The median that year was almost thirty points higher: .292.

Ken Keltner had lousy luck, but was still an average hitter (102 wRC+). And the next year was the best of his career (7.9 WAR), so it looks like Bill Veeck saw the Keltner case exactly right. Only there’s a twist. One of Veeck’s 1947 Indians had it even worse. Down there at 74th is the .256 BABIP of Joe Gordon. Joe Gordon slugged 27 doubles, 6 triples, and 29 home runs, so things turned out well for him, but if Veeck’s latecomer had bet that “a sensational leaping, diving backhanded catch” was on a ball hit by Ken Keltner, you’d want to bet against him. Joe Gordon’s luck was worse; he compensated by putting more balls in the outfield bleachers.

There’s weirder to come. Dead last, 86th of 86, is Roy Cullenbine, Tigers first baseman, who paired a grotesque .206 BABIP and .224 average (83rd of 86) with the second-highest walk rate in baseball. His 22.6% walk rate was topped only by Triple Crown winner Ted Williams. (By the way, in the previous year, Williams had been introduced to the defensive shift, as pioneered by, yes, Bill Veeck’s Indians.)

No player in 2012 came close to matching Cullenbine’s bizarre season. The lowest BABIP of any qualifying hitter in 2012 was .242 (Justin Smoak); of all hitters with BABIPs below .256 (fifty points higher than Roy Cullenbine’s), none came within fifty points of Cullenbine’s .401 OBP. The best analogy is this: Cullenbine hit for average like Dan Uggla, had Justin Smoak’s luck, and still drew walks at the rate of Barry Bonds.

Roy Cullenbine was only 33 in 1947, and in past years his offensive numbers were impressive. Had he been on Bill Veeck’s Indians instead of playing for the Tigers, his unlucky 1947 might have ended as Ken Keltner’s did,with a $5,000 bonus. The Tigers, not valuing Cullenbine’s patience, released him, and he never played a major-league game again.

There’s another interesting name among the ten unluckiest batters of 1947. Coming in at sixth-worst, with a BABIP of .247, is a patient slugger who got on base even more than Cullenbine did, with four more walks than he had hits. He too retired after the season. His name was Hank Greenberg, and that winter he accepted a job in a major-league front office, where he was groomed to be the team’s next general manager. The team was the Cleveland Indians. His new boss was Bill Veeck.


Can the WBC be fixed?

While this year’s iteration of the World Baseball Classic has certainly experienced success, it does not have the juggernaut status that the Football World Cup or the Olympics currently hold. While the Classic will probably never approach the success these two international tournaments have, it does have the potential to spread baseball interest and expand the game around the world, particularly in places like Europe or China. In order for baseball to grow, it has to reach new fan bases outside of the United States, which appears to be at the max of its potential. The WBC is a nice touch to baseball’s international growth, but it needs a few modifications to truly reach its potential.

The problem with the current round-robin format is the attendance figures and interest level with the games involving two lesser-known countries. In pool A, three out of the six games drew less than 5,000 fans, while the other three had more than 10,000 fans each, and two drew more than 25,000. The attendance figures in pool B were even more extreme. three of the games drew less than 2,000 fans, while the other three drew more than 20,000 each. To combat this problem, there have been numerous suggestions about modifying the tournament to turn it into a single elimination format, as Dave Cameron suggested in his post “Fixing the WBC”. This format is definitely the best option for the tournament, as it would increase the interest and attendance in each game given the win or go home nature atmosphere. Hopefully, since all the games would pit a high-seeded team against a low-seeded team, the low-interest games of less than 5,000 fans would be eliminated.

The other advantage to the single-elimination tournament is the elimination of the silly WBC rules and tiebreaking procedures. Run differential would no longer be the difference between advancing out of a pool and going home. The pointless games to determine seeding at the end of the second round would also be eliminated. Perhaps the pitch limits would go away as well because teams would play fewer games. The tournament would no doubt gain some relevancy if the silly rules and restrictions were eliminated.

Most of the potential changes to the WBC involve shortening it to a week or so. While most would agree that the current format is too long, MLB might not bite on a change that shortens the tournament to a mere week. The solution: why not expand the number of teams to 32? The current 16 teams would stay, and all the teams that participated in the qualifier would be added as well. That adds up to 28 teams. I wasn’t really sure what the four other teams could be, so I came up with Pakistan, Russia, Belgium, and Austria. I’m sure there might be better teams out there, but let’s proceed with these four teams to make it easy. To determine the format, I divided the tournament into four conferences: Northwest, Euro, East, and South:

East:                                                    South

  1. Japan                                    1. Venezuela
  2. South Korea                        2. Australia
  3. Taiwan                                 3. Brazil
  4. China                                    4. Colombia
  5. Israel                                     5. South Africa
  6. Czech Republic                   6. New Zealand
  7. Pakistan                               7. Philippines
  8. Russia                                   8. Thailand

Euro:                             Northwest:

1. Netherlands                     1. Dominican Republic
2. Italy                                    2. United States
3. Spain                                  3. Puerto Rico
4. Germany                            4. Cuba
5. United Kingdom              5. Canada
6. France                                6. Mexico
7. Belgium                              7. Panama
8. Austria                               8. Nicaragua

The current March timing for the WBC works OK, but it’s not perfect. The All-Star break doesn’t work either because MLB would never agree to nix the “beloved” event. That leaves the winter. I’m not sure the middle of the winter makes sense because the offseason is in full swing and free agents wouldn’t want to do it in fear of getting injured. That leaves November and February. Both of these times make sense to me, but I think the players would be less than thrilled to participate right after the postseason. That leaves February. The absence of football is a plus, and players wouldn’t have the excuse of spring training to avoid participation. Assuming that Spring Training starts March 1, here are some potential dates:

February 14: 4 East First Round Games

February 15: 4 South First Round Games

February 16: 4 Euro First Round Games

February 17: 4 Northwest First Round Games

February 19: 2 East Semifinal Games

2 South Semifinal Games

February 20: 2 Euro Semifinal Games

2 Northwest Semifinal Games

February 22: East Final Game

South Final Game

February 23: Euro Final Game

Northwest Final Game

February 25: East Winner vs. South Winner

February 26: Euro Winner vs Northwest Winner

February 28: Final Game

The close proximity of these games might require them to be played in a single country as opposed to the international format used now. I’m not really sure how many countries could host the two-week tournament besides Japan and the United States. Perhaps Japan and the US could alternate until other countries become viable alternative solutions. Or the regional tournament games could be held in that specific region and the winners could meet up for the semis somewhere else, like the current format. It would be great if European countries or other big countries like India could host the WBC, but currently it doesn’t seem likely.

Overall, this format offers some significant advantages to the current one. This version of the classic would have 31 games, only eight less than the current format, which would appeal to MLB because the new version could generate a comparable amount of revenue. However, individual teams would play fewer games, potentially attracting the big stars currently holding out. Already, we have seen players like Chase Headley, Jurickson Profar, Gio Gonzalez, and Kenley Jansen join the Classic in the later rounds when there are fewer games to play. Additionally, players competing for a job in spring training would be more enticed to join the classic because it provides another opportunity to showcase their talent to teams. The injury risk would be less because 1. there are fewer games to play and 2. players would have a longer period of time to recover from injury. Yes, baseball would start earlier, but hopefully this format would attract players the same way the World Cup does for soccer. With increased player participation, more exciting games, more teams involved, and a time frame that doesn’t compete with baseball’s own spring training, these changes make sense for MLB, the players, and most importantly, the fans.


The True Dickey Effect

Most people that try to analyze this Dickey effect tend to group all the pitchers that follow in to one grouping with one ERA and compare to the total ERA of the bullpen or rotation. This is a simplistic and non-descriptive way of analyzing the effect and does not look at the how often the pitchers are pitching not after Dickey.

I decided to determine if there truly is an effect on pitchers’ statistics (ERA, WHIP, K%, BB%) who follow Dickey in relief and the starters of the next game against the same team. I went through every game that Dickey has pitched and recorded the stats (IP, TBF, H, ER, BB, K) of each reliever individually and the stats of the next starting pitcher if the next game was against the same team. I did this for each season. I then took the pitchers’ stats for the whole year and subtracted their stats from their following Dickey stats to have their stats when they did not follow Dickey. I summed the stats for following Dickey and weighted each pitcher based on the batters he faced over the total batters faced after Dickey. I then calculated the rate stats from the total. This weight was then applied to the not after Dickey stats. So for example if Francisco faced 19.11% of batters after Dickey, it was adjusted so that he also faced 19.11% of the batters not after Dickey. This gives an effective way of comparing the statistics and an accurate relationship can be determined. The not after Dickey stats were then summed and the rate stats were calculated as well. The two rate stats after Dickey and not after Dickey were compared using this formula (afterDickeySTAT-notafterDickeySTAT)/notafterDickeySTAT. This tells me how much better or worse relievers or starters did when following Dickey in the form of a percentage.

I then added the stats after Dickey for starters and relievers from all three years and the stats not after Dickey and I applied the same technique of weighting the sample so that if Niese’12 faced 10.9% of all starter batters faced following a Dickey start against the same team, it was adjusted so that he faced 10.9% of the batters faced by starters not after Dickey (only the starters that pitched after Dickey that season). The same technique was used from the year to year technique and a total % for each stat was calculated.

Here is the weighted year by year breakdown of the starters’ statistics following Dickey and a total (- indicates a decrease which is desired for all stats except K%):

2012:
ERA: -46.94%  with 5/5 starters seeing a decrease
WHIP: -16.16% with 4/5 seeing a decrease
K%: 47.04% with 4/5 seeing an increase
BB%: 6.50% with 3/5 seeing a decrease
HR%: -50.53% with 5/5 seeing a decrease
BABIP: -14.08% with 4/5 seeing a decrease
FIP: -25.17% with 5/5 seeing a decrease

2011:
ERA: 17.92%  with 0/3 seeing a decrease
WHIP: -9.63% with 2/3 seeing a decrease
K%: -2.64% with 2/3 seeing an increase
BB%: -15.94% with 2/3 seeing a decrease
HR%: -9.21% with 2/3 seeing a decrease
BABIP: -15.14% with 2/3 seeing a decrease
FIP: -5.58% with 2/3 seeing a decrease

2010:
ERA: -23.82%  with 5/7 seeing a decrease
WHIP: 1.68% with 5/7 seeing a decrease
K%: -22.91% with 1/7 seeing an increase
BB%: -2.34% with 5/7 seeing a decrease
HR%: -43.61% with 5/7 seeing a decrease
BABIP: -3.61% with 4/7 seeing a decrease
FIP: -10.61% with 5/7 seeing a decrease

Total:
ERA: -17.21%  with 10/15 seeing a decrease
WHIP: -8.10% with 11/15 seeing a decrease
K%: -3.38% with 7/15 seeing an increase
BB%: -5.17% with 10/15 seeing a decrease
HR%: -32.96% with 12/15 seeing a decrease
BABIP: -11.04% with 10/15 seeing a decrease
FIP: -13.34% with 12/15 seeing a decrease

So for starters that pitch in games following Dickey against the same team, it can be concluded that there is an effect on ERA, WHIP, BABIP, and FIP and a slight effect on BB% and on K%. There is also a large effect on HR rates which we can attribute the ERA effect to. This also tells us that batters are making worse contact the day after Dickey.

So a starter (like Morrow) who follows Dickey against the same team can expect to see around a 17.2% reduction in his ERA that game compared to if he was not following Dickey against the same opponent. For example if Morrow had a 3.00 ERA in games not after Dickey he can expect a 2.48 ERA in games after Dickey.

So if in a full season where Morrow follows Dickey against the same team 66% of the time (games 2 and 3 of a series) in which he normally would have a 3.00 ERA without Dickey ahead of him, he could expect a 2.66 ERA for the season. This seams to be a significant improvement and would equate to a 7.6 run difference (or 0.8 WAR) over 200 innings.

Here is a year by year breakdown of relievers after Dickey (these are smaller sample sizes so I will not include how many relievers saw an increase or decrease):

2012:
ERA: -25.51%
WHIP: -1.57%
K%: 27.04%
BB%: -49.25%
HR%: -34.66%
BABIP: 30.23%
FIP: -38.34%

2011:
ERA: -17.43%
WHIP: 8.45%
K%: 6.74%
BB%: -5.14%
HR%: 7.34%
BABIP: 9.75%
FIP: -2.05%

2010:
ERA: -2.55%
WHIP: 7.69%
K%: -9.28%
BB%: 10.84%
HR%: 2.11%
BABIP: 4.23%
FIP: 9.43%

Total:
ERA: -16.61%
WHIP: 5.38%
K%: 7.50%
BB%: -12.65%
HR%: -8.53%
BABIP: 13.38%
FIP: -10.40%

As expected there was a good effect on the relievers’ ERA, FIP, K%, and BB%, but the WHIP and BABIP were affected negatively. This tells me that the batters were more free swinging after just seeing Dickey (more hits, less walks, more strikeouts).

So in a season where there are 55 IP after Dickey in games (like in 2012) there would be a 16.6% reduction in runs given up in those 55 innings. If the bullpen’s ERA is 4.20 without Dickey it can be expected to be 3.50 after Dickey. Over 55 IP this difference would save 4.3 runs (or 0.4 WAR).

Combine this with the saved starter runs and you get 11.9 runs saved or (1.2 WAR). This is Dickey’s underlying value with the team that he creates by baffling hitters. This 1.2 WAR is if Morrow has a 3.00 ERA normally and the bullpen has a 4.00 ERA. If Morrow normally had a 4.00 ERA than his ERA would reduce to 3.54 over the season with 10.2 runs saved for 200 innings (1.0 WAR) and if the bullpen has a 4.00 ERA normally as well, 4.1 runs would be saved there, equating to 14.3 runs saved or a 1.4 WAR over a season.


Johnny B. Goode

Controlling the run game, pitcher fielding and ERA

Run & Glove

Johnny Cueto has been mocking his peripherals ever since his big league debut.  For the most part FIP serves as a terrific gauge for pitcher performance, but in 2011 Cueto made FIP look like a heart monitor trying to explain the weather.  On what most consider a separate note, base runners have a healthy and robust fear of Cueto’s pickoff move, which is one of the best in the show.

FIP measures outcomes a pitcher can control (home runs, walks and strikeouts) and chalks the rest up to random variation.  Studies have shown that stolen bases contribute relatively little to run creation and perhaps on that basis the ability to control the run game has generally been ignored or deemed overrated.

It is difficult, however, to ignore the six runs Cueto saved the Reds via his contributions to controlling the run game in 2012.  By contrast, A.J. Burnett’s inability to control runners cost the Pirates four runs. The typical scale is that 10 runs amount to one team win – and teams will pay about $5 million per win.

Acknowledging run game control cannot fully explain how Cueto has routinely outperformed his peripherals, just as it cannot wholly explain Pittsburgh’s inability to keep pace with Cincinnati in the NL Central last season.  It does, however, get us closer.

Incorporating a pitcher’s fielding ability proved of comparative importance in explaining and predicting performance.  Here we’ll turn to Mark Buehrle, whose glove has saved four runs per year since 2004, and among fellow hurlers the fast-working lefty has been one of the decade’s most steadily superb fielders.  FIP underestimated Buehrle in eight of the past nine seasons, slighting his ERA by an average of .30 per year over that span.

Numbers

 The numbers indicate that a pitcher’s defense and ability to control the run game should both be considered in assessing and forecasting the pitcher’s value.

Focusing on seasons in which pitchers hurled 100 same-league (AL or NL) innings from 2003-2012 (n=1400), I ran a multiple linear regression to create a formula (“MBRA”) incorporating run control (rSB) and pitcher fielding (rPM) on top of line drive and infield fly ball percentages (credit to BABIP guru Steve Staude) and a regressed take on FIP.

MBRA = (55.25*HR + 14.05*BB – 8.57*K)/TBF – .041*rPM – .056*rSB + (5.71*LD – 8.27*IFFB)/(LD+GB+FB) + 2.34 

Correlation

Mean Absolute Error

MBRA

.7750

.4570

FIP

.7647

.4697

BERA

.7477

.4922

tERA

.7472

.5616

MBRAT

.7216

.5394

xFIP

.6451

.5649

SIERA

.6290

.5768

 

MBRA is engineered to properly credit pitchers who can field and control the run game.  When I subtracted MBRA from FIP to locate the pitcher-seasons that benifitted most from my formula, I was encouraged seeing Buehrle show up twice in the top ten, and five times in the top 100 (again, that is out of 1400).

Next, I looked at seasons in which pitchers threw 100 same-league innings in consecutive seasons from 2003 to 2012 (n=791).  This time I ran a regression to create a model suited to predict a pitcher’s ERA based on his previous year’s statistics.

MBRAT = (20.12*HR + 7.13*BB – 6.7*K)/TBF -.025*rPM -.034*rSB + 2.37*ZC% + 2.22

 

Correlation

Mean Absolute Error

MBRAT

.4526

.6498

SBERA

.4398

.6582

BERA

.4347

.6634

xFIP

.4220

.6803

MBRA

.4198

.6987

FIP

.4162

.7024

ERA

.3630

.7920

MBRAT stands tall on the lofty pinnacle of public forward-looking ERA estimators, and if you factor in the percentage at which pitchers throw over the edge of the plate (EDGE%) its correlation jumps even higher (.4621).  Unfortunately, I only have Edge% data from 2008 to 2012 (n=362) and cannot yet justify its inclusion.

On Deck

I will create expectations for pitchers with fewer innings pitched and convert my findings to a WAR measure that may serve as a middle-ground between fWAR and rWAR.  I also stumbled on a potentially significant relationship between pick-off attempts and strand rates that may work its way into future formulas.


Evaluating 2012 Projections

Evaluating 2012 Projections

Hello loyal readers.  It’s time for the annual evaluation of last year’s player projections.  Last year saw Gore, Snapp, and Highly’s Aggpro forecasts win among hitter projections (http://www.fangraphs.com/community/comparing-2011-hitter-forecasts/) and Baseball Dope win among pitchers http://www.fangraphs.com/community/comparing-2011-pitcher-forecasts/ .  In general, projections computed using averages or weighted averages tended to perform best among hitters, while for pitchers, structural models computed using “deep” statistics (k/9, hr/fb%, etc.) did better.

2012 Summary

In 2012, there were 12 projections submitted for hitters and 12 for pitchers (11 submitted projections for both).  The evaluation only considers players where every projection system has a projection.

Read the rest of this entry »


Introducing BERA: Another ERA Estimator to Confuse You All

Coming up with BERA… like its [almost] namesake might say, it was 90% mental, and the other half was physical.  OK, maybe he’d say something more along the lines of “what the hell is this…” but that’s beside the point.    By BERA, I mean BABIP-estimating ERA (or something like that… maybe one of you can come up with something fancier).  It’s an ERA estimator that’s along the lines of SIERA, only it’s simpler, and—dare I say—better.

You know, I started out not knowing where I was going, so I was worried I might not get there.  As you may recall, I’ve been pondering pitcher BABIPs for a little while here (see article 1 and article 2), and whereas my focus thus far had been on explaining big-picture, long-term BABIP stuff in terms of batted ball data, one question that remained was how well this info could be used to predict future BABIPs.  After monkeying around with answering that question, though, I saw that SIERA’s BABIP component could be improved upon, so I set to work in coming up with BERA.  In doing so, I definitely piggybacked off of FIP and a little of what SIERA had already done.  You can observe a lot just by watching, you know.   I’m also a believer in “less is more” (except for when it comes to the size of my articles, obviously), so I tried to go for the best compromise of simplicity and accuracy that I could.

Read the rest of this entry »


Is Rebuilding Worth It?

Every year the least competitive MLB teams decide whether they will commit to “going for it” the next season, or take a step back and wait for some of their cost-controlled young players to develop into big league contributors, then invest money in the team at that time a year or two down the road.  If the situation is dire, the media and baseball executives alike will start kicking the tires on an organization needing an all-out rebuild.  In this case, teams trade away every expensive, though often productive, veteran for young prospects that can hopefully help form a more competitive and sustainable team in a few years in part due to a higher production to salary ratio.  A judgment is made that investing money into the major league portion of the organization will not yield worthwhile results in the upcoming seasons, leading to declining attendance and television ratings.  That money would be better spent on the draft and developing the players acquired through trades of the more expensive players on the team.  These often publicly announced plans usually have estimated times to completion ranging from 3-5 years, often coinciding with a new baseball executive’s contract length within a year or two.  I set out to measure the results of this strategy as it applies to total revenue, as well as how it works out in terms of return on investment.

Read the rest of this entry »