Community Blog | Page 183

Another Highly Unimportant Stat: Pitcher Craftiness

December 23, 2013

In this post on measuring a player’s scrappiness, commenter Eric Garcia said “Next up, measuring a pitchers’ craftiness.” I liked this idea and thought I would give it a shot. Of course, the first problem is deciding what makes a pitcher “crafty”. Eric Garcia gave his suggestions and we will look at them eventually. I, however, thought about pitchers that came to my mind when the word “crafty” is used and looked at what they had in common. Generally, they do not have an overpowering fastball and don’t throw it that often. They usually don’t have that many strikeouts, but also don’t walk that many, so they still have a decent WHIP. The perception is that they are good at pitching out of jams, either by inducing ground-ball double plays or popups.

There were 81 pitchers that qualified for the ERA title in 2013. I found the average of this group in four categories: fastball velocity, strikeout percentage, WHIP, and LOB%. For each player I calculated how many standard deviations from the mean they were in each of these categories. I then summed these up (using the negatives for fastball velocity, strikeout percentage, and WHIP). Though “crafty” often seems to be used as a synonym for “left-handed”, I feel that you should be able to be crafty with either hand, so I did not use handedness at all. I considered using fastball percentage instead of velocity, but felt velocity better captured what we are looking for. Pitchers I think of as crafty seem to often outperform their FIP, so I considered using ERA-FIP, but felt that since the outperformance is often the result of a low strikeout rate and generally good WHIP, that it was already taken into account. The numbers are not league adjusted, so National League pitchers get a slight advantage. So, using these criteria, here are the 2013 leaders in craftiness:

Name	Craftiness Score
Bronson Arroyo	4.70
R.A. Dickey	4.44
Hisashi Iwakuma	4.03
Bartolo Colon	3.80
Kyle Lohse	3.65
Mark Buehrle	3.38
Travis Wood	3.20
Mike Leake	2.55
A.J. Griffin	2.50
Dillon Gee	2.38
Zack Greinke	2.25
Eric Stults	2.03
Kris Medlen	1.94
Clayton Kershaw	1.89
Hyun-Jin Ryu	1.86
Jeremy Guthrie	1.68
Julio Teheran	1.60
Kevin Correia	1.43
Hiroki Kuroda	1.39
Chris Tillman	1.30
Cliff Lee	1.26
Ervin Santana	1.26
Mike Minor	1.24
Jhoulys Chacin	1.22
Andy Pettitte	1.11
Doug Fister	1.04
John Lackey	0.94
Jose Quintana	0.83
Jarrod Parker	0.79
James Shields	0.77
Miguel Gonzalez	0.73
Adam Wainwright	0.72
Madison Bumgarner	0.68
Wade Miley	0.69
Scott Feldman	0.64
Jorge de la Rosa	0.55
Jeff Locke	0.47
Patrick Corbin	0.44
Jordan Zimmermann	0.35
Ricky Nolasco	0.01
Dan Haren	-0.03
Matt Cain	-0.13
Shelby Miller	-0.23
Yu Darvish	-0.32
Jose Fernandez	-0.35
Chris Sale	-0.39
Cole Hamels	-0.47
Mat Latos	-0.50
Andrew Cashner	-0.57
Justin Masterson	-0.55
Kyle Kendrick	-0.64
Felix Hernandez	-0.77
Anibal Sanchez	-0.86
Matt Harvey	-0.94
C.J. Wilson	-0.89
Jon Lester	-0.93
Jerome Williams	-1.00
Max Scherzer	-1.05
David Price	-1.05
Rick Porcello	-1.04
Ryan Dempster	-1.09
Yovani Gallardo	-1.10
Gio Gonzalez	-1.16
Homer Bailey	-1.32
Joe Saunders	-1.28
Derek Holland	-1.38
Ubaldo Jimenez	-1.42
Jeremy Hellickson	-1.83
Felix Doubront	-1.85
Tim Lincecum	-1.88
Ian Kennedy	-1.94
Justin Verlander	-2.12
Stephen Strasburg	-2.17
Bud Norris	-2.19
CC Sabathia	-2.20
Lance Lynn	-2.26
A.J. Burnett	-2.35
Jeff Samardzija	-3.60
Wily Peralta	-3.64
Edwin Jackson	-4.26
Edinson Volquez	-4.84

Considering the model used here, Bronson Arroyo being on top is not really a surprise (though I really thought Dickey would probably wind up on top and he would have easily if I had used fastball percentage instead of fastball velocity). Now some people might protest that a low strikeout rate should not be required. They would argue that it is certainly possible that a pitcher might still be considered crafty and have a fair number of strikeouts. If we remove the strikeout percentage from the stat, we get the following:

Name	Craftiness Score
Hisashi Iwakuma	4.34
R.A. Dickey	3.99
Bronson Arroyo	3.40
Clayton Kershaw	3.28
Yu Darvish	2.88
A.J. Griffin	2.63
Bartolo Colon	2.56
Travis Wood	2.51
Cliff Lee	2.53
Kyle Lohse	2.47
Zack Greinke	2.37
Mark Buehrle	2.22
Julio Teheran	2.06
Madison Bumgarner	1.83
Hyun-Jin Ryu	1.73
Mike Minor	1.71
Kris Medlen	1.67
Dillon Gee	1.53
Chris Tillman	1.56
Jose Fernandez	1.52
Adam Wainwright	1.39
Mike Leake	1.29
Chris Sale	1.10
Max Scherzer	1.09
John Lackey	1.06
Matt Harvey	0.98
James Shields	0.91
Hiroki Kuroda	0.88
Ervin Santana	0.89
Anibal Sanchez	0.87
Eric Stults	0.74
Felix Hernandez	0.75
Jose Quintana	0.70
Patrick Corbin	0.57
Shelby Miller	0.59
Doug Fister	0.46
Justin Masterson	0.46
Dan Haren	0.14
Andy Pettitte	0.09
Cole Hamels	0.03
Jhoulys Chacin	-0.02
Matt Cain	-0.01
Wade Miley	-0.03
Jordan Zimmermann	-0.04
Scott Feldman	-0.11
Miguel Gonzalez	-0.11
Ricky Nolasco	-0.13
Jarrod Parker	-0.18
Jeff Locke	-0.21
Ubaldo Jimenez	-0.25
Mat Latos	-0.26
Jeremy Guthrie	-0.30
Gio Gonzalez	-0.37
Kevin Correia	-0.45
Homer Bailey	-0.51
Jorge de la Rosa	-0.60
Stephen Strasburg	-0.67
C.J. Wilson	-0.83
A.J. Burnett	-0.90
Ryan Dempster	-1.01
David Price	-1.01
Jon Lester	-1.09
Andrew Cashner	-1.08
Derek Holland	-1.16
Tim Lincecum	-1.24
Rick Porcello	-1.30
Justin Verlander	-1.31
Yovani Gallardo	-1.55
Lance Lynn	-1.57
Ian Kennedy	-1.93
Felix Doubront	-2.04
Kyle Kendrick	-2.32
Jeremy Hellickson	-2.37
Jerome Williams	-2.41
CC Sabathia	-2.49
Bud Norris	-2.53
Jeff Samardzija	-2.82
Joe Saunders	-3.14
Wily Peralta	-4.70
Edwin Jackson	-5.03
Edinson Volquez	-5.40

When the poster Eric Garcia suggested this, his idea of a crafty pitcher was someone with a low velocity, high ERA, and a decent number of wins. If we use those criteria and the same methodology, we come up with the following list:

Name	Craftiness Score
R.A. Dickey	6.809175606
Mark Buehrle	4.7547704381
Bronson Arroyo	3.2944617169
Joe Saunders	2.9423646195
Jeremy Hellickson	2.6685500615
CC Sabathia	2.4966613422
Eric Stults	2.7180128884
A.J. Griffin	2.4076452673
Doug Fister	2.2427676408
Dan Haren	2.1691672291
Adam Wainwright	1.7071128009
Kyle Kendrick	1.7118241134
C.J. Wilson	1.5737716721
Jeremy Guthrie	1.4387583548
Chris Tillman	1.4439760517
Rick Porcello	1.459202682
Edinson Volquez	1.2576799698
Bartolo Colon	1.6239711996
Jorge de la Rosa	1.4181128961
Max Scherzer	1.1306105901
Kris Medlen	1.4880878755
Jhoulys Chacin	1.4133629431
Yovani Gallardo	1.1947402807
Lance Lynn	1.0099538962
Felix Doubront	1.1495573185
Scott Feldman	1.1974157822
Ricky Nolasco	1.1042531489
Dillon Gee	1.1994224084
Ryan Dempster	1.1677938881
Tim Lincecum	1.035870434
Andy Pettitte	1.1821092279
Mike Leake	1.05406572
Jordan Zimmermann	0.5825671124
Jon Lester	0.5408497347
Ian Kennedy	0.6776977315
Jarrod Parker	0.4623781624
Justin Masterson	0.3885353357
Hyun-Jin Ryu	0.4892567298
Mike Minor	0.3742320945
Hisashi Iwakuma	0.4643968873
Kevin Correia	0.2593286667
Patrick Corbin	0.0564390189
Kyle Lohse	0.312739265
Julio Teheran	0.0997486611
Cliff Lee	0.0886564906
Miguel Gonzalez	-0.0925431019
Jerome Williams	-0.256429514
Edwin Jackson	-0.4285314635
Ubaldo Jimenez	-0.2221254921
Bud Norris	-0.5000332264
Jeff Locke	-0.2451919386
Mat Latos	-0.5647784102
Zack Greinke	-0.4470782733
Wade Miley	-0.5363196771
Travis Wood	-0.3273302268
James Shields	-0.705666201
Justin Verlander	-0.8883247518
Shelby Miller	-0.9631708917
Matt Cain	-0.7250659316
Wily Peralta	-1.1640247285
Hiroki Kuroda	-0.7950288123
Madison Bumgarner	-0.7855967316
John Lackey	-0.9654585733
Felix Hernandez	-1.0396358378
Jose Quintana	-1.1617514899
Gio Gonzalez	-1.3356468354
Yu Darvish	-1.5340675857
Anibal Sanchez	-1.598691562
Cole Hamels	-1.4973933151
A.J. Burnett	-1.7115883636
Clayton Kershaw	-1.6983975443
Homer Bailey	-1.9877439854
Jeff Samardzija	-2.0853342328
Chris Sale	-2.0119349525
Derek Holland	-2.1558326826
Ervin Santana	-2.0875298917
David Price	-2.224336605
Andrew Cashner	-3.1088119908
Jose Fernandez	-3.8720417313
Stephen Strasburg	-4.3734432885
Matt Harvey	-5.3067683524

I doubt these numbers have any real value and are just presented here for entertainment. What do you think makes a pitcher crafty? Let me know in the comments.

xHitting (Part 2): Improved Model, Now with 2013 Leaders/Laggards

by samyoung

December 23, 2013

Happy holidays, all. It took me a while, but I finally have the second installment of xHitting ready. First off, thank you to all those who read/commented on the first piece. For those who didn’t get a chance to read it, the goal here is to devise luck-neutralized versions of popular hitter stats, like OPS or wOBA. A main extension over existing xBABIP calculators is that this approach offers an empirical basis to recover slugging and ISO, by estimating each individual hit type.

I’ve returned today with an improved version of the model. Highlights:

One more year of data (now 2010-2013)
Now includes batted-ball direction (all player-seasons with at least 100 PA)
FB distance now recorded for all player-seasons with at least 100 PA

(There’s no theoretical reason for the 100 PA cutoff, only that I was grabbing some of the new data by hand and couldn’t justify the time to fetch literally every single player.)

I have also relaxed the uniformity of peripherals used for each outcome. At least one reader asked for this, and after thinking about it a while, I decided I agree more than I disagree. The main advantage of imposing uniformity was that it ensures the predicted rates (when an outs model is also included) sum to 100%. But it is true that there are certain interactions or non-linearities that are important for some outcomes, but not others. Including these where they don’t fully belong has a cost to standard errors/precision, and to intuitive interpretation. To ensure rates still sum to 100%, there’s no longer an explicit ‘outs’ model; outs are simply assumed to be the remainder.

For those curious, below I display regression results for each outcome and its respective peripherals. You can otherwise skip below if these are not of direct interest.

(The sample includes all player-years with at least 100 plate appearances between the 2010 and 2013 MLB seasons. Park factors denote outcome-specific park factors available on FanGraphs. Robust standard errors, clustered by player, are in parentheses; *** p$<$0.01, ** p$<$0.05, * p$<$0.1)

The new variables seem to help, as each outcome is now modeled more accurately than before (by either R2 or RMSE). For comparison, here are the R2’s of the original specification:

0.367 for singles rate
0.236 for doubles rate
0.511 for triples rate
0.631 for HR rate

Something else I noticed: for balls that stay “inside the fence,” both pull/opp and actual side of the field matter. Consider singles: the ball needs to be thrown to 1st base (right side of infield) specifically. Thus an otherwise-equivalent ball hit to the left side is not the same as one hit to the right side, since the defensive play is harder to make from the left side. Similarly, hitting the ball to left field is less conducive for triples than hitting the ball to right field.

But hitting the ball to the left side as a lefty is not the same as hitting it there as a righty, since one group is “pulling” while the other group is “slapping.” The direction x handedness interactions help account for this.

How well do the predicted rates do in forecasting? For singles, doubles, and triples, the predicted rates do unambiguously better than realized rates in forecasting next season’s rates. Things are a little less clear for home runs, which I will expand on below.

Although predicted HR rate shows a slight edge in Table 1, the pattern often reverses (for HR only) if you use a different sample restriction — say requiring 300 PA in the preceding season. (For other outcomes, the qualitative pattern from Table 1 still holds even under alternative sample restrictions.)

So home runs appear to be a potential problem area. What should we do when we need HR to compute xAVG/xSLG/xOPS/xWOBA, etc.? Should we:

Use predicted HR anyway?
Use actual HR instead?
Use some combo of actual and predicted HR?

Empirically there is a clear answer for which choice is best. But before getting to that, let’s take a look at whether predicted home-run rate tells us anything at all in terms of regression. That is, if you’ve been hitting HR’s above/below your “expected” rate, do you tend to regress toward the prediction?

The answer to this seems to be “yes,” evidenced by the negative coefficient on ‘lagged rate residual’ below.

So, although realized HR rate is sometimes a better standalone forecaster of future home runs, predicted HR rate is still highly useful in predicting regression. Making use of both, it seems intuitively best to use some combo of actual and predicted HR rate for forecasting.

This does, in fact, seem to be the best option empirically. And this is true whether your end outcome of interest is AVG, OBP, SLG, ISO, OPS, or wOBA.

Observations:

(Option 1 = predicted HR only; Option 2 = actual HR only; Option 3 = combo)
Whether you use option 1, 2, or 3, xAVG and xOBP make better forecasters than actual past AVG or OBP
Option 1 does not do well for SLG, ISO , OPS, or wOBA
^This was not the case in the previous article, but results to that point had sort of a funky sample, having recorded flyball distance only for a partial list of players
Option 2 “saves” things for xOPS and xWOBA, but still isn’t best for SLG or ISO
Option 3 makes the predicted version better for any of AVG, OBP, SLG, ISO, OPS, or wOBA

End takeaways:

The original premise that you can use “expected hitting,” estimated from peripherals, to remove luck effects and better predict future performance seems to be true; but you might need to make a slight HR adjustment.
The main reason I estimate each hit type individually is for the flexibility it offers in subsequent computations. Whether you want xAVG, xOPS, xWOBA, etc., you have the component pieces that you need. This would not be true if I estimated just a single xWOBA, and other users prefer xOPS or xISO.
A major extension over existing xBABIP methods is that this offers an empirical basis to recover xSLG. The previous piece actually provides more commentary on this.
Natural next steps are to test partial-season performance, and also whether projection systems like ZiPS can make use of the estimated luck residuals to become more accurate.

Finally, I promised to list the leading over- and underachievers for the 2013 season. By xWOBA, they are as follows:

Overachievers (250+ PA)				Underachievers (250+ PA)
Name	2013 wOBA	2013 xWOBA	Difference	Name	2013 wOBA	2013 xWOBA	Difference
Jose Iglesias	0.327	0.259	0.068	Kevin Frandsen	0.286	0.335	-0.049
Yasiel Puig	0.398	0.338	0.060	Alcides Escobar	0.247	0.296	-0.049
Colby Rasmus	0.365	0.315	0.050	Todd Helton	0.322	0.369	-0.047
Ryan Braun	0.370	0.321	0.049	Ryan Hanigan	0.252	0.296	-0.044
Ryan Raburn	0.389	0.344	0.045	Darwin Barney	0.252	0.296	-0.044
Mike Trout	0.423	0.379	0.044	Edwin Encarnacion	0.388	0.429	-0.041
Junior Lake	0.335	0.292	0.043	Josh Rutledge	0.281	0.319	-0.038
Matt Adams	0.365	0.323	0.042	Wilson Ramos	0.337	0.374	-0.037
Justin Maxwell	0.336	0.295	0.041	Yuniesky Betancourt	0.257	0.294	-0.037
Chris Johnson	0.354	0.314	0.040	Brian Roberts	0.309	0.345	-0.036

Comments/suggestions?

A Different Look at the Hall of Fame Standard

by Stats All Folks

December 21, 2013

I’m writing this as a response to Dave Cameron’s two articles on December 19 and 20 concerning the Hall of Fame. While I completely understand the point Dave is/was trying to make in both pieces, I felt that his methodology was slightly flawed and perhaps deserved a fresh look. As mentioned multiple times in the comments section on both articles, the data he used included players that were elected via the Veterans Committee. Also included were players elected by the Negro Leagues Committee. The purpose of this post is to look at players elected strictly by the BBWAA. That list includes 112 inductees, the most recent of which being Barry Larkin.

Using the data Dave listed in his follow-up article that limits the player pool to either 5000 PA or 2000 IP, we get the following results:

Year of Birth	“Eligible Players”	Elected Players	Percentage
<1900	258	20	7.8%
1900-1910	93	16	17.2%
1911-1920	66	10	15.2%
1921-1930	77	8	10.4%
1931-1940	99	22	22.2%
1941-1950	168	15	8.9%
1951-1960	147	19	12.9%
1961-1970	160	2	1.3%

If you combine all the data, you get 112 elected players out of 1068 “eligible” players. That works out to 10.5% of the eligible population being inducted. If we remove the 1961-1970 births, it’s 110 elected out of 908 eligible, or 12.1%. If we try and bring the 1961-1970 total up to the overall average, that would mean ~17 inductees. To reach pre-1961 levels, we need ~19 inductees. To reach the lowest percentage of induction, we need a total of ~12 inductees. To reach the highest percentage, we need a total of ~36 inductees. I think it is safe to assume that, with the scrutiny given by Hall voters to the Steroid Era, the possibility of 36 inductees is nearly zero.

Dave also listed six players that he felt would surely get inducted in the coming years. That list included Greg Maddux, Ken Griffey Jr., Randy Johnson, Mariano Rivera, Tom Glavine, and Craig Biggio. If we include those six with the two already elected from the era (Barry Larkin and Roberto Alomar), the Hall would only need to elect four more members from the era to reach the current lowest standard. I would think that John Smoltz has a pretty persuasive case for the Hall of Fame as well, being the only pitcher with 200 wins and 150 saves. Also, Smoltz is one of the 16 members of the 3000 Strikeout Club. That list includes 10 current Hall of Famers (all elected by BBWAA). The other members not currently inducted include Smoltz, Roger Clemens, Randy Johnson, Curt Schilling, Pedro Martinez, and Greg Maddux. Dave already included Johnson and Maddux on his list of “should be in” Hall of Famers. Martinez was born in 1971, so he isn’t included in this discussion. That leaves Smoltz, Schilling, and Clemens. Clemens’ story doesn’t need to be rehashed at this point, and Schilling received 38.8% of the vote on his first ballot last year. Also, simply looking at traditional stats, you have to think Frank Thomas has a strong case as well (521 HR, .301 BA).

Another point I wanted to bring up involves the ages of the players elected by the BBWAA. The average age of a player elected is 49.7 years, with the median age being 48. The data gets skewed a bit by pre-1900s players (as the first election wasn’t until 1936) and by extremely young inductees like Lou Gehrig, Roberto Clemente, and Sandy Koufax . Gehrig was elected by a special ballot the year he retired after being diagnosed with ALS. Clemente was elected a year after his death. Both were elected before the five-year retirement period required for most players elapsed. Koufax only played 11 years in the MLB, a remarkably short time for a Hall of Famer.

If we use the ~50 year average age of election though, anyone born in 1964 or after still “has a decent chance” at election. If we figure an even distribution of eligible players born each year between 1961-1970, that means 60% of eligible players, or 96, still can make a case. That becomes 90 if we take out Maddux, Glavine, Griffey, Rivera, Johnson, and Biggio. As I stated earlier, they only need to elect four more to reach previously seen levels of induction. 4/90 is only 4.4% needed. That list of 90 players also doesn’t include still eligible players such as Don Mattingly, Roger Clemens, Edgar Martinez, Fred McGriff, and Mark McGwire.

I’m not trying to take a stand on either side of the PED Hall of Fame discussion. I’m just trying to point out that maybe the Hall of Fame isn’t being so much more strenuous on eligible players as they’ve been throughout history. Just something to think about.

Mark Trumbo, Pedro Alvarez, and Perception

by ncarrington

December 20, 2013

We have come a long way in evaluating players and yet, perception still clouds our judgment. Perception awarded Derek Jeter several Gold Gloves during years where he was a poor defensive player. Perception will likely award Nelson Cruz a hefty contract this winter. While there is no way to know for sure, I fear that perception may have played a role in the biggest trade so far this offseason: the well-documented Mark Trumbo trade.

Plenty of writers have covered why this trade looks like a poor move for the Diamondbacks so I won’t dive deeply into that. I desire to understand how Trumbo could be valued so highly (assuming the Diamondbacks feel they gave up quality for quality). Dave Cameron wrote an interesting article about how Trumbo was both overrated and underrated. He stated that Trumbo’s one great skill, breathtaking power, is a frequently overvalued skill. Kevin Towers seems to be one of those who overvalues power and made the trade based on that one skill. But is Trumbo’s power the only reason that a team might overvalue him? With this in mind, I decided to find a comparable player and at least speculate to the perception differences that may cause a team to overvalue someone like Trumbo.

That player is Pedro Alvarez. The similarities are actually quite amazing. The following table contains combined information from the 2012 and 2013 seasons, the two years that Trumbo and Alvarez were both full-time players.

2012-2013	HR	RBI	BB%	K%	ISO	BABIP	AVG	OBP	SLG	wOBA	wRC+	WAR
Mark Trumbo	66	195	7.1%	26.7%	.221	.293	.250	.305	.471	.333	114	4.7
Pedro Alvarez	66	185	8.8%	30.5%	.232	.292	.238	.307	.470	.332	112	5.4

Holy smokes! Every time I look at these numbers, I am shocked at how similar these two players were over a two-year span. Trumbo is one year older and right-handed, but that’s where the differences end. Neither gets on base much or is a great defender, but Alvarez wasn’t terrible at third in 2013. They both derive their value almost entirely from their power and strike out way too much. They are the right-handed and left-handed versions of each other from an offensive standpoint.

I’ll admit that if someone had forced me to pick between the two players before doing the research, I may have gone with Trumbo. Why does Trumbo seem to get more attention than Alvarez? Well, the markets are obviously different. Los Angeles draws a lot more attention than the finally revived corpse that is Pittsburgh baseball. What else does Trumbo have that Alvarez doesn’t? Trumbo has one giant first half in 2012 where he flashed skills he probably doesn’t have.

Pedro Alvarez’s best half of baseball was probably the first half of 2013. Alvarez hit .250/.311/.516 with 24 home runs. That is an impressive stat line, but it doesn’t show any growth in other skills outside of Alvarez’s impressive power. He didn’t get on base much more than other stretches of his career, and his average remained similar to his 2012 line of .244. He has never given anyone any reason to believe he is more than a one-trick pony.

During the first half of 2012, Trumbo hit .306/.358/.608 with 22 home runs. He was an All-Star, and some people thought he had taken a big leap forward. It was the kind of first half that can change perceptions, even though it was a small sample size. The second half proved unkind. Trumbo hit .227/.271/.359 with 10 home runs. But what a first half!

I have no idea whether Towers put any stock into Trumbo’s first half in 2012. Probably not. But it isn’t hard to see how teams could talk themselves into thinking that Trumbo has untapped potential based on that half. Regardless, the perception of Mark Trumbo as an above-average player likely comes from his undeniable power and one monster half of baseball that he has never come close to duplicating. It makes me wonder whether Towers would have given up two young players with potential for Alvarez if he had been available. Considering Alvarez is another “100-plus RBI, 30 home run guy”, he may have. But then again, he may secretly be banking on Trumbo as a real impact bat that produces in more ways than one. While there is no definitive answer to that, this comparison is another precautionary tale to overvaluing short sample sizes.

Team Construction, OBP, and the Importance of Variance

by Brandon Reppert

December 19, 2013

A recent article by ncarrington brought up an interesting point, and it’s one that merits further investigation. The basis of the article points out that even though two teams may have similar team average on-base percentages, a lack of consistency within one team will cause them to under-perform their collective numbers when it comes to run production. A balanced team, on the other hand, will score more runs. That’s our hypothesis.

How does the scientific method work again? Er, nevermind, let’s just look at the data.

In order to gain an initial understanding we’re going to start by looking at how teams fared in 2013. We’ll calculate a league average runs/OBP number that will work as a proxy for how many runs a team should be expected to score based on their OBP. And then we’ll calculate the standard deviation of each team’s OBP (weighted to plate appearances), and compare that to the league average standard deviation. If our hypothesis is true, teams with a relatively low OBP deviations will outperform their expected runs scored number.

Of course, there’s a lot more to team production than OBP. We’re going to conquer that later. Bear with me–here’s 2013.

A few things to keep in mind while dissecting this chart: 668.5 is the baseline number for Runs/(OBP/LeagueOBP). Any team number above this means that they are outperforming, while any number below represents underperformance. The league average team OBP standard deviation is .162

Team	Runs/(OBP/LeagueOBP)	OBP Standard Deviation
Royals	647.71	0.1
Rangers	710.22	0.17
Padres	632.53	0.14
Mariners	642.88	0.15
Angels	700.75	0.17
Twins	618.61	0.16
Tigers	723.95	0.12
Astros	642.5	0.15
Giants	620.1	0.15
Dodgers	627.18	0.21
Reds	673.82	0.19
Mets	638.45	0.18
Diamondbacks	668.02	0.16
Braves	675.02	0.16
Blue Jays	705.27	0.17
White Sox	622.92	0.15
Red Sox	768.53	0.19
Cubs	631.74	0.12
Athletics	738.61	0.15
Nationals	662.76	0.18
Brewers	650.02	0.16
Rays	669.46	0.18
Orioles	749.95	0.19
Rockies	689.93	0.18
Phillies	627.95	0.14
Indians	717.08	0.18
Pirates	637.87	0.17
Cardinals	744.3	0.2
Marlins	552.48	0.14
Yankees	666.17	0.14

That chart’s kind of a bear, so I’m going to break it up into buckets. In 2013 there were 16 teams that exhibited above-average variances. Of those, 11 outperformed expectations while only 5 underperformed expectations. Now for the flipside–of the 14 teams that exhibited below-average variances, only 2 outperformed expectations while a shocking 12(!) teams underperformed.

That absolutely flies in the face of our hypothesis. A startling 23 out of 30 teams suggest that a high variance will actually help a team score more runs while a low variance will cause a team to score less.

Before we get all comfy with our conclusions, however, we’re going to acknowledge how complicated baseball is. It’s so complicated that we have to worry about this thing called sample size, since we have no idea what’s going on until we’ve seen a lot of things go on. So I’m going to open up the floodgates on this particular study, and we’re going to use every team’s season since 1920. League average OBP standard deviation and runs/OBP numbers will be calculated for each year, and we’ll use the aforementioned bucket approach to examine the results.

Team Seasons 1920-2013

Result	Occurrences
High variance, outperformed expectations	504
High variance, underperformed expectations	508
Low variance, outperformed expectations	492
Low variance, underperformed expectations	538

Small sample size strikes again. Will there ever be a sabermetric article that doesn’t talk about sample size? Maybe, but it probably won’t be written by me. Anyways, the point is that variance in team OBP has little to no effect on actual results when you up your sample size to 2000+. As a side note of some interest, I wondered if teams with high variances would tend have bigger power numbers than their low variance counterparts. High variance teams have averaged an ISO of .132 since 1920. Low variance teams? .131. So, uh, not really.

If you want to examine the ISO numbers a little more, here’s this: outperforming teams had an ISO of .144 while underperforming teams had an ISO .120. These numbers remain the same for both high and low variance teams. It appears that overachieving/underachieving OBP expectations can be almost entirely explained by ISO.

I’m not satisfied with that answer, though. Was 2013 really just an aberration? What if we limit our samples to only teams that significantly outperformed or underperformed expectations (by 50 runs) while having a significantly large or small team standard deviation OBP.

Team Seasons 1920-2013, significant values only

Result	Occurrences
High variance, outperformed expectations	117
High variance, underperformed expectations	93
Low variance, outperformed expectations	101
Low variance, underperformed expectations	119

The numbers here do point a little bit more towards high variance leading to outperformance. High-variance teams are more likely to strongly outperform their expectations to the tune of about 20%, and the same is true for low-variance teams regarding underperforming. Bear in mind, however, that that is not a huge number, and that is not a huge sample size. If you’re trying to predict whether a team should outperform or underperform their collective means then variance is something to consider, but it isn’t the first place you should look.

Being balanced is nice. Being consistent is nice. It’s something we have a natural inclinations towards as humans–it’s why we invented farming, civilization, the light bulb, etc. But when you’re building a baseball team it’s not something that’s going to help you win games. You win games with good players.

What If: The St. Louis Cardinals Were Two Teams

by JU Jazz Hands

December 18, 2013

Much has been made of the Cardinals’ amazing depth and seeming ability to pull All-Star-caliber players from their minor leagues at will.

In today’s FanGraphs After Dark chat with Paul Swydan I asked what place in the NL Central the Cardinals would finish in were they to be forced to field two separate (but equal) teams in 2014.

Swydan’s answer:

Probably third and fourth. They’re not THAT good.
Maybe even lower than that. It’s an interesting question.

Well, I too thought it was interesting and decided to try to find out.

I looked at the Oliver projections for the Cardinals and tried to divide them into equal teams. Then I did my best (well, my most efficient, it is 9 at night) to divide up playing time equally between both teams. STEAMER projections assume 600 PA’s for all position players so I prorated each player’s WAR projection for the number of PA’s that I estimated (I tried to stick to 600 PA’s for each position – too much work to do otherwise).

For pitchers I used Oliver’s projected number of starts for starters and innings pitched for relievers to make sure that both teams were equal. I didn’t do any prorating for pitchers. I wanted to, but that started to look like more work than I was willing to put in right now — and I was sort of worried that Paul would do his own post on this, so I wanted to beat him to the punch.

There weren’t quite enough players projected for the Cardinals so for the missing positions I just assumed a replacement-level player.

These were the teams and their projected WAR totals that I came up with.

null

So each team was at about 25 .5 WAR.

How about the rest of the NL Central?

For this I just looked at the STEAMER projections since they already adjust playing time and I didn’t want to have to do it for each team. This is what STEAMER had for the other NL Central teams:

Pirates 34.5 WAR
Reds 30.5 WAR
Brewers 27.6 WAR
Cubs 26.9 WAR

So, our Cardinals teams look like they’d finish just behind the rest of the NL Central, but it’s close enough that we can say that the Cardinals might literally be twice as good as the Cubs and Brewers.

Team On-Base Percentage and a Balanced Lineup

by ncarrington

December 18, 2013

Teams that get on base often score more runs than those that don’t. We know this, and it comes as no surprise. In 2013, the Red Sox had the highest team OBP (.349) and also scored the most runs in MLB. The Tigers had the second-highest team OBP (.346), and they scored the second-most runs. Team OBPs can tell us a lot about the effectiveness of an offense (obviously not everything), but they can also be misleading if proper context isn’t applied.

The Cardinals scored 783 runs in 2013, good enough for third in MLB. The rival Reds scored 698 runs, 85 fewer than the Cardinals. There are many reasons for this gap in runs scored, but I would like to examine just one of them. The Cardinals had a team OBP of .332 while the Reds had a team OBP of .327. On first look, it appears that the Cardinals and Reds got on base at a similar rate. But a major difference exists below the surface. Take a look at the chart below of the top eight hitters by plate appearance for both teams (Chris Heisey gets the nod over Ryan Hanigan as to not have two Reds’ catchers on the list).

Reds OBP	Cardinals OBP
Joey Votto .435	Matt Carpenter .392
Shin Soo Choo .423	Matt Holliday .389
Jay Bruce .329	Allen Craig .373
Todd Frazier .314	Yadier Molina .359
Brandon Phillips .310	John Jay .351
Devin Mesoraco .287	David Freese .340
Zack Cozart .284	Carlos Beltran .339
Chris Heisey .279	Pete Kozma .275

The difference is quite evident. The average OBP in 2013 was .318. Seven of the top eight Cardinal hitters got on base at an above-average clip. Besides the pitcher, there is one easy out in that lineup. The Cardinals maintained a ridiculous batting average with RISP, but that matters much more because they always had people on base.

On the other hand, the Reds had two on-base Goliaths. Joey Votto and Shin-Soo Choo camped out on the bases. They became one with the bases. The problem was that the Reds had only one more player with an above-average OBP, Jay Bruce at .329. The other five players struggled to get on base consistently. Three of them had OBPs under .300.

So while the Cardinals achieve a high team OBP through balance, the Reds had two hitters who significantly raised the team OBP. Take Votto and Choo away, and the other six Reds on this list have a combined OBP of .305. That is a staggering low number for six of the top hitters on a playoff team.

What does this teach us? Well, team OBPs do not provide insight into how balanced a lineup a team has. The Reds would be foolish to think they have a lineup that gets on base enough to be an elite offense. With the loss of Choo, the Reds offense may struggle to produce runs at a league-average clip as Votto and Bruce could be stranded on base countless times.

A balanced lineup was a major factor in the Cardinals scoring the most runs in the National League. Their team may have had an excellent .332 OBP, but their top eight hitters by plate appearance had a .355 OBP. As a group they were excellent. The Red Sox were similar in that their top eight hitters by plate appearances all had above-average OBPs with Stephen Drew coming in eighth at .333. Think about that! The Red Sox eighth-best hitter at getting on base was 15 points above league average.

Even though the Reds finished 6th in team OBP in 2013, their on-base skills were lacking. While the Cardinals had only a five-point advantage in team OBP over their rival, they were much more adept at clogging the bases. Team OBPs are great, they just don’t always tell the whole story.

A New Metric of High Unimportance: SCRAP

by Brandon Reppert

December 17, 2013

It’s something we hear all the time: “He’s a scrappy player” or “He’s always trying hard out there, I love his scrappiness.” Maybe chicks don’t dig the long ball anymore; maybe they’re into scrappiness. I’m not really in a position to accurately comment on what chicks dig though, so I don’t know.

Even from a guy’s perspective, scrappiness is great. It’s hard to hate guys that overcome their slim frames by just out-efforting everyone else and getting to the big leagues. It’s not easy to quantify scrappiness, though. Through the years it’s always been a quality that you know when you see, but there’s never been a number to back it up. Until now.

Scrap is a metric that is scaled on a similar scale to Spd, where 5 is average and anything above that is above average, and anything below 5 is below average. Here are the components that make it up (each component is factored onto a Spd-like scale, assigned a weight, and then combined with all of the other components to give a final number).

Infield hit% — Higher is better.
.ISO — Less power means more scrappiness.
Spd –The ability to change a game with legs.
balls in play% — (PA-BB-K)/PA — Go up there looking to fight.
zSwing%. — Higher is better. Measures willingness to defend the zone.
oSwing%. — Lower is better. These guys can’t hit the low and away pitch to deep center.
zContact%. — Higher is better. These guys swing for contact.

Without further ado, here are the Scrap rankings of all qualified batters in 2013.

#	Name	Scrap
1	Alcides Escobar	6.31
2	Eric Young	6.27
3	Leonys Martin	6.25
4	Jacoby Ellsbury	6.24
5	Starling Marte	6.23
6	Jean Segura	6.19
7	Ichiro Suzuki	6.13
8	Alexei Ramirez	6.13
9	Elvis Andrus	6.08
10	Denard Span	6.08
11	Jose Altuve	6.08
12	Erick Aybar	5.93
13	Adeiny Hechavarria	5.9
14	Daniel Murphy	5.9
15	Brett Gardner	5.89
16	Carlos Gomez	5.89
17	Gregor Blanco	5.87
18	Michael Bourn	5.8
19	Alex Rios	5.76
20	Will Venable	5.72
21	Norichika Aoki	5.7
22	Jimmy Rollins	5.64
23	Shane Victorino	5.63
24	Michael Brantley	5.63
25	Howie Kendrick	5.63
26	Gerardo Parra	5.61
27	Nate McLouth	5.58
28	Nolan Arenado	5.54
29	Torii Hunter	5.53
30	Austin Jackson	5.53
31	Chris Denorfia	5.52
32	Jon Jay	5.52
33	Brandon Phillips	5.5
34	Alejandro De Aza	5.48
35	Dustin Pedroia	5.45
36	Darwin Barney	5.45
37	Ian Desmond	5.42
38	Starlin Castro	5.42
39	A.J. Pierzynski	5.4
40	Eric Hosmer	5.39
41	Asdrubal Cabrera	5.39
42	Josh Hamilton	5.39
43	Alex Gordon	5.39
44	Adam Jones	5.38
45	Coco Crisp	5.35
46	Andrew McCutchen	5.34

47	Marco Scutaro	5.34
48	Ian Kinsler	5.33
49	Andrelton Simmons	5.33
50	Desmond Jennings	5.32
51	Jonathan Lucroy	5.32
52	Chase Utley	5.3
53	Brandon Belt	5.3
54	Hunter Pence	5.26
55	Jason Kipnis	5.22
56	Ben Zobrist	5.21
57	Alfonso Soriano	5.2
58	Pablo Sandoval	5.19
59	Manny Machado	5.18
60	Brian Dozier	5.18
61	Matt Holliday	5.17
62	Brandon Crawford	5.17
63	Allen Craig	5.15
64	Matt Carpenter	5.14
65	Michael Young	5.13
66	Yunel Escobar	5.12
67	Yoenis Cespedes	5.11
68	Yadier Molina	5.11
69	Nick Markakis	5.11
70	Zack Cozart	5.1
71	Mike Trout	5.1
72	Nate Schierholtz	5.08
73	Todd Frazier	5.07
74	Michael Cuddyer	5.07
75	Domonic Brown	5.06
76	Chase Headley	5.03
77	Salvador Perez	5.03
78	Marlon Byrd	5.02
79	James Loney	5.0
80	Neil Walker	5.0
81	Kyle Seager	4.97
82	Andre Ethier	4.97
83	Freddie Freeman	4.96
84	Mike Moustakas	4.95
85	Robinson Cano	4.95
86	Jed Lowrie	4.95
87	David Freese	4.92
88	Shin-Soo Choo	4.91
89	Adam LaRoche	4.91
90	Chris Johnson	4.88
91	Martin Prado	4.87
92	Carlos Beltran	4.86
93	Ryan Zimmerman	4.85

94	Victor Martinez	4.83
95	Justin Morneau	4.81
96	Adrian Gonzalez	4.8
97	Anthony Rizzo	4.79
98	Alberto Callaspo	4.79
99	Trevor Plouffe	4.79
100	Ryan Doumit	4.77
101	Brandon Moss	4.74
102	Mark Trumbo	4.74
103	Matt Wieters	4.7
104	Josh Donaldson	4.69
105	Adrian Beltre	4.69
106	Justin Upton	4.68
107	Daniel Nava	4.67
108	Paul Konerko	4.65
109	Billy Butler	4.65
110	Matt Dominguez	4.64
111	Jayson Werth	4.62
112	Russell Martin	4.62
113	Jay Bruce	4.62
114	J.J. Hardy	4.6
115	Joey Votto	4.59
116	Buster Posey	4.59
117	Dan Uggla	4.57
118	Nick Swisher	4.55
119	Kendrys Morales	4.52
120	Carlos Santana	4.51
121	Pedro Alvarez	4.49
122	Mark Reynolds	4.48
123	Jedd Gyorko	4.48
124	Paul Goldschmidt	4.47
125	Prince Fielder	4.47
126	Edwin Encarnacion	4.45
127	David Ortiz	4.45
128	Adam Lind	4.4
129	Jose Bautista	4.38
130	Justin Smoak	4.37
131	Miguel Cabrera	4.37
132	Mitch Moreland	4.36
133	Joe Mauer	4.34
134	Evan Longoria	4.24
135	Chris Carter	4.23
136	Giancarlo Stanton	4.1
137	Mike Napoli	4.09
138	Troy Tulowitzki	4.07
139	Chris Davis	3.94
140	Adam Dunn	3.81

That’s quite a bit to look at. Here are a few of my takeaways:

The general perception of a player’s scrappiness is pretty close to what this metric spits out.
There are some surprises, such as Tulo being near the bottom. In his case it’s caused by an extremely low speed rating and a low z-swing%.
Little dudes that run hard tend to be scrappy (duh).
Big oafy power guys tend not to be scrappy (duh).
Upon removing the qualified batter restriction the ‘Scrap’ leader is Hernan Perez. Tony Campana is a close second. I think we can all agree that Campana is more or less the definition of scrappiness.

This isn’t a stat that’s going to forever change how we view baseball. But this does give us a way of quantifying, however imperfectly, a skillset that we haven’t been able to before. Now we not only know that Jose Altuve is scrappy, we know just how scrappy he is. I’ll let you decide how important that is.

If you have any suggestions regarding different ways to calculate Scrap let me know in the comments. It’s a metric that requires a good amount of arbitrary significance since, well, what does it even mean to be scrappy? We’ve always had an idea, and now we have a number.

The idea for this metric was spurned on by Dan Syzmborksi on this episode of the CACast podcast, somewhere around the 75-minute mark.

Baseball’s Most Ridiculous Patented Equipment

by John Racanelli

December 13, 2013

Background – what does a patent get you?

Long ago, governments recognized that protecting inventors’ efforts was essential to encourage technological advancement but realized that limiting the time in which an inventor had the exclusive right to market their invention served the greater good by preventing the inventor from controlling a useful product forever. Patents were first granted in Europe in the late 1400s and the patent system was first enacted in the United States in 1790. To date, there have been thousands of baseball-related patents issued covering everything from game equipment to methods of compressing game broadcasts.

In the United States, a patent is an intellectual property right granted by the government to an inventor that “excludes others from making, using, offering for sale, or selling the invention throughout the United States or importing the invention into the United States” for a limited time in exchange for public disclosure of the invention when the patent is granted. Currently, a utility patent is enforceable for 20 years from the date on which the application was submitted, assuming that periodic maintenance fees are paid as scheduled.

What can be patented?

A utility patent will be granted for a machine, process, article of manufacture, composition of matter (or any improvement to an existing machine, process, article of manufacture, composition of matter) as long as it is “new, nonobvious and useful.” There are certain things that cannot be patented, however, such as laws of nature, abstract ideas and inventions that are morally offensive or “not useful.”

The “non useful” component is somewhat interesting in that the patent examiner is charged only with making a decision whether an invention will function as expected and otherwise has a “useful purpose.” As you will see below, “useful” does not always mean that the invention will be marketable.

So how did James Bennett hope to change baseball?

While it is not clear whether inventor James E. Bennett of Momence, Illinois is the same James Bennett who played for the Sharon Ironmongers in the 1895 Iron and Oil League, it seems clear that he did not exert any forethought as to whether his inventions would be practical when used under baseball game conditions. Either that or he just really hated catching a ball with the existing baseball glove technology available at the turn of the 20^th Century.

By the early 1900s, baseball gloves had undergone constant improvement. Starting with George Rawlings in 1885, (Pat. No. 325,968) protective gloves were becoming more acceptable to protect fielders’ hands. In 1891, Harry Decker added a thick pad to the front of the glove (Pat. No. 450,355) and Bob Reach added an inflatable chamber (Pat. No. 450,717). By 1895 Elroy Rogers had designed the classic “pillow-style” catcher’s mitt (Pat. No. 528,343) that would be used with little change until Randy Hundley pioneered the one-handed catching technique in the 1960s using a hinged catcher’s mitt.

Regardless of the existence of the baseball glove technology in use at the time, James Bennett tried to think outside the box by eliminating the catcher’s mitt altogether and, instead, attaching that box to the catcher’s chest. Here is 1904’s “Base Ball Catcher” in all of its ill-conceived glory:

Front View

Side View

Bennett apparently envisioned the catcher squatting behind home plate acting as a passive target for the pitcher’s offerings and designed this contraption to accept the pitched ball into the cage such that it would strike the padding and drop through a chute into the catcher’s hand so it could be returned to the mound. As you can see, however, the device would have significant shortcomings should the catcher have to attempt to throw out a would-be base stealer, be required to catch the ball for a play at the plate, attempt to block a wild pitch or especially to field his position on a ball put in play in front of the plate.

But Bennett was not finished yet! In 1905, he patented a two-handed “Base Ball Glove” with an oversized pocket to trap the ball:

Front and Back View

Bennett claims that this poorly imagined glove is easy to use because the fingers on the player’s throwing hand were specially designed to “permit the easy and quick removal of that hand to grasp and throw the ball.” Just as with the “Base Ball Catcher,” however, this design does not offer the player much in the way of a catching radius.

So what happened to James E. Bennett’s inventions?

As of 1918, he was still looking for investors, according to this advertisement he placed in the August and October issues of “Forest and Stream” magazine.

The Rockies’ One Through Eight: the Small Successes and Failures of Lineup Construction

by Eric Garcia McKinley

December 9, 2013

Given the speedy obsolescence of my last blog post, I am left to conclude that Dan O’Dowd and Bill Geivett either don’t read my blog, or they don’t give a shit what an immodest blogger has to say about the Rockies. It’s likely both. Indeed, after the Rockies traded Dexter Fowler and signed Justin Morneau last week, there’s no use rehashing alternatives and possible failures. The task now is to think about what the Rockies can do with the roster that they do have. Last week, I wrote about the construction of the Rockies’ roster in the long-term and on a macro scale. This week, I want to think about what the lineup might—and, yes, should—look like on a micro level. What did the daily lineup look like in 2013? What will the daily lineup look like in 2014? Can it be a recipe for immediate success? What does the structure of the lineup tell us about the organization? Because the pitching staff is the area most likely to go through changes between now and opening day, I’m limiting myself to the position players and their offensive production.

The consensus among those who think about these things is that most managers follow orthodoxies that determine what types of hitters can hit where—speedy guys are lead-off hitters, and power hitters hit in the four or five hole. However, there is evidence that these managerial codes are non-optimal. The big caveat, however, is that research indicates optimizing lineups might only account for a handful of runs a year, and maybe one or two wins. But sometimes one or two wins can be the difference between postseason play and spending October noting the changing leaves. My goal here is not to compare the probable 2014 lineup with a more optimal one and argue that it constitutes the difference between success and failure. Rather, I suggest that a daily glance at the Rockies one through eight in 2014 can illuminate broader directions regarding where the team is going. Or not going, as the case may be.

Here is what I think the Rockies daily lineup will look like come April (for the sake of simplicity, I’ll only consider lineups against right-handed starting pitchers):

1) Charlie Blackmon, LF

2) DJ LaMahieu, 2B

3) Carlos Gonzalez, CF

4) Troy Tulowitzki, SS

5) Michael Cuddyer, RF

6) Wilin Rosario, C

7) Justin Morneau, 1B

8) Nolan Arenado, 3B

9) Pitcher

The immediate result of the Fowler trade is that the Rockies have lost their leadoff hitter. Fowler fit the profile of a conventional choice to lead off games. Namely, he is fast. Still, Fowler was a good fit to hit leadoff, but it was not because of his speed, but because he was among the best on the team in getting on base. This should be the primary metric for a leadoff hitter because guys need to get on base in order to score runs. Despite hitting just .263, Fowler’s 13% walk rate elevated his OBP to .368. For comparison, Rosario hit .292, but his free swinging style and 3% walk rate put his OBP at just .315. Even without the threat to steal (Fowler stole 19 bases in 28 attempts), his ability to get on base made him the best candidate on the team to hit in the one hole. Without Fowler, I think Walk Weiss (or Bill Geivett, or whoever the hell makes these clubhouse decisions) is going to go with Blackmon (and sometimes Corey Dickerson) in the leadoff spot, only because Blackmon fits the profile that values speed first. If we assume that Blackmon splits time with Dickerson in left field as well as leading off games, they collectively project (per Steamer) to get on base at a .325 clip in about 700 plate appearances, hardly enough to justify hitting first.

Whereas the decision to bat Fowler first made sense both by conventional and unconventional thinking, the number-two hitter is where the Rockies really made a mistake. I expect it to be repeated in 2014. Over the course of the year, a mélange of as-of-now below average hitters were placed in the two spot—mostly whoever happened to be playing second base, meaning either Josh Rutledge or LaMahieu. The total slash line of all two hitters for the 2013 Rockies? .256/.290/.341. Aside from the pitcher’s spot, the collective average and OBP of the two hitter was better than only the seven spot, and the slugging percentage was the worst among position players. The Rockies essentially placed their worst hitter between the one and three spot. If the Rockies, as I suspect, go with LaMahieu to hit second, they’re going to repeat the error. The other player I can envision Weiss placing in the two hole is Arenado—who projects to be the only position player with worse offensive numbers than LaMahieu.

What throws this mistaken lineup construction into such stark relief is that research suggests that the two spot is precisely where the team’s best hitter should be placed. Sky Kalkman argues that a team’s three best hitters should be placed in the one, two, and four holes, with high OBP leaning towards the one and two spots and power at the four spot. The next best two should be hitting in the three and five spots, and the worst hitters placed in spots six through eight (in the National League). If the Rockies daily lineup looks like what I think it will, then two of the team’s three worst hitters will regularly hit one and two.

Then what should the lineup look like? Baseball Musing’s lineup analysis allows the interested fan to input a name, OBP, and slugging percentage, and it purports to output the optimal team lineup based on runs per game. The calculus is based on past performance taken from data either from 1959-2004 or the steroid inflated statistics from 1989-2002. As Jack Moore observes, both models are flawed because neither is applicable to the game today and the simulations take place in a vacuum without context. Additionally, the RPG outputs are inflated beyond reason. But regardless of whether or not the RPG outputs can be taken at face value, the tool has some use because it enables you to see RPG differentials among different lineup constructions. Using the more inclusive 1959-2004 model and 2014 Steamer projections, the supposed optimal lineup—the one that ostensibly would produce just over five runs per game—looks like this:

1) Tulowitzki

2) Gonzalez

3) Blackmon

4) Morneau

5) Cuddyer

6) Arenado

7) LaMahieu

8) Rosario

9) Pitcher

This lineup is enticingly unconventional. It provides for the Rockies’s best hitters to have the most opportunities to get on base and score runs. Still, I wouldn’t follow it. For one, the team’s best hitters at getting on base also happen to be the ones with the most pop. So there is no easy way to favor OBP at the one and two spots and power at the four and five spots. I would love to have an OBP Carlos Gonzalez and a home run hitting one, but we have to make do with the fortunate curse that they are the same person—at least we do now, as Fowler reached base about as often as Gonzalez in 2013. This lineup would also be risky because the two through four hitters are all left-handed, which would make it easy for the opposition to marshal its lefty specialist late in a close game. Conversely, I would construct the Rockies daily lineup as follows, this time with projected slash line (again, per Steamer):

1) Gonzalez – .297/.376/.547

2) Cuddyer – .281/.343/.474

3) Rosario – .278/.316/.515

4) Tulowitzki – .300/.376/.534

5) Morneau – .276/.345/.461

6) LaMahieu – .289/.328/.392

7) Arenado – .277/.318/.446

8) Blackmon/Dickerson – .276/.326/.455

9) Pitcher (based on 2013 production) – .140/.176/.165

In my mind, this lineup is the one most likely to produce the most runs for the Rockies. Ideally, I would rather have Gonzalez hitting second rather than first, but the rest of the roster limits this flexibility. The possibility of Gonzalez leading off has been raised, but I don’t think there is much to the talk. Other than Gonzalez’s first half season with the Rockies in 2009, he’s only led off when Jim Tracy thought it could pull him out of a horrid slump. Tulowitzki is certainly a better hitter than Cuddyer, but Tulo’s power coupled with Cuddyer’s ability to get on base (even if he’s in for some serious regression in 2014) make hitting Cuddyer second and Tulo fourth the best play. The three and five spots will produce more outs than the one, two, and four spots, but the upside of Rosario’s power mitigates the risk of those outs, as would Morneau’s relatively higher OBP and ability to hit about one fifth of his balls in play as line drives.

Again, this exercise does not identify the path to success and the path to failure for the Rockies in 2014. The team is unlikely to make the playoffs regardless of how the lineup is structured. But what it should do is serve as a reminder to pay attention to the daily details and to think beyond inherited baseball wisdom. If the daily lineup turns out to replicate past mistakes, then I think it points to a much larger organizational problem of resisting even the simplest and most easily integrated baseball analytics. But if Weiss runs out lineups that defy convention, then it might suggest that the franchise has a baseball plan in addition to a business plan.

« Previous Page — « Previous entries

Next entries » — Next Page »

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG