Archive for Uncategorized

The “Exceptional” Kyle Lohse

After the 2012 season, Kyle Lohse declined the qualifying offer of the St. Louis Cardinals, and hit the free agent market.  Lohse’s 2012 season was exactly what any starter would want in a contract year: a career-best 2.86 ERA over 211 innings.  It completed a comeback from a rough 2010 in which Lohse battled arm trouble, and had one of his worst seasons. 

Many commentators felt that that Lohse’s 2012 campaign was a one-time affair.  Lohse’s ERA benefited from an unusually low .262 Batting Average on Balls in Play (BABIP), and the usually reliable pitching statistic of Fielding Independent Pitching (FIP) dinged him for it, pegging his real performance at 3.51 — almost three quarters of a run higher.  Furthermore, Lohse spent 2012 at Busch Stadium, a pitcher’s park, and got to have his pitches called by Yadier Molina, perhaps the best catcher in the game.  

But was Lohse’s low BABIP in 2012 truly a fluke? 

Let’s start by comparing Lohse to other Cardinals starters with at least 150 IP that year.  Like Lohse, they pitched their home games in the same pitcher’s park, and also took their signs from Yadier Molina:

Name IP BABIP ERA FIP
Kyle Lohse 211 0.262 2.86 3.51
Jake Westbrook 174.2 0.312 3.97 3.8
Adam Wainwright 198.2 0.315 3.94 3.1
Lance Lynn 169 0.316 3.67 3.47

Of all Cardinals starters that year, Kyle Lohse had the best starter BABIP by 50 points, and was the only one below the league BABIP average.  Interesting.  But, one season proves nothing.  So, let’s look at 2011, again for Cardinal starters with at least 150 IP:

Name IP BABIP ERA FIP
Kyle Lohse 188.1 0.269 3.39 3.67
Chris Carpenter 237.1 0.312 3.45 3.06
Jake Westbrook 183.1 0.313 4.66 4.25
Jaime Garcia 194.2 0.318 3.56 3.23

In 2011, Kyle Lohse’s BABIP was a mere seven points higher than his 2012 BABIP, and still absurdly low.  Once again, Lohse’s BABIP was by far better than any other Cardinals starter, and well below league average.  Is this still a fluke?  Does Yadi just save his best calls for his friend Kyle?

Perhaps, the key is to get Lohse away from Molina and Busch Stadium.  Fortunately for our purposes, the Milwaukee Brewers indulged this notion, signing Lohse at the conclusion of 2013 Spring Training.  Miller Park, where the Brewers play, is a hitter’s park where the fly balls go a long way and batters get more hits.  Furthermore, in 2012, the Brewers had one of the worst defenses in baseball.  The stage seemed to be set for a substantial BABIP regression.

The 2013 season is now almost complete for the Brewers.  Yet, as of the time this article was written, here are the statistics for Brewers starters with at least 150 IP in 2013:

Name IP BABIP ERA FIP
Kyle Lohse 184.2 0.284 3.46 4.1
Yovani Gallardo 161.2 0.299 4.18 3.95
Wily Peralta 172.1 0.292 4.49 4.28

Lohse’s BABIP did regress a bit.  Yet, Lohse’s BABIP is not only the lowest of the three qualifying Brewers starters, but still notably below the .294 BABIP average of baseball. 

One last comparison: other NL Central starters play in many of the same stadiums that Kyle Lohse does.  How does his BABIP compare to starters who have also spent the last three years pitching at least 450 innings exclusively for NL Central teams?

Name BABIP ERA FIP
Kyle Lohse 0.271 3.22 3.75
Bronson Arroyo 0.278 4.13 4.63
Mike Leake 0.284 3.87 4.21
Homer Bailey 0.292 3.76 3.67
Yovani Gallardo 0.293 3.79 3.83
Jake Westbrook 0.307 4.23 4.15

There he is again.  The lowest BABIP in the NL Central for starters over the last three years belongs to Kyle Lohse.

What is going on?  Does Kyle Lohse simply possess The Will to Pitch? 

Certainly, many of you might claim Kyle Lohse is the beneficiary of nothing more than good luck.  It is almost an article of faith among observers that BABIP is essentially a random attribute beyond the pitcher’s control, benefiting substantially from defense.  One could also argue I am using arbitrary endpoints.  While Kyle Lohse had a terrific pitching BABIP from 2011–2013, his major league BABIP was .364 in 2010.  Move the goalposts, some would say, and get a different result.  Finally, Derek Carty suggests that BABIP can take as long as 8 years (~3729 batters) to stabilize into a predictable indicator of a pitcher’s ability, which is another way of saying that it never really stabilizes at all, and is therefore indicative of nothing.

As to Kyle Lohse, that view may be correct.  But I suspect it is not.  Rather, I suspect that Kyle Lohse’s career renaissance has actually been driven in part from his ability to limit the damage caused by balls put into play.  To explain why, I’ll first address the arguments I just made in favor of his performance being unsustainable.

First, let’s talk about BABIP.  Although it common to attribute BABIP entirely to luck, it is more complicated than that.  Tom Tango and his colleagues found, for example, that BABIP was 44% luck.  The remainder (majority) of BABIP was attributed to a combination of the pitcher, the park, and fielding.  The pitcher was given 28% of the credit for his BABIP, but that is just an average; many observers suspect that a small class of pitchers has a unique ability to control their BABIP by inducing less effective contact.  Strikeout pitchers are one example. So, while it is common to dismiss good BABIPs as flukes, it is intellectually lazy to do so, particularly if a pitcher is generating low BABIPs on a consistent basis. 

Second, let’s address arbitrary endpoints.  Am I excluding Kyle Lohse’s dreadful 2010 season from my endpoints?  Yes.  Why? A few reasons.  First, because Lohse was injured that year and dealing with arm trouble that he finally was able to resolve.  In fact, the 2010 season was the culmination of a few injury-plagued seasons for Lohse.  But since the 2011 season that followed, Lohse has consistently pitched at least 180 innings per year and also consistently been effective, more so than he was ever was before.  Since 2011, his walk rates have been the best of his career, as have the ratio of his strikeouts to walks, both attributes that everyone agrees are controlled primarily by the pitcher’s ability.  Also, as Russell Carleton has found, a pitcher’s recent BABIP performance tends to be more predictive of their BABIP going forward.  So, what some would call an arbitrary endpoint (the beginning of Lohse’s 2011 season), I would call appropriate, and indicative.    

Finally, there is the issue of sample size.  Although I have no quarrel with the method Derek Carty used to conclude that a pitcher’s BABIP can take 3729 batters to stabilize, Kyle Lohse has faced over 2400 batters in the past three years.  That is not trivial sample, particularly when it spans home stadiums at opposite ends of the park factor spectrum. 

My suspicions about Lohse are further confirmed when you consider the differential between his RA9-WAR and his fWAR.  FanGraphs bases fWAR for pitchers entirely on their FIP.  However, FanGraphs also recognizes that FIP, while effective in evaluating most pitchers, does not properly evaluate pitchers who actually possess the skill to limit the damage on balls put into play.  Rather than toss FIP and fWAR aside, FanGraphs last year began publishing RA9-WAR as an alternative metric to allow a comparison between the number of runs that actually come across the plate while a pitcher is on the mound, versus those that FIP is willing to credit to the pitcher as having personally prevented.  The differential between a pitcher’s RA9-WAR and fWAR tells you how much of that pitcher’s run prevention cannot be explained by the three “true” outcomes of home runs, walks, and strikeouts.  Niftily, FanGraphs also estimates how the other runs were prevented — through BABIP (BIP-Wins) and by runners stranded (LOB-Wins).  Both RA9-WAR and fWAR are also park-adjusted.

Let’s start with the entire time period of 2011-2013.  For starters with 450 IP, Lohse’s RA9-WAR / fWAR differential is one of the top 10% in the game.

Name RA9-WAR BIP-Wins LOB-Wins FDP-Wins RAR WAR RA9 / fWAR Differential
Jered Weaver 17 6.1 -0.1 6 102 10.9 6.1
Jeremy Hellickson 9 4.6 0.6 5.2 37.2 3.8 5.2
Hiroki Kuroda 14 1.7 2.6 4.3 90.4 9.7 4.3
Clayton Kershaw 21.9 5.6 -1.5 4.1 152.9 17.8 4.1
Bronson Arroyo 6.6 2.3 1.7 3.9 23.3 2.6 4
Kyle Lohse 11 3.6 0.2 3.8 66.1 7.2 3.8
Ervin Santana 8.2 4.6 -0.9 3.7 41.9 4.5 3.7
R.A. Dickey 11.8 3.2 0 3.2 80 8.6 3.2
James Shields 15.5 2 1 3 117.3 12.5 3

Lohse’s differential has intensified in 2012-2013.  Over the last two years, among those with 300 IP pitched, only one starter in baseball had a larger RA9-WAR / fWAR differential (last column) than Kyle Lohse:

Name RA9-WAR BIP-Wins LOB-Wins FDP-Wins fWAR RA9-WAR minus fWAR
Clayton Kershaw 14.6 4.3 -0.9 3.4 11.2 3.4
Kyle Lohse 8.3 2.4 0.9 3.3 5 3.3
Hiroki Kuroda 10.3 1.6 1.1 2.7 7.6 2.7
Bronson Arroyo 6.7 1.3 1.1 2.5 4.2 2.5
Jarrod Parker 7.2 2.1 0.2 2.3 5 2.2
Jordan Zimmermann 8.3 1.1 0.8 2 6.4 1.9
Ervin Santana 3.5 3.4 -1.6 1.9 1.7 1.8
R.A. Dickey 8.2 2.3 -0.4 1.9 6.4 1.8
Chris Sale 11.3 0.8 0.8 1.6 9.7 1.6

That guy’s name is Clayton Kershaw, and he is pretty good.  In fact, Kershaw and Lohse have beat their FIP by basically the same amount over the past two years.  Unlike Kershaw, Lohse has pitched one of those seasons at home in Miller Park.

Overall, it is safe to say Lohse is showing a strong and consistent ability to beat his FIP, and over the last few years, is doing so better than almost any starter in baseball.  He is doing so by generating balls in play that are uniquely unsuccessful at becoming hits, and which his defense seems unusually capable of being able to field for outs.

How is he doing this?  It certainly is not his strikeout rate.  Lohse is not anybody’s idea of a strikeout pitcher.

What Lohse does do, however, is control the count, minimize walks, and consistently pitch from ahead.  This quality makes Lohse an extremely enjoyable pitcher to watch: despite topping out at 90 mph, he pounds the zone and challenges hitters.  His BB/9 over the last three years has ranged from 1.62 to 2.01.  During that same time frame, only Cliff Lee is more likely than Kyle Lohse to throw a first-pitch strike, which Lohse did 67.5% of the time.  The fact that Lohse is throwing first-pitch strikes against 2/3 of the batters he faces without getting killed suggests that he is putting those strikes in locations where batters want no part of them.  In short, Lohse has terrific control and consistently finds himself in counts where he and his catcher have the luxury of choosing their pitch.

Does Lohse’s control affect the quality of the ball being put into play against him?  It very well may.  Although his sample size could have been larger, Russell Carleton found that pitcher BABIPs correlated with the pitch counts the hitters were facing when they put the bat on the ball.  The more favorable the count to the pitcher, the less likely the hitter will get on base from his hit.  Kyle Lohse’s three best counts for limiting batter wOBA this year?  Why, those would be 0-2, 1-2, and 0-1.  And the three counts Kyle Lohse faces far less than any others?  Those would be 3-0, 3-1, and 3-2. 

The bottom line is that Kyle Lohse is an exception among aging starters: a pitcher who has gained effectiveness in his mid-thirties through terrific control that not only forces hitters to beat him, but also apparently limits the damage even when batters do hit the ball.  Should the Brewers make Lohse available at the trade deadline next year, contenders would be foolish not to give him a close look, particularly with Lohse under control through 2015.  When the difference between collecting a pennant and going home can be a batted ball just out of reach, it makes sense to have a pitcher with a demonstrated knack for putting the ball in the defender’s glove.  


Evaluating Players in the Dark and Scooter Gennett

To make it as a big-league ballplayer, you have to do very hard things well, like hitting a very fast-moving baseball.  You also have to be able to do some reasonably easy things well, like see the baseball.  Why, then, couldn’t Brewers second baseman Scooter Gennett see the ball in the minor leagues?

In a recent Brewers broadcast, tv announcer Brian Anderson relayed a story about Scooter Gennett and his somewhat surprising performance in the majors (149 wRC+ and 1.4 WAR in 49 games so far).  Gennett claimed that he was just seeing the ball so much better in the majors due to poorly-lighted minor-league ballparks.

While minor league plate discipline data may not be a reliable comparison, if he was able to see the ball better in the majors, you would expect certain things to happen.  He’d make contact frequently, and probably solid contact.  Take a look at his contact numbers now 51 games into the majors:

Team

PA

OCon%

ZCon%

Milwaukee

162

77.1%

94.6%

His contact numbers so far are comparable to Matt Carpenter’s.  What we don’t see in Scooter’s major league data, however, is a real solid line drive rate to indicate he’s able to better put the barrel of the bat on the ball.  In fact, he ranks just 28th out of all second basemen this season with at least 150 plate appearances  with a slightly-above-league-average 22%.  He doesn’t appear to actually be recognizing pitches any better– his walk rate is actually down from his time in the minors, and his strikeout rate is up.  But there seems to be something to indicate that he’s seeing the ball well–he’s swinging and making contact on plenty of the pitches, and he figures out where the ball is and puts his bat on it.

Which bring us to the question:  What the hell is going on in minor-league ballparks, if in fact Scooter Gennett’s contact rates are really closer to Matt Carpenter’s and he feels the ball was harder to see in Nashville?

If you’re the Brewers, or any team really, wouldn’t you want to know that difference?  Especially when your other second base options this year have been Rickie Weeks (86 wRC+), Jeff Bianchi (57 wRC+), and Yuniesky Betancourt (Yuniesky Betancourt)?  I don’t know much (anything) about exterior lighting, but I would think that if there was a possibility that field conditions were affecting a team’s player evaluations, teams could reasonably justify investing some money into the lights for the minor-league affiliates.

“Seeing the baseball” seems like it’s discussed for well over half of players’ and managers’ attributions of a hitting streak or an unexpected jump in power, and this may account for Scooter Gennett’s explanation of his success with the Brewers in 2013. But with the margins for error and to gain a competitive advantage so small in the majors, these kind of anomalies may be well worth the attention of baseball ownership and their affiliated clubs.


Free Agent Case Study: Jarrod Saltalamacchia

Dennis Eckersley filled in for Jerry Remy during the Red Sox road trip to play the Giants and Dodgers and has remained on board for the Orioles series. Eckersley’s analysis, cluttered with lingo like “cut the moss,” “throwing cheese,” and “Hello?!”, is also often insightful and informative. Such was the case when he praised Jarrod Saltalamacchia for his consistent season behind the plate for the Sox in 2013.

Jarrod-Saltalamacchia1
It’s hard to imagine another Sox catcher at this point in time.

Saltalamacchia has seemingly overcome the developmental issues that persisted during the early part of his career on the defensive side of the plate. His swing has always provided power and, perhaps most importantly, he has become a trusted game-caller by Boston’s pitching staff. Salty is playing in a contract year in 2013. In this post, I’ll take a look at the market for catchers, analyze Salty’s true value to the Sox, and give a prediction for whether I see him re-upping with Boston this offseason.

The Numbers

Saltalamacchia was once a top prospect in the Braves system and was the center-piece in a 2007 trade for Mark Teixeira. Much of the promise scouts saw in Salty arose from the power he generated from his uppercut swing from the left side of the plate. Like most young players with a long powerful stroke, Salty struggled with strikeouts and inconsistencies in his approach. Salty’s status as a star prospect diminished during his time in Texas due to his inability to put the ball in play, and the Sox took a flier on him at the end of the 2010 campaign. The numbers show the type of hitter that he has been from 2011-2013 with the Sox, but also point to a fundamental change in his approach in his most recent campaign.

From 2011-2013, Salty hit the 6th-most homers amongst catchers in Major League Baseball with 51 (Mike Napoli leads all catchers with 69, despite playing his entire 2013 campaign at first base). Salty is also last among all Major League catchers with a 69.8% contact rate and leads the group with a 30.6% strikeout rate. These are numbers that reflect a hitter who swings to hit the ball out of the park for each and every swing he takes.

Despite the strikeouts (which have been prevalent throughout Salty’s career), it is clear that he has gone through a fundamental change in hitting philosophy during the 2013 campaign. The graph below helps us visualize Salty’s trend during his time with the Sox:

SAlty_swings

The graph breaks down Salty’s batted balls by fly balls, ground balls, or line drives. When he arrived with the Sox in 2010, Salty was in the worst spot, batted ball-wise, of his entire career. His line drive percentage hovered around 5%, whereas his fly ball (and pop up) percentage was at the highest of his career. Since he has been with the Sox, Salty has reversed this trend, culminating in the highest line-drive (and lowest fly-ball) percentages of his career in 2013. This is one reason for Salty’s apparent decrease in power, as ZiPS projects him to hit 14 dingers this year following seasons of 16 homers in 2011 and a career-high 25 in 2012. Despite the slight dip in power, the change in approach has made Salty a more productive overall hitter: his greater propensity to hit line drives has caused his BABIP to rise dramatically from .265 in 2012 to a whopping .379 in 2013. Moreover, it has caused his overall average and OBP to rise to .270 and .341, respectively (up from .222 and .288 in 2012). The only surprising stat after noticing Salty’s decrease in FB% and increase in LD% is that his slugging percentage has not changed at all from 2012 to 2013, even despite the fact that his homer rate is down. But a quick review of Salty’s counting stats reveals that this is due to the fact that he ranks eighth in the Major Leagues with 34 doubles. We can again attribute this to his greater propensity to hit line drives, as many of the long fly balls that stayed up for just too long may be dropping for Salty in 2013.

As his batted ball trends and overall stats suggest, Salty has been on an upward slope as a hitter during his time with the Red Sox.

The Intangibles

While we have just examined the ways in which Salty helps his club with the bat, he holds arguably even more value to the Sox has their primary backstop. This is where the intangibles come in to play, which might be the single biggest factor as to why Salty gets a major pay-day (or why he doesn’t) on the open market. Simply put, there is much more to calculating player value for a catcher than offensive and defensive stats alone.

Catchers can improve a pitching staff with their daily preparation and ability to call a game. In an effort to quantify Salty’s game-calling ability, I’ll reference an article called “Salty’s Defense/Game-Calling Impact” on the Pro Sports Daily forums. As of August 5th, the chart below gives pitcher’s ERA during Salty’s starts as compared to his back-up’s starts over the past three years:

SAlt

While there is much more to calling a game than simply “pitcher ERA”, the trend is a bit alarming when estimating Salty’s value. Numbers don’t tell the whole picture, of course, but they certainly wouldn’t support a claim that Salty improves a pitching staff through his game-calling. One thing is clear: pitchers are doing better in 2013 while Salty is behind the plate, but it remains a mystery whether this is because he’s calling a better game or simply because the pitchers he’s catching are better.

Josh Beckett was one pitcher who spoke out about Salty’s inability to be on the same page as the starter, but many have spoken in defense of the backstop’s ability during the 2013 campaign. Jake Peavy, for one, has commended Salty’s approach to game-day: “I can’t say enough about his willingness. Salty has got some time here, some time in the big leagues. For him to be so humble in his approach, to not say, ‘This is how we do things here’; it was him saying, ‘Hey, man, what do you need to win tonight? What do you need me to do?” In any case, his familiarity with the Sox pitching staff likely makes Salty more immediately valuable to the Red Sox than any other team.

On another note, Salty has been very durable during his time in Boston. He has missed just 4 games due to injury from 2011-2013 and has not been placed on the disabled list once. While durability is always valuable, it is especially valuable in a catcher, in which the day-to-day bumps and bruises are far more prevalent. This should make teams more comfortable offering him a long-term deal.

The Market

Of the 18 catchers on open market following the season, Salty is the youngest at age 29. He will likely be the second-most coveted free agent catcher (behind Atlanta backstop Brian McCann), though that could change if Salty gets hot or McCann gets hurt again (he missed time in 2011 due to an oblique injury and missed time in 2013 due to offseason shoulder surgery). There have been very few free agent catchers over the past 3 years, likely due to the fact that familiarity with a team’s pitchers is very important to front offices. Thus, we notice that there have been many contract extensions for catchers, but very few catchers who actually hit the open market. Salty is, in fact, in a very unique position as a productive free agent catcher who will likely fetch a deal for more than 3 years. In any case, here is how the free agent/extension market has looked over the past two years:

market

Miguel Montero seems to be the most similar comparable by age (29) and overall production (using ZiPS projections, Salty will have a 2.1 average fWAR over the past 3 seasons when he hits the open market). Montero’s signing was actually an extension, so even though his overall production was a bit higher when he signed, it’s certainly not far-fetched to believe that Salty will get a 5-year, $60 million contract when teams are bidding for his services.

The Suitors

I expect the White Sox, Angels, Athletics, Yankees, Braves, Rangers, Rays, and the Red Sox to be in the running for Salty’s services based on their need behind the plate for next season. I doubt the Rays or the White Sox would spend the money on the current Sox backstop, and a signing by the Angels and Athletics seems equally unlikely due to the Angels’ payroll and Oakland’s frugality. This leaves the Red Sox, Yankees, Braves, and Rangers as Salty’s primary suitors, and their free-spending tendencies should make Salty’s eyes light up in free agency.

Prediction

This is a really tough call. I think that if Atlanta does sign a catcher to a long-term deal, they will simply retain Brian McCann (familiarity, as discussed previously, is likely very important to teams when evaluating catchers). A look at the remaining teams’ organizational depth charts could provide insight into Salty’s destination. According to mlb.com, catcher Gary Sanchez is the Yankees’ top prospect with an ETA of 2015. He’s also the second-highest ranked catching prospect in baseball behind Travis d’Arnaud. I expect the Yankees to hold off on signing Salty. While the Rangers’ top prospect is also a catcher, Jorge Alfaro is not expected to arrive until at least 2016 and their 40-man catching depth is weak. The Red Sox have multiple catching prospects in their system (#10 Blake Swihart, #13 Jon Denney, and #15 Christian Vazquez) and have depth on their 40-man roster with Dan Butler and Ryan Lavarnway. If the price does indeed use to my predicted contract of 5 years and $60 million, it’s hard to see Salty re-upping with Sox GM Ben Cherington. Instead, I see him jumping town for Texas, an organization where his prospects once faded, but one that now might make him a very rich man.

Vince D’Andrea is a rising senior at the Massachusetts Institute of Technology. His blog, Dave Roberts’ Dive, can be found here.


Baseball’s Most Extreme Pitches from Starters, So Far

Introduction

After reading Jeff Sullivan’s piece entitled “Identifying Baseball’s Most Unhittable Pitches, So Far” on August 21, I found his methodology to be quite interesting.  It was suggested in the comments rather than looking at whiff rate we should consider who has allowed the weakest contact.  Now, there are a couple of different ways to look at weakest contact.  First, you could look at batted ball velocity.  You could also look at batted ball distance as well.  Both of these techniques would provide some measure of the severity of contact allowed by a pitcher.  At the end of the day though, a warning track fly ball is still as effective for a pitcher as a pop up.  I thought it would be better to look at who got hurt the least with their pitches.

In saying that, I mean to look at what pitchers are theoretically giving up nothing but singles on a pitch versus what pitchers are theoretically giving up nothing but home runs.  A quick calculation to quantify this value is total bases per hit allowed (TB/H).  This is the same as the ratio between slugging percentage and batting average (SLG/AVG).  Values have to be between one and four.  A value of 1.00 corresponds to only singles.  A value of 4.00 corresponds to only home runs.  Any value in between could represent a combination of all hit types.

Baseball Prospectus provides PitchF/X leaderboards for eight different pitch types: four-seam fastball, sinker, cutter, splitter, changeup, curveball, slider, and knuckleball.  I chose to look at only starting pitchers in this study.  Also, to be considered, a pitcher had to have thrown at least 200 of the pitch of interest.  The league leaders in games started are just above 25.  If we are conservative and estimate 80 pitches per start, that allows for 2000 pitches thrown, so 200 would represent roughly 10% of the pitcher’s arsenal.  With that background information now covered, let’s look at the best and worst pitchers in each pitch type.  All data is accurate through August 22.

Data

Four-Seam Fastball

Pitcher

Team

TB/H

Pitcher

Team

TB/H

Jarrod Cosart

HOU

1.20

Lucas Harrell

HOU

2.33

Tyler Chatwood

COL

1.20

Todd Redmond

TOR

2.20

Stephen Fife

LAD

1.22

Allen Webster

BOS

2.20

Bartolo Colon

OAK

1.26

Tyler Skaggs

ARI

2.15

Joe Kelly

STL

1.26

Erik Bedard

HOU

2.10

Sinker

  Pitcher

Team

TB/H

Pitcher

Team

TB/H

Brandon Cumpton

PIT

1.10

Yu Darvish

TEX

2.27

Taylor Jordan

WSH

1.10

Bud Norris

BAL

2.12

John Lackey

BOS

1.21

Aaron Harang

SEA

1.96

Gerrit Cole

PIT

1.22

Scott Kazmir

CLE

1.93

Jonathan Pettibone

PHI

1.22

Jon Lester

BOS

1.92

Wade Davis

KCR

1.22

Cutter

Pitcher

Team

TB/H

Pitcher

Team

TB/H

Clay Buchholz

BOS

1.11

Jeff Samardzija

CHC

2.00

Jenrry Mejia

NYM

1.17

Jerome Williams

LAA

1.95

Lucas Harrell

HOU

1.20

Cole Hamels

PHI

1.90

Jonathon Niese

NYM

1.21

A.J. Griffin

OAK

1.86

Mike Pelfrey

MIN

1.31

Yu Darvish

TEX

1.85

Splitter

Pitcher

Team

TB/H

Pitcher

Team

TB/H

Hiroki Kuroda

NYY

1.22

Ubaldo Jimenez

CLE

1.72

Jake Westbrook

STL

1.31

Tim Hudson

ATL

1.70

Jorge de la Rosa

COL

1.32

Dan Haren

WSH

1.69

Doug Fister

DET

1.33

Tim Lincecum

SFG

1.61

Hisashi Iwamuka

SEA

1.33

Jason Marquis

SDP

1.58

Changeup

Pitcher

Team

TB/H

Pitcher

Team

TB/H

Stephen Strasburg

WSH

1.00

John Danks

CHW

2.21

Matt Harvey

NYM

1.06

Jeremy Hefner

NYM

1.96

Gio Gonzalez

WSH

1.10

Dan Straily

OAK

1.91

Francisco Liriano

PIT

1.14

Randall Delgado

ARI

1.89

Bud Norris

BAL

1.22

Edinson Volquez

SDP

1.87

Curveball

Pitcher

Team

TB/H

Pitcher

Team

TB/H

Clayton Kershaw

LAD

1.00

Homer Bailey

CIN

2.33

Jason Hammel

BAL

1.00

Zack Greinke

LAD

2.09

C.J. Wilson

LAA

1.07

Wandy Rodriguez

PIT

2.00

Dillon Gee

NYM

1.14

Tim Hudson

ATL

2.00

Max Scherzer

DET

1.17

John Lackey

BOS

2.00

Slider

Pitcher

Team

TB/H

Pitcher

Team

TB/H

Tyson Ross

SDP

1.00

Jordan Zimmermann

WSH

2.24

Jorge de la Rosa

COL

1.17

Wade Miley

ARI

2.07

Bartolo Colon

OAK

1.18

Dallas Keuchel

HOU

2.06

Jeremy Hefner

NYM

1.24

Carlos Villanueva

CHC

2.06

C.J. Wilson

LAA

1.24

Hisashi Iwamuka

SEA

1.96

And for completeness,

Knuckleball

Pitcher

Team

TB/H

R.A. Dickey

TOR

1.68

Combining all that data together, we get the following five pitches as the best in baseball so far.

Pitcher

Team

Pitch

TB/H

Stephen Strasburg

WSH

Changeup

1.00

Clayton Kershaw

LAD

Curveball

1.00

Jason Hammel

BAL

Curveball

1.00

Tyson Ross

SDP

Slider

1.00

Matt Harvey

NYM

Changeup

1.06

Also, to complete the picture, here are the worst five pitches in baseball so far.

Pitcher

Team

Pitch

TB/H

Lucas Harrell

HOU

Four-Seam

2.33

Homer Bailey

CIN

Curveball

2.33

Yu Darvish

TEX

Sinker

2.27

Jordan Zimmermann

WSH

Slider

2.24

John Danks

CHW

Changeup

2.21

Analysis

As you can see, there are a lot of “good” pitchers that throw “lousy” pitches.  This metric is far from perfect.  For example, Yu Darvish appears in the bottom five in two different categories.  Does that mean Darvish should stop throwing his sinker and cutter?  No, it most certainly does not.  It just shows that when Darvish makes a (albeit rare) mistake with either pitch hitters are mashing it.  I found this a fun exercise that yielded results that may not be the most meaningful but that are interesting for discussion nonetheless.


Major League Baseball Should be All Over the Quantified Self Movement

This post originally appeared in slightly different form on my blog: Biotech, Baseball, Big Data, Business, Biology…

Baseball players break down.  Their performances fluctuate.  As a group there are some interesting generalities with respect to how pitching, hitting and fielding change with age.  But the error bars are huge.  There are many things we still don’t know about baseball players, about why one prospect hits the ground running and another flames out.  And we also don’t know if there is any way to know, since the task of putting together the skills needed to play major league baseball may be one of the most complex of the major sports, and understanding complexity is hard.

But it seems worthwhile to give it a try.

The Mystery of the Missing Ligament

Let’s talk about R.A. Dickey for a minute.  Not because he’s a highly interesting human being, although he is.  And not because he’s a knuckleballer, which is fun and interesting due to rarity and the entertaining sight of six foot athletes flailing at baseballs traveling with the flight path of a drunken small-nosed bat.  But rather because he was drafted in 1996 in the 1st round by the Texas Rangers, and only during his physical workup was it discovered that he was missing a key ligament in his arm.  The Ulnar Collateral Ligament (UCL) to be exact.  Without which, it is assumed, a pitcher cannot pitch.

Well, except  that he did.  This shouldn’t be under-emphasized.  Pitching without a UCL is thought to be akin to trying to play tailback for the Seahawks without an Anterior Cruciate Ligament (ACL) in your knee.  And yet he pitched and pitched well for years without a UCL.  RA Dickey got his UCL replaced and then knocked around the major and minor leagues for several years, eventually learned how to throw a knuckleball, and now has pitched successfully in the majors for several years more.

A story like this illustrates two points.  One, we may be making assumptions that aren’t always supported by the data—for example, that the UCL is required for pitching.  And two, you can learn a lot just by looking and measuring.

Measure by Measure

What should be measured and how?  I think an area to look into might be the tools being developed now to support self-measurement.  The quantified-self movement has gained enough prominence that magazines like Newsweek are running profiles.  For people in the movement, the motivation for participation stems from a desire to better understand themselves; to have data that will give them a data-driven view of what is going on in their bodies and minds.  The goals are often better health, losing weight, tracking mood, athletic prowess, increasing the levels of good indicators and decreasing the levels of the bad.

One of the distinguishing elements of how this is being done is granularity.  Apps on a smartphone, portable electronic devices, and logging tools can capture data in intervals ranging from several times a day up to a more or less continuous stream.  Even tests and procedures that might normally be performed once a year at an annual physical become fair game for more frequent monitoring, as long as you have the money to pay for the testing.  The open question is whether collecting all of this data will reveal new insights.  Or, to put it graphically, if you tested a metric infrequently, and got this graph:

graph1

Would the result of more frequent testing look like this?

graph2

Or like this?

graph3

This example is borrowed from the site of Ginger IO, a company that is developing tools for continual measurements of health related metrics, among other things.

Where baseball comes in to this is I believe MLB teams are continually in a search for new ways to gain an advantage in building a quality team.  You know, that extra 2%.  A baseball team has vast resources, and those resources are focused on getting the most out of the several hundred baseball players that comprise the major and minor league talent of the team.  There are trainers, and doctors, and team dieticians, and masseuses, and coaches.  What would it take to add an additional technological and analytical group dedicated to gathering data on the players and seeing whether any of this information provides additional retrospective or prospective insight into individual performance?

Here is where an enterprising team could probably reach out to a couple of different groups for help in setting this up.  One would be device and software manufacturers who are building tools in this space.  I’ve written before about EmotionSense and have also learned recently about GingerIO (HT to @Dshaywitz).  Another highly interested party would be the nearest medical school and those researchers looking into patient reported outcome (PRO) techniques and patient monitoring efforts.  If an MLB team doesn’t already have its own high-powered statistical analysis group (or even if it does), it could reach out to suppliers of software tools for analyzing large scale datasets and finding patterns, like Ayasdi or Google.

I could also see a viable group for a partnership being other professional sports teams.  Many MLB teams are in the same city as NFL, NBA, NHL, and/or MLS franchises.  To spread the investment costs as well as providing control groups for each other, it would be useful to collaborate with these other franchises to learn more about the effect of sports training in general.

A speculative area for data collection and analysis could be in genomics, transcriptomics and proteomics.  Michael Snyder of Stanford University has been demonstrating for some years now how a program of monitoring personal molecular information about one’s health, along with other more conventional measures, provides new insights into health and disease.

The metrics should also include the conventional.  Going back to the example of R.A. Dickey, wouldn’t it be useful to perform elbow and shoulder scans for every player on major and minor league rosters on at least a yearly basis?  So often in sports you hear the term “typical wear and tear” when describing an elbow or shoulder or knee.  My question is, how do you know it’s typical?  Until you have a large, well-defined baseline that you follow for years under the rigorous conditions that baseball players are subjected to, how can you know what real wear and tear is?  And if you did know, wouldn’t that help you in making decisions about training and protecting your own players, to say nothing of evaluating free agents?  One of the truisms of baseball is that every team knows more about their own players than anyone else, leading to information asymmetry in trading and signing.  It seems an imperative for each team to reduce or reverse that asymmetry if at all possible.

An additional area that personal monitoring could help in is understanding on-field performance.  I’ve already touched on how MLB could use various kinds of GPS and positioning sensors to more accurately measure defense, for example, so I won’t elaborate further except to point out Chip Kelly is bringing this approach to the Philadelphia Eagles, and it will be interesting to see if we get reports on the effectiveness of using GPS to monitor his NFL players’ movements.

Biological passports

Another benefit of building a baseline for different kinds of metrics in your team would be helping to detect the possibility of doping.  This seems to be in the news right now for some reason, so let me just say that if a team began collecting, analyzing and storing biological samples on a regular basis, this would help in detecting those who are taking performance-enhancing substances.  This isn’t a new idea; the World Anti-Doping Agency is advocating this approach already.  However, I think MLB could take it to a high level of rigor and quality.  Would this have to be negotiated?  Sure.  But there is probably no better time than now to see if such an agreement can be forged between the union and the MLB owners.

Essentially, by taking samples from enough players over time, as well as healthy, age and ethnicity-matched volunteers as a control group, an MLB team could build up a comprehensive profile of what normal is with respect to the known indicators of performance enhancement such as hemocrit levels, not just as an average, but on an individual basis.  With this kind of data, a rapid, unusual change in specific metabolites could provide grounds for more intensive investigation.  When athletes come up with a positive test, a standard argument has been that he or she always has had an unusually high level of the tested substance.  Well, you know, the only way to know that for sure is to have a record dating back years that demonstrates outlier status or not for that athlete and that test.  Continual sampling is almost certain to deter many would-be attempts to use performance enhancing substances.

This would be invasive.  No doubt about it.  Which is why there should also be stringent controls on data and better maintenance of privacy than we’ve seen so far in the Biogenesis saga.  However, there is also probably no better time to negotiate these kinds of tests as baseball strives to clean its image again.

Too much data?

Of course, collecting all this data provides no guarantee of actually finding out something specifically useful and actionable for any given MLB team.  As Nate Silver has pointed out many times in his columns and book, given enough data you can find a correlation for almost anything.  However one thing is certain: you can’t find new things when you don’t look, and trying to apply concepts of the quantified self to MLB teams will lead to a whole lot of cross-discipline interactions and innovative thinking, which a forward-looking team might be able to parlay into the next big market inefficiency in baseball.


Does Your Team Have a Winning Core? Profiling Sustainable Roster Construction

Thanks to an atrocious month of May, the 2013 Milwaukee Brewers were abruptly transformed from a fringe contender into a rebuilding baseball club.

Most people agree that the Brewers need to build a new core, but what does that mean? Many teams have young players in the midst of an above-average season, but that doesn’t necessarily translate to sustainable success for the roster as a whole. And the opinions expressed about so-called core players are usually subjective and not expressed in a way that allows direct comparisons between teams.

We could really use a metric to compare the rosters of teams who are developing potentially sustainable talent with those who aren’t. My effort to do this is called Core Wins, which summarizes the extent to which a team’s success is being driven from players most likely to constitute core talent, as opposed to players on their way out the door, probably in decline, or both.

To do this, we need define what it means to be a core player, and specifically the factors by which we evaluate a core player’s respective contributions to the team.

The Core Player

In my view, core players do three things: (1) contribute significantly to their team’s success, (2) do so while under extended team control, and (3) do so at or before they reach their peak ages of likely productivity. Each of those attributes needs to be mathematically summarized to reduce these contributions to a measurable value.

The first factor is the easiest: a core player is expected to contribute, and to do so above what could be found in an entry-level minor-league call-up. A major league player’s ability to do so over the course of a season is commonly summarized in some version of the wins above replacement (WAR) metric, which attempts to combine the player’s batting, fielding, and if applicable, pitching contributions. A counting statistic also fits our needs best, since we are looking for aggregate contributions over the course of a single season. So, we’ll use WAR, as calculated by Fangraphs (fWAR).

The second factor, team control, is more complicated. Player control comes in two primary forms: (1) players under club control due to the terms of baseball’s collective bargaining agreement, and (2) players who have signed freely-negotiated contracts. The collective bargaining agreement keeps players under club control for at least six major league years. Free agent contracts range from one-year stop-gaps to those lasting a decade or longer. Most ballclubs are a collection of young players under sustained club control, long-term (and typically expensive) free agents, and stopgap players on value contracts. But teams with a sustainable core should be drawing significant production from players who will actually be around in future years. If too much production is coming from departing or declining players, the club is asking for trouble.

The third factor — player age — is less significant, but still important. Younger players are cheaper than older players, and thus easier to afford and keep around. Younger players are less frequently injured, meaning they will be in the lineup more often. Younger players who have not yet reached their peak production age will also probably continue to improve, whereas players beyond their peak age will probably decline.

However, age can be overemphasized. The primary advantage of youth— extended club control — is already being considered. Moreover, mature players signed to long-term contracts tend to be some of the most valuable players in the game — Joey Votto, Felix Hernandez, and their peers. And while prospects are important, most ballclubs would strongly prefer Joey Votto over a 22-year old prospect who may, but probably won’t, someday turn into Joey Votto. So while age matters, it is not as important as control.

So to summarize: we need to weigh player value, but do it in a way that primarily emphasizes team control while still placing some value on a player’s age.

Method

Player Contributions

All WAR figures were drawn from Fangraphs. The figures for batting fWAR (which incorporates fielding) and pitching fWAR were combined into one spreadsheet for each team year. When a player generated values for both batting (plus fielding) and pitching WAR, those values were summed, including the effect of any negative values. Once a net value was obtained for all players on a team roster for the year, all zero or net negative WAR values were disregarded.

Player Control Index

Player control numbers were drawn primarily from Cot’s Contracts, and cross-checked with Baseball Reference, other sources, and common sense as needed. Cot’s provides individual player contract data from 2009 onward, so only data from 2009 through 2012 was used. Control years were weighted identically, regardless of whether they arose from the CBA or a free agent contract. A player subject to a club option was considered to be under club control for that year. The author’s best estimate of remaining club control was necessary in a few cases when contract details were unclear, but not surprisingly, most of those players were fringe contributors that would not constitute core talent anyway.

A player was assigned one control year if his contract expired after the current season, two control years if his contract expired after the following season, and so on. For practical reasons — including the frequent shuffling from the minors experienced by young players, and the oft-diminishing returns of the longest contracts — the maximum number of control years considered for a player was 5. A Control Index was then calculated for each player in each roster year, with the number of control years as numerator, and an assigned denominator of 2 — for the minimum years that would constitute extended organizational control. So, for example, a player with an expiring contract would have a Control Index of 0.5 (1 season left divided by 2), and a typical player in their final pre-arbitration year would have a Control Index of 2.0 (4 seasons of control divided by 2). The maximum Control Index is 2.5.

Age Index

A player’s “baseball age” — their age on July 1 of a given season — was drawn from Fangraphs. An Age Index was then calculated for each player using an assigned value for a typical peak performance age as the numerator and the player’s baseball age for each season as the denominator. There has been some debate on the overall peak performance age for ball players, but, taking a strong hint from one of my reviewers, I used 27. To give some sense of the value range, the Age Index in 2012 for Mike Trout would have been 1.35 (27/20) and for Livan Hernandez would have been 0.73 (27/37).

Determining Core Win Value

In my formula, Core Win value is a weighting exercise. To calculate a player’s Core Win value to a roster, I multiplied the player’s net fWAR for each season by the Control Index and the Age Index. The Control Index has a greater range (0.5 to 2.5) and thus a greater potential weight than the Age Index, which seems appropriate for the reasons stated above. The combined effect of these indices means young prospects that produce at a level of 2 fWAR or higher are weighted the most heavily. This makes sense: players who promptly adjust to the difficulty of the major leagues, yet still have years of probable improvement ahead of them, all while under extended team control, are those most likely to constitute a sustainable core of talent for the ballclub.

Discussion

Now that we have a formula for Core Win Value, we need to decide what it means to have a winning core. That cut-off is ultimately in the eye of the beholder, but I looked to the gold standard: the Tampa Bay Rays. The Rays are widely acclaimed for their ability to acquire and maintain control of young talent, often through early buy-outs of free agent years, combined with club options that retain team flexibility. This has been particularly true over the years covered by this study: 2009 through 2012.

To provide some contrast with the Rays, we will also consider the roster construction during that same time period of the New York Mets and the Oakland Athletics.

The Gold Standard: The Rays

Not surprisingly, the Core Wins formula likes the Rays very much. Indeed, three characteristics of the Rays between 2009 and 2012 suggest a working definition of a team with a strong, sustainable core: (1) the Rays consistently feature five or more players producing a Core Win Value of 5 or higher per season, which is my working definition of a “Core Player”; (2) they have accomplished this feat in multiple consecutive years (all four years I studied, in fact) and (3) at least two of these Core Players were usually pitchers.

Let’s start with 2009. For ease of viewing, in each of these tables, I’ve bolded wins figures for potential Core Players (five or more Core Wins). I’ve also italicized the names of pitchers who cross the Core Wins threshold, to distinguish them from position players.

2009 Tampa Bay Rays

Name fWAR Age Control Years Control Index Age Index Core Wins
Evan Longoria 7.5 23 5 2.50 1.17 22
Ben Zobrist 8.5 28 5 2.50 0.96 20
James Shields 3.5 27 5 2.50 1.00 9
Matt Garza 2.9 25 5 2.50 1.08 8
Jason Bartlett 5.3 29 3 1.50 0.93 7
Carl Crawford 5.6 27 2 1.00 1.00 6
B.J. Upton 2.1 24 4 2.00 1.13 5
David Price 1.3 23 5 2.50 1.17 4

In 2009, the Rays won 84 games, featuring seven players that delivered 5 Core Wins or more. This depth, plus MVP-level performances from Evan Longoria and Ben Zobrist, prepared the Rays for the eventual departure of Carl Crawford, whose dwindling team control was removing him from the team’s core. Note that the team’s two best pitchers in 2009, James Shields and Matt Garza, were both under team control for 5 more years. David Price generated only 1.3 fWAR in 2009, and thus barely missed the Core Wins cut, but he was on the upswing.

2010 Tampa Bay Rays

Name fWAR Age Control Years Control Index Age Index Core Wins
Evan Longoria 7.6 24 5 2.50 1.13 21
David Price 3.9 24 5 2.50 1.13 11
Ben Zobrist 3.7 29 5 2.50 0.93 9
B.J. Upton 3.8 25 3 1.50 1.08 6
John Jaso 2.3 26 5 2.50 1.04 6
Sean Rodriguez 2.1 25 5 2.50 1.08 6
Matt Joyce 1.7 25 5 2.50 1.08 5
James Shields 1.7 28 5 2.50 0.96 4
Carl Crawford 7.4 28 1 0.50 0.96 4
Matt Garza 1.5 26 4 2.00 1.04 3

In 2010, the Rays maintained 7 players at a Core Win level of 5 or more, culminating in 96 team wins and a first-place finish in the AL East. Only one pitcher (David Price) made the Core Win cut-off of 5 this time, but James Shields just missed it. Matt Garza regressed a bit (and was promptly traded to the Cubs for more prospects, without any negative effect). Carl Crawford, despite an MVP-level year of 7.4 fWAR, is discounted out of the team core by the Core Wins formula, due to his team control ending that year.

2011 Tampa Bay Rays

Name fWAR Age Control Years Control Index Age Index Core Wins
Ben Zobrist 6.2 30 5 2.50 0.90 14
Evan Longoria 6.2 25 4 2.00 1.08 13
David Price 4.3 25 5 2.50 1.08 12
Matt Joyce 3.5 26 5 2.50 1.04 9
James Shields 4.4 29 4 2.00 0.93 8
Desmond Jennings 2.3 24 5 2.50 1.13 6

2011 featured more of the same. Carl Crawford was gone, but the Rays did not miss him, as the formula anticipated. Six Rays met the Core Win threshold, two of them pitchers (Price, Shields). Superstar contributions by Zobrist and Longoria, combined with ascending contributions from four others — including Price and Shields — resulted in a highly-successful season from Tampa Bay’s controlled talent, and others. The Rays won 91 games and made a wild-card playoff appearance.

2012 Tampa Bay Rays

Name fWAR Age Control Years Control Index Age Index Core Wins
Ben Zobrist 5.8 31 4 2.00 0.87 10
David Price 4.8 26 4 2.00 1.04 10
Desmond Jennings 3.3 25 5 2.50 1.08 9
Matt Moore 2.4 23 5 2.50 1.17 7
Evan Longoria 2.5 26 5 2.50 1.04 6
Alex Cobb 2.0 24 5 2.50 1.13 6
Jake McGee 2.0 25 5 2.50 1.08 5
James Shields 3.9 30 3 1.50 0.90 5

By 2012, the Rays had developed an astonishing eight players that crossed our Core Win threshold. An incredible five of these players — over half the team’s core, under our formula — were starting pitchers with at least four years of team control remaining. This means that the Rays’ entire starting rotation was under long-term control. Despite a hamstring injury that kept him out for over three months, Evan Longoria still contributed 2.5 fWAR to the effort, and his new contract provided the team with the long-term control to keep him in the team’s core. The 2012 Rays won 90 games: not enough for even a wildcard in the American League that year, but a terrific season nonetheless.

Before the 2013 season, the Rays dealt James Shields to Kansas City for the bat of Wil Meyers and other prospects. As of the publication of this article, Fangraphs projects them to win 93 games in 2013, on a payroll of only $62 million. In sum, the Rays have been, and continue to be, the prototypical team that demonstrates what it means to have a sustainable core of controlled talent.

By Stark Contrast, the New York Mets

The Mets have been bad for years, and the Core Wins formula identifies major flaws in roster construction as a possible culprit.

2009 New York Mets

Name WAR Age Control Years Control Index Age Index Core Wins
David Wright 3.4 26 5 2.50 1.04 9
Johan Santana 3.2 30 5 2.50 0.90 7
Angel Pagan 2.8 27 4 2.00 1.00 6

Dreadful: there is no other way to describe the 2009 Mets. That year, the Mets spent $140 million for 70 team wins, generating only three Core Players under our formula. Even those players gave only ok performances. From a Core Wins perspective, this roster was terrible. One of the three players to meet the Core Wins threshold, and the only starting pitcher — Johan Santana — is heading past his probable prime.

2010 New York Mets

Name WAR Age Control Years Control Index Age Index Core Wins
Ike Davis 3.1 23 5 2.50 1.17 9
Johan Santana 3.6 31 5 2.50 0.87 8
Angel Pagan 5.1 28 3 1.50 0.96 7
David Wright 3.5 27 4 2.00 1.00 7
Jon Niese 2.1 23 5 2.50 1.17 6
Mike Pelfrey 2.2 26 4 2.00 1.04 5

The results for the Mets weren’t much better in 2010 — 79 wins — but their roster at least improved. Six players made Core Player-type contributions, and two of those players were starting pitchers. If these performances proved to be sustainable over multiple years, or at least into 2011, the Mets had some reason for optimism.

2011 New York Mets

Name WAR Age Control Years Control Index Age Index Core Wins
Daniel Murphy 2.8 26 5 2.50 1.04 7
Jon Niese 2.1 24 5 2.50 1.13 6
Ruben Tejada 1.6 21 5 2.50 1.29 5
Ike Davis 1.3 24 5 2.50 1.13 4
Jose Reyes 5.8 28 1 0.50 0.96 3
David Wright 1.7 28 3 1.50 0.96 3

But it didn’t work out. In 2011, the Mets were right back to a pathetic three Core Player performances, with only one starting pitcher among them. In fact, the Mets’s strongest core performance in 2011 came from 2.8-win Daniel Murphy. Not good. Ike Davis promptly regressed out of the core, David Wright fought injuries, and Johann Santana didn’t play all year, which is why Core Wins discounts the value of aging players. Although Jose Reyes provided a superstar WAR of 5.8 and a batting title, as a departing free agent, that performance provided no ongoing value to the team, and the Core Wins formula discounts it accordingly. It all amounted to 77 wins, and low expectations for the following season.

2012 New York Mets

Name WAR Age Control Years Control Index Age Index Core Wins
Jon Niese 2.7 25 5 2.50 1.08 7
David Wright 7.4 29 2 1.00 0.93 7
Ruben Tejada 1.7 22 5 2.50 1.23 5
Matt Harvey 1.5 23 5 2.50 1.17 4
R.A. Dickey 4.4 37 2 1.00 0.73 3

Validating this expectation, the 2012 Mets did even worse, winning only 74 games. Only three players could pass the Core Wins threshold, and one of their best players — R.A. Dickey — could not even quality as a Core Player, despite 4.4 fWAR. The Core Wins formula discounts the going-forward value of 37-year-old performances, and Dickey’s 2013 performance with the Blue Jays has validated that skepticism.

But, the Mets get enough bad news, so let’s focus on some positive aspects. In 2012, David Wright performed at an MVP level. And while the Mets had only four Core Win players in 2011, two of them are starting pitchers, which is an important positive from our study of the Rays. In fact, one starter, Jon Niese, was signed to an early long-term contract a very Rays thing to do, putting a competent starter under extended team control. Matt Harvey also looks to be a championship-caliber ace, and remains under maximum team control.

So far, 2013 is not being kind to the Mets either — Fangraphs currently projects them to finish with 76 wins — but there are hints that things may soon be looking up, particularly if their farm system can continue to develop strong rotation talent, as many project that it will.

Trending in the Right Direction: The Oakland Athletics

Finally, let’s conclude with what turns out to be a Goldilocks example: the team that like the Mets, tried and failed to improve their core, but stuck with it and seems to have gotten the hang of it lately: the Oakland Athletics.

2009 Oakland Athletics

Name WAR Age Control Years Control Index Age Index Core Wins
Brett Anderson 3.6 21 5 2.50 1.29 12
Ryan Sweeney 3.9 24 5 2.50 1.13 11
Rajai Davis 3.7 28 5 2.50 0.96 9
Kurt Suzuki 3.1 25 5 2.50 1.08 8
Dallas Braden 2.7 25 5 2.50 1.08 7
Andrew Bailey 2.3 25 5 2.50 1.08 6

In terms of roster-building, the 2009 Athletics took a fairly solid approach: they ended up with six potential Core Players, and three of them are starting pitchers. All these players offered at least five years of team control. However, the 2009 Athletics also underscore that just because your wins are coming from the right place does not mean you are getting enough of them. The best performance in this group is still only 3.9 fWAR — good, not great. The 2009 Athletics won only 74 games, although at least they didn’t have to pay Mets prices to get there.

2010 Oakland Athletics

Name WAR Age Control Years Control Index Age Index Core Wins
Daric Barton 4.8 24 5 2.50 1.13 14
Cliff Pennington 3.4 26 5 2.50 1.04 9
Gio Gonzalez 2.9 24 5 2.50 1.13 8
Brett Anderson 2.4 22 5 2.50 1.23 7
Dallas Braden 3.3 26 4 2.00 1.04 7
Trevor Cahill 1.6 22 5 2.50 1.23 5

In 2010, the Athletics were better. Leveraging some of the previous year’s young talent, they ended up 81-81. There were six core-type player performances, and four of them pitchers: ordinarily, a good thing. But notably, there was not a significant amount of improvement from 2009’s core contributors. In fact, the strongest core contributors in 2010, Daric Barton and Cliff Pennington, were marginal contributors the year before, raising the possibility of fluke performances. And, only two core performances came from position players, which didn’t leave much room for error going forward in the scoring department. So, the 2010 Athletics showed hints of a developing core, but a fragile one.

2011 Oakland Athletics

Name WAR Age Control Years Control Index Age Index Core Wins
Gio Gonzalez 3.2 25 5 2.50 1.08 9
Jemile Weeks 1.7 24 5 2.50 1.13 5
Trevor Cahill 2 23 4 2.00 1.17 5

And indeed it was. The Athletics rotation was devastated by injuries in 2011: Dallas Braden needed shoulder surgery, and Brett Anderson needed Tommy John surgery. That would be a tough blow for any team, but particularly for Oakland, which did not have much behind them. What was left of the rotation (and roster) collapsed to three core-type players. The two core bats of consequence in 2010, Daric Barton and Cliff Pennington, immediately regressed and revealed themselves to be one-year wonders. The only developing bat remaining was an average, but unspectacular debut by Jemile Weeks, whose own performance later proved unsustainable.

Although two out of the three core players were starting pitchers, there was little to support it. Brandon McCarthy actually had a very good year (4.5 fWAR), but since he was completing a 1-year-deal at the time, he offered the A’s no core value.

Things looked bleak. Fortunately, the A’s stuck to their guns and kept developing young talent. Then, 2012 happened.

2012 Oakland Athletics

Name WAR Age Control Years Control Index Age Index Core Wins
Josh Reddick 4.5 25 5 2.50 1.08 12
Jarrod Parker 3.4 23 5 2.50 1.17 10
Tommy Milone 2.8 25 5 2.50 1.08 8
Yoenis Cespedes 2.9 26 4 2.00 1.04 6
Brandon Moss 2.3 28 5 2.50 0.96 6
Sean Doolittle 1.6 25 5 2.50 1.08 4

2012 found the Athletics again having restocked their core, this time with a balance of bats and pitching talent. Five core players are represented, and their values are not all projection, either: Josh Reddick produced 4.5 fWAR, Jarrod Parker generated 3.4 fWAR, and two other controlled players produced close to 3 fWAR. Two core players are starting pitchers. Furthermore, in 2012, the A’s finally enjoyed a little luck. They outplayed their Pythagorean expectation by a few wins, got 2+ win performances from non-core starters on short-term deals — Brandon McCarthy and Bartolo Colon — and ended up with 94 wins and an AL West title, on top of what appeared to be developing core.

If you thought that the Athletics were finally getting the hang of this roster-building thing, you may be right. The Athletics have spent much of 2013 on top of the AL West, and Fangraphs currently projects them to finish with 91 wins — on a budget of $62 million. A very Rays-like experience all around, which corresponds with quality roster construction.

Conclusion

The Core Wins metric profiles the extent to which team performances are being delivered by so-called Core Players, and also tracks the progression of players in and out of the club’s core over time. Even herculean performances by impending free agents (see Carl Crawford, 2010) tend to wash out of the metric, while young players who initially impress, but fail to sustain (see Ike Davis, 2011) also fall out of the measured core, despite their built-in advantages of youth and team control. As such, Core Wins strikes me as useful and if nothing else, an improvement over the prevailing practice of eyeballing the roster and cherry-picking performances by younger players.

Because it is based on WAR (a counting statistic), Core Wins is primarily backward-looking. But, the general method can also be used prospectively. For example, if you input projections from your preferred player projection system, you could forecast the extent to which your team is likely to get future contributions from sustainable sources — a useful thing to know when deciding between trades, farm system call-ups, or free agent signings. Similarly, if you want to focus on particular positions of concern — (third base, starting rotation) — or skill sets (batter OBP, pitcher FIP) — you can adjust the Age Index to account for the peak performance ages corresponding with those particular positions or skills. Those analyses can be retrospective or prospective.

Of course, superior roster construction does not guarantee superior performance, as the Oakland A’s can attest. Previously healthy players can be felled by injury, and promising talents too often fail to sustain early achievements. But in general, developing Core Players makes good sense, and certainly seems to be delivering results for the league’s most efficient ballclubs. So if your favorite team seems incapable of stacking success, you might check to see how good of a job the front office has been doing in generating Core Wins.

Special thanks to Paul Noonan and Tom Tango, who both offered helpful comments on the general direction of this article. All errors are entirely my own, including some table pasting errors in the original version. Thanks to Andrew Yuskaitis for pointing those out. They have now been corrected.


Trade Ian Kinsler

The 2014 Rangers have an interesting predicament.  The same predicament they currently have, but it will be more pronounced, more necessary to solve in the off-season.

They have two shortstops and a second baseman.  One shortstop, Elvis Andrus, is locked up for a long, long time.  And the other shortstop, Jurickson Profar, is most likely going to move over to second base permanently, giving the Rangers what should be  a very good and young middle infield, for many years.

I’m assuming the Rangers keep Profar at 2b, rather than move him to the outfield or trade him for another top prospect.  It would mean Ian Kinsler either must change positions, or more logically, be traded to a team who will value him more highly since he can man second base for said team.

Kinsler has been on a decline the past few years, whether it’s due to injury or diminishing skills.  Or perhaps a combination of both.  For example: The league average wOBA in the American League this season is .318.  Over the past two seasons, Kinsler’s wOBA’s have been .327 and .330, respectively.  The .330 has been over the course of 97 games in 2013, so he has some room to improve upon that.  But there is only so much he can do with only a month and a half of the season remaining.

Best option for 2014?  Trade Ian Kinsler.  There are certainly obstacles.  He is going to turn 32 next season.  He, as I mentioned, isn’t hitting like he used to hit, as just two years ago, he posted a 7-WAR season with a .364 wOBA.  He is guaranteed four more seasons, and $62 million on his current contract (including the 2018 option which has a $5 million buyout).  So most teams will be wary of committing that kind of money to a player who is past his prime, and probably past the point of “good” nowadays.  Above-average, maybe.  But I can’t see Kinsler being worth much more than 3 wins in a season moving forward, and he might be worth even less than that.

There is one team that could use a 2B next season though, and has a fairly new obsession with throwing around money: the Los Angeles Dodgers.  Mark Ellis has a $1 million buyout on his 2014 option and is going to be turning 37 next summer.  There is no doubt that Ian Kinsler will be an upgrade at 2B for the Dodgers over Ellis (And at $5 million, Ellis might even be worth a utility role).  If the Dodgers don’t bring home a championship this season after spending an absurd amount of money in 2013 (and beyond), there will be even more pressure to win next year.

In comes the potential acceptance of either the remaining Ian Kinsler money or most of it, without having to give up much.  Maybe a prospect with some upside.  But they definitely won’t have to surrender a bonafide prospect of any kind.

The Rangers COULD decide to just move Kinsler to 1B or a corner outfield spot.  But a .330-ish wOBA at first base would be below the league average at the position.  And even though .330 would be a little above average in left or right field, he would be learning a new position.  That might not go well.  There is a not-miniscule chance Ian Kinsler is a below-average player in 2014 if he is moved off of 2B, especially if it is to the outfield.

The Rangers would probably be just as good bringing back David Murphy as one of the outfielders, rather than moving Ian Kinsler out there.  Murphy is a solid defender, and even though he’s been terrible at the plate in 2013, he should be very cheap next year and regress back closer to his normal offensive numbers.

The other outfield spot could be solved with a platoon, potentially a minor leaguer, depending on who is ready (if anyone), a stop-gap, maybe even Nelson Cruz.  Although, knowing that Cruz was just suspended, I would simply let him walk.

They can solve their outfield situation in a better manner than using Ian Kinsler to fill one of the two voids.

And they can find a 1B for a year that’ll hit like Kinsler probably will in 2014.

Overall, the best bet for the Rangers is to move on from Kinsler, assuming there is a team that wants or needs a 2B badly enough.

 


CarGo and the Value of Plate Discipline

Note: I have no idea if I’m the first to do this, but quite frankly I don’t care.

As you’ve probably grown sick of hearing¹, Brewers center fielder Carlos Gomez is undeterred by his team’s general shittiness and is having a terrific season–his 5.7 WAR² is a very close third in the NL, and one of the players he’s behind may be out for a while. While he’s always been an excellent defender (50.8 career UZR prior to this season), his bat has never been particularly good (his career-best wRC+ prior to this season was 105, last year).

A massive improvement on offense has been the driving force behind his MVP-type numbers, as his wRC+ of 133 this year sits at 16th in the NL; this can be attributed to an increase in power (.235 ISO, compared to .150 for his career) and an uptick in BABIP (.350, compared to .311 for his career). Many of the articles listed in the first footnote cite these as reasons behind his improvement. One element of his game that has not improved, however, and is getting startlingly little coverage from the media, is his plate discipline; his walk and strikeout rates sit at 6% and 25.1%, respectively, meaning his BB/K of 0.24 is 6th-worst in the NL.

Now, should Gomez end up leading the league³ in WAR with that kind of plate discipline, how revolutionary would that be? I decided to find out. I looked up every NL WAR leader going back to 1910 (when strikeouts for batters⁴ were first recorded) and recorded their strikeouts and walks, then calculated each batter’s K/BB⁵ and ranked them from lowest to highest; the top 10 are listed below.

Year NL K BB K/BB
2013 Carlos Gomez* 144 34 4.24
1988 Andy Van Slyke 126 57 2.21
2011 Matt Kemp 159 74 2.15
2012 Ryan Braun 128 63 2.03
1984 Ryne Sandberg 101 52 1.94
1971 Willie Stargell 154 83 1.86
2005 Andruw Jones 112 64 1.75
1970 Tony Perez 134 83 1.61
1978 Dave Parker 92 57 1.61
1941 Pete Reiser 71 46 1.54
*ZiPS Projection

The average K/BB was 0.85, meaning Gomez’s⁶ is nearly 400% worse.

Now, any fan of baseball–sabermetrically inclined or otherwise–knows that this year (and in recent years), plate discipline has been at an all-time low. Knowing this, I decided to measure each player differently. I gathered up all the league-average K/BB’s for every year going back to 1910, then divided each WAR leader’s K/BB by the league-average K/BB for the respective year, and created K/BB-, in the style of ERA-. I then ranked each batter’s K/BB- from highest to lowest (i.e. worst to best); the top 10 are listed below.

Year NL K/BB lgK/BB K/BB-
2013 Carlos Gomez* 4.24 2.51 169
1941 Pete Reiser 1.54 0.99 156
1988 Andy Van Slyke 2.21 1.8 123
1984 Ryne Sandberg 1.94 1.69 115
1937 Medwick 1.22 1.07 114
1971 Willie Stargell 1.86 1.67 111
1978 Dave Parker 1.61 1.48 109
1970 Tony Perez 1.61 1.63 99
1926 Hack Wilson 0.88 0.89 99
2011 Matt Kemp 2.15 2.3 93
*ZiPS Projection

The average K/BB- was 58, meaning Gomez’s was almost 200% worse.

The closest match to Gomez’s season (at least in terms of plate discipline) was Pete Reiser in 1941. That year, his K/BB was a very solid (by our standards) 1.54, but the league-average was below 1, meaning he was actually pretty bad by league-adjusted standards.⁷

Even when we adjust for the era, Gomez’s plate discipline is historically bad. People may argue about the value of plate discipline to a hitter, but you can’t dispute the facts: the average K/BB for a WAR leader is 42% better than league-average, and Gomez’s is 69% worse than league average, and yet he is contending for the WAR lead.

So, what does this mean? Obviously, as I mentioned in the introduction, a large part of Gomez’s value comes from his defense, and thus his offense is probably behind that of many others on this list. Gomez’s season has come out of nowhere, at least to some degree, meaning that it may be a fluke; for that to be determined, we’ll just have to wait and see. Though Brewers fan may be discouraged to hear it, history suggests it probably is.

——————————————————————————————————————————-

¹You know, from here, and here, and here, and here, and here, and here, and here. Also, I’ve now started doing footnotes a la Grantland, although there isn’t any linking yet.

²All stats are as of Tuesday, August 13th, in case this takes some time to get published.

³I’m really getting sick of people using “the league” to refer to MLB as a whole; it’s misleading and it’s wrong. This isn’t the NFL–there are two leagues, not one. When you’re referring to MLB, say “the majors”, not “the league”.

⁴Strikeouts for pitchers go back all the way to 1876 (i.e. when all pitcher stats go back to). Why’d it take 34 years to record strikeouts for batters?

⁵I’ve always hated BB/K–it returns numbers that are much too minuscule. I prefer the larger form of K/BB.

⁶Is that correct, or should there be no “s”?

⁷Reiser’s success that year–166 wRC+–was mainly motivated by a .377 BABIP, 97 points higher than the MLB average that year, and by far the highest of his career.


The Most Predictable Hitters of 2013

I was watching the Twins game a few weeks ago when veteran Jamey Carroll effortlessly took an outside pitch to right field, as one might hope he would. The announcers were quick to praise his ability to “go with the pitch”. I’ve seen this play out time after time, often followed by praise for “going with the pitch” and “not trying to do too much”. That got me thinking, do some hitters go with the pitch better than others? Is this a desirable skill or does it leave the hitter vulnerable? Can a defense exploit this trait with a defensive shift much like we see shifts on straight pull hitters?

To dive into this I captured the angle of each hit ball since 2010 and displaced that against the angle that I expected the pitch to be hit. For example, an inside pitch on a right-handed batter could be expected to be hit near the left field line, while an outside pitch could be expected to be hit near the right field line. Everything in between would be evenly spread across the field, relative to the pitch’s location across the plate.

To make it a little more accurate for right-handed hitters vs left-handed hitters, I analyzed the actual pitch placement for pitches that become hit balls. As you can see below, all hitters prefer the ball just a touch on the outside part of the plate. I took two standard deviations of the hit pitches and considered that the spectrum that we’ll map to the field, with unique values for right or left handed hitters. We’ll call this our hit zone.

The players that made it to the top of the data below are the ones that tend to go with the pitch. That is, they take the outside pitch to the opposite field, they pull an inside pitch, and they take a pitch down the middle of the plate straight through center field. They are less random and more predictable.

With that, here are the most predictable hitters of 2013 through August 10th.

Batter

Average Absolute Angle Difference

Mean Angle Difference (Pull Tendency)

Standard Deviation

Hit Balls

Melky Cabrera

17.77

2.59

22.10

291

Pete Kozma

18.43

2.08

22.13

253

Marco Scutaro

18.43

0.25

23.05

361

Everth Cabrera

18.43

-0.32

23.25

319

Chris Stewart

18.76

6.81

21.97

182

Jamey Carroll

19.06

-2.93

23.02

153

Martin Prado

19.17

-5.35

23.90

392

Elvis Andrus

19.19

-4.59

24.42

387

Lorenzo Cain

19.21

0.09

24.23

266

 

For comparison sake, here are the 10 least predictable hitters.

Batter

Average Absolute Angle Difference

Mean Angle Difference (Pull Tendency)

Standard Deviation

Hit Balls

Carlos Santana

26.73

17.38

27.47

299

Howie Kendrick

26.73

-10.40

31.03

347

Juan Francisco

26.74

7.08

31.39

169

Yasiel Puig

27.08

-1.71

32.62

167

Jimmy Rollins

27.11

16.51

26.98

369

Ryan Flaherty

27.12

14.14

28.81

143

Pedro Alvarez

27.20

13.90

29.69

254

Ryan Howard

27.42

7.23

32.52

197

Chris Young

29.81

16.39

31.33

165

Chris Heisey

31.02

19.52

30.40

119

Let’s explain this data before we go any further.

First off, the field is 90 degrees and thus, the values are all in degrees.

  • Average Absolute Angle Difference: If a pitch was on the inside of the plate on a right-handed hitter, and was determined it would be “properly” hit somewhere near the left field line, but was actually hit 20 degrees to the right of that expected spot, this number shows that difference, averaged across all hit balls.
  • Mean Angle Difference: Some balls are pulled against their expected spot, others are not. Pulled balls show up as a positive angle (for both L or R hitters), while negative angles indicate the batter was behind the pitch. The Average Absolute Angle Difference does care either way, while this metric does. A higher positive value here indicates a pull tendency while a negative value indicates that a batter is more often than not behind the pitch. Those batters with a higher value from 0 indicate they could be a little more predictable to pull or push.
  • Standard Deviation: This should give you an indicator as to what kind of angle you could expect ⅔ of a batter’s hit balls to be where you expect them to be. For example, Chris Stewart has a standard deviation of 21.97 degrees. Given a very outside pitch that you’d expect to be hit down the right field line, you can expect that Stewart will usually hit that ball down the line or at most 20 degrees to the left or foul.

Looking at the data back to 2010 I found these players continually near the top. It seems for them, they have always hit this way, and can be expected to continue to hit this way.

  • Marco Scutaro
  • Ryan Hanigan
  • Jamey Carroll
  • Denard Span
  • Elvis Andrus

Now, what can we do with this knowledge? Can a defense use the left-handed shift on a right-handed hitter? To look at this we’ll look at spray charts, but with a very important distinction from a standard spray chart – we’ll limit the hit balls to those hit on pitches on the outside of the hit zone.

I’ll start you off with a spray chart for someone not on our list – Jose Bautista. This chart shows where he hits outside pitches. He looks like a good spray hitter when you look at only the outside pitches. As a defense, you wouldn’t shift on Bautista AND pitch him outside.

Let’s move on to someone who was continually at the top of our list, Marco Scutaro. You’ll see Scutaro reliably hits balls on the outer third of the hit zone to the right side. He still hits a fair number of ground balls across the infield, so an infield shift wouldn’t be advised. But liners and fly balls in the outfield are heavily weighted to the right. Using a control pitcher, pitching on the outside ⅓ of the hit zone, you could reliably shade the outfield to right field.

The same applies for Jamey Carroll, another player who, like Scutaro, shows up on our list year after year.

Takeaways
I’ve found that the tendency of pushing the ball on outside pitches to be much more predictable with our leaders than pulling the ball on an inside pitch. There’s surely more to be gleaned from this data, but the outfield shift on these predictable push hitters is definitely the most interesting.

Data Collection & Mining Techniques
The metrics for all hitters, year-by-year back to 2010 can be found here: https://docs.google.com/spreadsheet/ccc?key=0AtERgAQ83pATdDItUzAxXzhMZm41cGFPRjgxOEdZa0E&usp=sharing

All of the data used in this post was loaded from MLB’s gameday servers into a MongoDB database using my atbat-mongodb project. This project is open source code that anybody can use, modify, contribute to, etc. Fork me please!
https://github.com/kruser/atbat-mongodb

The following programs were used to mine and plot the data from the mlbatbat MongoDB database.


The Tale of Two Drews

The Red Sox employed outfielder J.D. Drew from 2007-2011 and signed his brother, Stephen, to a one-year contract prior to the 2013 campaign. The Drews are ballplayers who go about their business in similar ways — they’d prefer to avoid the limelight and just hit the baseball. It’s an admirable quality, but not one that’s so cooperative with the Boston media or fans. For some inexplicable reason, Boston is enamored with players whose highs are raucous and whose lows are dismal. This was never the case with J.D., and doesn’t appear to be the case with Stephen, but the numbers say that they’re some of the best Sox contributors in recent history.

The Background

J.D. and Stephen were high profile prospects in their respective draft classes and both went to Florida State University.* Prior to signing with the Sox, the two had established themselves in the National League. Both brothers, however, followed completely different paths to their contracts with the Boston Red Sox. In 2007, the Sox signed J.D. at the pinnacle of his career to a 5-year, $70 million contract. Stephen signed a low-risk, high-reward deal with the Sox for 1-year at $10 million prior to 2013. He’s the shortstop for now — Xander Bogaerts is the future. Boston fans can’t help but notice the similarities between the two brothers, which extends beyond the striking resemblance to one another and the shared uniform number (#7). Stephen plays the game much same way as J.D. did, with a smooth and dispassionate style that makes hitting and fielding a baseball seem as simple as driving a tractor (because this is all I like to imagine J.D. does now that he’s stepped away from the game). The two have nearly identical left-handed swings and are known around baseball to share one elite quality: their approach to an at-bat and their knowledge of the strike-zone.

Batter’s Eye

J.D. Drew was heralded as one of the most disciplined hitters in baseball when he signed with the Red Sox in 2007. This means he had an excellent understanding of the strike-zone and had the ability to take close pitches for balls to reach base. Less was known about Stephen when he arrived in Boston, as he was a lower-profile signing. But after his first 84 games, it’s clear that he possesses the same skill. The skill can be quantified by using a PITCHf/x statistic called O-Swing%. The stat measures the percentage of pitches a batter swings at outside the strike-zone. If you need more info on O-Swing%, FanGraphs has a good summary. But suffice it to say that the lower a hitter’s O-Swing%, the better handle he has on the strike-zone (there are a few exceptions; for example, Miguel Cabrera does not see very many pitches in the zone, but is still skilled enough to square up balls that are off the plate. He has one of the highest O-Swing% in the MLB). I’ve plotted BB% (a hitter’s rate of drawing walks) vs. O-Swing% for each hitter with at least 300 plate appearances this season and super-imposed J.D.’s numbers he racked up with the Sox (2007-2011):

BBOSWing

We can make a couple of observations. First off, BB% clearly trends with O-Swing% — this makes sense: those who swing less often at pitches outside the zone are more likely to walk. Second, we see that Stephen possesses the same plate discipline as J.D., ranking around the 15th percentile in O-Swing%. In fact, both brothers’ BB% is slightly higher than we might expect based on the linear regression (i.e. the data points lie above the trend line). Finally, we notice that if J.D. played in 2013, he would lead the league in O-Swing%. That’s right: J.D. Drew would have the best eye in Major League Baseball if he strapped on the spikes and decided to have another go. Players who are more likely to walk (i.e. who have a high BB%) are more likely to have a higher OBP, one of the fundamental stats for determining a player’s value. It’s not difficult to see why the Drews got the big bucks from Boston.

Fans (including myself) were under the assumption that if you have a great eye, you strike out less. This is not such a ridiculous proposition: if you have an elite knowledge of the strike-zone, then surely you should utilize it with two strikes. But a simple plot of K% (the rate at which a hitter strikes out) vs. O-Swing% demonstrates otherwise:

KOswing

A blob. The two statistics are not correlated in the slightest. To Sox fans, it seemed that J.D. Drew often took the third strike with the bat on his shoulder — the “Master of the Backwards K”. Since Sox fans knew he had a great eye, it seemed as though this happened at an alarming rate, as the expectation was that a lower O-Swing% should also lead to a lower K%. The two stats are not correlated and Drew did not strike out at an alarming rate at all — if he decided to step into the batter’s box in 2013 he’d be right around the league average in K%. Because J.D.’s eye was touted (for good reason) as one of the best in the league, many fans unfairly jumped to conclusions about how often he should strike out. Also, if we take a look at where Stephen lies in the data spread, we see that he strikes out at a much greater rate than his brother, but seems to take less heat from Red Sox Nation. This might be because Sox fans love players with a flair for the dramatic — something Stephen has shown he possesses whereas J.D. never did.

The “Anti-Clutch”

The biggest hit I remember from J.D. Drew was a grand slam in Game 6 of the 2007 ALCS, which turned the tide of the series. As for walk-offs, I remember one biggie: a line-drive over the head of the right-fielder in Game 5 of the 2008 ALCS against the Rays to cap a massive Sox comeback. Gordon Edes of ESPNBoston reminds us that there was, in fact, one more, but goes on to summarize J.D.’s reputation brilliantly: “Mr. Excitement, he was not.”

“The Anti-Clutch” was the nickname bestowed to J.D. Drew by my dad, who was often frustrated with his performance in tight spots. But my dad’s a stubborn guy and may have been swayed by one strikeout (he also championed the nickname “Master of the Backwards-K”). Certainly he hasn’t done a fair analysis of the relevant statistics, so I’ll do it here. CLUTCH is a complicated statistic that attempts to quantify a player’s performance in high-pressure situations. It utilizes WPA (win probability added) and LI (leverage index, a measure of just how “high pressure” the situation truly is) and normalizes the league-average player to zero. You can read more about CLUTCH here, but the number generally ranges from -1 to 1. Thus, a player with a positive CLUTCH can be considered just that (clutch) but a player with a negative CLUTCH often chokes in the tight spots. So how did J.D.’s numbers look during his time in Boston?

JDDrew

Yikes. That’s all there really is to say about that, except for it likely validates the opinion of Dr. D’Andrea. For reference, Stephen Drew’s CLUTCH is 0.64 during his first season in Boston, which checks in at well-above average. Nonetheless, J.D. Drew has had a tremendous, all-star career, similar to the likes of Eric Davis, Raul Mondesi, and Kirk Gibson.

Stephen’s Trend

Jose Iglesias started the season as the Red Sox shortstop when Drew missed much of spring training due to a concussion. When Drew returned, Iglesias was optioned to Pawtucket, but was recalled when Stephen missed time in July with a hamstring injury. Iglesias was traded to the Tigers in the deal that brought Jake Peavy to the Sox, clearing the way for Drew to re-assume the everyday job on the left side of second base. Drew’s season trend, especially as it pertains to his batting average, was likely a main reason why GM Ben Cherington felt comfortable giving up Iglesias, a defensive wizard:

Stephen Drew

While Drew’s not even half the fielder that Iglesias is, he has the potential to carry a team for weeks at a time with his bat. Fitting his season trend to a third-degree polynomial (this is not a “random” choice — he has clearly had two critical points over the course of the year), we can see that Drew is heating up as the season turns to August. In the best-case scenario (the one in which Drew continues or surpasses his current surge), he could be hitting .300 by September 1st. In a more realistic scenario, Drew will continue his current hot streak, and then regress to his career average of .264 by the time September rolls around. In any case, the remainder of the season is looking promising for the Red Sox shortstop, which is a good sign for a team that’s in desperate need of production from the position. In the wake of the Peavy deal, my favorite Globe writer Chad Finn had this to say about the brothers: “And yes, I’m kind of chuckling at the thought that the unfairly maligned Stephen Drew is still here while Iglesias has moved on. The Drews, they’re survivors, man.”

*J.D. Drew was drafted by the Phillies second overall in 1997, but failed to sign a contract. He and agent Scott Boras demanded $10 million whereas the Phils were only willing to offer $2.6 million. He played with an independent league team for one year, then was drafted fifth overall by the Cardinals in 1998, signing for $7 million. Phillies fans booed him for the entirety of his career.

Vince D’Andrea is a rising senior at the Massachusetts Institute of Technology. He is an avid Red Sox fan and his blog, Dave Roberts’ Dive, can be found here.