Archive for Player Analysis

What the Red Sox are Getting in Grady Sizemore

The Red Sox have made news, signing Grady Sizemore to a Major League contract worth $750k ($6 million if all incentives are factored in).  This got me thinking about what happened to Sizemore.  He made the game of baseball seem simple with his defensive prowess, his above average power, and his lightning speed.  Then one day, his aging body realized that the kind of aggressive player that he had been would no longer work and it started giving in to the high strain that he put it under.  He had a hernia surgery as well as several knee and back surgeries between 2010 and 2013 which seemed to end his baseball career.  Then out of the blue he was picked up by the Red Sox in a deal that has high potential for both sides; the Red Sox could get a great player for cheap and Sizemore could resurrect his career.  This deal could end poorly for Sizemore, who could finally realize that his career is over either from injury or performance reasons while the Red Sox will view it as a failed experiment that doesn’t hurt them financially now or in the long term.

The signing of Grady Sizemore is an indication that they are ready to give Jacoby Ellsbury’s former job over to Jackie Bradley Jr., but it also shows that they are prepared with a backup plan in case that doesn’t pan out.  Some would say that Sizemore is hardly a backup plan as he could very well end up injured, which is absolutely true, and that an outfield of Gomes LF, Victorino CF, and Nava RF would happen in the event of Bradley turning out to be a dud.  But with Sizemore comes a tremendous amount of upside.  Five years ago he was a 30-30 player and a gold glover in the outfield.  He was healthy, he was starting to walk more and strike out less, and then everything stopped for him.  He was forced to undergo elbow surgery in 2009, an injury that had plagued him all season long, and from that point on if it wasn’t one problem then it was another.  Left knee surgery, right knee contusion, hernia surgery, back surgery, and then a right knee surgery came all in a span of three years which can leave a player asking whether or not their career is over.

The question that should also be brought to the attention about Sizemore is what his plate discipline will be like.  Sizemore’s BB/K reached its peak in 2008 with a .75 BB/K, dropped to .65 in 2009, and then plummeted to .245 in the combined 104 games in 2010 and 2011.  For most of his career, it seemed that Sizemore was above average at walking and avoiding strikeouts as his career BB/K was .53, .05 above the Major League average during that time period of .48.  Now did the drop come about as a result of the injuries that he suffered from 2009 to 2011 or did they just come as a result of him losing his ability?  The interesting thing about his BB/K having such a drastic change is that his swing percent rarely changed.  His career Swing% is 43.4% which is fairly decent considering that between 2004 and 2011, the average swing rate among players was 47.6%.  Sizemore also made contact with the baseball at an 81.1% rate over the course of his career with the contact rate dropping barely below that number in the three shortest years of his career (2004, 2010-2011).  Those were also the only years that his swinging strike percent exceeded 10%.  Perhaps what this shows is that pitchers weren’t afraid to attack him and he wound up taking a lot of called strikes.

The other facets of his game that must be viewed at with a lot of importance are his power and speed.  I truly believe that if Sizemore can stay healthy, then we will see a resurgence of his power.  His power numbers have always been impressive, with a career ISO of .204 and career SLG of .473.  Even in 2011 when he was limited to 71 games and was coming off of microfracture surgery in his left knee, he produced an ISO of .198.  I don’t think power will be an issue for him.  The other major part of his game that will likely never return is the speed.  In those 71 games of 2011, he stole 0 bases after averaging 19 swiped bases per season and stealing at least 22 bases in 4 of his 7 seasons prior to 2011.  With the second knee surgery having occurred in 2012, my guess is that little to no speed will be found from him in 2014.

As with everything in baseball, there are the intangibles that must come into consideration when discussing the future of Grady Sizemore.  For starters, he has not set foot onto a baseball field in 2 years.  It is a possibility that he will be incredibly rusty and might struggle to perform again at the big league level.  For some players, that would be less of a concern but for a player who last played baseball in his twenties and who is now playing in his thirties (granted, it was his late twenties and it will be his early thirties), it could pose a greater challenge.  It’s possible that he could shake all rust in Spring Training and come out in April and prove all the doubters wrong although it is impossible to know for sure.

The most optimistic yet realistic scenario for Sizemore is that he comes back to the majors, is solid defensively, and puts up great power numbers for the Red Sox.  My guess is that from the knee and back surgeries, his base stealing days are over and he will not be able to cover as much ground in the outfield that someone else might.  If I was Red Sox management, I would not give him the role of backup center fielder until I knew for sure the kind of speed that he has left.  I would task Shane Victorino with that as he has remained healthy and still has the speed to cover that ground.  Victorino would play center and when he does then Sizemore would be in left field and Gomes/Nava would be in right field.  If Bradley fits in with the Red Sox plan, then Sizemore just becomes a spot starter/platoon player in left, center, and right field only to give people a break when they need it.  So to answer the question that the title of this article poses: the Red Sox are getting a wild card, a player with the potential to be a power bat off the bench or even in the everyday lineup or a player that has played his last days in the bigs.


Comparing Kershaw and Tanaka’s Opt-Out Clauses

This post is going to examine the value of the opt-out clause in both the Clayton Kershaw and Masahiro Tanaka contracts. I think this is interesting because the Yankees gave Tanaka an opt-out one year earlier, and gave that option to a commodity with a much more uncertain value.  As we will see, the opt-out clause for Tanaka is going to be a lot more costly to the Yankees than the clause was for Kershaw and the Dodgers.

Let’s start with the projections for each player. ZIPS and STEAMER don’t have anything for Tanaka, but we can make a guess based on the contract he was given that he’s at least expected to be worth a lot of wins over the next several years.  Since he’s the same age, it seems approximately fair to start with 5 wins, and reduce in the same pattern that Kershaw got.  I’ll use the values from Dave Cameron’s excellent article the other day for Clayton Kershaw, and I’ll also take the $/WAR from his projected inflation.  Excess value is the value of that player’s WAR, minus salary.

Tanaka Kershaw
AGE WAR $/WAR Salary Excess Value AGE WAR $/WAR Salary Excess Value
25 5.0 6.0 22 8.0 25 5.5 6 30 3.0
26 5.0 6.3 22 9.5 26 5.5 6.3 30 4.7
27 4.5 6.6 22 7.7 27 5 6.6 30 3.0
28 4.5 6.9 22 9.1 28 5 6.9 30 4.5
29 4.5 7.3 22 10.9 29 5 7.3 30 6.5
30 4.0 7.7 22 8.8 30 4.5 7.7 30 4.7
31 4.0 8.0 22 10.0 31 4.5 8 30 6.0

The key here is not going to be the expected value — it’s going to be the possible variation. Kershaw is expected to get 5.5 wins next year because of the ever-present risk of injury — there are probably Dodgers fans going nuts over that projection because they know that a healthy Kershaw, pitching like he can, is going to be worth closer to 7 wins.  There are certainly scenarios where he manages that, but also scenarios where he tears his rotator cuff and is worthless.  While there is a continuum of possibilities, let’s break the world into two scenarios for each pitcher, an up and a down. The only requirement is that the weighted average of each scenario has to average out to their projections.  I’ve made up some basic numbers here, and you might think they’re reasonable, you might think they’re not, but the point of this article is to illustrate how one extra year and some extra volatility can affect the value of an opt-out clause.

In each scenario, I make the downside a mirror image of the upside. For Tanaka, because he is an unproven commodity, I’ve added 2 WAR to the upside, and subtracted 2 for the downside. For Kershaw, I’ve just added/subtracted 1 for each. I gave each scenario a 50-50 chance of happening.

GOOD Tanaka-50% GOOD Kershaw-50%
AGE WAR $/WAR Salary Excess Value AGE WAR $/WAR Salary Excess Value
25 7.0 6.0 22 20.0 25 6.5 6 30 9.0
26 7.0 6.3 22 22.1 26 6.5 6.3 30 11.0
27 6.5 6.6 22 20.9 27 6 6.6 30 9.6
28 6.5 6.9 22 22.9 28 6 6.9 30 11.4
29 6.5 7.3 22 25.5 29 6 7.3 30 13.8
30 6.0 7.7 22 24.2 30 5.5 7.7 30 12.4
31 6.0 8.0 22 26.0 31 5.5 8 30 14.0

 

BAD Tanaka-50% BAD Kershaw-50%
AGE WAR $/WAR Salary Excess Value AGE WAR $/WAR Salary Excess Value
25 3.0 6.0 22 -4.0 25 4.5 6 30 -3.0
26 3.0 6.3 22 -3.1 26 4.5 6.3 30 -1.7
27 2.5 6.6 22 -5.5 27 4 6.6 30 -3.6
28 2.5 6.9 22 -4.8 28 4 6.9 30 -2.4
29 2.5 7.3 22 -3.8 29 4 7.3 30 -0.8
30 2.0 7.7 22 -6.6 30 3.5 7.7 30 -3.1
31 2.0 8.0 22 -6.0 31 3.5 8 30 -2.0


Let’s think about what happens in each scenario when it comes time to exercise the opt-out clause.  Shockingly, GOOD Kershaw and GOOD Tanaka each exercise the clause. We can see this reflected in the positive “excess value” column of each chart — age 29 for Tanaka and age 30 for Kershaw. They could get more on the free market, so they will. BAD Kershaw and BAD Tanaka both stick with their contracts, because they’re being paid more than market value.  Let’s re-do the charts from the teams’ perspectives, reflecting the opt-out clauses now:

GOOD Tanaka-50% GOOD Kershaw-50%
AGE WAR $/WAR Salary Excess Value AGE WAR $/WAR Salary Excess Value
25 7.0 6.0 22 20.0 25 6.5 6 30 9.0
26 7.0 6.3 22 22.1 26 6.5 6.3 30 11.0
27 6.5 6.6 22 20.9 27 6 6.6 30 9.6
28 6.5 6.9 22 22.9 28 6 6.9 30 11.4
29 0.0 7.3 0 0.0 29 6 7.3 30 13.8
30 0.0 7.7 0 0.0 30 0 7.7 0 0.0
31 0.0 8.0 0 0.0 31 0 8 0 0.0

 

BAD Tanaka-50% BAD Kershaw-50%
AGE WAR $/WAR Salary Excess Value AGE WAR $/WAR Salary Excess Value
25 3.0 6.0 22 -4.0 25 4.5 6 30 -3.0
26 3.0 6.3 22 -3.1 26 4.5 6.3 30 -1.7
27 2.5 6.6 22 -5.5 27 4 6.6 30 -3.6
28 2.5 6.9 22 -4.8 28 4 6.9 30 -2.4
29 2.5 7.3 22 -3.8 29 4 7.3 30 -0.8
30 2.0 7.7 22 -6.6 30 3.5 7.7 30 -3.1
31 2.0 8.0 22 -6.0 31 3.5 8 30 -2.0

Now let’s take the expected value of these two scenarios, which is in this case a simple average:

Expected Value Tanaka Expected Value Kershaw
AGE WAR $/WAR Salary Excess Value AGE WAR $/WAR Salary Excess Value
25 5.0 6.0 22 8.0 25 5.5 6 30 3.0
26 5.0 6.3 22 9.5 26 5.5 6.3 30 4.7
27 4.5 6.6 22 7.7 27 5 6.6 30 3.0
28 4.5 6.9 22 9.1 28 5 6.9 30 4.5
29 1.3 7.3 11 -1.9 29 5 7.3 30 6.5
30 1.0 7.7 11 -3.3 30 1.75 7.7 15 -1.5
31 1.0 8.0 11 -3.0 31 1.75 8 15 -1.0

We can see that in both cases, the post-option years of the contract become negative propositions for the teams — in fact, they would have to be, by how we’ve implicitly stated the conditions under which the players opt out: if the player were expected to provide positive value to his team, he would opt out.

So how much is the option worth? Ignoring the $20 million posting fee, the Tanaka contract, sans opt-out, was expected to produce $63.9M in excess value for the Yankees. With the option, the expected excess value drops down to $26.1M.  That’s a drop of $37.8M. This could be thought of as the extra money Tanaka puts into his pocket from years 5 onward, if he comes into the league and becomes Justin Verlander.  Kershaw, on the other hand, would be expected to generate $32.3M for the Dodgers, without the opt-out. Now his contract is only worth $19.1M to them. That’s a reduction in value, but because we’ve made him less uncertain, and because the option occurs after year 5, not year 4, the reduction is only $13.2M.  So the extra year and the double variability make Tanaka’s option worth $24.6M more than Kershaw’s.

Again, this depends largely on the choices I’ve made for the range of possible outcomes, and I’ve kind of picked Tanaka’s projection out of thin air (since the excess value of the contract with the opt-out is only $6.1M, considering the $20M posting fee, I would argue that I’m not that far off). I could have made more possible outcomes, or maybe even defined a probability distribution function and integrated over that, if I knew how to do that sort of thing. The only lesson we’re going to be able to take from this is how one year and some extra variability affect the value of the opt-out clause.


The Top Five Yankee Second Basemen

Something very strange happened this offseason: the Yankees were outbid for a player they have a clear need for (although all teams need players of this caliber). This player is the best second basemen, and one of the top 10 position players, in all of baseball. Of course this player is Robinson Cano, perennial All-Star, Silver Slugger, Gold-Glover and MVP candidate. I do not need to tell you that Robinson Cano is a great baseball player. But I thought it would be interesting, as a matter of reflection to appreciate Cano’s talent/ be slightly depressed watching him rack up his numbers in Seattle, to rank the best second basemen in Yankee history and to determine where Cano fits in.

First, I think it is important to put the five players to be discussed in some historical context. When one thinks about the great “Yankee positions,” second base does not particularly stand out, at least to me. Like most Yankee fans (I imagine), I immediately think of center field (Mantle, DiMaggio), catcher (Berra, Dickey, Posada, Munson), first base (Gehrig) and right field (Ruth). But is this justified? Lets look at the top five fWAR (FanGraphs’ WAR) totals for each position in Yankee history:

Position

Top 5 Total fWAR

Rank

First Base

231.6

4th

Second Base

216.7

5th

Third Base

178.9

7th

Shortstop

194.9

6th

Catcher

237.5

3rd

Left field

170.2

8th

Center Field

310.7

1st

Right Field

269.8

2nd

*NOTES: (1) Babe Ruth was counted as a right fielder (2) Stats courtesy of FanGraphs.

As we can see, second base places 5th behind the four positions I think Yankee fans most associate with greatness. However, no other team in history has had at least five second basemen accumulate at least 37.1 fWAR, and only one team’s top five (the Reds) beat the Yankees’ top five in total fWAR, albeit barely (220.3 to 216.7). Of course not all teams have been around as long as the Yankees have (and some have been around longer) but you get the idea. Suffice to say, second base has been an excellent position in the history of an organization that has had several excellent positions. So while second base places right around where we would expect in terms of other Yankee positions, it is important to reiterate that (1) the four Yankee positions ahead of second basemen on the aforementioned list are insanely good and include some of the greatest players of all time, and (2) the top five Yankee second basemen, compared to other teams’ top second basemen, are among the best ever.

That being said, here are some stats for my top five Yankee second basemen of all time, in no particular order:

Player

Games

HR

BsR

AVG

OBP

SLG

wRC+

Def

fWAR

Gordon

1000

153

-7.8

.271

.358

.467

121

140.1

40.1

Cano

1374

204

-4.9

.309

.355

.504

126

-10.4

37.1

Randolph

1694

48

17.6

.275

.374

.357

110

143.9

51.4

Lazzeri

1659

169

-8.2

.293

.379

.467

121

48.6

48.4

McDougald

1336

112

-4.5

.276

.356

.410

114

128.6

39.7

*NOTES: (1) Stats courtesy of FanGraphs; (2) These stats are what each player accumulated as a Yankee only.

Like I said before, this is more or less as good a list of top-five second basemen that any team has. Every player on this list was an above-average hitter that played exceptional defense (except for Cano). The one glaring weakness, with the exception of Randolph, is baserunning. This strikes me as a bit odd because second basemen are typically solid in this aspect of the game. Even still, these are five very, very good ballplayers. Now to the top five:

5. Gil McDougald

Gil McDougald’s inclusion on this list is somewhat dicey because he played all over the infield save for first base (he appeared in 599 games at second, 508 at third, and 284 at short as a Yankee). McDougald is included because 1) he did in fact play most of his games at second, and 2) in my opinion, he is one of the most underrated players in Yankee history. The Rookie of the Year in 1951 (his best season with the bat with a 142 wRC+) McDougald was a five-time All Star and a member of the five Yankee World Series championship teams. A player with his versatility is extremely valuable to any team and the fact that he was making his contributions to an organization in the midst of the greatest dynasty in sports history (1949-1964) is all the more impressive. Throw in his above-average bat and you have one great ballplayer. McDougald does not rank 1st in any of the aforementioned categories but he is the definition of a “jack of all trades” player: he played multiple positions and did everything well.

4. Willie Randolph

Millennials like myself remember Randolph mostly (and quite fondly) from his time as the Yankee third-base coach during the most recent dynasty years (and less fondly as the manager of the Mets), but he had a fantastic playing career in pinstripes as well. Representing the Yankees in four All-Star games (including in 1977, the Yankees’ first World Series title since 1962), Randolph had the reputation as a defensive wizard. The statistics back that assertion up nicely, as his 143.9 Def rating is best among second basemen in franchise history (and his career Def rating of 168.2 is ninth all time among second basemen). Randolph is easily the best baserunner of the five, with a 17.6 BsR (no other player is above -4.5). Randolph was no slouch with the bat either, although his power pales in comparison to the other four players on the list. However, it is known that on-base ability is more valuable than power, and Randolph’s .374 career OBP ranks second. McDougald and Randolph are strikingly similar players (even their fWAR/game is an identical .030) but I decided to rank Randolph higher due to his superior on-base ability.

3. Robinson Cano

The inspiration for this post, Robinson Cano checks in as the third-greatest second baseman in Yankee history. A five-time All-Star and five-time Silver Slugger, Cano’s Yankee career began somewhat randomly during the teams’ terrible start to the 2005 season, and he never looked back.  His 126 wRC+ is tops on the list. He also leads in home runs, batting average, and slugging. However, his Def rating of -10.4 is easily the worst on the list (acknowledging that defensive metrics are far less reliable than offensive and base running metrics). Cano has been one of the very best players in baseball the past several years. Neither McDougald nor Randolph could claim such during their playing days. Cano has been top-five in all of baseball in bWAR (Baseball-Reference WAR) in four different seasons, whereas McDougald has two such seasons, and Randolph none. Had Cano signed with the Yankees this offseason, he most likely would have ended up #1 on this list.

2. Tony Lazzeri

Hall of famer Tony Lazzeri checks in at #2. In his 12 seasons as a Yankee from 1926-1937, Lazzeri played less than 123 games only once, hit at least 10 home runs in every season but two (in those two seasons, 1930 and 1931, he hit 9 and 8 home runs, respectively) and had a wRC+ greater than 100 in 11 straight seasons. He also accumulated at least 2 fWAR every year he was with the Yankees. Suffice to say, Lazzeri was a very consistent ballplayer on same great Yankee clubs (including arguably the great of all time, the 1927 squad). His 48.4 WAR is second on the list. Unlike Cano, Lazzeri was not one of the best players in all of baseball during his playing career, but was simply with the Yankees longer and his counting stats reflect as much, giving him a slight edge over Cano.

1. Joe Gordon

Completely disregarding my reasoning for ranking Lazzeri ahead of Cano, I decided to rank Joe Gordon, another of the most underrated Yankees of all time, as the best second baseman in the teams’ history. He, like many big leaguers in the 1940s, missed time (in Gordons’ case, the 1944 and 1945 seasons) to serve in WWII. In 1942 and 1943, Gordon put up 8.8 fWAR and 6.8 fWAR, respectively, and save for a 2.1fWAR season in 1946, bounced right back and put up 6.9 fWAR in 1947 and 7.1 fWAR in 1948. The point of all of this is that Gordon would have, in all likelihood, continued to dominate in the two seasons he missed, but we’ll never know.

Even though his time in pinstripes, and in baseball for that matter, was shorter than it could have been, Gordon did not disappoint when he was on the field. A Yankee for seven seasons, he was an All-Star in six of them (although his 1946 selection is a bit odd. Check out his numbers that year). In those seven seasons he accumulated 40.1 fWAR, an average of 5.7 fWAR per season. This is easily the highest per-season average of any player on this list (Cano is second at 4.1 with the other three each at 4.0). On a fWAR/game basis, Gordon’s .040 is well ahead of the others (McDougald and Randolph are tied for second at .030). He, like Cano, could claim to be one of the best ballplayers of his time, having placed in the top 10 in overall bWAR five times as a member of the Yankees. Gordon was an elite defender, rating second all-time in Def for a second baseman. Randolph barely has him beat in terms of what they did as Yankees, but Gordon’s per-season average of 20.0 Def easily eclipses Randolph’s 11.1. Couple his historic defensive abilities with his great bat (his 121 wRC+ trails only Cano) and you have a fantastic ballplayer and the best second baseman in the teams’ storied history.

So there is my top five Yankee second basemen of all time. What sets Gordon apart from the rest are his per-season averages, but if you place a higher value on longer-term consistency I suppose Lazzeri would be your guy. But no other player did more in a shorter amount of time than Gordon, hence my ranking of him as #1. Honestly, I could be talked into changing this list around in a number of different ways (exlcuding McDougald and including Stirnweiss and flipping Lazzeri and Gordon just to name a couple) but I think the purpose of a post like this is to try and initiate some interesting debate while admiring the careers of past Yankee greats. Like I previously stated, I think second base is an under-appreciated Yankee position, but the organization has had some truly great second basemen in its history.


Joey Votto: 6-WAR Player

In spite of the ridiculous scrutiny regarding his 2013 season, Joey Votto is one of the best hitters in baseball. He also has a large contract that begins in 2014. Big contracts have seemingly brought bad baseball voodoo to some of baseball’s best (injury to Pujols noted). As Votto begins his mega-deal, both ZiPS and Steamer show him declining by over a win in WAR in 2014.  Because of how good he has been for the last four years, I found this projected decline somewhat surprising. I decided to dive into the numbers and found some really interesting things.

Votto has had four straight years of excellence. Since 2010, he has the third-best WAR among position players (25.1), the second-best wRC+ (164), the second-best wOBA (418) and the best OBP (434). Votto is at least in the conversation as the best hitter in baseball over this four-year stretch. But for some more perspective, let’s look at his last four years in detail.

Year

AVG

OBP

SLG

BB%

ISO

BABIP

wRC+

wOBA

Def

WAR

2010

324

424

600

14%

276

361

172

438

-10.1

6.8

2011

309

416

531

15.3%

222

349

157

406

-5.2

6.5

2012

337

474

567

19.8%

230

404

178

438

-2.1

5.6

2013

305

435

491

18.6%

186

360

156

400

-10.1

6.2

Quite impressive. Votto has earned over six WAR every year except 2012, the year he injured his knee and played in only 111 games. He played many of those games injured as well. The only reasonable complaints are that Votto’s power dropped some in 2013, and his defense was poor (decent for a first baseman). The extent to which Votto’s knee surgery affected his power in 2013 remains to be seen, but it isn’t like his power numbers fell off a cliff.  So what do our beloved projection systems say about Votto’s age-30 season in 2014? They say he will be good but not quite as elite as he has been.

Projection System

AVG

OBP

SLG

BB%

ISO

BABIP

wRC+

wOBA

Def

WAR

ZiPS

289

416

506

17.5%

217

334

N/A (149 OPS+)

386

3

4.8

Steamer

296

424

507

17.8%

211

341

156

400

-9.7

5.0

The encouraging part for Reds fans is that both Steamer and ZiPS think Votto’s power will tick up a little. The hope is that Votto’s knee will be healed, and he will return to the doubles machine he once was, with a few more home runs as well. But Votto is also entering his age-30 season. His best power days may very well be behind him. Or maybe not. I’ll get to that shortly.

The other numbers are similar to the four previous years with one noticeable difference. Both projection systems predict Votto’s batting average to drop below 300 for the first time since his rookie season in 2008, where he struggled to a 297 average.  The cause of this decline in batting average is our old friend BABIP. ZiPS has Votto’s BABIP dropping to 334 even though Votto has averaged a 368.5 BABIP for the last four years. BABIP can fluctuate wildly from year to year,  but Votto has shown the ability to maintain a high BABIP throughout his career. We can expect him to do a little better than these projections. If every 10 points of BABIP equals about 0.3 in WAR, Votto is likely to gain between half a win and one full win.

But the equation 10 points of BABIP=0.3 WAR is with all other stats being equal. If Votto’s BABIP is higher than these projections, he is likely to also have some more extra-base hits, including a couple more home runs. This added power would raise his value even more. Both projection systems already have his power rising from last year. Votto could have a few years of solid power left, especially if his knee is fully healthy.

This puts Votto around six-WAR territory. The other important factor will be his defense. Votto’s defense was poor in 2013 compared to his previous two seasons. He was a top-five defensive first baseman in 2011 according to FanGraphs’ Def and would have been top-three in 2012 had he played enough games to qualify. To remain a six-WAR player, Votto will likely need to return to an above-average defensive first baseman. Steamer has his defense at about the same level as 2013. I do not believe ZiPS Def adjusts for position, but it appears they think he will be average to slightly above-average defensively for a first baseman.

The Reds have much bigger problems than their superstar first baseman. They lack the ability to get on base consistently. They have serious question marks in left field and center field. The reality is that the Reds are a borderline playoff team right now and need Votto to be an elite player to have a legitimate chance of returning to the postseason. After looking at the numbers and with the prospect of a fully healthy knee, It is easy to see Votto continuing his run of excellence.


Analyzing Yoenis Cespedes

Yoenis Cespedes struggled at the plate this year for reasons unknown to most. Analyzing why he struggled in 2013 versus why he was deemed excellent in 2012 all comes down to sabermetrics. Cespedes’ biggest enemy was actually… himself. Through research and statistics, Cespedes swings at too many inside pitches in an attempt to hit more home runs. The pressure from his overshadowed rookie season may have come back to haunt him this past year. His batting average dropped from .292 to .240 and his OPS fell from .861 to .737 all because of a few changes Cespedes made at the plate. The statistics easily point out the causes for Cespedes’ struggles and how he might be able to fix them for next season. Even though it may seem that Cespedes was a much worse batter in 2013, that is not the entire case. He actually was much better at making contact with pitches thrown to the outside of the strike zone, boasting an increase from 59.5 % to 63.7 %.


   1.  Swinging at Inside Pitches Too Often & Taking Too Many Strikes
Cespedes took a swing at way too many pitches inside the strike zone this season. A number that increased from 65.3 % to 71.8 % from 2012 to 2013. In comparison, when Adrian Beltre took a swing at 71.6 % of inside pitches in 2005, he hit .255 with just a .716 OPS. In addition, when swinging at such a high amount of inside pitches, Cespedes’ hit a lower percentage of them as well — going from 84.0 % in 2012 to 80.4 % in 2013. As a result of his tendency to swing more often at inside pitches, he saw an increase of strikes by 2.5 % (1233 of 1979 in 2012 to 1407 of 2169 in 2013). More strikes lead to more strikeouts and a lower batting average. His strikeout rate increased from 18.9 % to 23.9 % just over the course of a single season.

Swinging at the amount of inside pitches that he did, power pitchers took full advantage of his swing, resulting in a .196 batting average. Against finesse pitchers, Cespedes averaged a .263 batting average. (power pitchers are defined as the top third of pitchers when combining the amount of strikeouts and walks. finesse pitchers are defined as the bottom third) When Cespedes fell behind in the count, he proved to be an easy out; with two strikes and any amount of balls, he batted a horrifying .130. Also, Cespedes is often too eager to swing at the first pitch of a plate appearance when he does not have a trace on the pitcher’s style or location. Swinging at the first pitch resulted in a .209 batting average whilst taking the first pitch resulted in a .252 batting average.

Picture

2.  Pulling Too Hard for Home Runs
Cespedes certainly tried to hit as many home runs as possible this season; he did pass the previous year’s number of 23 by three and his power assuredly grew. As evidenced by his spectacle at the home run derby, Cespedes possesses a strength like few others in the MLB. However, he often tried too hard to get the ball over the wall, resulting in an increase in fly ball rate from 39.9 % in 2012 to 45.6 % in 2013. The pressure to improve from critics and fans alike might have pushed Cespedes into trying to hit more home runs than he possibly could. Given his time on the DL due to nagging hand injuries, it surprised most that he even hit this many home runs — either because of lost time or wrist pain.


                    Picture

                                                             The power is definitely still there

 3.  BABIP (Batting Average on Balls In Play)
Cespedes was also just plain unlucky in 2013. BABIP measures the percent of batted balls that end up as hits — either because of defense, luck, or positioning. Basically two-thirds uncontrollable to the batter and one-third placement of the bat. Cespedes ended the 2013 season with a .274 BABIP; whereas the standard and league average nowadays hovers around .300. In 2012, he finished with a .326 BABIP — a lot luckier than this past year. The second reason (pulling for home runs) most likely factors a moderate amount into the regression too. Unfortunately, the causes of BABIP can disguise a player’s true skill level behind solid defense, timing, and bad luck.

The real Yoenis Cespedes is most likely somewhere in between his two major-league seasons but much closer to his rookie season than 2013. Yoenis Cespedes thrived in the spotlight but collapsed under pressure in 2013. His statistics in the 2013 playoffs alone describe his love of the spotlight (.381/.409/.667). Not only does he play well in the playoffs, but he also crushed everyone else in the home run derby this year. Expect Cespedes to be a big bounce-back candidate in 2014 after he can look at why he struggled at the plate. Upon arriving in America from Cuba as a free agent, Cespedes was hailed as a five-tool player and “arguably the best all-around player to come out of Cuba in a generation.” Don’t give up hope on the Athletics’ outfielder just yet.

For more articles like this, visit my baseball analysis and news website: The Wild Pitch

All statistics courtesy of baseball-reference and FanGraphs:
http://www.baseball-reference.com/players/c/cespeyo01.shtml
http://www.fangraphs.com/statss.aspx?playerid=13110&position=OF


Grading 2013 AL SP Performance with Attention to the 2-D Direction of Batted Balls

Foreword

Two years ago, I began developing a system for evaluating the performance of minor-league pitchers relative to their minor-league level/league peers. My goals were to use only game data that could be extracted from the MLB Advanced Media Gameday archives for every level of the minors (ruling out any of the pitch-outcome data that is available for AA and AAA games), to ignore whether batted balls went for hits or home runs, and to ignore runs allowed. In brief, the challenge amounts to using whatever else information can be compiled from the game-specific dataset to arrive at the best approximation of the pitcher’s true performance, as judged independent of those factors which tend to fall outside their control (defense, park effects, etc.). What eventually follows are the results of applying the latest iteration of this “Fielding and Ballpark Independent Outcomes” method to 2013’s American League starting-biased pitchers.

Basic Steps of Applying the Method to a League

  1. Download the relevant details of every plate appearance (PA) from the league’s season into a spreadsheet/database
  2. Derive a 24-outs-baserunners-state run expectancy matrix à la Tango in The Book
  3. Quantify how each PA of the season impacted the inning’s run expectancy
  4. Exclude all bunts and foulouts, plus every PA taken by a pitcher
  5. Reweight the proportion of line drives (LD), outfielder fly balls (OFFB), ground balls (GB), and infielder flyballs (IFFB) by ballpark to offset any stadium- or stringer-related anomalies in play event classifications
  6. Referencing the run-expectancy value determined for each PA in Step 3, the corresponding basic description of the play (BB vs HBP vs K vs GB vs IFFB vs OFFB vs LD), and the 2 coordinates indicating where the batted ball was fielded (if there was one), quantify what each of the following 12 general PA event types were worth in terms of runs, on average, for the season: 1) walk or hit-by-pitch, 2) strikeout, 3) IFFB, 4) GB to batter’s pull-field-third of the diamond, 5) GB to batter’s center-field-third, 6) GB to batter’s opposite-field-third, 7) LD to pull-third, 8) LD to center-third, 9) LD to opposite-third, 10) OFFB to pull-third, 11) OFFB to center-third, and 12) OFFB to opposite-third.
  7. For each pitcher in the study sample, tally up the number of each of the 12 event types that they allowed and in each instance charge them with the exact number of runs determined in Step 6 for the corresponding event type; divide the resulting sum by the total number of events to arrive at a single number for each pitcher that quantifies how a PA against them that season should have affected the inning’s run expectancy, on average (the more negative this number the better the pitcher should have performed on the year)
  8. Quantify how high or low the pitcher rated on the value in Step 7 versus the mean of the sample on a standard deviation (SD) basis

What were the 12 Event Types Worth in 2013?

The table below shows how the studied event types impacted run expectancy in AL Parks during 2013, on average. The 2-D direction of the batted ball does tend to be rather consequential for LD and even more so for OFFB.

 photo 2013ALParksPAEventType-EffectofRunExpectancies2_zps4e1054de.jpg

So as far as Step 7 described above goes, each pitcher in what follows will be charged +0.29 runs for every BB and HBP, -0.26 runs for every K, … and -0.08 runs for every OFFB to the Opposite-Field-Third, with that sum ultimately divided by the total number of PA events to arrive at a single number that quantifies what an average PA against the pitcher in 2013 was worth in terms of runs (per run expectancies). Think of that as the equation being used to evaluate each pitcher’s performance.

Study Sample

The 101 pitchers who faced more than 200 batters as an American Leaguer in 2013 while averaging more than 10 batters faced per game. Data they accumulated as relievers is included in the analysis. Data they accumulated as National Leaguers is not. As before, any PA that resulted in a bunt or foulout or that was taken by a pitcher was excluded.

Scores Computed

The overall rating number described in Step 8 above is termed Performance Score. Steps 7 and 8 can be repeated with the non-batted-ball events (BB,HBP,K) stricken from the numerator and denominator at Step 7, and this result is termed Batted Ball Subscore (in short, how should the pitcher have rated versus their peers on batted balls?). To further understand how the pitcher achieved their Performance Score, a Control Subscore (how many SDs high or low was the pitcher’s BB+HBP% versus the study population’s mean?) and a Strikeout Subscore (how many SDs high or low was the pitcher’s K% ?) are computed. An Age Score is also calculated that quantifies how young the pitcher was versus the population’s mean age, per SDs. Given the method’s minor-league origins, the scores are typically expressed on a 20-to-80 style scouting scale where 50 is league-average, scores above 50 bettered league-average, and any 10 points equates to 1 SD (percentiles will be listed for those who prefer them).

2013 American League Starting Pitcher Results

In the tables to follow, green text indicates a value that beat league-average by at least 1 SD (“very good”) while red text indicates a value that trailed league-average by at least 1 SD. Asterisks indicate left-handed throwers.

Sorting by Performance Score

Here are the Top 33 2013 AL SP per the Performance Score measure. Scherzer edged Darvish for the #1 spot as the top of the list somewhat mimicked the BBWAA’s Cy Young vote.

 photo FG-2013ALSPScoresTop33_zps7510f67e.jpg

Detroit and Cleveland each landed five in the Top 33 while Boston, Oakland, and Tampa Bay each placed four. Perhaps not coincidentally, those clubs were also the playoff teams.

And below are the Middle 34 by Performance Score.

 photo FG-2013ALSPScoresMid34_zps31e63487.jpg

And below are the Bottom 34 by Performance Score.

 photo FG-2013ALSPScoresBot34_zps37b53dab.jpg

Pedro Hernandez took last place by a comfortable margin as five other Twins joined him on this dubious list of 34. To further corner the market on these sorts of arms, the club has since inked another of the 34 to a three-year free-agent contract.

Sorting by Batted Ball Subscore

Given the system’s unique weighting of batted-ball types by direction, let us examine how the pitchers grade out on this metric. Below are the Top 20 sorted by Batted Ball Subscore. Masterson nosed out Deduno for top honors. Here, the Twins fare better as three besides Deduno crack the Top 20.

 photo FG-2013ALSPBattedBallSubscoresTop20_zps83100793.jpg

 One unique angle of this approach is that a pitcher can be a relatively strong batted-balls performer without being a noteworthy groundball-inducer if their outfield flyballs, line drives, and groundballs are skewed optimally to the least dangerous zones of the field per the batter’s handedness. Colon serves as a prime example of such a pitcher.

Below are the laggards who comprise the Bottom 20.

 photo FG-2013ALSPBattedBallSubscores2Bot20_zps7c4024d0.jpg

Garza’s 29 number as an American Leaguer is somewhat scary for the sort of money he’s likely to command as a free agent (he’d earn about a 35 Batted Ball Subscore if the Cubs NL data were factored in). Salazar’s numbers show how a very high rate of strikeouts and good control can successfully offset a dangerous distribution of batted balls by type and direction.

Admittedly, there is a third dimension to each of these batted balls (launch angle off the bat relative to the plane of the field) that would stand to further improve the batted-balls assessment if such information were available.

Other Directions

A variety of things can be done with these numbers, such as breaking them down further into LHB values and RHB values, identifying comparable pitchers who share similar subscores (MLBers to MLBers, MiLBers to MLBers), studying how these values evolve as the minor leaguer rises through the farm towards the majors and their predictive value as to future MLB performance, and so on. And then there’s also the reverse analysis — evaluating hitter performance under a similar lens.

On Tap

Perhaps the most intriguing research question that application of this system raises is, “Would advanced metrics familiarly used to grade pitcher performance yield better results if their equations included batted-ball directional terms?” As a first attempt to test those waters, I plan to follow this up with a post that shows how these results compare to those obtained by variants of more familiar advanced statistical-evaluation methods (SIERA, FIP, etc.). In the interim, I welcome whatever comments, criticisms, and suggestions this readership has to offer.


What Makes a Good Pinch-Hitter?

There seems to be quite a bit of disagreement in FanGraphs-land over what skills make for a good pinch-hitter. Some will argue that power is more important while others might say that on-base skills are more important. And while I know that it’s fashionable for the author to make a stance at the start of his article, I’m not going comply. I’m just going to unsexily dive face-first into Retrosheet.

How can we solve this problem? How do we know what skills are best for pinch-hitters? Well, we can examine the base-out states that pinch-hitters confront and then derive from those base-out states specific pinch-hitter linear weights. We will then compare pinch-hitter linear weights to league-average linear weights to see which skills retain value. Simple.

We’re also going to split the data by league, since pinch-hitting tendencies in the National League are likely going to be different than American League tendencies. I’m going to use the last five years of data, because whim. The table below, then, includes league-average linear weights followed by NL and AL pinch-hitter linear weights (aside: the run values of linear weights are from 1999-2002, per Tango’s work. This won’t make a real difference in the results, however, since we’re examining relative value of different base-out states and not overall run-value of different events).

Relative Linear Weights, 2009-2013

Linear Weight HR 3B 2B 1B NIBB Out K
League Average 1.41 1.06 0.76 0.47 0.33 -0.300 -0.310
AL Pinch-Hitting 1.45 1.07 0.77 0.49 0.32 -0.305 -0.325
NL Pinch-Hitting 1.42 1.05 0.75 0.48 0.31 -0.290 -0.310

In the National League we can see that the value of home runs have increased slightly while walks have seen a corresponding decrease. This is because pinch-hitters often come to the plate when there are more outs than average. This sensibly decreases the value of walks and increases the importance of hurrying up and sending everyone around the bases already. This note comes with a caveat, however — the differences in linear weights are pretty small. It seems that managers in the National League are often forced to use the pinch-hitter to replace the pitcher, and therefore pinch-hitters are used in a lot of sub-optimal places.

The American league does not condone making everyone hit, however, and the impact upon pinch-hitting situations is pretty clear. The run value of home runs increases by .04 in pinch-hitting situations in the American League compared to the paltry .01 National League increase. In fact the run values of nearly all events increases — managers in the American League simply have more flexibility on when to use pinch-hitters and so they are able to deploy their pinch-hitters in base/out situations that are strategically favorable.

What does this all mean? Like everything, this simultaneously means quite a bit and not much at all. Home run value increases while walk value decreases during average pinch-hitter situations, but the change isn’t huge. If you’re a general manager looking for a bench bat and there’s a home-run guy available with a 90 wRC+ and a plate-discipline guy with a 95 wRC+, take the plate-discipline guy. What if they both have a 90 wRC+? Then take the home-run guy. The pinch-hitter linear weights here are more of a tie-breaker than a game-changer. Power is more important than walks when it comes to being a pinch-hitter, but being a good hitter is more important than power.

Roster construction is never that simple, though. Ideally a team will have both power and plate-discipline guys available on the bench and then the manager will be able to leverage both of their abilities based upon the base/out state (and also the score/inning situation, which is outside the scope of this article). Managers tend to be kind of strategic dunces, though, so I’m not sure if I see this happening. If I were in charge of anything I would supply my manager with a chart of base/out states that list the team’s best pinch-hitters in each situation. I’m not in charge, though, and even if I were I would probably be ignored.

I am in charge of this article, however, which means that I can bring it to a close. I’ll note that another valid way to do this study would be to create WPA-based weights rather than run-expectancy weights. There’s a lot more noise in WPA, but it could still create some interesting conclusions. I reckon the conclusion would be pretty much the same though — what makes a good pinch-hitter? Well, a good hitter makes for a good pinch-hitter. And a little power doesn’t hurt.


Billy Hamilton: 2014 Leadoff Hitter?

The signing of Shin-Soo Choo gives the Rangers a player with strong on-base skills, solid power, and decent corner-outfield defense. The signing also left a gaping hole in the outfield for the Reds. Choo was one of three Reds starters that got on base at an above-average clip. He was easily the first- or second-best offensive player for the Reds in 2013. While he was miscast in center field, Choo brought a great deal of value to a team that needed his particular offensive skill set.

Walt Jocketty has stated that Billy Hamilton is the new center fielder and will likely bat leadoff for the 2014 Reds. Hamilton starting in center field should come as no surprise as the Reds do not have many other options. The wisdom of Hamilton batting leadoff is at least up for debate. You can easily go look at his projections for 2014 and draw your own conclusions, but I would like to at least provide some context.

Every baseball fan knows about Hamilton’s speed. He is ferociously fast. He stole 155 bases in the minors in 2012 and successfully stole 13 bases in 14 attempts in limited major league action in 2013. Speed is nice , but it is certainly not close to the most important skill for a player in the leadoff spot. Reds fans may know this best of all from watching Corey Patterson, Willy Taveras, and Drew Stubbs flounder at the plate. Those players were wickedly fast, but as the saying goes, you can’t steal first base. None of them had the on-base skills to bat leadoff, but they found themselves there anyway because of their speed. To avoid this list of failed Reds leadoff hitters, Billy Hamilton will need to get on base enough to justify being at the top of the order. That is the obvious question: can Hamilton get on base to use that blinding speed of his to turn singles into doubles and doubles into triples? There are signs that he can but others that he shouldn’t in 2014.

The 2012 season launched Hamilton into top-20 prospect territory. He obviously broke the stolen-base record, but he also showed some ability with the bat. In a 132 games between high A and AA, Hamilton hit .311/.410/.420. He had 14 triples. His walk rate rose dramatically from the year before. Hamilton looked like a perfect leadoff hitter through two levels.

Then 2013 and AAA came. Hamilton slashed .256/.308/.343. His walk percentage dropped from 16.9% in 50 games in AA (small sample size noted) to 6.9% in 123 games in AAA. it was arguably his worst season as a professional. He looked completely overmatched at times and questions about his ability to get on base resurfaced.

So which is the real Billy Hamilton, and what does it mean for 2014? Hamilton’s ceiling is likely between his 2012 and 2013 minor league performance. In five seasons as a minor leaguer, Hamilton slashed .280/.350/.378. Coupled with his speed and potential excellent defense in center field, that slash line could make him an All-Star-caliber player. The hope is that 2013 was a product of learning a new position and a significant drop in BABIP from over .370 to .310.

Still, Hamilton was very inconsistent at the plate in 2013 and didn’t prove he could hit AAA pitching for an extended period of time. The major leagues are an obvious step up in competition, and it would be surprising to see him match his .280/.350/.378 minor league career slash line in 2014. Steamer projects him to have a .305 OBP, and after last year, it is easy to see why.

While it is very possible Hamilton could surpass gloomy projections, the Reds probably shouldn’t risk it in 2014, at least at first. It makes much more sense to see how Hamilton adjusts to major-league pitching in a less important part of the lineup (7th for instance). He would get fewer at bats and would not be so heavily scrutinized if he struggled adjusting to the level. If he performs well, he can always move up in the lineup, but the Reds likely have better leadoff options than Hamilton to begin the year.

If Hamilton plays excellent defense in center field and has a good year on the bases, he will provide solid value for the Reds. To fill Choo’s shoes, he will have to hit closer to his career minor league mark as opposed to his 2013 numbers. In 2014, that may be difficult.


The Cascading Bias of ERA

There are so many problems with ERA that it’s unbelievable. I’m not going to sit here and tell you what’s wrong with ERA, though, because you’re probably smart. But there’s a problem with ERA, and it’s a problem that transcends ERA. It’s a problem that trickles down through FIP, xFIP, SIERA, TIPS, etc. etc. name your favorite stat, etc., and it’s something I don’t see talked about much.

All of our advanced pitcher metrics are trying to predict or estimate ERA. They’re trying to figure out what a pitcher’s ERA should be, and herein lies the problem: Because they could be exactly right, but they’d still be a little incorrect due to one little assumption.

This assumption–that pitchers have no control over whether or not the fielders behind them make errors–seems easy to make. Like most assumptions, however, this one is subtly incorrect. Thankfully, the reason is pretty simple. Ground balls are pretty hard to field without making an error, and fly balls aren’t. And the difficulty gap is pretty huge.

How big? Well in 2013 there were precisely 58,388 ground balls, 1,344 of which resulted in errors. On the other hand a mere 98 out of 39,328 fly balls resulted in errors. That means that 2.3% of ground balls result in errors while a tiny 0.25% of fly balls do. It’s time to stop pretending that this gap doesn’t exist, because it does.

So now that we know this, what does it mean? Well it means this: ground-ball pitchers will have an ERA that suggests they are better than their actual value, while fly-ball pitchers have the opposite effect. Pitchers who allow contact, additionally, are worse off because every time they allow contact they put pressure on their defense. They’re giving themselves a chance to stockpile unearned runs which nobody will count against them if they’re only looking at ERA derivatives. When it comes to winning baseball games, however, earned runs don’t matter. Runs matter.

I am going to call this the “pressure on the defense” effect, which will cause some pitchers to be more prone to unearned runs than other pitchers. How big is this effect? Well, not huge. The gap between the best pitcher and worst pitcher in the league is roughly three runs over the course of the season. But keep in mind that three runs is about a third of a win, and a third of win is worth about $2 million dollars. We’re not discussing mere minutiae here.

In order to better quantify this effect I have developed the xUR/180 metric, which will estimate how many unearned runs should have taken place behind each pitcher with an average defense. Below is a table of all qualified starting pitchers from 2013 ranked according this metric. I have also included how many unearned runs they actually allowed in 2013, scaled to 180 innings for comparative purposes.

# Name xUR/180 UR/180
1 Joe Saunders 7.24 9.84
2 Jeff Locke 7.11 4.33
3 Wily Peralta 6.97 17.7
4 Edwin Jackson 6.88 13.36
5 Edinson Volquez 6.81 6.35
6 Kyle Kendrick 6.77 8.9
7 Justin Masterson 6.66 0.93
8 Doug Fister 6.58 5.19
9 Wade Miley 6.57 7.12
10 Rick Porcello 6.51 2.03
11 Jerome Williams 6.47 7.45
12 Jorge de la Rosa 6.43 5.38
13 Yovani Gallardo 6.42 7.99
14 A.J. Burnett 6.35 8.48
15 Scott Feldman 6.32 8.94
16 Mike Leake 6.26 5.62
17 Andrew Cashner 6.25 8.23
18 Felix Doubront 6.22 6.66
19 Jhoulys Chacin 6.13 5.48
20 Kevin Correia 6.13 2.92
21 Jeremy Guthrie 6.13 3.41
22 Mark Buehrle 6.11 5.31
23 Andy Pettitte 6.05 7.78
24 Hyun-Jin Ryu 6.01 2.81
25 Jeff Samardzija 6.0 5.07
26 C.J. Wilson 5.93 11.03
27 CC Sabathia 5.9 8.53
28 Jon Lester 5.84 4.22
29 Ryan Dempster 5.8 10.52
30 Tim Lincecum 5.77 5.48
31 Hiroki Kuroda 5.72 4.48
32 Bud Norris 5.72 7.15
33 Jordan Zimmermann 5.69 3.38
34 Patrick Corbin 5.68 1.73
35 Dillon Gee 5.67 3.62
36 Ervin Santana 5.67 7.68
37 Kris Medlen 5.66 8.22
38 Bronson Arroyo 5.63 2.67
39 Stephen Strasburg 5.62 9.84
40 Mat Latos 5.62 6.85
41 Ubaldo Jimenez 5.61 7.9
# Name xUR/180 UR/180
42 Jarrod Parker 5.61 4.57
43 John Lackey 5.6 5.71
44 Gio Gonzalez 5.55 5.53
45 Lance Lynn 5.55 2.68
46 Eric Stults 5.5 7.09
47 Felix Hernandez 5.49 4.41
48 Zack Greinke 5.48 2.03
49 Hisashi Iwakuma 5.47 3.28
50 Jose Quintana 5.46 4.5
51 Ian Kennedy 5.46 8.95
52 Ricky Nolasco 5.45 7.23
53 R.A. Dickey 5.44 6.42
54 Jeremy Hellickson 5.4 3.1
55 Homer Bailey 5.38 3.44
56 Miguel Gonzalez 5.36 9.47
57 Madison Bumgarner 5.34 5.37
58 James Shields 5.32 1.58
59 Adam Wainwright 5.32 2.99
60 Bartolo Colon 5.32 3.79
61 Derek Holland 5.3 7.61
62 Kyle Lohse 5.26 3.63
63 Cole Hamels 5.18 4.91
64 Anibal Sanchez 5.18 3.96
65 David Price 5.18 8.7
66 Chris Sale 5.14 6.73
67 Justin Verlander 5.06 8.25
68 Chris Tillman 5.04 1.75
69 Jose Fernandez 5.03 5.23
70 Shelby Miller 4.98 6.24
71 Matt Cain 4.97 2.93
72 Clayton Kershaw 4.9 5.34
73 Julio Teheran 4.9 2.92
74 Matt Harvey 4.86 1.01
75 Cliff Lee 4.79 4.86
76 Travis Wood 4.78 3.6
77 Dan Haren 4.78 4.26
78 Yu Darvish 4.53 1.72
79 A.J. Griffin 4.46 5.4
80 Mike Minor 4.46 5.29
81 Max Scherzer 4.15 3.36

 

Some notes:

  • Groundballs are still good, they’re just not as good.
  • A combination of groundballs and contact lead to more unearned runs. The pitchers at the top of the board demonstrate this.
  • A combination of strikeouts and fly balls will tend to limit the impact of unearned runs, as demonstrated by the bottom of the board.
  • Errors that occur on fly balls tend to be more costly than errors on ground balls. This metric accounts for that gap, but the low likelihood of fly-ball errors make this bullet point’s effect relatively negligible.
  • Line drives are similar to fly ball in terms of error rate, but they tend to be less costly than fly ball errors.

I’m sure there is more to be gleaned, but the point is this: we need to stop trying to predict ERA, because ERA is not a pure value stat. We should be trying to figure out how many runs a pitcher should/should have given up, because that’s what matters. Runs matter, and who cares if they’re unearned? They’re kind of the pitcher’s fault, anyways.


xHitting (Part 2): Improved Model, Now with 2013 Leaders/Laggards

Happy holidays, all.  It took me a while, but I finally have the second installment of xHitting ready.  First off, thank you to all those who read/commented on the first piece.  For those who didn’t get a chance to read it, the goal here is to devise luck-neutralized versions of popular hitter stats, like OPS or wOBA.  A main extension over existing xBABIP calculators is that this approach offers an empirical basis to recover slugging and ISO, by estimating each individual hit type.

I’ve returned today with an improved version of the model.  Highlights:

  • One more year of data (now 2010-2013)
  • Now includes batted-ball direction (all player-seasons with at least 100 PA)
  • FB distance now recorded for all player-seasons with at least 100 PA

(There’s no theoretical reason for the 100 PA cutoff, only that I was grabbing some of the new data by hand and couldn’t justify the time to fetch literally every single player.)

I have also relaxed the uniformity of peripherals used for each outcome.  At least one reader asked for this, and after thinking about it a while, I decided I agree more than I disagree.  The main advantage of imposing uniformity was that it ensures the predicted rates (when an outs model is also included) sum to 100%.  But it is true that there are certain interactions or non-linearities that are important for some outcomes, but not others.  Including these where they don’t fully belong has a cost to standard errors/precision, and to intuitive interpretation.  To ensure rates still sum to 100%, there’s no longer an explicit ‘outs’ model; outs are simply assumed to be the remainder.

For those curious, below I display regression results for each outcome and its respective peripherals.  You can otherwise skip below if these are not of direct interest.

(The sample includes all player-years with at least 100 plate appearances between the 2010 and 2013 MLB seasons.  Park factors denote outcome-specific park factors available on FanGraphs.  Robust standard errors, clustered by player, are in parentheses; *** p$<$0.01, ** p$<$0.05, * p$<$0.1)

The new variables seem to help, as each outcome is now modeled more accurately than before (by either R2 or RMSE).  For comparison, here are the R2’s of the original specification:

  • 0.367 for singles rate
  • 0.236 for doubles rate
  • 0.511 for triples rate
  • 0.631 for HR rate

Something else I noticed: for balls that stay “inside the fence,” both pull/opp and actual side of the field matter.  Consider singles: the ball needs to be thrown to 1st base (right side of infield) specifically.  Thus an otherwise-equivalent ball hit to the left side is not the same as one hit to the right side, since the defensive play is harder to make from the left side.  Similarly, hitting the ball to left field is less conducive for triples than hitting the ball to right field.

But hitting the ball to the left side as a lefty is not the same as hitting it there as a righty, since one group is “pulling” while the other group is “slapping.”  The direction x handedness interactions help account for this.

How well do the predicted rates do in forecasting?  For singles, doubles, and triples, the predicted rates do unambiguously better than realized rates in forecasting next season’s rates.  Things are a little less clear for home runs, which I will expand on below.

Although predicted HR rate shows a slight edge in Table 1, the pattern often reverses (for HR only) if you use a different sample restriction — say requiring 300 PA in the preceding season.  (For other outcomes, the qualitative pattern from Table 1 still holds even under alternative sample restrictions.)

So home runs appear to be a potential problem area.  What should we do when we need HR to compute xAVG/xSLG/xOPS/xWOBA, etc.?  Should we:

  1. Use predicted HR anyway?
  2. Use actual HR instead?
  3. Use some combo of actual and predicted HR?

Empirically there is a clear answer for which choice is best.  But before getting to that, let’s take a look at whether predicted home-run rate tells us anything at all in terms of regression.  That is, if you’ve been hitting HR’s above/below your “expected” rate, do you tend to regress toward the prediction?

The answer to this seems to be “yes,” evidenced by the negative coefficient on ‘lagged rate residual’ below.

So, although realized HR rate is sometimes a better standalone forecaster of future home runs, predicted HR rate is still highly useful in predicting regression.  Making use of both, it seems intuitively best to use some combo of actual and predicted HR rate for forecasting.

This does, in fact, seem to be the best option empirically.  And this is true whether your end outcome of interest is AVG, OBP, SLG, ISO, OPS, or wOBA.

Observations:

  • (Option 1 = predicted HR only; Option 2 = actual HR only; Option 3 = combo)
  • Whether you use option 1, 2, or 3, xAVG and xOBP make better forecasters than actual past AVG or OBP
  • Option 1 does not do well for SLG, ISO , OPS, or wOBA
  • ^This was not the case in the previous article, but results to that point had sort of a funky sample, having recorded flyball distance only for a partial list of players
  • Option 2 “saves” things for xOPS and xWOBA, but still isn’t best for SLG or ISO
  • Option 3 makes the predicted version better for any of AVG, OBP, SLG, ISO, OPS, or wOBA

End takeaways:

  • The original premise that you can use “expected hitting,” estimated from peripherals, to remove luck effects and better predict future performance seems to be true; but you might need to make a slight HR adjustment.
  • The main reason I estimate each hit type individually is for the flexibility it offers in subsequent computations.  Whether you want xAVG, xOPS, xWOBA, etc., you have the component pieces that you need.  This would not be true if I estimated just a single xWOBA, and other users prefer xOPS or xISO.
  • A major extension over existing xBABIP methods is that this offers an empirical basis to recover xSLG.  The previous piece actually provides more commentary on this.
  • Natural next steps are to test partial-season performance, and also whether projection systems like ZiPS can make use of the estimated luck residuals to become more accurate.

Finally, I promised to list the leading over- and underachievers for the 2013 season.  By xWOBA, they are as follows:

Overachievers (250+ PA) Underachievers (250+ PA)
Name 2013 wOBA 2013 xWOBA Difference Name 2013 wOBA 2013 xWOBA Difference
Jose Iglesias 0.327 0.259 0.068 Kevin Frandsen 0.286 0.335 -0.049
Yasiel Puig 0.398 0.338 0.060 Alcides Escobar 0.247 0.296 -0.049
Colby Rasmus 0.365 0.315 0.050 Todd Helton 0.322 0.369 -0.047
Ryan Braun 0.370 0.321 0.049 Ryan Hanigan 0.252 0.296 -0.044
Ryan Raburn 0.389 0.344 0.045 Darwin Barney 0.252 0.296 -0.044
Mike Trout 0.423 0.379 0.044 Edwin Encarnacion 0.388 0.429 -0.041
Junior Lake 0.335 0.292 0.043 Josh Rutledge 0.281 0.319 -0.038
Matt Adams 0.365 0.323 0.042 Wilson Ramos 0.337 0.374 -0.037
Justin Maxwell 0.336 0.295 0.041 Yuniesky Betancourt 0.257 0.294 -0.037
Chris Johnson 0.354 0.314 0.040 Brian Roberts 0.309 0.345 -0.036

Comments/suggestions?