Mike Minor and All the Home Runs

Mike Minor just keeps giving up home runs. To be fair, he’s a fly ball pitcher and home runs will come with that. And actually, he’s given up the long ball a little more frequently than he should (10.5% HR/FB) throughout his career, so maybe this shouldn’t come as such a surprise.

His 1.51 HR/9 this season is 7th among pitchers who have thrown as many innings as Minor has (83.1). But he’s had some bad luck this year – .343 BABIP, 14.9% HR/FB – and he’s been stricken with a… different kind of offseason injury plus shoulder tendinitis in Spring Training, so it’s reasonable to think that’s where the issue starts and ends. But after personally seeing him give up four home runs in a rehab game against Reds double-A squad Pensacola, it feels like something may be wrong. So I’d like to examine this a little more, if that’s ok.

I imagine that if the problem is something more than just arm trouble or bad luck, it should show up in his numbers somewhere. So I’ll compare his PITCHf/x, pitch type, and heat map data from this season – a not-so-good one – and last season – a quite good one.

First, I just want to show again that he’s been much less lucky this season. It feels to me like there’s something more to it, but luck could be the problem.

babip minor

While that may be so, giving up more home runs could be the result of a change in the amount he’s throwing each of his pitches and the velocity of those pitches.

pitch type

So there’s actually been a small uptick in Minor’s velocity since last season, and he’s been throwing more sliders and fewer changeups. He’s been showing that same trend since his debut and seemed to find a happy medium last year. Those changes from 2013 to this year seem significant, and I think they might be playing a part in his production.

First, we’ll compare how his pitches have been moving and how effective they’ve been the last two years. Rather than show four more tables with a bunch of numbers, here’s a quick summary: 1) His changeup is moving less than it did last year, and it’s getting crushed. 2) His fastball and slider are both moving more than they did last year – but only by a little – and are getting crushed. So those things aren’t great. The BABIP on his changeup is the only one that isn’t outrageous; it’s .281 this year. The opponent’s BABIP on his fastball and slider are .394 and .350, respectively, which are both pretty crazy. So those are two more points for just a ton of bad luck going Minor’s way, and perhaps some good signs pointing towards better luck in the near future. On to the next thing.

Maybe his issue has been locating the ball. He’s walk rate is up a little bit from last year, so it could be that he’s having trouble pitching where he did in 2013. I thought showing his heat maps might illustrate that, but, well…

2013 heat map 2014 heat map

They don’t. Not really, anyway. A lot of his pitches this year, like last year, are right around the middle of the plate, though they were spread out a little more last year. I’m not sure what exactly that means, but maybe he’s not locating quite as well this year.

From what I can gather, it seems like Mike Minor has seen several little changes. (A little higher release point turns into less movement on a pitch every now and then, which turns into everyone crushing your slider, etc.) And a lot of little changes can make a big difference – if things aren’t the same, they’ll be different, right?

Now for a little good news – though I hesitate to call it that. Minor’s historically been a “2nd half pitcher.” Hitters go from a .330 wOBA against him in the 1st half to a .300 after the break, and his FIP and xFIP see some drops as well. In addition, his xFIP is 3.61, which is actually a little better than it was last season. A turnaround doesn’t seem terribly far off for Minor. Cut out a little of that horribly bad luck, and Atlanta’s rotation gets better. Those things might not mean much at all, but maybe it can give Braves fans some hope.


Roster Doctor: Los Angeles Dodgers

With a payroll north of $200,000,000, you would expect the Los Angeles Dodgers to field a competitive team, and indeed they have. As we emerge from the All-Star break, they are neck and neck with the hated Giants, heading into a pennant chase that could be one for the ages. The Dodgers have four of the most watchable players in baseball (Kershaw, Greinke, Puig, and Ramirez) and a farm system with enough talent to supply reinforcements either directly or via trades. The team is not without needs, however. Like almost any team, the Dodgers has some bullpen depth issues, but just alleviated those somewhat by recalling Paco Rodriguez, a non-flamethrower who nevertheless generates a ton of Ks. Catching has been a riddle for manager Don Mattingly as well.  He’s had to use four backstops, none of whom have amassed enough appearances to qualify for the batting title, and of whom only the stalwart but venerable A.J. Ellis has provided anything even approaching an offensive contribution. (Well, Miguel Olivo made an offensive contribution of a different kind.)

But the biggest problem has been Matt Kemp, who dug a Tunguskan-size crater in center field before Mattingly more or less permanently shunted him to left. Kemp has the worst WAR (-1.3) for any position player qualifying for the batting title except Domonic Brown. Kemp’s hitting about as well as last year’s (modest) effort, but his defense has gone from bad (-0.6 dWAR) to eye-watering (-2.5). Whether you’re new school (zone rating) or old school (range factor), you will find nothing to like in Kemp’s defensive metrics. The move to left has probably mitigated the defensive damage he’s doing, but mainly by reducing his opportunities to come within proximity of the ball. His range in left is almost as far below the league as his range in center, although he’s making fewer errors. Kemp’s agent thinks he can still play center, and so presumably do Matt and his mom. That about exhausts the list.

In one sense this is a simple problem that the Dodgers can solve without any outside help. They could bench Kemp immediately. Center field prospect Joc Pederson is murdilating the PCL’s beleaguered pitchers to the tune of a 1.045 OPS, and yes, that’s good even in the PCL. Pederson is third in the league in OPS, behind two guys who are at least five years older. To the extent Pederson would struggle against major league lefties, he could be platooned with righty Scott Van Slyke, with Andre Ethier sliding between center and left. This is a rare situation where a manager can (almost) unilaterally boost his team’s playoff chances with a single lineup change.

And yet … Kemp can still hit. His .752 OPS is third on the Dodgers among batting qualifiers, and while that’s over 80 points off his career number, it still represents useful offense. At this stage in his career, Kemp’s value would dramatically increase if he didn’t have to put on a glove. The question is how to allocate that increased value among the Dodgers and their potential trade suitors. There are four playoff-contending AL teams whose DHs are either injured, ineffective, or both:

New York Yankees (Carlos Beltran .698 OPS)

Kansas City Royals (Billy Butler .675)

Cleveland Spiders (Nick Swisher .641)

Seattle Mariners (Corey Hart .611)

Kemp would immediately boost any of these teams’ offenses. The Yankees could take much of Kemp’s anvil-like contract ($20 m/yr through 2019), but have few if any prospects to offer. The Royals and Mariners are in the opposite situation: good talent to trade but limited ability to absorb such a huge financial hit. Cleveland, sadly, can’t really employ either approach, and in any case hitting is not their main need.

Dodgers president Stan Kasten’s general strategy upon assuming command was to throw immense amounts of Guggenheim money at the major league roster first, and then reinforce the farm system to ensure a steady stream of cost-controlled reinforcements for the future. Part I of the plan is working well, and Part II is underway with Corey Seager, Julio Urias and Alex “Van Gogh” Guerrero headlining a good collection of upper level minor league talent (non-Pederson division). The Dodgers could go either way here: begin their slow march away from the payroll tax penalty by banishing Kemp to the Bronx, or recharge the lower reaches of their farm system with talent from either of the smaller market franchises who could be in on Kemp. They may not succeed in moving Kemp, but if they can it would provide at least a small edge in a pennant race that looks sure to go to the wire.


Bringing Bill James’ Famous Arbitration Case to 2014

“I helped prepare arbitration cases for George three straight years in the 1980’s… George had led the American League in errors the first year that we prepared a case for him. We were wondering what to do about that, so I drew up an exhibit entitled ‘What Was the Cost of George Bell’s Errors?’ The exhibit showed that while Bell had led the league in errors with 11, none of the errors had actually cost his team anything. Of the 11 errors, only about three led to unearned runs, all had occurred in games which Toronto had won anyway, and in those three games, Bell had driven in something like seven runs.”

Bill James, The New Bill James Historical Abstract

 

The case that Bill James made for George Bell in 1985, and later informed his readers about when he released his Historical Abstract, always fascinated me. As someone who is a big believer that fielding metrics have a long way to go (especially behind the plate), this arbitration case was my Zihuatanejo, that far away place that always gave me hope that errors were really as pointless a statistic as they seemed.

However, as Bill James points out in the rest of George Bell’s player ranking, the fact that nothing came of Bell’s errors in 1985 (his first arbitration year), as well as 1986 and 1987, when James used the same exhibit, was rather noteworthy. Although errors are definitely not the be all and end all of fielding statistics, one would have to imagine that some ill had to come of them, at some point, right?

With the All-Star break upon us, and sadly no real baseball for the last four days, the chance to finally look into this idea of how much errors actually cost the erring player’s team, presented itself. At the halfway point, there were exactly 20 players who had committed 10 or more errors in 2014. Since there was time to kill without baseball on, I decided to pour over some box scores and figure out just how much each of those leading “error-men” had cost their teams. Using baseball-references fielding game logs, it was easy to find the games in which each player had made their errors, and then going through the play-by-play made it (usually) straightforward as to whether their error led to a run or not.

For this study, I created a chart with columns for all of the parts mentioned in Bill James arbitration case: total errors, unearned runs as a result of those errors, games that the team lost when that player committed an error, and RBI in those games that were lost. The final column (RBI in games lost) was tweaked a tiny bit due to the inclusion of one other column. The column added was one called “true losses.” This was the measure of how many games the team lost by equal to, or fewer runs, than the player’s error cost the team. For example, if Pedro Alvarez made an error that cost his team three runs, and the Pirates lost 4-3, that would be a true loss. Or, if Derek Dietrich made an error that cost his team one run, and the Marlins lost 3-2, that would also be a true loss. Finally, if the game went to extra innings and was a loss, any error worth one run or more was counted as a true loss. Therefore, if Josh Donaldson committed an error which cost his team only one run and then the A’s lost 10-8, but that final came in extra innings, then that would still count as a true loss because the extra innings would have never occurred (hypothetically).

Now this is obviously not a foolproof study. There is no way to say for sure that the error committed for one run was any more the cause of the loss than the pitcher who gave up the home run the next inning. It is also starting to get into a bit of a messy “Butterfly Effect” situation, meaning that there is no way of knowing how the rest of the game (or our lives, bro) would be different if Jose Reyes hadn’t booted that grounder in the fifth inning.

However, it was a fun study to put together, and it can be revealing into how little (or in poor Starlin Castro’s case, how much) errors truly change a game. Here’s the official chart:

What Was the Cost of Player X’s Errors?

Name Errors UER from E Team L’s True L’s RBI in True L’s
Pedro Alvarez 3B 20 11 11 4 4
Josh Donaldson 3B 15 6 5 1 0
Ian Desmond SS 15 10 8 2 2
Asdrubal Cabrera SS 14 12 9 1 0
Jose Reyes SS 13 7 9 2 0
Brandon Crawford SS 13 6 5 0 0
Lonnie Chisenhall 3B 13 6 5 0 0
Everth Cabrera SS 13 7 6 0 0
Brad Miller SS 13 7 5 1 0
Martin Prado 3B 12 13 8 2 2
Jonathan Villar SS 12 14 8 0 0
David Wright 3B 11 5 4 1 0
Starlin Castro SS 11 12 6 5 0
Jean Segura SS 11 8 1 0 0
Elvis Andrus SS 11 7 8 0 0
Yan Gomes C 11 4 6 0 0
Chris Owings SS 11 8 7 2 1
Derek Dietrich 2B 11 6 5 1 0
Jarrod Saltalamacchia C 10 5 7 1 0
Hanley Ramirez SS 10 7 7 1 0

Key: UER from E – unearned runs from errors; Team L’s – team losses; True L’s – true losses (described above); RBI in True L’s – how many RBIs the player had in said True Loss games

 

Let’s tackle this table column by column.

Well, I don’t think a historiography of each player’s name is necessary in today’s article, so let’s skip over to the position column. It is interesting to note how many left-side of the infield players there are atop the error leaderboard. There’s nobody from the outfield to be found (the “top” outfielder per errors is Sports Illustrated cover boy, George Springer with seven), and there are only three players that don’t hail from third base or short stop as their main position. One branch off of this study that could be interesting would be to look at whether or not there was a correlation between a player’s position on the diamond, and how frequently an error led to runs or “true losses.” My gut instinct would be to guess no, but maybe errors in the outfield are often for more bases, and therefore more likely to lead to a run – just a hypothesis.

Jumping over to the errors column, Alvarez’s 20 errors stood out, as the difference between his total and the second place total is the same as the difference between second place total and the bottom of our table. In fact, seeing that high total made me curious as to just how many errors it would take to get into the record books. Well, if you’re including the entire history of baseball, the answer is: like a bajillion. Obviously the game was entirely different, but it’s hard to imagine that Herman Long’s 122 errors in 1889 weren’t embarrassing even back then. The record for errors in a single season since 1952 is 44 by Robin Yount in 1975, and the record since 1980 is Jose Offerman with 42 in 1992. So while Alvarez’s 20 errors may be pacing the league by a good margin now, it’s fair to say he won’t be joining even the modern record books this season.

The next column looks at unearned runs derived from each player’s errors, and the variance is quite extreme. With a range from only four runs (it’s interesting to note that the catchers have the two lowest unearned runs tallies, maybe that positional study would provide some analysis after all) all the way up to 14, there doesn’t seem to be too close of a connection between the amount of errors and the amount of unearned runs. For instance, Josh Donaldson has committed three more errors than Jonathan Villar in 2014, but Villar’s errors have led to eight more runs. This brings up the question of whether unearned run prevention is simply luck, or whether some teams (and pitchers) respond better after an error is committed in the field.

The A’s are one of baseball’s best teams, and have an excellent pitching staff, so it isn’t too surprising that Donaldson’s unearned runs are among the lowest, especially in comparison to how many errors he has committed. On the other end of the spectrum are players like Altuve and Castro who play on rebuilding teams, and it is unsurprising to see their names next to some of the highest unearned run totals. However, there is most certainly a lot to be said for luck playing a role in how many unearned runs come along after an error. For example, teammates Asdrubal Cabrera and Lonnie Chisenhall find themselves on opposite ends of the spectrum in terms of unearned runs after errors, a definite sign of the role random chance plays in unearned run prevention.

One other note on the extreme variance in unearned runs tied to errors. The variance could also come as the result of what kind of error was made. A bobbled ball that never even gets thrown across the infield does only one base of harm; whereas, an overthrow (many of Alvarez’s errors) may lead to two bases of harm. One could also try to really dig deep into this data and see if younger, more inexperienced players were more likely to commit errors late in games, when the pressure was ratcheted up, and maybe those errors were more likely to be costly. However, with this study, the idea is simply to get a feel for another way of looking at errors, and the main point that remains here is that there is a lot of luck to whether a player’s error costs his team a run or not.

There isn’t a whole lot to be said about the team losses column, as committing an error does indeed swing the pendulum (or WPA chart) towards a loss, but so minimally that it wouldn’t even bother one of Poe’s victims. For instance, implying that Jean Segura (only one team loss in games he committed an error) timed his errors better than Elvis Andrus (eight team losses in games he committed an error) is really just saying that the Brewers are better than the Rangers; which they are, but that doesn’t reflect on the individual player at all. That comparison is especially interesting given that Andrus’ errors have actually led to fewer unearned runs than Segura’s.

The next column, the “true losses” column, is where the fallacy of the error as a statistic truly shows its colors. The only players who cost their teams more than two wins in the first half (with teams having played well over 90 games in 2014, so far) were the league leader, Alvarez, and the incredibly unlucky Starlin Castro. Castro’s case could be an entire article itself, and the poor timing of his errors is remarkable. The fact that the Cubs have only lost six games in which he has committed an error, and five of those can be considered “true losses” is very much a statistical anomaly. Consider that in this chart there are 124 team losses outside of Castro’s Cubs. Of those 124 losses, 19 were true losses, or just over 15 percent. In Castro’s case, over 83 percent of his team losses were true losses, such a far outlier it warrants special attention.

Even when including Castro’s remarkable true loss numbers, the percent of losses that could be considered, even hypothetically, the erring player’s fault is merely 18.5 percent, and that’s not even accounting for all the games that the team’s still won in which one of  the listed player’s committed an error. This is a good time to point out that this study obviously does not take into account any of the good, run-saving plays that these fielders make, and even still the total impact on a team is minimal. As seen in Pedro Alvarez’s row, he drove in plenty of runs in those games in which he cost his team, and with his strong range, some of those errors he made likely would have been singles, with the majority of third baseman failing to even get to the ball. Josh Donaldson and David Wright stand out as particularly strong cases of top-notch fielders who, because of their strong range, get to more groundballs, but get to them in difficult positions, thus increasing the likelihood of an error.

All of this being said, let’s not take too much away from the potential impact of an error. It is indeed a mistake, and can have a negative impact on the team in ways more than just the scoreboard. For instance, for every error made, that is an extra batter that the pitcher has to face, and therefore, more pitches on his final pitch count. If the bases were clear before the error, the pitcher has to pitch out of the stretch now, and the threat of a potential steal is in play. If a certain player is prone to errors, it may also lead to his pitcher not having confidence in his defense behind him, and therefore getting himself in trouble by trying to do too much on the mound. Other fielders may feel that they have to cheat in the commonly erring fielder’s direction if there is likely to be a mistake made, which can mess up a team’s defensive positioning. Finally, there’s the fact that for all of us here at FanGraphs who realize the harm in relying on errors too much as a statistic, there are still those in baseball who do rely on it, and committing enough errors in the field, may lead to a player riding the pine for a few days.

In the end, it’s fair to say that errors are one metric out of many. They have historically been overused, and hopefully the chart above, has made it clear that frequently an error won’t really cost the team anything.

And if your error did cost your team, well, you’re probably Starlin Castro.


Using High-A Stats to Predict Future Performance

Last week, I looked into how a player’s low-A stats — along with his age and prospect status at the time — can be used to predict whether he’ll ever play in the majors. I used a methodology that I named KATOH (after Yankees prospect Gosuke Katoh), which consists of running a probit regression analysis. In a nutshell, a probit regression tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. While KATOH technically predicts the likelihood that a player will reach the majors, I’d argue it can also serve as a decent proxy for major league success. If something makes a player more likely to make the majors, there’s a good chance it also makes him more likely to succeed there.

Things that were predictive for players in low-A included: age, strikeout rate, ISO, BABIP, and whether or not he was deemed a top 100 prospect by Baseball America in the pre-season. However, a player’s walk rate was not significant in predicting a player’s ascension to the majors. Today, I’ll analyze what KATOH has to say about players in class-A-advanced leagues. Here’s the R output based on all players with at least 400 plate appearances in a season in high-A from 1995-2009:

High-A Output

This looks very similar to what I found for low-A players: Walk rate isn’t significant, and everything else has very similar effects on the final probability. However, the coefficients from this model are all a tad bigger than those from the low-A version, implying that high-A stats might be a bit more telling of a player’s future. Intuitively, this makes sense: The closer a player is to the big leagues, the more his stats start to reflect his future potential.

By clicking here, you can see what KATOH spits out for all current prospects who logged at least 250 PA’s in high-A as of July 7th. I also included a few notable players who fell short of the threshold, namely Joey Gallo (who checks in at a remarkable 99.8%), Peter O’Brien, and Jesse Winker. Here’s an excerpt of the top-ranking players:

Player Organization Age MLB Probability
Joey Gallo TEX 20 100%
Corey Seager LAD 20 99%
Carlos Correa HOU 19 99%
Albert Almora CHC 20 93%
Nick Williams TEX 20 93%
D.J. Peterson SEA 22 93%
Jesse Winker CIN 20 91%
Orlando Arcia MIL 19 88%
Jose Peraza ATL 20 87%
Colin Moran MIA 21 87%
Renato Nunez OAK 20 86%
Tyrone Taylor MIL 20 85%
Hunter Renfroe SDP 22 84%
Josh Bell PIT 21 84%
Raul Mondesi KCR 18 83%
Daniel Robertson OAK 20 83%
Jorge Polanco MIN 20 81%
Dilson Herrera NYM 20 77%
Breyvic Valera STL 21 77%
Peter O’Brien NYY 23 76%
Matt Olson OAK 20 75%
Jorge Alfaro TEX 21 75%
Patrick Leonard TBR 21 75%
Dalton Pompey TOR 21 73%
Billy McKinney OAK 19 73%
Teoscar Hernandez HOU 21 73%
Brandon Nimmo NYM 21 72%
Jose Rondon LAA 20 70%
Rio Ruiz HOU 20 70%
Brandon Drury ARI 21 70%

Next up will be double-A. Unlike A-ball, double-A tends to be a random mishmash of prospects and minor-league lifers, so it will be interesting to see how KATOH handles this wide array of players. And perhaps double-A is where a player’s walk rate finally starts to tell us something about his future success.

Statistics courtesy of Fangraphs, Baseball-Reference, and The Baseball Cube; Pre-season prospect lists courtesy of Baseball America.


Pitch Movement Benchmarks

There are many variables that influence the effectiveness of a pitch. Of these many variables, the way in which the pitch moves contributes to the overall story of a pitcher. And because we all want to know how and why a pitcher is successful, determining benchmarks for movement can be a useful measurement when evaluating pitchers. 

Using 2011-2013 data, with exclusion criteria of at least 50 innings and the pitch being thrown at least 4% of the time, we determined the average horizontal movement, vertical movement, and overall movement (which we refer to using the Z-axis) of each pitch. Horizontal movement is affected by handedness; so to account for that, we used the absolute value of average movement. All of this doesn’t mean much yet, because there are so many factors that goes into what makes a pitch effective. But we can at least look at it to get some idea of how a pitch moves relative to others throwing it. 

Here they are:

FA% vFA FA-X (inches) FA-Y (inches) FC-Z (inches)
38.99% 91.48 4.82 7.73 9.49
FT% vFT FT-X (inches) FT-Y (inches) FT-Z (inches)
21.06% 91.31 8.50 6.76 11.04
FC% vFC FC-X (inches) FC-Y (inches) FC-Z (inches)
19.07% 88.44 1.27 5.47 5.83
SI% vSI SI-X (inches) SI-Y (inches) SI-Z (inches)
36.84% 90.23 8.57 4.91 10.31
FS% vFS FS-X (inches) FS-Y (inches) FC-Z (inches)
14.66% 84.09 5.31 2.82 6.43
SL% vSL SL-X (inches) SL-Y (inches) SL-Z (inches)
21.27% 83.21 2.67 0.44 3.64
CU% vCU CU-X (inches) CU-Y (inches) CU-Z (inches)
14.56% 76.89 5.01 -5.86 8.02
CH% vCH CH-X (inches) CH-Y (inches) CH-Z (inches)
12.82% 83.10 7.01 4.14 8.55

You may notice it to be odd that some of these pitches’ vertical movement appear to rise. However, these measurements do not account for gravity. If gravity was factored into the measurement, then yes, the slider (and the rest of all these pitches) would appear to drop many more inches.

Lets see how Clayton Kershaw’s curveball matches up…

CU% (pfx) vCU (pfx) CU-X (pfx) CU-Y (pfx) CU-Z (pfx)
14.52 74.94 2.7 -8.58 8.99

Versus MLB average.

CU% (pfx) vCU (pfx) CU-X (pfx) CU-Y (pfx) CU-Z (pfx)
14.56% 76.89 5.01 -5.86 8.02

Kershaw’s curveball, although the horizontal movement is less than average, moves much more than league average vertically and more overall.

Work in collaboration with Douglas Wills.


Wait, They’re Good Now?

In the 2008 season the Yankees started the year with two young pitching prospects in their rotation: Phil Hughes and Ian Kennedy. These two pitchers were expected to be the future of the Yankees rotation. That didn’t really go as planned. The two pitchers struggled, and they both earned demotions as they combined for an ERA of 7.44. Hughes and Kennedy were simply not ready for major-league action. They gave up too many walks, didn’t strike out enough guys, and didn’t keep the ball in the ballpark. That’s a recipe for disaster when it comes to trying to succeed as a pitcher at the major-league level.

Nonetheless, these two pitchers showed enough promise as prospects for the Yankees to actually wait on them. In fact, after their demotions, Hughes and Kennedy spent most of their 2008 season in the minors due to mediocrity and injuries. The  Yankees were patient for a year with their young talent, however there is only so much time that goes by before you go from being a developing prospect to struggling major leaguer. The Yankees quickly gave up on Kennedy, and traded him to Arizona, where he showed decent success as a starter. In three seasons with Arizona, Kennedy compiled a WAR of 10.2.

The Yankees saw something in Phil Hughes. Hughes showed some promise in 2009 as a reliever, and then in 2010 as a starter who compiled a WAR of 2.5. However, there was the problem of Yankee Stadium not suiting Hughes’s skill set. Hughes was a fly-ball pitcher in a stadium that was known for being a hitter’s haven. Hughes always struggled as a Yankee when it came to keeping the ball in the park. The lowest home run rate that Hughes posted as a full season Yankee starter was 1.28 in 2010.

Both Kennedy and Hughes had some success over the  years; one could even argue that Kennedy was one of the best pitchers in the league in 2011. However, for the most part their careers have been a mixed bag. But times have now changed. Kennedy is now with his third team, the Padres, and Hughes is with his second team, the Twins. After mediocre 2013 seasons, the two pitchers are actually performing well.

2014 Season K/9 BB/9 HR/9 ERA FIP xFIP WAR
Hughes 7.99 0.81 0.67 3.92 2.62 3.22 3.7
Kennedy 9.67 2.46 0.72 3.47 2.93 3.17 2.3

As of right now, Hughes is fourth in the league for FIP among qualified pitchers. The only pitchers who have been better are John Lester, Adam Wainwright, and Felix Hernandez. Hughes is third in the league for WAR, right behind Lester and Hernandez. For the first half of the season Hughes has pitched like an ace.

Hughes has had the second best walk rate among qualified starters. Any walk rate below two is considered to be good, and Hughes’s rate right now is downright ridiculous. We can’t expect Hughes to be this good at not walking people, however the ZiPs/Steamer projections have him finishing the year with a walk rate between 1.31-1.38. That’s a pretty good projection, considering Hughes has never had a walk rate lower than 2.16. Hughes has also improved his home run problem, as he isn’t letting an egregious number of baseballs leave the park. The main change in Hughes approach has been his implementation of the cutter. Between 2012 and 2013, Hughes had dropped his cutter. This year, he reintroduced the pitch — throwing it 23% of the time — and dropped his usage of a slider. The change has proven to be useful for Hughes, and he no longer needs to rely on his fastball.

Then there is Kennedy. Kennedy has turned himself into the ace of the Padres staff this year. The main difference in Kennedy is that he has actually gained velocity on his pitches. Throughout his career he has always been a soft tosser. For most of his career, Kennedy averaged 89-90 MPH on his fastball. In 2013 he was up to 90 MPH. This year he is averaging 92 MPH.

Not only does Kennedy’s fastball have more velocity, but he’s also throwing the pitch more than he ever has since 2009. He has thrown his fastball 48% of the time this year. The last time he threw it more than 40% was 2010.

While it may be good to have more velocity, it also could be a little bit of concern when it comes to Kennedy because his secondary offering don’t appear to be very good. In fact, all of his pitches have negative wRAA values except for his fastball, which has a wRAA of 12.8. Most of Kennedy’s strikeouts have come off of his fastball. Having a good fastball is nice, but when Kennedy gets older — and his velocity starts to decline — he’s going to have a hard time being successful if he doesn’t have good secondary offerings.

Overall, the changes for these pitchers seemed to have worked. They’re succeeding in their own environments. While the Yankees never were able to see their prized prospects come into fruition, these two pitchers have found success away from New York. Learning to pitch at the major-league level is a learning curve. Some pitchers dominate right away. Other pitchers struggle for their first couple of years, and then things somehow start to click for them. I’m not suggesting that Kennedy and Hughes have figured out pitching, nor are they the best pitchers in the majors. However, they have proved that they are  at least very average starters, or maybe even above average major-league pitchers. Only time will tell.


The Luckiest and Un-Luckiest Pitchers According To Base Runs

On June 3rd Marlins pitcher Henderson Alvarez threw an 88-pitch shutout against the Rays scattering eight hits while not issuing a walk. On July 11th Marlins pitcher Henderson Alvarez also gave up eight hits while not issuing a walk but only made it five innings after surrendering 6 runs. While the circumstances surrounding these two starts aren’t completely the same they do a good job illustrating the phenomena of cluster luck.

Cluster luck, originally discovered and coined by Joe Peta in his book Trading Bases, essentially tells us how lucky teams have been by measuring the difference in the expected number of runs scored by a team based on its power (total bases), and base runners (hits/walks) and its actual number of runs scored. In Alvarez’s July start above he was a victim of poor sequencing, allowing his hits in bunches rather than spreading them out over the course of his start. For a more complete (and easier to understand) definition and some real world examples check out this and this.

What I will be attempting to do in this article is figure out a way to accurately estimate how many runs a pitcher should have allowed, and subsequently what his run average should look like, and then pinpoint certain pitchers who have been lucky or unlucky so far this season. Basically I am trying to normalize a pitcher’s RA by adjusting for sequencing and cluster luck.

Fortunately for me the heavy lifting for part one has already been done thanks to Dan Smyth. His metric, Base Runs (BsR), was developed and popularized in the early 1990’s and is an extraordinarily simple yet accurate way of estimating runs allowed using standard box score statistics. Base Runs for pitchers takes four inputs, innings pitched, hits, walks, and home runs, which are converted into four factors, A, B, C, and D. The final formula looks like A*B/(B+C)+D. For a lengthier piece on Base Runs, it’s properties, and it’s pros and cons consult this and this.

I took these statistics, including run average, for every pitcher in the majors through July 12th and figured his expected runs allowed by Base Runs, then converted it to Base Run Average or BsRA and took the difference between BsRA and his actual RA. I also calculated the pitchers’ RA- and BsRA- by taking the pitcher’s RA or BsRA and divided it by the league RA or BsRA (for reference the league RA is 4.14 and the league BsRA is 4.19). By taking the difference between the two, (BsRA-)-(RA-), we can figure out the percentage of extra runs compared to league average the pitcher should have allowed.

In the tables below you’ll see I’ve given this stat the name Luck%, a poor name admittedly since we’re dealing with percentages and I’m sure the differences aren’t completely due to luck but the name will have to do until I think of something better. For example Max Scherzer’s RA- is 80.92 (RA of 3.35/league RA of 4.14) meaning he has allowed runs at around 81% of the league average, but his BsRA- is 88.62 (BsRA of 3.71/league BsRA of 4.19) meaning he should have allowed runs at around 89% of the league average. We then get a Luck% of 88.62-80.92=7.71, so Scherzer should have allowed 7.71% more runs compared to league average, he has a Luck% of 7.71.

Whew. Now we can get to the names.

First the top ten qualified pitchers who have had their numbers most positively affected by cluster luck.

Name IP RA BsRA BsRA- RA- Luck%
Mark Buehrle 126.1 2.92 3.95 94.3 70.5 23.7
Wei-Yin Chen 104 4.24 5.19 123.8 102.4 21.4
Jason Vargas 125 3.38 4.23 101 81.6 19.4
Zack Greinke 118.2 3.11 3.91 93.4 75.1 18.2
Alfredo Simon 116.2 2.78 3.50 83.5 67.1 16.3
Josh Beckett 103.2 2.6 3.30 78.9 62.8 16.1
Masahiro Tanaka 129.1 2.71 3.41 81.5 65.5 16
Yordano Ventura 101.2 3.36 4.03 96.2 81.2 15
Chris Young 105.1 3.16 3.81 91 76.3 14.7
Henderson Alvarez 120 3.23 3.85 91.8 78 13.8

I like this list since it is very diverse. We have pitchers who have been pleasant surprises this season but who we all know aren’t really that good (Vargas and Simon). Older pitchers experiencing a late career resurgence (Beckett and Buehrle). Great pitchers (Greinke and Tanaka) and not so great pitchers (Chen). Hard throwing (Alvarez) and soft throwing (Young). High strikeout and low strikeout etc. etc. It’s good to see that not just one type of pitcher is affected giving me confidence that cluster luck does play a factor in a pitchers numbers to such a degree even this late in the season.

Now on to the top ten pitchers who have had their numbers most negatively affected by cluster luck.

Name IP RA BsRA BsRA- RA- Luck%
Anibal Sanchez 94.2 3.52 2.44 58.2 85 -26.8
Matt Garza 124.1 4.42 3.37 80.4 106.8 -26.3
Justin Masterson 98 6.06 5.09 121.4 146.4 -25
Tyler Skaggs 91 4.65 3.78 90.2 112.3 -22.2
Charlie Morton 119.1 4.15 3.36 80.1 100.2 -20.1
Roenis Elias 112 4.94 4.33 103.2 119.3 -16.1
Jorge De La Rosa 102.2 4.91 4.32 103.2 118.6 -15.4
Edwin Jackson 105.1 6.07 5.53 132 146.6 -14.7
Jose Quintana 119.1 3.85 3.31 79.1 93 -13.9
Hiroki Kuroda 116.1 4.64 4.19 100 112.1 -12.1

This is a slightly less diverse list. Most of these guys are having disappointing seasons, but perhaps they haven’t been as bad as we think. Four of these guys have a below average RA, but an above average BsRA (or perfectly average in the case of Kuroda). Then there’s Anibal Sanchez who might just be one of the most underrated pitchers in baseball as his BsRA is seventh in all of baseball.

So what does Luck% end up telling us about a pitcher? We know that pitchers have little control over what happens after a ball is put in play, but what we’re doing here is figuring out which pitchers have been victimized by poor sequencing. Perhaps we can look at Luck% the same way we look at BABIP. If the measure is abnormally high compared to a pitcher’s career rate and the pitcher hasn’t made a substantial improvement in his mechanics or pitch repertoire perhaps some regression is in order.

So is Anibal Sanchez due for a spectacular second half? Maybe not. A myriad of factors could be influencing his low Luck%. We know that in general offense goes up when runners are on base and Sanchez could be especially susceptible to allowing runs to score in bunches. He has a slow move to the plate potentially allowing more runners to steal and get in scoring position. Perhaps his stuff is less effective from the stretch due to a breakdown in mechanics. Maybe he focuses too much attention the runners on base and not enough on the one at the plate, I really don’t know.

I only have half a season of data on 100 or so pitchers so obviously more research is needed. One could find the correlation between Luck% and peripheral stats such as K% and BB%, or find year to year correlations for Luck% to find out how much variation is actually luck and how much is skill. I’d definitely be intrigued by those results and I’ll likely revisit these numbers when the season ends.

I’m still relatively new to performing this kind of analysis so any constructive criticism would be greatly appreciated or if you’ve seen something like this done elsewhere on the internet. If you have suggestions for any improvements (especially the name) or further research I’d love to here it. If you think I majorly screwed up somehow I’d love to hear about too.


Using low-A Stats to Predict Future Performance

For a piece I wrote a couple of weeks ago, I used historical minor league stats to to construct a model that predicts how likely it is that a teenager in A-ball will make it to the major leagues. While this method produced some interesting results, it also had some flaws, most notably that it didn’t take scouting or defense into account. This basically meant that a great defensive player — or a raw, toolsy player — could easily get an undeserving low rating if he had a poor year at the plate. Another drawback was that it only applied to teenaged players in low-A, who represent a pretty small portion of players at the level, and just a sliver of the prospect population.

With these shortcomings in mind, I’ve taken another stab at predicting which players from the South Atlantic and Midwest leagues are most and least likely to make it to the show. Like last time, I ran a probit regression, which tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. But instead of limiting my analysis to players under the age of 20, I considered all players and included age as a variable in my model. I also attempted to quantify scouting by taking into account whether or not a player made Baseball America’s pre-season prospect rankings. The model still relies heavily on offensive performance, but isn’t entirely guilty of “scouting the stat line.”

It’s come to my attention that Chris St. John of Beyond the Boxscore is doing something very similar with his JAVIER projection system, and it will be interesting to see where his model and mine agree and disagree once I repeat this exercise for all minor leaguers. Chris named his system after Chicago Cubs prospect Javier Baez, so I’ll follow suit and also name mine after a prospect. Yankees’ prospect Gosuke Katoh was my original my inspiration for this idea, so I’ll call my methodology KATOH. Without further adu, here’s the resulting R output if you’re into that kind of stuff:

Low-A Output
All hitting stats were taken relative to league average and then scaled to 2014 low-A league averages.

A player’s age, prospect status, strikeout rate, ISO, and even BABIP all proved to be predictive in the direction you’d expect. But the show-stopper here is that a player’s walk rate isn’t at all predictive of whether or not he’ll make it to the majors. One possible explanation is that — unlike power or speed — plate discipline is a skill that can be learned, and many players in low-A are still developing their batting eye and learning to lay off pitches. As one example, Brian McCann walked less than 5% of the time as a 19-year-old in the Sally League, but still developed into a relatively patient big leaguer.

Another possibility is that you don’t have to be a particularly good hitter to run a high walk rate in low-A. Pitchers at that level often have little idea where the ball’s going, which enables hitters to take an ultra-passive approach in the hopes that they’ll see four balls before they see three strikes. That strategy might work in the low minors, but can lose it’s effectiveness in the upper-levels where pitchers have a better handle on their control. I’ve included an excerpt of what KATOH spits out for modern-day players in low-A who logged at least 250 plate appearances through July 7th. The full list of qualifying players can be seen here.

Player Name Organization Player’s Age MLB Probability
David Dahl COL 20 89%
Jake Bauers SDP 18 89%
J.P. Crawford PHI 19 87%
Dominic Smith NYM 19 79%
Willy Adames DET 18 78%
Chance Sisco BAL 19 74%
Reese McGuire PIT 19 73%
Andrew Velazquez ARI 19 70%
Manuel Margot BOS 19 69%
Ryan McMahon COL 19 68%
Franmil Reyes SDP 18 66%
Brett Phillips HOU 20 65%
Wendell Rijo BOS 18 64%
Carson Kelly STL 19 63%
Kean Wong TBR 19 63%
Trey Michalczewski CHW 19 62%
Clint Frazier CLE 19 62%
Clint Coulter MIL 20 62%
Evan Van Hoosier TEX 20 59%
Austin Dean MIA 20 59%
Drew Ward WSN 19 58%
Raimel Tapia COL 20 56%
Tanner Rahier CIN 20 55%
Correlle Prime COL 20 55%
Carlos Asuaje BOS 22 54%
Dustin Peterson SDP 19 54%
Jesmuel Valentin LAD 20 54%
Dawel Lugo TOR 19 54%
Avery Romero MIA 21 53%
Chad Wallach MIA 22 53%
Nomar Mazara TEX 19 52%

Over the next couple of weeks, I plan to repeat this exercise for all levels of minor league play. As I climb the minor league ladder, it will be interesting to see when — or even if — a hitter’s walk rate starts to be predictive of whether or not he’ll make it to the majors. Keep an eye out for the next iteration, which will look at high-A stats and slap probabilities on current high-A players.

Statistics courtesy of FanGraphs, Baseball-Reference, and The Baseball Cube; Pre-season prospect lists courtesy of Baseball America.


Pitch(er)’s F/x

The MLB is not facing a crisis yet, but it may be soon. In an age of instant gratification and the desire to see the biggest, loudest, and longest of highlights, baseball is getting slower and lower scoring. Although picking up the pace would be a simple task for the Commissioner’s Office, picking up the scoring would be much, much more difficult. The reason for the decline in runs per game is not obvious at first glance. But, like all things in the MLB these days, the key lies in the data.

At the turn of the century, the Steroid Era was going on strong. Even when league-wide PED testing was implemented in 2003, runs per game increased from 2003 (4.73) to 2006 (4.86). Since then, runs have dropped significantly, hovering just above four. Rather than looking to possible reasons, such as PED use, the real proof lies in observation. The major change from 2006 to now is the use of PitchF/x data. In 2006, PitchF/x became a staple in every MLB ballpark. The applications for the system are endless, but the focus for scouting hitters is Hot Zones.

Nearly every hitter has a “hole” in their swing. Even Mike Trout struggles hitting balls up in the zone. Miguel Cabrera has (some) trouble with balls on the outer edge, although limited. Pitchers meanwhile dictate the zone. Although they may prefer to throw to one side of the plate or a certain elevation, elite pitchers have no problem working the ball to all parts of the zone and outside it. The game’s most dominant pitcher this year (not up for argument) has scattered pitches everywhere, especially to lefties. For Kershaw of course, the Heat Map does little justice to his ability to locate the ball. Most hitters have a similar hole, so he is more likely to throw it there than he is all over the heat map. It does show his ability to pitch the ball to a spot better than a hitter can make good contact on a pitch in a certain spot. Let’s take a peek at an example.

Paul Goldschmidt is a really, really good hitter of white balls with red laces. If you don’t believe me, ask Tim Lincecum. First, let’s take a look at Goldschmidt’s Heat Map over his career. Nothing too surprising, he likes his baseballs on the inner half of the zone. Once you get out of the zone on the inside though, he becomes not-so-amazing. Now if we take a peek at DJ Pauly G (I will never call him this to his face because I like my current face structure) vs Kershaw, you can see that Kershaw has been pretty good at targeting his cooler zones. The result of this has been a batting average of just over the Mendoza Line. When you look at him against Lincecum, you see something a lot different. This is probably why Lincecum typically has a sore neck the day after he faces the Diamondbacks. While Kershaw has been able to get it out of the zone low and in, Lincecum has tended to leave them over the plate, resulting in the ball coming to rest in the stands.

At first glance, it may be a pretty simple difference that one pitcher is hitting his spots and one is not. At second glance, it might still look the same. If you really squint though, you can see that conventional wisdom would say very rarely throw it inside to Goldschmidt. Goldy would have been pitched around 10 years ago, and almost all the balls would have been dotting the lefty batter’s box. Prior to the installation of PitchF/x, pitchers would likely have been scared to throw it inside to the slugger. Advanced data available via Heat Maps can show something different, which Kershaw has capitalized on.

From a hitter’s perspective, you probably have a decent idea of what you can and cannot do at the plate. Prior to Pitch F/x, hitters kind of knew what to expect. There was once a hitter that pitchers really didn’t know what to do when they faced, so they walked him. His name was Barry, and a large part of why he couldn’t be pitched to was because pitchers had no idea what to do when he came to the plate. In a 2001 USA Today article, it got to the point where the question was asked “How do you pitch to Bonds?” Bonds had no holes, or so it was thought. I would venture to guess that Bonds, and other greats, would have hit far fewer home runs in an age where pitchers knew the specific places hitters could and could not put the ball over the wall.

Now, hitters are faced with more of a dilemma due to the hyper-advanced scouting. Back when it was a simple “he likes to chase sliders outside the zone late in the count”, hitters had some expectations of what they would likely face. Now, their approach has changed to, “I better look for the low and away slider, but he might try to get me with the high heat since I have a high whiff rate there. Or maybe he’ll go for the change since I have trouble when I am behind in the count and I have fouled off two pitches after seeing one or more sinkers on the outer half of the zone during night games played on the West Coast.” The moral is, pitchers have so much data they can know a hitter better than he can know himself. A hitter’s guess on what he may face is much less educated than it was prior to PitchF/x, making it a lot harder to put the barrel on the ball.

Although there are surely outside causes, PitchF/x is a large part of the reason that runs are on the decline. Pitchers have control on where the ball will end up 60’6” later, and if they are able to put it in a place where the hitter is poor, there will be fewer runs. The new data available has helped pitchers much more than hitters thus far, and until something changes in hitters’ approaches or new data comes along favoring batters, we can expect more of the same. Unfortunately for fans like myself who loved watching Barry knock them into the bay in high scoring affairs, it looks like the Steroid Era’s high scoring affairs are long gone. Low scoring baseball is here to stay.


In an Imperfect World, Chase Utley is a Hall-of-Famer

“Criminally underrated” is now an overused phrase, meaning exactly what I want it to mean in regards to Chase Utley.

Overshadowed by inferiors, Utley has flown under the mainstream for the most part because of the common fans obsession with statistics that, while not useless, are very much flawed.

“Inferior” does not mean bad.  Ryan Howard was a good baseball player for a number of years.  Ditto for Jimmy Rollins.  The two players range somewhere in the above-average range, to just plain good.

But neither player can touch Utley in either peak seasons, or cumulative value.

But this isn’t written to compare Utley to non-Hall of Famers.  And it’s not written to compare him to Hall of Famers that are probably not deserving of the honor, either.

Utley stands up well to the actual Hall of Famers.  The players who already have their plaques enshrined in Cooperstown.  And the guys that aren’t there yet, but should be eventually (not voted in yet/not eligible).  He is one of the all-time greats and he still has some mediocre to good baseball left, especially since he is currently on pace to exceed five wins again this year, if one were to assume good health.  Which with Utley though, is not necessarily a safe assumption.

He knocked out five 7-7.9 win seasons in five consecutive seasons from 2005-2009.  It’s not like my normal loose threshold of Hall of Fame caliber seasons that I set at 6 wins.  Utley eclipsed the *6* by at least a win, in every one of those five seasons.

I get that 58 wins is generally perceived to be a borderline Hall of Famer.  And Utley has not reached the counting stats that so many of the current Hall of Fame voters have grown — and adopted permanently, apparently — a love for.  So if an observer of baseball does not consider advanced statistics and/or sabermetrics then the case for Utley seems less apparent.

But with that said, the right to vote should at least be exercised by observers of the game who realize that playing a certain position, and playing it well, matter greatly.  It’s not necessarily the case, but it should be.  You don’t have to be infatuated with WAR and WARP to know that a guy who can handle second base defensively has more value than a guy that can only handle first base.

Utley could obviously handle 2B.  But he wasn’t just an adequate “handler” of the position as much as one of the better handlers of the position of all time.  Perennially a good defender, perennially a 2B, perennially one of the best-hitting 2B ever…and what he have is a guy that might just end up getting lost in an extremely crowded ballot.

58 wins may not be enough.  But if he ages with any kind of grace, I don’t see how 65 is out of the realm of possibility.

The one thing Utley has going for him is that sabermetrics is growing.  And there will still be hard-headed voters when Utley’s case ultimately rolls around.  But there should be less stubborn, “set-in-their-ways” voters, than we currently have to deal with.  And most likely, there will be guys that just don’t view Utley as a Hall of Famer with any kind of non superhero like finish to his career.

That’s their right.

But Chase Utley was — at his best — better than Whitaker.  He was better than Biggio.  And he was better than Alomar.

If he retired after this season, he’d get my vote.  But since it is likely he stays healthy enough to produce at a decent-enough level for a few more seasons, he may get a lot of other people’s votes as well.

In reflection, Chase Utley will look better when the ballot rolls around, to the voters, than he does to them now.  Even his peak years will.