Author Archive

Finding Comps for Brandon Finnegan Using PITCHf/x

Twenty-one year old Brandon Finnegan put his name on the map in Tuesday night’s epic wildcard game, when he tossed scoreless 10th and 11th innings before leading off the 12th with a walk to Josh Reddick, who would eventually come around to score against Jason Frasor. Drafted by the Royals with this year’s 17th overall pick, Finnegan made quick work of the minor leagues, making his big league debut on September 6th at Yankee Stadium, just 81 days — and 27 minor league innings — removed from his last appearance with TCU in this year’s College World Series.

To find comps for Finnegan, I first looked for pitchers with a similar arsenal of pitches. Using a minimum of 1,000 pitches, I sought out left-handed pitchers who threw fastballs, sliders, and changeups — Finnegan’s three pitches — at least 90% of the time since 2008, and threw each of these pitches at least 5% of the time. From there, I turned to the PITCHf/x database to find out how often these pitchers’ pitches fell within Finnegan’s middle 50% of values for velocity, break angle, break length, and spin rate, and spin direction from his eight big-league games. These are the pitchers who threw the highest ratio of pitches comparable to what Finnegan threw. The similarity percentage was calculated by dividing each pitcher’s share of pitches meeting these criteria by the share of pitches met by Finnegan himself. The ERA’s were calculated over the last seven years: 2008-2014.

Pitcher Simalarity ERA
Tony Watson 13% 2.63
Chris Sale 7% 2.76
Patrick Corbin 6% 3.80
Tony Cingrani 6% 3.49
Derek Holland 6% 4.23
Francisco Liriano 5% 4.26
Oliver Perez 4% 4.50
Ross Detwiler 4% 3.83
Tim Byrdak 3% 3.78
J.C. Romero 3% 3.68
Martin Perez 3% 4.13
Luis Perez 3% 4.50
Jordan Norberto 3% 4.00
Brian Duensing 3% 4.12
Michael Kirkman 3% 4.98
Andrew Miller 3% 4.78
Wil Ledezma 3% 5.82
Jonathan Sanchez 3% 4.60
CC Sabathia 3% 3.43
Zach Britton 2% 4.05

Finnegan looks to have a bright future ahead of him, as his top two comps are two of the most dominant pitchers in baseball — one a starter (Sale) and one a reliever (Watson). It remains to be seen which path the Royals will choose for their hard-throwing lefty going forward. While it’s tempting to slot him in in a relief role next season, the wiser decision might be to stretch him out as  starter, where he would be able to take full advantage of his three-pitch arsenal. But either way, until the Royals’ playoff run comes to an end, Brandon Finnegan will be allowed to air it out for just an inning or two at a time on easily the biggest stage he’s ever seen. And given his lights-out stuff, he might just end up being this year’s Francisco Rodriguez.

This article originally appeared on Pinstripe Pundits.


Could It Be Time to Update WAR’s Positional Adjustments?

It’s been quite a week for the WAR stat. Since Jeff Passan dropped his highly controversial piece on the metric on Sunday night, the interwebs have been abuzz with arguments both for and against the all-encompassing value stat. One criticism in particular that caught my eye came from Mike Newman, who writes for ROTOscouting. Newman’s qualm had to do with a piece of WAR that’s often taken for granted: the positional adjustment. He made the argument that current WAR models underrate players who play premium defensive positions, pointing out that it would “laughable” for Jason Heyward to replace Andrelton Simmons at shortstop, but not at all hard to envision Simmons being an excellent right fielder.

This got me thinking about positional adjustments. Newman’s certainly right to question them, as they’re a pretty big piece of the WAR stat, and one most of us seem to take for granted. Plus, as far as I’m aware, none of the major baseball websites regularly update the amount they credit (or debit) a player for playing a certain position. They just keep the values constant over time. I’m sure that whoever created these adjustments took steps to ensure they accurately represented the value of a player’s position, but maybe they’ve since gone stale. It’s certainly not hard to imagine that the landscape of talent distribution by position may have changed over time. For example, perhaps the “true” replacement level for shortstops is much different than it was a decade or so ago when Alex Rodriguez Derek Jeter, Nomar Garciaparra, and Miguel Tejada were all in their primes.

I decided to try and figure out if something like this might be happening. If the current positional adjustments were in fact inaccurately misrepresenting replacement level at certain positions, we’d expect the number of players above replacement level to vary by position. For example, there might be something like 50 above-replacement third basemen, but only 35 shortstops. Luckily, the FanGraphs leaderboard gives you the ability to query player stats by position played, which proved especially useful for what I was trying to do. For each position, I counted the number of plate appearances accumulated by players with a positive WAR and then divided that number by the total plate appearances logged at that position. Here are the results broken out by position for all games since 2002.

Ch1

Based on this data, it seems like the opposite of Newman’s hypothesis may be true. A significantly higher portion positive WAR plate appearances have come from players at the tougher end of the defensive spectrum, which implies that teams don’t have too difficult of a time finding shortstops and center fielders who are capable of logging WARs above zero. Less than 13% of all SS and CF plate appearances have gone to sub-replacement players. But finding a replacement-level designated hitter seems to be slightly more difficult, as teams have filled their DH with sub-replacement-level players nearly 30% of the time. Either teams are really bad at finding DH types (or at putting them in the lineup), or the positional adjustments aren’t quite right. The disparities are even more pronounced when you look at what’s taken place from 2002 to 2014.

Ch2

The share of PAs logged by shortstops and center fielders hasn’t changed much over the years, but the numbers have plummeted for first basemen, corner outfielders, and DH’s. From Billy Butler and Eric Hosmer, to Jay Bruce and Domonic Brown, this year’s lineups have been riddled with sub-replacement hitters manning positions at the lower end of the defensive spectrum. Meanwhile, even low-end shortstops and center fielders, like Derek Jeter and Austin Jackson, have managed to clear the replacement level hurdle this season if we only count games at their primary positions.

The waning share of above-replacement PA’s coming from 1B, LF, RF, and DH has caused the overall share to drop as well, with a particularly big drop coming this year. Here’s a look at the overall trend.

 

Ch3

And here it is broken down by position…

 

Ch4

And just between this year and last…

 

ch5

 

Frankly I’m not sure what to make of all of this. I’m hesitant to call it evidence that the positional adjustments are broken. There could be some obvious flaw to my methodology that I’m not considering, but I find it extremely interesting that there’s been such a shift between this year and last. We’re talking an 8 percentage point jump in the number of PAs that have gone to sub-replacement-level players. Maybe its been spurred the rise of the shift or maybe year-round interleague play has something to do with it, but it seems to me that something’s going on here. And I’m interested to hear other people’s thoughts on these trends.


Corey Dickerson Doesn’t Care About Your Stupid Strike Zone

Rockies outfielder Corey Dickerson is quietly having an excellent season at the plate. Believe it or not, the 25-year-old is hitting an impressive .315/.371/.577, which even after adjusting for the effects of Coors Field, is still good for a 144 wRC+ — 13th highest among players with at least 400 plate appearances. Dickerson’s batted pretty sparingly against lefties, which has certainly played a role in his gaudy stat line, but platoon or no platoon, a .405 wOBA is certainly nothing to sneeze at.

While Dickerson’s out-of-the-blue breakout is interesting, the approach he’s used to get there is what makes him truly unusual. Since debuting last season, he’s swung at 62% of pitches inside the strike zone and 42% of pitches outside of it, making him about 1.5 times (62%/42%) as likely to swing at a strike than a ball. This is the lowest such ratio of any player with at least 600 PA’s these last two years. Dickerson’s not a free swinger, per se — his overall swing rate of 51% is 38th out of 251 players with at least 600 PA’s — but he just doesn’t discriminate based on whether or not a pitch is in the strike zone. Here’s a look at the hitters with the lowest Z-Swing%/O-Swing% these last two seasons:

Name O-Swing% Z-Swing% Z/O-Swing%
Corey Dickerson 42% 62% 1.47
A.J. Pierzynski 47% 74% 1.58
Salvador Perez 40% 65% 1.60
Dee Gordon 33% 53% 1.61
Shane Victorino 32% 53% 1.66
Alfonso Soriano 42% 69% 1.67
Scooter Gennett 40% 68% 1.68
Charlie Blackmon 39% 66% 1.70
Oswaldo Arcia 39% 66% 1.71
Juan Lagares 35% 59% 1.71
Evan Gattis 41% 70% 1.72
Pablo Sandoval 44% 76% 1.73
Ryan Zimmerman 31% 53% 1.73
Howie Kendrick 38% 65% 1.73
Chris Johnson 40% 70% 1.74

Dickerson’s contact rates tell a similar story. Just like his overall swing rate, Dickerson’s contact rate of 81% isn’t all that interesting. Here, he checks in at 151 out of 251. But also like his swing rate, it doesn’t change very much depending on a pitch’s location. He’s put wood on 83% of pitches he’s offered at in the zone, compared to 74% outside of it, making him 1.1 times as likely to connect on a pitch within the zone — fourth lowest out of 251.

Name O-Contact% Z-Contact% Z/O-Contact%
Victor Martinez 87% 93% 1.07
Pablo Sandoval 80% 87% 1.09
Dustin Pedroia 82% 92% 1.12
Corey Dickerson 74% 83% 1.12
Nick Markakis 83% 94% 1.13
Alexi Amarista 78% 90% 1.14
Brian Roberts 80% 92% 1.14
Eduardo Escobar 74% 85% 1.14
Dee Gordon 80% 91% 1.15
Adrian Beltre 78% 90% 1.15
Ichiro Suzuki 78% 90% 1.15
Yadier Molina 78% 91% 1.16
Denard Span 83% 96% 1.16
Jed Lowrie 77% 90% 1.17
Norichika Aoki 81% 95% 1.17

Multiplying these two metrics (Contact% x Swing%) gives us Dickerson’s contact rate over all pitches seen, regardless of that pitch’s location. Lets call this AllContact% to distinguish it from the traditional Contact%. This number shows just how much of an outlier he really is. For the average major league hitter, a pitch thrown in the strike zone results in contact 2.9 times as often as one outside of it, but for Dickerson, a pitch in the zone is less than 1.7 times as likely. Even if we set the bar as low as 70 plate appearances to include 577 players, this is still the lowest in baseball since the start of 2013.

Name Z/O-Swing% Z/O-Contact% Z/O-AllContact%
Corey Dickerson 1.47 1.12 1.66
Luis Sardinas 1.39 1.24 1.73
Reed Johnson 1.34 1.33 1.79
Alexi Casilla 1.67 1.10 1.83
Dee Gordon 1.61 1.15 1.84
Pablo Sandoval 1.73 1.09 1.88
Ramiro Pena 1.55 1.22 1.89
Jose Iglesias 1.54 1.24 1.90
Salvador Perez 1.60 1.19 1.90
C.J. Cron 1.52 1.25 1.90
Jeff Francoeur 1.63 1.18 1.93
Endy Chavez 1.61 1.20 1.94
Joaquin Arias 1.60 1.22 1.95
A.J. Pierzynski 1.58 1.24 1.96
Ryan Goins 1.46 1.34 1.96

And unsurprisingly, he also the all-time leader since 2007 (the earliest year with PITCHf/x data). Dickerson had the lowest among all players with 100 PA’s here, but I set the threshold to 600 PA’s to avoid having leader board filled with obscure players like Jesus Feliciano and Jordan Brown. In case you were wondering, Vladimir Guerrero checked in at 2.13.

Name Z/O-Swing% Z/O-Contact% Z/O-AllContact%
Corey Dickerson 1.47 1.12 1.66
Tony Pena 1.47 1.22 1.80
Dee Gordon 1.56 1.15 1.80
Salvador Perez 1.61 1.17 1.88
Garret Anderson 1.51 1.25 1.89
Pablo Sandoval 1.74 1.09 1.90
Joaquin Arias 1.58 1.22 1.92
Alexi Amarista 1.73 1.14 1.97
David Eckstein 1.71 1.15 1.97
Bengie Molina 1.74 1.13 1.97
Ichiro Suzuki 1.75 1.12 1.97
Erick Aybar 1.69 1.18 2.00
A.J. Pierzynski 1.68 1.20 2.03
Reed Johnson 1.50 1.36 2.03

Dickerson’s indifference to a pitch’s location means its probably only a matter of time before pitchers just stop throwing the ball in the strike zone, especially if he keeps slugging well above .500. So far this year, opposing pitchers have thrown Dickerson a strike just over 45% of the time. This is lower than the league average of 49%, but isn’t exceptionally low, especially for a free-swinging power hitter. Guys like Jose Abreu, Carlos Gomez, and Pablo Sandoval see strikes around 42% of the time, so pitchers could almost certainly get away with throwing Dickerson a few more balls. Sure, he’s shown that he’s able to hit those pitches, but even for a player like Dickerson, chasing after bad pitches is still a recipe for lots of swings and misses. His 74% O-Contact% is well above the league average of 63%, yet still lower than the overall Contact% of 80%.

Dickerson’s one-size-fits-all approach to swinging has worked well so far, but it remains to be seen what will happen when pitchers start exploiting it by throwing more balls out of the zone. Maybe he’ll be unfazed and keep on raking. Maybe he’ll turn into a strikeout machine, who needs to refine his approach to even stay in the big leagues. Either way, Corey Dickerson’s a fascinating player, who’s unlike any we’ve seen in recent years, and it’ll be interesting to see if he’s able to keep succeeding going forward.


How Brett Gardner’s Plate Discipline Made Him Great

At the start of the 2013 season, Brett Gardner adopted a new, more aggressive approach at the plate in the hopes of barreling more hittable pitches. Up to that point, the slap-hitting outfielder had been one of the most patient hitters in baseball. Gardner sat out most of 2012 due to injury, but swung at just 32.7% of all pitches seen between 2010 and 2011, the fewest of any player with at least 300 plate appearances. Last year, his swing rate jumped to 40.1%, with most of his new-found aggressiveness focused on pitches located within the strike zone. While his zone swing rate rose by 13 percentage points from 2010 to 2013, his rate for pitches out of the zone only increased by seven.

The change seemed to pay off. Gardner posted a career high .143 ISO last season — much better than his career mark of .103 — on his way to a very respectable 108 wRC+. He’s carried that success over to this season as well. With 16 homers, he’s doubled his total from last season — which was already a career high — and with a 119 wRC+, he’s developed into one of the better-hitting outfielders in all of baseball.

But unlike last season, he’s no longer sporting a swing percentage north of 40%. Instead, it’s fallen back to 36.6%, just a tad higher than his 35% mark from 2011. So if Gardner’s back to his old ways of watching two thirds of all pitches go by, how has he managed to keep hitting for power? The answer has everything to do with plate discipline. Gardner’s continued to take advantage of hittable pitches, but has also gotten much better at laying off pitches outside of the strike zone. First lets look at how often he’s swung at pitches inside of the strike zone.

Zone

Since adopting his more aggressive approach two springs ago, Gardner’s behavior on pitches in the zone hasn’t changed much. Maybe he’s gotten a little less aggressive over the past couple of years, but for the most part, his swing rates have been pretty consistent. It’s probably safe to say that Gardner’s a guy who swings at about 50-55% of pitches in the strike zone. We see a different story, however, when it comes to pitches outside of the zone.

Outside

At least initially, Gardner’s swing rate on balls out of the zone also spiked. He seemingly became more aggressive on all pitches, without discriminating based on location. But that’s changed over the past couple of seasons, as he’s swung at fewer and fewer pitches out of the zone. His O-Swing% dipped below 18% in both July and August — down from around 25% in early 2013 — putting him on par with what he was doing back in 2010 and 2011. Today, Gardner’s been nearly three times more likely to swing at a strike than a ball, up from two times as likely in April of 2013.

Gardner’s improved plate discipline is nothing new. Although his change in approach puts a kink in the trend, Gardner’s been getting better at deciding whether or not to swing since his first days in the big leagues, and probably even longer. Even before he re-evaluated his approach before the 2013 season, he was already starting to transition from a “guy who doesn’t swing at anything” to a “guy who doesn’t swing at balls”.

ZoneOut

Coming up through the minors, Gardner didn’t impress many scouts with his tools, and barely even made his college team as a walk-on. Sure, he’s always had plus-plus speed, but that only gets you so far when you’re an outfielder with little power to speak of. Rather than relying on his pure hitting skills, Gardner makes it work with his zen-like plate discipline. By swinging at so few balls out of the zone, Gardner practically forces pitchers to leave the occasional pitch over the heart of the plate, and has just enough pop in his bat to make them pay for it. But most importantly, he’s learned how to take advantage of those mistake pitches, while simultaneously laying off of the bad ones.

Statistics courtesy of FanGraphs.


Brandon Moss has Become a Little Too Patient

Brandon Moss has wielded an immensely potent bat since joining the Athletics’ lineup in June of 2012. Between 2012 and 2013, he hit a remarkable 146 wRC+, and clubbed a homer once every 15.7 PA’s, placing him third in baseball behind Chris Davis and Miguel Cabrera over that span. Moss kept up the hot hitting to start the 2014 season, as well. The 30-year-old 1B/OF/DH posted a 162 wRC+ in the season’s first two months, further establishing himself as a key cog in one of baseball’s most potent lineups.

But Brandon Moss hasn’t been himself lately. Since his last home run on July 24th, he’s only managed three extra-base hits, resulting in a laughable .168/.317/.198 batting line. Moss’s slump has also coincided with a change in his hitting approach. Moss appears to have gotten a bit more passive at the plate, swinging at way fewer pitches both inside and outside of the strike zone. This new-found passivity took a turn for the extreme once the calendar turned to August, when his O-Swing% and Z-Swing% fell to 27% and 65%, respectively — both around six percentage points lower than his career norms.

Swing

Moss’s decision to lay off more pitches has unsurprisingly lead to a spike in both his walk and strikeout numbers, but it’s also resulted in his power completely flat-lining. Moss has basically been Adam Dunn without the power these last couple of months. That’s a pretty terrible hitter, and is part of the reason why the A’s went out and got the real Adam Dunn to help their sputtering offense.

BBK

ISOO

The new swing profile is something that’s recently changed, making it the obvious culprit for Moss’s drop-off in production, but we shouldn’t immediately rule out the possibility that pitchers have changed the way they’re approaching him. It could just be that he’s swinging at fewer pitches because he’s getting fewer pitches to hit. That doesn’t seem to be the case, though, as Moss’s zone breakdown from August looks nearly identical to what it was over the season’s first four months. For whatever reason, Moss just isn’t swinging as often as he used to.

Untitled

It’s not entirely clear what’s spurred Moss’ sudden reluctance to swing the bat, but all indications are that it’s done a number on his offensive performance. Unlike the Brandon Moss that — up until recently — could be counted on for a wRC+ north of 130, this latest iteration seems to be letting a few too many hittable pitches float down the heart of the plate. And based on what’s transpired over the last month or two, Moss’s best bet is probably to re-discover the more aggressive approach that’s worked so well for him in the past.

Statistics courtesy of FanGraphs; Zone breakdowns courtesy of Baseball Savant.


Why Haven’t the A’s had Any Good Pitch-Framers?

The ability to quantify the value of catcher framing has been one of the biggest sabermetric breakthroughs of the last decade. By parsing through PITCHf/x data, analysts like Mike Fast, Max Marchi, Dan Brooks, and Harry Pavlidis have managed to shed light on which catchers are adept at turning balls into strikes, uncovering hidden value in otherwise unremarkable players, including Rene Rivera, Chris Stewart, and of course, Jose Molina.

MLB front offices have taken notice. Several teams, including the Yankees, Rays, Red Sox, Pirates, Padres, and Brewers have begun hoarding good-framing catchers over the past few years. But one team that’s missing from this list are the Oakland Athletics, who have historically been among the first adapters of sabermetric principles. One would think that the A’s would be all over the Jose Molina‘s and Chris Stewart’s of the world, yet Billy Beane and co. seem to have missed the memo on acquiring good framers. In fact, they’ve made a habit of employing poor ones. According to Baseball Prospectus‘ model, A’s catchers rank fourth from last in framing runs saved this season. This isn’t a one year anomaly, either. Here’s a look at all of the catchers the A’s have used since 2010, along with their career framing numbers.

Catcher Innings Share of A’s Innings FR Runs per 7,000
Kurt Suzuki 2,929 42% -9
Derek Norris 1,854 27% -1
John Jaso 755 11% -16
Landon Powell 540 8% -10
Stephen Vogt 421 6% -4
George Kottaras 217 3% -8
Anthony Recker 125 2% -17
Josh Donaldson 71 1% -9
Jake Fox 59 1% -15

That right there is a pretty sorry group of framers. There’s not a single catcher in the group who’s even above average. So what gives? Why has Billy Beane — who’s nearly synonymous with the term “market inefficiency” — been so reluctant to exploit the latest market inefficiency?

As far as I can tell, there are two possible explanations, and the real answer is probably some combination of the two:

1) The A’s have chosen to employ catchers who excel in areas other than pitch-framing.

2) The A’s aren’t completely buying into all of this pitch-framing stuff.

Let’s start with the first explanation. Since 2010, A’s catchers have accumulated 12.1 fWAR (which doesn’t account for framing), putting them 15th out of 30 MLB organizations. But since 2012, the year after Mike Fast’s research first brought the value of pitch framing to the public’s eye, the A’s rank 10th. The average wRC+ from a catcher is 93, but the A’s have done much better than that of late by employing guys like John Jaso (136 wRC+) and Derek Norris (110 wRC+). Even if you were to dock the Oakland’s catchers for their poor framing skills, they’d still fall somewhere in the middle of the pack in terms of total value. Basically, the A’s have managed to find good, cheap catchers, who generate value in ways other than framing pitches. Plus, for all we know, the A’s might have reason to believe these guys excel in other overlooked areas. They could be superb game callers, for example.

But that can’t be all that’s going on. Sure, the A’s have done a decent enough job of finding catching talent without prioritizing framing, but it’s not like they’ve had Mike Piazza or Johnny Bench behind the plate. Jaso and Norris are fine players, but aren’t exactly superstars. Plus, it should tell us something that they haven’t even brought in any bottom-of-the-barrel framing specialists. Eric Kratz or Chris Stewart were both traded for warm bodies last winter, but the A’s instead chose to roll with Vogt as their primary catching depth.

Perhaps the A’s have reason to believe that publicly available framing models overstate the value-add of a framed pitch? As Dave Cameron recently pointed out, its not entirely clear if the full value of a framed pitch should be attributed to the catcher, with none of the credit going to the pitcher. Current models don’t account for how a pitcher might change his approach based on the framing abilities of his catcher, and research shows that pitchers do in fact change their approach based on who’s catching, throwing a few more pitches outside of the strike zone:

Framing

Oakland’s brain trust is about as progressive as they come, and have a proven penchant for unearthing value from unlikely places. When a team like that zigs while others zag, it probably makes sense to ask why. This isn’t to say that the publicly-available framing data is useless, as having a good framer undeniably adds some value, even if it’s only a few runs. But the fact that the A’s have yet to employ a single plus framer should lead us to wonder if there’s a piece of the puzzle we might be missing.

Statistics courtesy of FanGraphs and Baseball Prospectus.


Searching for the Existence of Team Clutch as a Repeatable Skill

As you’ve probably heard by now, the Baltimore Orioles have made a habit of outperforming their run differential these last three years. In 2012, they finished the year 93-69, but their +7 run differential suggested they didn’t play much better than a .500 team. This year, they’re at it again. They currently sit atop the American League East with a 73-52 record, but their peripheral stats suggest they’ve lucked into a few wins along the way.

This has inevitably led to some disagreement over the true talent of recent Orioles teams. On the one hand, it’s been well established that things like BaseRuns and Pythagorean records do a pretty good job of predicting a team’s win-loss record. But at the same time, Buck Showalter’s Orioles have been pulling this off for a while now. Even if you understand and accept the concept of random variation, its a little hard to believe that the Orioles’ run has been entirely due to luck.

Jeff Sullivan recently penned a convincing article, dispelling the myth that clutch teams remain clutch over an extended period of time. He compared teams’ first-half clutch scores to their second-half scores, finding no correlation between the two, concluding that “team clutch” is not a repeatable skill.

Sullivan’s argument is pretty persuasive, but Major League teams today are sort of like like a Ship of Theseus: They experience lots of turnover over the course of a year, and come September, many look completely different than they did on opening day. Perhaps a comparison of half-seasons might not be picking up on the “magic” that often exists for only part of a year, when a team had the right combination of players on its roster.

To test whether this might be the case, I looked at month-to-month correlations for all consecutive months from 2009 to 2013. I also broke things up by hitting clutch and pitching clutch to see if there might be a phenomenon that exists on only one side of the ball.

Rplot04Rplot Rplot01

There isn’t much going on here, as all three trend lines are pretty darn close to flat. But we do see a slight upward slope to the trend line for pitchers. Its not enough to be statistically significant (P-Value=.27), but maybe it could be picking up on something. For instance, it doesn’t seem far-fetched that some managers might be better than others at deploying relievers in situations where they’re likely to succeed. The 2012 Orioles’ bullpen, after all, was more clutch than average in all six months of the season. So maybe their success had something to do with the way Buck Showalter managed his bullpen? Let’s see if we see anything more definitive by breaking up the correlations up by starters and relievers.

Rplot02 Rplot03

Nada. Both rotation and pitching clutch show no signs of correlation, which implies that the hint of a relationship for month-to-month pitching clutch was purely statistical noise. Pretty much any way you slice it, there’s just no evidence suggesting that team clutch is in any way a repeatable skill, even over very short periods of time. Some teams — like the Orioles — do manage to string together consecutive months of clutch performance. But the overall lack of correlation between consecutive months shows that a team’s clutch performance is about as random as a coin flip. If you flip a coin enough times, you’ll eventually get 10 heads in a row. By that same logic, you’re bound to find a stretch as extreme as the Orioles’ if you string together enough three-year stretches.

All statistics courtesy of FanGraphs and their infinitely useful splits data.


What Types of Hitters Have Large Platoon Splits?

Big-league teams today employ a myriad of data-driven strategies to eek every last drop of value from the players on their rosters. Many of these strategies consist of matching up hitters and pitchers based on their handedness. Between lineup platoons and highly-specialized bullpens, managers today go to great lengths to ensure they’re putting their players in the best possible situation to succeed.

It’s easy to see why. With very few exceptions, Major League hitters hit much better against opposite-handed pitching. In terms of wOBA (vs. opposite-handed – vs. same-handed), lefties perform about .031 better against righties, while righties hit .043 better against lefties. Yet not all platoon splits are created equal. Players like Shin-Soo Choo, David Wright, and Jonny Gomes are notorious for their drastic splits, while others put up comparable numbers no matter who’s on the mound. Ichiro Suzuki and Alex Rodriguez are a couple of the no-platoon-split poster boys.

Ok, so some batters have bigger platoon splits than others, but is there any particular reason for this? Take Choo for example. Is there something inherent to his skill set or approach that causes him to struggle against lefties?

Hoping to find an answer, I ran some regressions in search of attributes that might make a player more likely to have an exaggerated platoon split. I tested all sorts of things out there — from walk rate and swing% to a player’s height and throwing arm — but didn’t come away with much. Aside from a hitter’s handedness, attributes that proved statistically significant included: a hitter’s overall wOBA, his line drive rate, his strikeout rate, and his contact rate on pitches out of the zone, but even those relationships are extremely weak. It takes .100 points on a batter’s wOBA, or a 10% increase in K% or LD%, to move a batter’s platoon split by just .010 points. This tells us something, but not a ton, and at the end of the day, these variables account for a nearly negligible 4% of the variation in hitters’ platoon splits. Here’s the resulting R output. My sample included all batter seasons from 2007-2013 with at least 100 plate appearances against both lefties and righties, excluding switch hitters:

Platoons

Good hitters or guys who strike out frequently might be a little more prone to having large platoon splits. But for all practical purposes, a player’s ability to hit one type of pitching better than the other seems to be a skill that’s independent of all others. Aside from going by a player’s platoon stats, which can take years to become reliable, there’s little we can do to anticipate which hitters might fare particularly bad against same-handed pitching. And with the exception of players with long track records of unusual platoon splits — like Choo and Ichiro — it’s generally safe to assume that any given hitter’s true-talent platoon split is within shouting distance of the average: .043 for lefties and .031 for righties.


Applying KATOH to Historical Prospects

Over the last few weeks, I have written a series of posts looking into how a player’s stats, age, and prospect status can be used to predict whether he’ll ever play in the majors. I analyzed hitters in Rookie leagues, Short-Season A, Low-A, High-A, Double-A, and Triple-A using a methodology that I named KATOH (after Yankees prospect Gosuke Katoh), which consists of running a probit regression analysis. In a nutshell, a probit regression tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. While KATOH technically predicts the likelihood that a player will reach the majors, I’d argue it can also serve as a decent proxy for major league success. If something makes a player more likely to make the majors, there’s a good chance it also makes him more likely to succeed there.

After receiving a few requests, I decided to apply the model to players of years past. In what follows, I dive into what KATOH would have said about recent top prospects, look at the highest KATOH scores of the last 20 years, and highlight some instances where KATOH missed the boat on a prospect. If you’re feeling really ambitious, here’s a giant google doc of KATOH scores for all 40,051 player seasons since 1995 ( minimum 100 plate appearances in a short-season league or 200 in full-season ball).

Before I delve into the parade of lists, I want to point out one disclaimer to what I’m doing here. KATOH was derived from the performances of historical players, so applying the model to those same players might make it look a little better than it is. Take a player like Jason Stokes for example. Although he was a very well-regarded prospect in the early 2000’s (#15 and #51 per Baseball America in 2003 and 2004), KATOH consistently gave him probabilities in the 70’s and 80’s. But part of that is likely because Stokes’ data points were incorporated into the model. If I had created KATOH in 2005, Stokes’ MLB% may have been a few percentage points higher. Even so, a few data points generally aren’t enough to substantially change a model that incorporates thousands. In other words, it’s probably safe to assume that a player’s MLB% using today’s KATOH is roughly in line with what he would have received at the time.

Now, onto the results. Here’s what KATOH thought about some of the most recent top 100 prospects:

2013 Top 100 Prospects

Player Year Age Level MLB Probability
Xander Bogaerts 2013 20 AA 99.888%
Xander Bogaerts 2013 20 AAA 99.869%
George Springer 2013 23 AAA 99.816%
Gregory Polanco 2013 21 AA 99.614%
Nick Castellanos 2013 21 AAA 99.608%
Kolten Wong 2013 22 AAA 99.428%
Wil Myers 2013 22 AAA 99.418%
Miguel Sano 2013 20 A+ 99.335%
Tyler Austin 2013 21 AA 99.194%
Jackie Bradley 2013 23 AAA 99.079%
Kaleb Cowart 2013 21 AA 99%
Byron Buxton 2013 19 A+ 98%
Francisco Lindor 2013 19 A+ 98%
Christian Yelich 2013 21 AA 97%
Byron Buxton 2013 19 A 97%
Addison Russell 2013 19 A+ 97%
Billy Hamilton 2013 22 AAA 96%
Brian Goodwin 2013 22 AA 96%
Carlos Correa 2013 18 A 96%
Slade Heathcott 2013 22 AA 96%
Javier Baez 2013 20 A+ 95%
Jake Marisnick 2013 22 AA 95%
Albert Almora 2013 19 A 95%
Jonathan Singleton 2013 21 AAA 94%
Mike Zunino 2013 22 AAA 94%
Alen Hanson 2013 20 A+ 94%
Gregory Polanco 2013 21 A+ 92%
Javier Baez 2013 20 AA 91%
Jorge Soler 2013 21 A+ 90%
Gary Sanchez 2013 20 A+ 89%
Austin Hedges 2013 20 A+ 89%
Mike Olt 2013 24 AAA 87%
Miguel Sano 2013 20 AA 83%
George Springer 2013 23 AA 82%
Mason Williams 2013 21 A+ 78%
Trevor Story 2013 20 A+ 61%
Bubba Starling 2013 20 A 61%
Courtney Hawkins 2013 19 A+ 58%
Roman Quinn 2013 20 A 58%

2012 Top 100 Prospects

Player Year Age Level MLB Probability
Jurickson Profar 2012 19 AA 99.975%
Anthony Rizzo 2012 22 AAA 99.947%
Manny Machado 2012 19 AA 99.937%
Billy Hamilton 2012 21 AA 99.856%
Oscar Taveras 2012 20 AA 99.827%
Kolten Wong 2012 21 AA 99.824%
Nolan Arenado 2012 21 AA 99.759%
Leonys Martin 2012 24 AAA 99.737%
Nick Franklin 2012 21 AA 99.737%
Yasmani Grandal 2012 23 AAA 99.714%
Wil Myers 2012 21 AAA 99.659%
Andrelton Simmons 2012 22 AA 99.566%
Travis D’Arnaud 2012 23 AAA 99.512%
Jedd Gyorko 2012 23 AAA 99.493%
Hak-Ju Lee 2012 21 AA 99.492%
Jonathan Singleton 2012 20 AA 99.482%
Nick Castellanos 2012 20 AA 99.465%
Jonathan Schoop 2012 20 AA 99.443%
Jean Segura 2012 22 AA 99.423%
Nick Castellanos 2012 20 A+ 99.051%
Starling Marte 2012 23 AAA 99.015%
Anthony Gose 2012 21 AAA 99%
Rymer Liriano 2012 21 AA 99%
Jake Marisnick 2012 21 AA 99%
Xander Bogaerts 2012 19 A+ 98%
Michael Choice 2012 22 AA 98%
Gary Brown 2012 23 AA 98%
Christian Yelich 2012 20 A+ 98%
Nick Franklin 2012 21 AAA 97%
Javier Baez 2012 19 A 97%
Brett Jackson 2012 23 AAA 96%
Zack Cox 2012 23 AAA 92%
Mason Williams 2012 20 A 91%
Gary Sanchez 2012 19 A 89%
Jake Marisnick 2012 21 A+ 88%
Francisco Lindor 2012 18 A 88%
Cheslor Cuthbert 2012 19 A+ 87%
Miguel Sano 2012 19 A 86%
Billy Hamilton 2012 21 A+ 83%
George Springer 2012 22 A+ 80%
Christian Villanueva 2012 21 A+ 80%
Mike Olt 2012 23 AA 79%
Matt Szczur 2012 22 A+ 78%
Rymer Liriano 2012 21 A+ 76%
Blake Swihart 2012 20 A 66%
Cory Spangenberg 2012 21 A+ 64%
Bubba Starling 2012 19 R 17%

2011 Top 100 Prospects

Player Year Age Level MLB Probability
Mike Trout 2011 19 AA 99.973%
Brett Lawrie 2011 21 AAA 99.969%
Anthony Rizzo 2011 21 AAA 99.911%
Wil Myers 2011 20 AA 99.654%
Christian Colon 2011 22 AA 99.495%
Brandon Belt 2011 23 AAA 99.414%
Austin Romine 2011 22 AA 99.393%
Jesus Montero 2011 21 AAA 99.379%
Devin Mesoraco 2011 23 AAA 99.205%
Brett Jackson 2011 22 AAA 99.199%
Dustin Ackley 2011 23 AAA 99.196%
Yonder Alonso 2011 24 AAA 99%
Lonnie Chisenhall 2011 22 AAA 99%
Zack Cox 2011 22 AA 98%
Jason Kipnis 2011 24 AAA 98%
Mike Moustakas 2011 22 AAA 98%
Desmond Jennings 2011 24 AAA 98%
Jonathan Villar 2011 20 AA 98%
Matt Dominguez 2011 21 AAA 98%
Jurickson Profar 2011 18 A 97%
Bryce Harper 2011 18 A 97%
Tony Sanchez 2011 23 AA 97%
Dee Gordon 2011 23 AAA 97%
Grant Green 2011 23 AA 97%
Manny Machado 2011 18 A+ 97%
Nolan Arenado 2011 20 A+ 96%
Chris Carter 2011 24 AAA 96%
Travis D’Arnaud 2011 22 AA 96%
Wilmer Flores 2011 19 A+ 95%
Jose Iglesias 2011 21 AAA 95%
Hak-Ju Lee 2011 20 A+ 94%
Brett Jackson 2011 22 AA 93%
Jonathan Singleton 2011 19 A+ 92%
Joe Benson 2011 23 AA 91%
Gary Sanchez 2011 18 A 86%
Wilin Rosario 2011 22 AA 86%
Nick Castellanos 2011 19 A 85%
Nick Franklin 2011 20 A+ 83%
Jean Segura 2011 21 A+ 82%
Cesar Puello 2011 20 A+ 82%
Derek Norris 2011 22 AA 76%
Jonathan Villar 2011 20 A+ 73%
Aaron Hicks 2011 21 A+ 68%
Billy Hamilton 2011 20 A 61%
Miguel Sano 2011 18 R 44%
Josh Sale 2011 19 R 15%

Next, lets take a look at some of the highest KATOH scores of all time, namely those who received a score of at least 99.9%. There aren’t any complete busts among these players, as virtually all of them went on to play in the majors.

All-Time Top KATOH Scores

Player Year Age Level MLB Probability
Sean Burroughs 2000 19 AA 99.998%
Luis Castillo 1996 20 AA 99.995%
Fernando Martinez 2007 18 AA 99.994%
Daric Barton 2005 19 AA 99.992%
Alex Rodriguez 1995 19 AAA 99.992%
Carl Crawford 2001 19 AA 99.992%
Elvis Andrus 2008 19 AA 99.992%
Adam Dunn 2001 21 AAA 99.990%
Joe Mauer 2003 20 AA 99.989%
Ryan Sweeney 2005 20 AA 99.984%
Nick Johnson 1999 20 AA 99.984%
Jose Tabata 2009 20 AA 99.983%
Jose Tabata 2008 19 AA 99.983%
Travis Snider 2009 21 AAA 99.981%
Joaquin Arias 2005 20 AA 99.980%
Matt Kemp 2006 21 AAA 99.979%
Jose Reyes 2002 19 AA 99.979%
Jurickson Profar 2012 19 AA 99.975%
Mike Trout 2011 19 AA 99.973%
Jay Bruce 2008 21 AAA 99.971%
Brett Lawrie 2011 21 AAA 99.969%
B.J. Upton 2004 19 AAA 99.959%
Howie Kendrick 2006 22 AAA 99.951%
Ryan Howard 2005 25 AAA 99.951%
Dioner Navarro 2004 20 AA 99.950%
Luis Rivas 1999 19 AA 99.949%
Lastings Milledge 2005 20 AA 99.948%
Anthony Rizzo 2012 22 AAA 99.947%
Billy Butler 2006 20 AA 99.946%
Fernando Martinez 2008 19 AA 99.944%
Alberto Callaspo 2004 21 AA 99.944%
Jose Lopez 2003 19 AA 99.939%
Freddie Freeman 2010 20 AAA 99.939%
Manny Machado 2012 19 AA 99.937%
Rickie Weeks 2005 22 AAA 99.935%
Casey Kotchman 2004 21 AAA 99.932%
Eric Chavez 1998 20 AAA 99.930%
Adrian Beltre 1998 19 AA 99.927%
Shannon Stewart 1995 21 AA 99.917%
Anthony Rizzo 2011 21 AAA 99.911%
Karim Garcia 1995 19 AAA 99.910%
Jay Bruce 2007 20 AAA 99.907%
Jeff Clement 2008 24 AAA 99.902%
Miguel Cabrera 2003 20 AA 99.900%

All of the players who registered a KATOH score of at least 99.9% did so while playing in either Double- or Triple-A. This isn’t all that surprising since these are the levels closest to the big leagues. But what about the lower levels? Like we saw in Double- and Triple-A, there weren’t any complete busts among the highest ranking hitters from full-season A-ball. For both full-season leagues, each of the 20 top ranked players has either made it to the majors, or in the case of Carlos Correa, is young enough to still has an excellent chance to do so. But on the bottom two rungs on the minor league ladder, we come across a few instances where KATOH whiffed, most notably in Garrett Guzman (74%), Richard Stuart (72%), and Pat Manning (72%).

Top KATOH Scores for Seasons in High-A

Player Year Age Level MLB Probability
Adrian Beltre 1997 18 A+ 99.863%
Andruw Jones 1996 19 A+ 99.568%
Giancarlo Stanton 2009 19 A+ 99.405%
Billy Butler 2005 19 A+ 99.348%
Miguel Sano 2013 20 A+ 99.335%
Chris Snelling 2001 19 A+ 99.241%
Jason Heyward 2009 19 A+ 99.097%
Andy LaRoche 2005 21 A+ 99.091%
Wilmer Flores 2010 18 A+ 99.075%
Nick Castellanos 2012 20 A+ 99.051%
Jose Reyes 2002 19 A+ 99%
Casey Kotchman 2003 20 A+ 99%
Vernon Wells 1999 20 A+ 99%
Travis Lee 1997 22 A+ 99%
Brandon Wood 2005 20 A+ 98%
Xander Bogaerts 2012 19 A+ 98%
Justin Huber 2003 20 A+ 98%
Aramis Ramirez 1997 19 A+ 98%
Jay Bruce 2007 20 A+ 98%
Byron Buxton 2013 19 A+ 98%

Top KATOH Scores for Seasons in Low-A

Player Year Age Level MLB Probability
Mike Trout 2010 18 A 99%
Adrian Beltre 1996 17 A 98%
Jurickson Profar 2011 18 A 97%
Bryce Harper 2011 18 A 97%
Sean Burroughs 1999 18 A 97%
Andruw Jones 1995 18 A 97%
Byron Buxton 2013 19 A 97%
Jason Heyward 2008 18 A 97%
Corey Patterson 1999 19 A 97%
Vladimir Guerrero 1995 20 A 97%
Javier Baez 2012 19 A 97%
Ian Stewart 2004 19 A 96%
Lastings Milledge 2004 19 A 96%
Carlos Correa 2013 18 A 96%
Prince Fielder 2003 19 A 96%
Delmon Young 2004 18 A 96%
Josh Vitters 2009 19 A 96%
Chad Hermansen 1996 18 A 95%
Wilmer Flores 2010 18 A 95%
B.J. Upton 2003 18 A 95%

Top KATOH Scores for Seasons in Short-Season A

Player Year Age Level MLB Probability Played in Majors
Chris Snelling 1999 17 A- 82% 1
Richard Stuart 1996 19 A- 72% 0
Aramis Ramirez 1996 18 A- 71% 1
Ryan Kalish 2007 19 A- 71% 1
Cory Spangenberg 2011 20 A- 66% 0
Hanley Ramirez 2002 18 A- 66% 1
Wilson Betemit 2000 18 A- 65% 1
Ismael Castro 2002 18 A- 65% 0
Vernon Wells 1997 18 A- 64% 1
Carlos Figueroa 2000 17 A- 61% 0
Carson Kelly 2013 18 A- 61% 0
Pablo Sandoval 2005 18 A- 60% 1
Dan Vogelbach 2012 19 A- 59% 0
Manny Ravelo 2000 18 A- 57% 0
Chip Ambres 1999 19 A- 57% 1
Maikel Franco 2011 18 A- 55% 0
Jurickson Profar 2010 17 A- 55% 1
Derek Norris 2008 19 A- 54% 1
Cesar Saba 1999 17 A- 54% 0
Edinson Rincon 2009 18 A- 52% 0

Top KATOH Scores for Seasons in Rookie ball

Player Year Age Level MLB Probability Played in Majors
Jeff Bianchi 2005 18 R 76% >1
Justin Morneau 2000 19 R 74% 1
Addison Russell 2012 18 R 74% 0
Garrett Guzman 2001 18 R 74% 0
James Loney 2002 18 R 74% 1
Prince Fielder 2002 18 R 73% 1
Pat Manning 1999 19 R 72% 0
Wilmer Flores 2008 16 R 70% 1
Alex Fernandez 1998 17 R 70% 0
Dorssys Paulino 2012 17 R 69% 0
Tony Blanco 2000 18 R 69% 1
Hank Blalock 1999 18 R 69% 1
Joe Mauer 2001 18 R 69% 1
Hanley Ramirez 2002 18 R 69% 1
Ramon Hernandez 1995 19 R 68% 1
Angel Salome 2005 19 R 68% 1
Marcos Vechionacci 2004 17 R 67% 0
Gary Sanchez 2010 17 R 66% 0
Scott Heard 2000 18 R 65% 0
Jose Tabata 2005 16 R 65% 1

Now for KATOH’s biggest whiffs. Looking at seasons prior to 2011, the following players had very high KATOH ratings, but never made it to baseball’s highest level. The biggest miss was Cesar King, a defensive-minded catcher from the Rangers organization. Though to KATOH’s credit, King did spend five days on the Kansas City Royals’ roster in 2001 without getting into a game. Following King are a couple of busted Yankees prospects in Jackson Melian and Eric Duncan. Not to make excuses for KATOH, but these guys’ high scores may have had something to do with the way the Yankees over-hyped their prospects back then. If those two weren’t on Baseball America’s top 100 list, KATOH would have pegged them in the 70’s, rather than in the high-90’s.

KATOH’s Biggest Misses

Player Year Age Level MLB Probability
Cesar King 1998 20 AA 99.427%
Jackson Melian 2000 20 AA 99%
Eric Duncan 2005 20 AA 98%
Matt Moses 2006 21 AA 98%
Juan Williams 1995 21 AA 98%
Jeff Natale 2005 22 AA 97%
Eric Duncan 2006 21 AA 97%
Nick Weglarz 2010 22 AAA 96%
Nick Weglarz 2009 21 AA 96%
Tony Mota 1999 21 AA 95%
Micah Franklin 1998 26 AAA 94%
Billy Martin 2003 27 AAA 94%
Bill McCarthy 2004 24 AAA 94%
Jackson Melian 1999 19 A+ 94%
Tagg Bozied 2004 24 AAA 94%
Kevin Grijak 1995 23 AAA 93%
Angel Villalona 2008 17 A 93%
Danny Dorn 2010 25 AAA 93%
Nic Jackson 2003 23 AAA 92%
Pat Cline 1997 22 AA 92%

And here are the major leaguers who KATOH deemed least likely to make it when they were in the minors. Its worth noting that a couple of them — Jorge Sosa and Jason Roach — made it as pitchers.

Worst KATOH Scores Who Made it to the Majors

Player Year Age Level MLB Probability
Justin Christian 2004 24 A- 0.017%
Jorge Sosa 1999 21 A- 0.027%
Tyler Graham 2006 22 A- 0.087%
Gary Johnson 1999 23 A- 0.136%
Bo Hart 1999 22 A- 0.155%
Tommy Manzella 2005 22 A- 0.181%
Michael Martinez 2006 23 A- 0.185%
Eddy Rodriguez 2012 26 A+ 0.194%
Kevin Mahar 2004 23 A- 0.215%
Will Venable 2005 22 A- 0.232%
Brent Dlugach 2004 21 A- 0.268%
Sean Barker 2002 22 A- 0.270%
Steve Holm 2002 22 A- 0.301%
Edgar V. Gonzalez 2000 22 A- 0.315%
Peter Zoccolillo 1999 22 A- 0.328%
Konrad Schmidt 2007 22 A- 0.337%
Tommy Medica 2010 22 A- 0.365%
Brian Esposito 2008 29 AA 0.392%
Jason Roach 1997 21 A- 0.396%
Jorge Sosa 2000 22 A- 0.439%

KATOH’s far from perfect, but overall, I think it does a pretty decent job of forecasting which players will make it to the majors. That being said, it’s still a work in progress, and I have a few ideas rolling around in my head to improve on the model. Furthermore, I’m working to develop something that will forecast how a minor leaguer will perform upon reaching the majors, to complement his MLB%. I’ll be dropping these new and improved KATOH projections (for both hitters and pitchers) after this year’s World Series, when we’ll all be desperate for something baseball-related to get us through the winter.

Statistics courtesy of FanGraphs, Baseball-Reference, and The Baseball Cube; Pre-season prospect lists courtesy of Baseball America.


Using Short-Season A Stats to Predict Future Performance

Over the last couple of weeks, I’ve been looking into how a player’s stats, age, and prospect status can be used to predict whether he’ll ever play in the majors. So far, I’ve analyzed hitters in Rookie leaguesLow-A, High-A, Double-A and Triple-A using a methodology that I named KATOH (after Yankees prospect Gosuke Katoh), which consists of running a probit regression analysis. In a nutshell, a probit regression tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. While KATOH technically predicts the likelihood that a player will reach the majors, I’d argue it can also serve as a decent proxy for major league success. If something makes a player more likely to make the majors, there’s a good chance it also makes him more likely to succeed there.

For hitters in Low-A and High-A, age, strikeout rate, ISO, BABIP, and whether or not he was deemed a top 100 prospect by Baseball America all played a role in forecasting future success. And walk rate, while not predictive for players in Rookie ball, Low-A, or High-A, added a little bit to the model for Double-A and Triple-A hitters. Today, I’ll look into what KATOH has to say about players in Short-Season A-ball. Due to varying offensive environments in different years and leagues, all players’ stats were adjusted to reflect his league’s average for that year. For those interested, here’s the R output based on all players with at least 200 plate appearances in a season in SS A-ball from 1995-2007.

Short Season Output

Just like we saw with hitters in Rookie ball, a player’s Baseball America prospect status couldn’t tell us anything about his future as a big leaguer. This was entirely due the scarcity players top 100 prospects in the sample, as only a handful of players spent the year in SS A-ball after making BA’s top 100 list. Somewhat surprisingly, walk rate is predictive for players in SS-A, despite being statistically insignificant for hitters in Rookie ball and the more advanced A-ball levels. Another interesting wrinkle is the “Strikeout_Rate:Age” variable. Basically, this says that strikeout rate matters more for younger players than for older players at this level. Although frequent strikeouts are obviously a bad thing no matter how old you are:

Rplot

The season is less than 50 games old for most teams in the New York-Penn and Northwest Leagues, which makes it a little premature to start analyzing players’ stats. But just for kicks, here’s a look at what KATOH says about this year’s crop of players with at least 100 plate appearances through July 28th. The full list of players can be found here, and you’ll find an excerpt of those who broke the 40% barrier below:

Player Organization Age MLB Probability
Rowan Wick STL 21 82%
Eduard Pinto TEX 19 68%
Marcus Greene TEX 19 60%
Mauricio Dubon BOS 19 59%
Franklin Barreto TOR 18 57%
Christian Arroyo SFG 19 57%
Skyler Ewing SFG 21 56%
Taylor Gushue PIT 20 55%
Domingo Leyba DET 18 55%
Raudy Read WSN 20 53%
Nick Longhi BOS 18 52%
Andrew Reed HOU 21 52%
Danny Mars BOS 20 51%
Amed Rosario NYM 18 49%
Yairo Munoz OAK 19 48%
Seth Spivey TEX 21 47%
Mike Gerber DET 21 47%
Mark Zagunis CHC 21 47%
Kevin Krause PIT 21 46%
Leo Castillo CLE 20 45%
Jordan Luplow PIT 20 45%
Mason Davis MIA 21 40%
Kevin Ross PIT 20 40%
Franklin Navarro DET 19 40%

As we saw with Rookie league hitters, KATOH doesn’t think any of these players are shoo-ins to make it to the majors. Even Rowan Wick, who hit a Bondsian .378/.475/.815 before getting promoted, gets just 82%. This goes to show that SS A-ball stats just aren’t all that meaningful.

Once the season’s over, I’ll re-run everything using the final 2014 stats, which will give us a better sense of which prospects had the most promising years statistically. I also plan to engineer an alternative methodology — to supplement this one — that will take into account how a player performs in the majors, rather than his just getting there. Additionally, I hope to create something similar for projecting pitchers based on their statistical performance. In the meantime, I’ll apply the KATOH model to historical prospects and highlight some of its biggest “hits” and “misses” from years past. Keep an eye out for the next post in the coming days.

Statistics courtesy of FanGraphs, Baseball-Reference, and The Baseball Cube; Pre-season prospect lists courtesy of Baseball America.