Category: Research | Page 69

Archive for Research

How Eric Sogard Made History

August 14, 2014

Last Saturday’s Athletics vs. Twins game turned out just about as everyone expected, with Oakland winning the game 9-4 in what was a noncompetitive contest after the 5^th inning. Minnesota chose to debut their #8 prospect, starting pitcher Trevor May, in a road game during a lost season against a team with the best run differential in the majors. They did this, one can only suspect, because they are the Minnesota Twins. From experience we may infer the answer to the question “what happens when we give a young man his debut against a pitching meat grinder?”, and so it was that Trevor May struggled, and struggled mightily. May lasted only two innings, gave up four earned runs, walked seven batters, and did not record a single strikeout. This is strange for many reasons, the main ones being that Minnesota is a pitch-to-contact team in its approach, and because no one has walked at least seven batters and not struck anyone out in their debut since Ricky Romero for the Blue Jays in August of 2012.

However, we’re not here to talk exclusively about Trevor May, even though his wild performance on Saturday night partially allowed this article’s existence. We’re here to talk about Eric Sogard, who quietly had a strangely historic night during an otherwise fairly pedestrian Oakland win. Eric Sogard is known, if he is truly known outside of the Oakland fan base, for two things: coming in second during this past off-season’s “Face of MLB” contest, and for his prowess with the glove.

Sogard doesn’t really hit: he’s currently slashing .216/.305/.271, and he hit his first home run of the year last week, a 349-foot missile down the right field line. Eric Sogard is one of those major leaguers who is in the league because he does one thing very well, and because he plays for a team that has the luxury of being able to carry a player whose value is determined almost entirely by defense. Make no mistake, Eric Sogard is a very good defensive second baseman: he has a UZR/150 of 8.5 that puts him 8^th among active 2B this year with a minimum of 500 innings played. He is not, however, great with the bat.

That being said, let’s look at Eric Sogard’s batting line from Saturday night against the Twins, when he batted in the 9^th position:

0-1, 1 R, 4 BBs, 1 SB

Eric Sogard walked four times while batting in the 9^th position in the lineup on Saturday night. Take a moment to let that fact sink in, because it’s crazy. How rare is it for a batter in the 9^th position in the lineup to walk four times? Since 1914, it has only happened 14 times including Sogard this past weekend. He’s the first member of the Oakland Athletics to ever do it. Only two other players since 1914 have accomplished this and also stolen a base in the same game: Desi Relaford (2002) & Brady Anderson (1990). On top of all of that, Sogard also made an error – because in a game when weird things are happening to a defensive second baseman, of course he did. He’s now the only player in baseball history to have walked four times in the 9^th spot, stolen a base, and made an error in the same game. That’s reaching a little bit, but hey, baseball history!

Just pointing out the rarity of this phenomenon isn’t really interesting enough, though. Let’s go a little deeper. Specifically, let’s ask ourselves this question: “how many pitches did Eric Sogard ‘get to hit’ on Saturday night?” By “get to hit” I mean pitches in the strike zone that have a high likelihood of good contact – i.e., not “pitcher’s pitches” on the corner low and away or nasty breaking pitches located perfectly. Yes, this is subjective, as every hitter is different in their preference of locations to swing at and hit thrown pitches, but we’re more generally going to look at pitches that were over the plate and hittable. We know the answer to this question isn’t going to be a lot of pitches, given the four walks. However, for a light-hitting second baseman batting in the 9^th spot, who should expect to be challenged over the plate in almost every at-bat, it’s a fun question to ask. It also allows us to look at some GIFs.

I’ve gone ahead and split up every at-bat that Eric Sogard had on Saturday into different GIFs and overlaid them with circles: green for balls and red for strikes. Sogard saw 22 pitches on Saturday, which tied him for the team lead with Derek Norris. Let’s dig in.

1^st AB, 2^nd inning – 2 out, none on, P Trevor May:

Sogard saw four pitches, all four of which were balls. Only the first pitch of the at-bat was close to in the strike zone, and Sogard was either taking all the way or correctly identified the pitch as a changeup and laid off.

Pitches to hit tally: 0

2^nd AB, 3^rd inning, 1 out, 1 on: P Samuel Deduno:

This was the biggest battle of the game for our nerd power-harnessing second baseman, as he saw seven pitches and went to a full count. The 2-0 strike might have been the most hittable pitch Sogard saw all night, even though it was low in the zone and breaking slightly toward the outer half of the plate. The 3-1 pitch, called a strike, probably could’ve been called either way, and Sogard pulled a 3-2 liner foul off of an inside fastball off the plate before taking his second walk.

Pitches to hit tally: 1

3^rd AB, 5^th inning, 1 out, none on: P Samuel Deduno:

Deduno had just given up a two run homer to Stephen Vogt in the previous at-bat, so he might’ve been a little rattled when facing our young hero. Sogard walked on five pitches, with the 1-0 high fastball strike the best pitch to hit. However, when you’ve already walked twice, why start swinging now?

Pitches to hit tally: 2

4^th AB, 6^th inning, 2 out, 1 on: P Ryan Pressly:

Nothing to swing at. The two low and away pitches were the closest to strikes, but they were also easy takes after two pitches high and outside that weren’t close. Four walks achieved.

Pitches to hit tally: 2

5^th AB, 8^th inning, 2 out, 1 on: P Anthony Swarzak:

At last Eric Sogard is bested. After a high and outside pitch was taken for a ball, allowing us to dream of the first-ever five walk night out of a hitter in the 9^th spot, Sogard swung at a nasty low and away pitch on the corner and meekly chopped out to the pitcher. Not even Sogard’s blazing speed could rescue him this time. Unfortunately, that was not a pitch to hit/swing at.

Alas, poor Sogard, we knew him well.

Final pitches to hit tally: 2

To wrap it all up, here we have a GIF of all of the pitches Sogard saw on Saturday (from the catcher’s perspective). I’m viewing the strike zone that was generated by the system with a healthy bit of skepticism, as it’s not adjusted for the batter or the umpire. Still, it gives us a concrete idea of how many pitches Sogard saw that were worth swinging at:

The answer is two or three at most, which is insane, because the Minnesota Twins have the fifth-lowest BB/9 in the majors, and Eric Sogard is hitting around .215 with one home run.

Baseball is great because Giancarlo Stanton hits majestic 500 foot moon shots, but it’s also great because guys who are 5′ 10″ (on a good day) defensive specialists who platoon at second base draw four walks against a team that is known for pitching to contact. Players like Eric Sogard aren’t barred from the history books, even though they’re often overlooked in favor of the mashers chasing home run titles; they simply make their history in a very different and sometimes more interesting way. Maybe we have to dig for it a little. Or maybe Eric Sogard just needs to not swing for a whole game and let the stars align.

This post is dedicated to my good friend Adam Sax, who is nice enough to help me out with the deep stats (and the last GIF) and is the biggest Sogard fan I know.

Biogenesis Players: Then vs. Now

by Danny Sader

August 13, 2014

After watching Nelson Cruz this year and all the noise he has been making, on top of a recent report by Buster Olney stating, “The average distance of the fly balls pulled by Ryan Braun this season is down 42 feet, from 302 to 260…”, it inspired me to look up the numbers for players suspended in the Biogenesis case. The big four suspended were Alex Rodriguez, Ryan Braun, Nelson Cruz, and Jhonny Peralta. Other position players involved and suspended were Everth Cabrera, Jesus Montero, Francisco Cervelli, and Jordany Valdespin.

This article will focus on the big four with the exception of A-Rod because he has been suspended all season. Obviously enough this is a small sample size so take heed. I will be making a couple of assumptions, the main one being that these players had been using steroids for at least 3 years (2010-2012) prior to their being caught and suspended. The other assumption being that enough time has passed for the effects of the steroids to have worn off and that their bodies/abilities are back to their more natural state.

Ryan Braun	2010	2011	2012	2014	2014 (ZiPSU)
HR/FB	14.00%	18.80%	22.80%	15.10%
Slug%	0.501	0.597	0.595	0.496	0.505
ISO	0.197	0.265	0.276	0.211	0.231
WRC+	134	171	160	129	133
OFF	32.5	58.8	52	12.5	21
True Distance (ft)	408.2	406.7	406.9	387.9
Average Speed Off Bat (mph)	105.1	104.7	104.2	102.1

Nelson Cruz	2010	2011	2012	2014	2014 (ZiPSU)
HR/FB	15.20%	18.70%	13.10%	20.00%
Slug%	576	509	460	513	505
ISO	258	246	200	253	246
WRC+	147	116	105	130	127
OFF	26.6	7.7	0.8	14.9	18.5
True Distance (ft)	405.2	411.6	418.6	398.9
Average Speed Off Bat (mph)	105.2	106.4	106.8	104.2

Jhonny Peralta	2010	2011	2012	2014	2014 (ZiPSU)
HR/FB	7.50%	10.80%	8.30%	12.50%
Slug%	392	478	384	447	441
ISO	143	179	145	187	180
WRC+	91	122	85	122	120
OFF	-12.7	11.2	-13.8	8.4	10.3
True Distance (ft)	392.5	388.4	391.9	397
Average Speed Off Bat	101.2	102.3	101.7	102.8

The main thing that jumps out at you is that Cruz and Peralta are statistically putting up some of the best numbers of their careers (without a doubt, top 3)! Braun, however, is having his worst season of the 4 above, while Peralta and Cruz both are having their most powerful seasons yet. Their HR/FB rates are each at their highest as well as their ISO numbers, while again Braun’s are at his worst of the 4 seasons. Looking at WRC+ and OFF, Peralta is having his 4th best season ever, Cruz is having his 2nd best ever, and Braun is having the worst season of his career to date (with the possible exception of 2008).

Using ESPN’s hittrackeronline.com I looked up each player’s True Distance on home runs this year as well as the average exit speed velocity of their home runs. Ryan Braun has lost 3 mph which has correlated to a shortage of almost 20 feet on his balls. Nelson Cruz has lost about 2 mph and 20 feet off his home run balls from his peak of the four years. Jhonny Peralta, on the other hand, is showing his best numbers this year.

So what does all this mean? In summary, I believe the main thing we can take away from this is that each player who used steroids should be assessed on a case by case basis. Every player is affected differently. We cannot group all steroid users together. Using the above statistics as proof, after being charged in the Biogenesis case, 2 players are having among the best seasons of their careers while another is having his worst. In addition the best all-around athlete and youngest of the 3 (so therefore closest to his prime) is the one who is struggling most, Ryan Braun! Whether it is the HOF vote, or evaluating future value of perceived steroid users, we can’t lump them all into the same group and assume that they will automatically decline. Yes, using steroids is absolutely cheating, however it doesn’t necessarily mean that those players wouldn’t have been just as productive had they chosen legal supplements or nothing at all.

What Types of Hitters Have Large Platoon Splits?

by Chris Mitchell

August 11, 2014

Big-league teams today employ a myriad of data-driven strategies to eek every last drop of value from the players on their rosters. Many of these strategies consist of matching up hitters and pitchers based on their handedness. Between lineup platoons and highly-specialized bullpens, managers today go to great lengths to ensure they’re putting their players in the best possible situation to succeed.

It’s easy to see why. With very few exceptions, Major League hitters hit much better against opposite-handed pitching. In terms of wOBA (vs. opposite-handed – vs. same-handed), lefties perform about .031 better against righties, while righties hit .043 better against lefties. Yet not all platoon splits are created equal. Players like Shin-Soo Choo, David Wright, and Jonny Gomes are notorious for their drastic splits, while others put up comparable numbers no matter who’s on the mound. Ichiro Suzuki and Alex Rodriguez are a couple of the no-platoon-split poster boys.

Ok, so some batters have bigger platoon splits than others, but is there any particular reason for this? Take Choo for example. Is there something inherent to his skill set or approach that causes him to struggle against lefties?

Hoping to find an answer, I ran some regressions in search of attributes that might make a player more likely to have an exaggerated platoon split. I tested all sorts of things out there — from walk rate and swing% to a player’s height and throwing arm — but didn’t come away with much. Aside from a hitter’s handedness, attributes that proved statistically significant included: a hitter’s overall wOBA, his line drive rate, his strikeout rate, and his contact rate on pitches out of the zone, but even those relationships are extremely weak. It takes .100 points on a batter’s wOBA, or a 10% increase in K% or LD%, to move a batter’s platoon split by just .010 points. This tells us something, but not a ton, and at the end of the day, these variables account for a nearly negligible 4% of the variation in hitters’ platoon splits. Here’s the resulting R output. My sample included all batter seasons from 2007-2013 with at least 100 plate appearances against both lefties and righties, excluding switch hitters:

Good hitters or guys who strike out frequently might be a little more prone to having large platoon splits. But for all practical purposes, a player’s ability to hit one type of pitching better than the other seems to be a skill that’s independent of all others. Aside from going by a player’s platoon stats, which can take years to become reliable, there’s little we can do to anticipate which hitters might fare particularly bad against same-handed pitching. And with the exception of players with long track records of unusual platoon splits — like Choo and Ichiro — it’s generally safe to assume that any given hitter’s true-talent platoon split is within shouting distance of the average: .043 for lefties and .031 for righties.

Ruben Amaro Jr. Says Teams “Over-Covet” Prospects; Is He Right?

by Nick Rabasco

August 9, 2014

Many are questioning the thought process behind Ruben Amaro Jr. standing pat at the non-waiver trade deadline. The Phillies have a lot of veterans under fairly large contracts. According to Philly.com, when asked about why he didn’t move some of his veterans, Amaro stated

“In this day and age, I think one of the most over-coveted elements of baseball are prospects,” Amaro said. “I don’t know how many prospects that have been dealt over the last several years have really come to bite people in the a**. I think what’s happened is, I think teams are really kind of overvaluing in some regards.”

I thought it would be fun to actually go back and see how many prospects or minor league players who were traded at the deadline panned out. I went back to 2005 and used every single transaction that involved both an MLB player and a prospect (I considered a prospect a guy who had never been in the MLB, or a guy who had been in the MLB but had yet to achieve rookie status). I also strictly used trades that were done on July 31, in each year from 2005-2011. I skipped 2012 and 2013 because it’s harder to get a gauge on whether or not prospects traded will make it or have any success. Also, from 2011 until now, prospects have had about three years to get to the big leagues and I felt that was a good place to end.

There were 53 transactions in that time, some very minor, some very major, and some in between. I took each transaction and compiled each player’s WAR after the trade (WARAT). I still applied this criteria if there was a player who was traded on two different July 31s. For example, Jake Peavy was traded twice, so his WARAT will be different from one trade to the next. Some players appear as prospects and MLB guys as well, like Jarrod Saltalamacchia, who was traded as a prospect, and later on once he was not considered a rookie anymore.

I will look at the percentage of prospects that never made it, the percentage that made it but provided negative WAR, and the percentage that made it and provided positive WAR. I will then look at the MLB guys who were traded and the percentage of guys who provided positive and negative WAR for the remainder of their careers.

The data I found was very interesting. There were 85 “prospects” traded and 66 MLB guys traded. Below is a table with each trade. In parenthesis, I noted whether each player was a prospect (P) or an MLB guy at the time. I will then have their WARAT, or WAR after trade. If a prospect never made it to the show, I use the abbreviation “NMI.”

TEAM A	TEAM B
2005
Kyle Brono (P, NMI) & Kenny Perez (P, NMI)	Jose Cruz Jr. (MLB, 3.2)
Kyle Farnsworth (MLB, 3.2)	Zach Miner (P, 2.7) & Roman Colon (P, NMI)
Geoff Blum (MLB, 3.2)	Ryan Meaux (NMI)
Ron Villone (MLB, -0.6)	Yorman Bazardo (P, 0.2) & Michael Flannery (NMI)
Miguel Olivo (MLB, 7.7)	Miguel Ojeda (MLB, -0.3) & Nathaneal Mateo (P, NMI)
2006
Rich Scalamandre (P, NMI)	Jorge Sosa (MLB, -0.1)
Todd Walker (MLB, 0.7)	Jose Ceda (P, 0)
Rheal Cormier (MLB, -0.3)	Justin Germano (P, 0.4)
Kyle Lohse (MLB, 17.6)	Zach Ward (P, NMI)
Jeremy Affeldt (MLB, 2.5) & Denny Bautista (MLB, -0.2)	Ryan Shealy (P, 0.7) & Scott Dohmann (P, -0.4)
Sean Casey (MLB, -0.8)	Brian Rogers (P, -0.3)
Jose Diaz (P, NMI)	Matt Stairs (MLB, 0.9)
Julio Lugo (MLB, -0.8)	Joel Guzman (P, -0.2) & Sergio Pedroza (P, NMI)
Jesse Chavez (P, 0.9)	Kip Wells (MLB, 0.2)
2007
Mark Teixeira (MLB, 24.7) & Ron Mahay (MLB, 0.6)	Jarrod Saltalamacchia (P, 8.2) & Elvis Andrus (P, 17.6) & Neftali Feliz (P, 4.8) & Matt Harrison (8.8) & Beau Jones (P, NMI)
Eric Gagne (MLB, -0.8)	Kason Gabbard (P, 0.4) & David Murphy (10.4) & Engel Beltre (P, NMI)
Jon Link (P, 0)	Rob Mackowiak (MLB, -0.7)
Julio Mateo (MLB, 0.2)	Jesus Merchen (P, NMI)
Matt Morris (MLB, 0.1)	Rajai Davis (P, 8.4)
Wilfredo Ledezma (MLB, 0) & Will Startup (P, NMI)	Royce Ring (P, 0)
2008
Jason Bay (MLB, 6.1)	Manny Ramirez (MLB, 6) & Craig Hanson (P, -0.5) & Brandon Moss (P, 6.3)
Ken Griffey Jr. (MLB, -1.1)	Nick Masset (P, 2.4) & Danny Richar (P, -0.2)
Arthur Rhodes (MLB, 1.7)	Gaby Hernandez (P, NMI)
Manny Ramirez (^)	Andy LaRoche (P, 0.3) & Bryan Morris (P, -1.4)
2009
Aaron Poreda (P, 0.1) & Adam Russell (P, 0) & Clayton Richard (P, 0.7)	Jake Peavy (MLB, 13.2)
Jarrod Washburn (MLB, -0.4) & Mauricio Robles (P, 0.1)	Luke French (P, -0.5)
Vinny Rottino (P, 0.1)	Claudio Vargas (MLB, 0.1)
Orlando Cabrera (MLB, 0.3)	Tyler Ladendorf (P, NMI)
Edwin Encarnacion (MLB, 13.8) & Josh Roenicke (P, 0.1)	Scott Rolen (MLB, 7.4) & Zach Stewart (P, -0.4)
Joe Beimal (MLB, -0.3)	Ryan Matheus (P, -0.3) & Robinson Fabian (P,NMI)
Nick Johnson (MLB, 0.5)	Aaron Thompson (P, -0.2)
Victor Martinez (MLB, 10.9)	Justin Masterson (P, 13.7) & Bryon Price (P, NMI) & Nick Hagadone (P, 0)
Chase Weems (P, NMI)	Jerry Hairston (MLB, 3.1)
2010
Bobby Crosby (MLB, -0.1) & DJ Carrasco (MLB, -0.5) & Ryan Church (MLB, 0.5)	Chris Snyder (MLB, -0.1) & Pedro Ciriaco (P, 0.1)
Lance Berkman (MLB, 4.5)	Jimmy Paredes (P, -1.6) & Mark Melancon (P, 3.3)
Ramon Ramirez (MLB, 0.6)	Daniel Turpen (P, NMI)
Christian Guzman (MLB, -0.7)	Ryan Tutusko (P, NMI) & Tanner Roark (P, 3.6)
Jarrod Saltalamacchia (MLB, 8.7)	Roman Mendez (P, 0.1) & Chris McGuiness (P, -0.4)
Javier Lopez (MLB, 2.8)	Joe Martinez (P, 0.2) & John Bowker (MLB, -1)
Octavio Dotel (MLB, 2.4)	James McDonald (MLB, 2.9) & Andrew Lambo (P, -0.2)
Rick Ankiel (MLB, 1) & Kyle Farnsworth (MLB, 1)	Tim Collins (P, 1.4) & Gregor Blanco (MLB, 6.2) & Jesse Chavez (MLB, 1.5)
Corey Kluber (P, 8.4)	Jake Westbrook (MLB, 3.8)
Nick Greenwood (P, 0)	Ryan Ludwick (MLB, 1.4)
Ted Lilly (MLB, 2.8) & Ryan Theriot (MLB, 0.5)	Blake DeWitt (MLB, -0.5) & Kyle Smit (P, NMI) & Brett Wallach (P, NMI)
2011
Orlando Cabrera (MLB, -0.7)	Thomas Neal (P, -0.6)
Derrek Lee (MLB, 1.7)	Aaron Baker (P, NMI)
Michael Bourn (MLB, 9.1)	Jordan Schafer (MLB, 0.1) & Juan Abreu (P, 0) & Paul Clemens (P, -1.4) & Brett Oberholtzer (P, 2.9)
Alex Castellanos (P, -0.6)	Rafael Furcal (MLB, 1.2)
Brad Ziegler (MLB, 2.1)	Brandon Allen (P, -0.4) & Jordan Norberto (P, 0.3)
Mike Adams (MLB, 1.2)	Robbie Erlin (P, 1.1) & Joe Weiland (P, -0.1)
Erik Bedard (MLB, 3.4)	Josh Fields (P, 0.9) & Trayvon Robinson (P, -0.7) & Chih-Hsien Chiang (P, NMI)
Ubaldo Jimenez (MLB, 4.8)	Alex White (P, -0.2) & Joe Gardner (P, NMI) & Matt McBride (P, -1.2)

As you can see, some trades worked out better than others. Of the 85 prospects, 72.9% of them (62) made it to the big leagues. So, that means 23 prospects, or 27.1% of those traded, never stepped on a big league field. Of the 62 that made it, 32 were good for positive WAR after the trade, 21 were worth negative WAR, and 9 were at 0 WAR. The WAR of all the prospects that made it adds up to 97.8. That’s an average of about 1.2 WAR per prospect.

Now we can analyze the MLB guys. There is a wide variety of age in the group of 66 MLB players. Some were traded fairly early in their MLB careers; some were traded as their career was winding down. I found that 69.6% of these players (46) were good for positive WAR after they were traded. 19 players (28.7%) were worth negative WAR, and 1 player was worth zero WAR after the trade. When you add their WAR together, you get 178.8, averaging 2.7 WAR per MLB player traded.

So, on average, teams were trading an MLB guy that would be worth 2.7 WAR for the rest of their career, for a prospect that would turn out to be worth 1.2 WAR in that same time period.

In addition, if you add up the total WARAT for each individual trade, the MLB player’s WARAT was higher than the prospect’s WARAT in 32 of the 53 trades (60.3%). The prospect’s WARAT was higher in 17 of 53 trades (32%). Finally, there were three trades that cancelled each other out, and were neutral.

There are many ways to look at this and some things to keep in mind. It may seem like trading an established big leaguer is not smart from these numbers. However, it depends on the situation a team is in. Also, most of these “prospects” have yet to finish their MLB careers, so they are still in the process of racking up WAR. Good examples include Kluber, Masterson, Moss, Murphy, Andrus, Davis, and Feliz. On the other hand, some of the MLB guys were traded when they were still pretty young. Saltalamacchia, Martinez, Teixeira and Encarnacion are examples, but they are still older than most right now. These guys are providing most of the WARAT for the MLB guys. Also, some of the MLB guys were so old that they only lasted another couple years in the MLB.

You have to take money into account as well. For some trades, teams are not only getting prospects in return, but they’re dumping salary and now have money they could spend elsewhere in the off-season. One example of a trade that worked out really well for one team and not so well for another was the huge Braves-Rangers trade. The Braves received Mark Teixeira, and traded four prospects that have all turned out well. Teixeira was great for Atlanta, but was only there for half of 2007 and half of 2008, with the Braves not even advancing to the postseason with him. The Rangers however, got guys who helped the Rangers reach the World Series in 2010 and 2011. Be careful with the prospects you trade away.

Since I am relating this article to Ruben Amaro Jr., I will connect this data to the Phillies’ current situation. The evidence shows it probably would have been smart for them to move their older, more expensive players for prospects, even if they aren’t considered top prospects. Amaro stated that he doesn’t know how many prospects in past years have come back to bite teams. Yes, not every prospect is going to pan out. And yes, some of them could come back to bite. However, as mentioned before, over 70% of prospects dealt at the deadline from 2005-2011 at least made it to the major leagues. There is also a good chance that most prospects that make it will contribute positive WAR. That’s a pretty good turnout. Hamels, Utley, Rollins, Papelbon, Howard, Burnett, and Byrd will all be north of 30 years old next year, with some over 35. So, they do not have young guys who are already established, like Martinez, Encarnacion, and Teixeira like I talked about earlier. They are old. The current Phillies team has proven it’s not going to win, so why wouldn’t they trade off some of their assets, and take a chance on some prospects panning out, while at the same time free up money for future off-seasons? They are not going to win in 2015 or 2016 most likely, so even if their current players still provide positive WAR in the next two years, what’s the point in keeping them around? Go out and completely reload and blow the roster up. With the amount of guys they could trade, or could have traded, you’re bound to have some of the prospects you get in return pan out, as the data above suggests. Stock up the minor league system, and take the hit at the major league level for a couple years. Add that to the money they will be saving, and they will be well-equipped to contend in three years.

Prospects are not “over-coveted” in baseball. The problem for Amaro and the Phillies is that they do not have the right people in charge of evaluating and developing prospects. They have traded for prospects in the past, such as the Pence and Victorino trades in 2012 (not included above) and have not gotten good returns. So, maybe Ruben Amaro Jr. just isn’t very good at what he does, and wants to believe that giving up major-league veterans for prospects when your team is completely out of it is not a good idea.

Applying KATOH to Historical Prospects

by Chris Mitchell

August 9, 2014

Over the last few weeks, I have written a series of posts looking into how a player’s stats, age, and prospect status can be used to predict whether he’ll ever play in the majors. I analyzed hitters in Rookie leagues, Short-Season A, Low-A, High-A, Double-A, and Triple-A using a methodology that I named KATOH (after Yankees prospect Gosuke Katoh), which consists of running a probit regression analysis. In a nutshell, a probit regression tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. While KATOH technically predicts the likelihood that a player will reach the majors, I’d argue it can also serve as a decent proxy for major league success. If something makes a player more likely to make the majors, there’s a good chance it also makes him more likely to succeed there.

After receiving a few requests, I decided to apply the model to players of years past. In what follows, I dive into what KATOH would have said about recent top prospects, look at the highest KATOH scores of the last 20 years, and highlight some instances where KATOH missed the boat on a prospect. If you’re feeling really ambitious, here’s a giant google doc of KATOH scores for all 40,051 player seasons since 1995 ( minimum 100 plate appearances in a short-season league or 200 in full-season ball).

Before I delve into the parade of lists, I want to point out one disclaimer to what I’m doing here. KATOH was derived from the performances of historical players, so applying the model to those same players might make it look a little better than it is. Take a player like Jason Stokes for example. Although he was a very well-regarded prospect in the early 2000’s (#15 and #51 per Baseball America in 2003 and 2004), KATOH consistently gave him probabilities in the 70’s and 80’s. But part of that is likely because Stokes’ data points were incorporated into the model. If I had created KATOH in 2005, Stokes’ MLB% may have been a few percentage points higher. Even so, a few data points generally aren’t enough to substantially change a model that incorporates thousands. In other words, it’s probably safe to assume that a player’s MLB% using today’s KATOH is roughly in line with what he would have received at the time.

Now, onto the results. Here’s what KATOH thought about some of the most recent top 100 prospects:

2013 Top 100 Prospects

Player	Year	Age	Level	MLB Probability
Xander Bogaerts	2013	20	AA	99.888%
Xander Bogaerts	2013	20	AAA	99.869%
George Springer	2013	23	AAA	99.816%
Gregory Polanco	2013	21	AA	99.614%
Nick Castellanos	2013	21	AAA	99.608%
Kolten Wong	2013	22	AAA	99.428%
Wil Myers	2013	22	AAA	99.418%
Miguel Sano	2013	20	A+	99.335%
Tyler Austin	2013	21	AA	99.194%
Jackie Bradley	2013	23	AAA	99.079%
Kaleb Cowart	2013	21	AA	99%
Byron Buxton	2013	19	A+	98%
Francisco Lindor	2013	19	A+	98%
Christian Yelich	2013	21	AA	97%
Byron Buxton	2013	19	A	97%
Addison Russell	2013	19	A+	97%
Billy Hamilton	2013	22	AAA	96%
Brian Goodwin	2013	22	AA	96%
Carlos Correa	2013	18	A	96%
Slade Heathcott	2013	22	AA	96%
Javier Baez	2013	20	A+	95%
Jake Marisnick	2013	22	AA	95%
Albert Almora	2013	19	A	95%
Jonathan Singleton	2013	21	AAA	94%
Mike Zunino	2013	22	AAA	94%
Alen Hanson	2013	20	A+	94%
Gregory Polanco	2013	21	A+	92%
Javier Baez	2013	20	AA	91%
Jorge Soler	2013	21	A+	90%
Gary Sanchez	2013	20	A+	89%
Austin Hedges	2013	20	A+	89%
Mike Olt	2013	24	AAA	87%
Miguel Sano	2013	20	AA	83%
George Springer	2013	23	AA	82%
Mason Williams	2013	21	A+	78%
Trevor Story	2013	20	A+	61%
Bubba Starling	2013	20	A	61%
Courtney Hawkins	2013	19	A+	58%
Roman Quinn	2013	20	A	58%

2012 Top 100 Prospects

Player	Year	Age	Level	MLB Probability
Jurickson Profar	2012	19	AA	99.975%
Anthony Rizzo	2012	22	AAA	99.947%
Manny Machado	2012	19	AA	99.937%
Billy Hamilton	2012	21	AA	99.856%
Oscar Taveras	2012	20	AA	99.827%
Kolten Wong	2012	21	AA	99.824%
Nolan Arenado	2012	21	AA	99.759%
Leonys Martin	2012	24	AAA	99.737%
Nick Franklin	2012	21	AA	99.737%
Yasmani Grandal	2012	23	AAA	99.714%
Wil Myers	2012	21	AAA	99.659%
Andrelton Simmons	2012	22	AA	99.566%
Travis D’Arnaud	2012	23	AAA	99.512%
Jedd Gyorko	2012	23	AAA	99.493%
Hak-Ju Lee	2012	21	AA	99.492%
Jonathan Singleton	2012	20	AA	99.482%
Nick Castellanos	2012	20	AA	99.465%
Jonathan Schoop	2012	20	AA	99.443%
Jean Segura	2012	22	AA	99.423%
Nick Castellanos	2012	20	A+	99.051%
Starling Marte	2012	23	AAA	99.015%
Anthony Gose	2012	21	AAA	99%
Rymer Liriano	2012	21	AA	99%
Jake Marisnick	2012	21	AA	99%
Xander Bogaerts	2012	19	A+	98%
Michael Choice	2012	22	AA	98%
Gary Brown	2012	23	AA	98%
Christian Yelich	2012	20	A+	98%
Nick Franklin	2012	21	AAA	97%
Javier Baez	2012	19	A	97%
Brett Jackson	2012	23	AAA	96%
Zack Cox	2012	23	AAA	92%
Mason Williams	2012	20	A	91%
Gary Sanchez	2012	19	A	89%
Jake Marisnick	2012	21	A+	88%
Francisco Lindor	2012	18	A	88%
Cheslor Cuthbert	2012	19	A+	87%
Miguel Sano	2012	19	A	86%
Billy Hamilton	2012	21	A+	83%
George Springer	2012	22	A+	80%
Christian Villanueva	2012	21	A+	80%
Mike Olt	2012	23	AA	79%
Matt Szczur	2012	22	A+	78%
Rymer Liriano	2012	21	A+	76%
Blake Swihart	2012	20	A	66%
Cory Spangenberg	2012	21	A+	64%
Bubba Starling	2012	19	R	17%

2011 Top 100 Prospects

Player	Year	Age	Level	MLB Probability
Mike Trout	2011	19	AA	99.973%
Brett Lawrie	2011	21	AAA	99.969%
Anthony Rizzo	2011	21	AAA	99.911%
Wil Myers	2011	20	AA	99.654%
Christian Colon	2011	22	AA	99.495%
Brandon Belt	2011	23	AAA	99.414%
Austin Romine	2011	22	AA	99.393%
Jesus Montero	2011	21	AAA	99.379%
Devin Mesoraco	2011	23	AAA	99.205%
Brett Jackson	2011	22	AAA	99.199%
Dustin Ackley	2011	23	AAA	99.196%
Yonder Alonso	2011	24	AAA	99%
Lonnie Chisenhall	2011	22	AAA	99%
Zack Cox	2011	22	AA	98%
Jason Kipnis	2011	24	AAA	98%
Mike Moustakas	2011	22	AAA	98%
Desmond Jennings	2011	24	AAA	98%
Jonathan Villar	2011	20	AA	98%
Matt Dominguez	2011	21	AAA	98%
Jurickson Profar	2011	18	A	97%
Bryce Harper	2011	18	A	97%
Tony Sanchez	2011	23	AA	97%
Dee Gordon	2011	23	AAA	97%
Grant Green	2011	23	AA	97%
Manny Machado	2011	18	A+	97%
Nolan Arenado	2011	20	A+	96%
Chris Carter	2011	24	AAA	96%
Travis D’Arnaud	2011	22	AA	96%
Wilmer Flores	2011	19	A+	95%
Jose Iglesias	2011	21	AAA	95%
Hak-Ju Lee	2011	20	A+	94%
Brett Jackson	2011	22	AA	93%
Jonathan Singleton	2011	19	A+	92%
Joe Benson	2011	23	AA	91%
Gary Sanchez	2011	18	A	86%
Wilin Rosario	2011	22	AA	86%
Nick Castellanos	2011	19	A	85%
Nick Franklin	2011	20	A+	83%
Jean Segura	2011	21	A+	82%
Cesar Puello	2011	20	A+	82%
Derek Norris	2011	22	AA	76%
Jonathan Villar	2011	20	A+	73%
Aaron Hicks	2011	21	A+	68%
Billy Hamilton	2011	20	A	61%
Miguel Sano	2011	18	R	44%
Josh Sale	2011	19	R	15%

Next, lets take a look at some of the highest KATOH scores of all time, namely those who received a score of at least 99.9%. There aren’t any complete busts among these players, as virtually all of them went on to play in the majors.

All-Time Top KATOH Scores

Player	Year	Age	Level	MLB Probability
Sean Burroughs	2000	19	AA	99.998%
Luis Castillo	1996	20	AA	99.995%
Fernando Martinez	2007	18	AA	99.994%
Daric Barton	2005	19	AA	99.992%
Alex Rodriguez	1995	19	AAA	99.992%
Carl Crawford	2001	19	AA	99.992%
Elvis Andrus	2008	19	AA	99.992%
Adam Dunn	2001	21	AAA	99.990%
Joe Mauer	2003	20	AA	99.989%
Ryan Sweeney	2005	20	AA	99.984%
Nick Johnson	1999	20	AA	99.984%
Jose Tabata	2009	20	AA	99.983%
Jose Tabata	2008	19	AA	99.983%
Travis Snider	2009	21	AAA	99.981%
Joaquin Arias	2005	20	AA	99.980%
Matt Kemp	2006	21	AAA	99.979%
Jose Reyes	2002	19	AA	99.979%
Jurickson Profar	2012	19	AA	99.975%
Mike Trout	2011	19	AA	99.973%
Jay Bruce	2008	21	AAA	99.971%
Brett Lawrie	2011	21	AAA	99.969%
B.J. Upton	2004	19	AAA	99.959%
Howie Kendrick	2006	22	AAA	99.951%
Ryan Howard	2005	25	AAA	99.951%
Dioner Navarro	2004	20	AA	99.950%
Luis Rivas	1999	19	AA	99.949%
Lastings Milledge	2005	20	AA	99.948%
Anthony Rizzo	2012	22	AAA	99.947%
Billy Butler	2006	20	AA	99.946%
Fernando Martinez	2008	19	AA	99.944%
Alberto Callaspo	2004	21	AA	99.944%
Jose Lopez	2003	19	AA	99.939%
Freddie Freeman	2010	20	AAA	99.939%
Manny Machado	2012	19	AA	99.937%
Rickie Weeks	2005	22	AAA	99.935%
Casey Kotchman	2004	21	AAA	99.932%
Eric Chavez	1998	20	AAA	99.930%
Adrian Beltre	1998	19	AA	99.927%
Shannon Stewart	1995	21	AA	99.917%
Anthony Rizzo	2011	21	AAA	99.911%
Karim Garcia	1995	19	AAA	99.910%
Jay Bruce	2007	20	AAA	99.907%
Jeff Clement	2008	24	AAA	99.902%
Miguel Cabrera	2003	20	AA	99.900%

All of the players who registered a KATOH score of at least 99.9% did so while playing in either Double- or Triple-A. This isn’t all that surprising since these are the levels closest to the big leagues. But what about the lower levels? Like we saw in Double- and Triple-A, there weren’t any complete busts among the highest ranking hitters from full-season A-ball. For both full-season leagues, each of the 20 top ranked players has either made it to the majors, or in the case of Carlos Correa, is young enough to still has an excellent chance to do so. But on the bottom two rungs on the minor league ladder, we come across a few instances where KATOH whiffed, most notably in Garrett Guzman (74%), Richard Stuart (72%), and Pat Manning (72%).

Top KATOH Scores for Seasons in High-A

Player	Year	Age	Level	MLB Probability
Adrian Beltre	1997	18	A+	99.863%
Andruw Jones	1996	19	A+	99.568%
Giancarlo Stanton	2009	19	A+	99.405%
Billy Butler	2005	19	A+	99.348%
Miguel Sano	2013	20	A+	99.335%
Chris Snelling	2001	19	A+	99.241%
Jason Heyward	2009	19	A+	99.097%
Andy LaRoche	2005	21	A+	99.091%
Wilmer Flores	2010	18	A+	99.075%
Nick Castellanos	2012	20	A+	99.051%
Jose Reyes	2002	19	A+	99%
Casey Kotchman	2003	20	A+	99%
Vernon Wells	1999	20	A+	99%
Travis Lee	1997	22	A+	99%
Brandon Wood	2005	20	A+	98%
Xander Bogaerts	2012	19	A+	98%
Justin Huber	2003	20	A+	98%
Aramis Ramirez	1997	19	A+	98%
Jay Bruce	2007	20	A+	98%
Byron Buxton	2013	19	A+	98%

Top KATOH Scores for Seasons in Low-A

Player	Year	Age	Level	MLB Probability
Mike Trout	2010	18	A	99%
Adrian Beltre	1996	17	A	98%
Jurickson Profar	2011	18	A	97%
Bryce Harper	2011	18	A	97%
Sean Burroughs	1999	18	A	97%
Andruw Jones	1995	18	A	97%
Byron Buxton	2013	19	A	97%
Jason Heyward	2008	18	A	97%
Corey Patterson	1999	19	A	97%
Vladimir Guerrero	1995	20	A	97%
Javier Baez	2012	19	A	97%
Ian Stewart	2004	19	A	96%
Lastings Milledge	2004	19	A	96%
Carlos Correa	2013	18	A	96%
Prince Fielder	2003	19	A	96%
Delmon Young	2004	18	A	96%
Josh Vitters	2009	19	A	96%
Chad Hermansen	1996	18	A	95%
Wilmer Flores	2010	18	A	95%
B.J. Upton	2003	18	A	95%

Top KATOH Scores for Seasons in Short-Season A

Player	Year	Age	Level	MLB Probability	Played in Majors
Chris Snelling	1999	17	A-	82%	1
Richard Stuart	1996	19	A-	72%	0
Aramis Ramirez	1996	18	A-	71%	1
Ryan Kalish	2007	19	A-	71%	1
Cory Spangenberg	2011	20	A-	66%	0
Hanley Ramirez	2002	18	A-	66%	1
Wilson Betemit	2000	18	A-	65%	1
Ismael Castro	2002	18	A-	65%	0
Vernon Wells	1997	18	A-	64%	1
Carlos Figueroa	2000	17	A-	61%	0
Carson Kelly	2013	18	A-	61%	0
Pablo Sandoval	2005	18	A-	60%	1
Dan Vogelbach	2012	19	A-	59%	0
Manny Ravelo	2000	18	A-	57%	0
Chip Ambres	1999	19	A-	57%	1
Maikel Franco	2011	18	A-	55%	0
Jurickson Profar	2010	17	A-	55%	1
Derek Norris	2008	19	A-	54%	1
Cesar Saba	1999	17	A-	54%	0
Edinson Rincon	2009	18	A-	52%	0

Top KATOH Scores for Seasons in Rookie ball

Player	Year	Age	Level	MLB Probability	Played in Majors
Jeff Bianchi	2005	18	R	76%	>1
Justin Morneau	2000	19	R	74%	1
Addison Russell	2012	18	R	74%	0
Garrett Guzman	2001	18	R	74%	0
James Loney	2002	18	R	74%	1
Prince Fielder	2002	18	R	73%	1
Pat Manning	1999	19	R	72%	0
Wilmer Flores	2008	16	R	70%	1
Alex Fernandez	1998	17	R	70%	0
Dorssys Paulino	2012	17	R	69%	0
Tony Blanco	2000	18	R	69%	1
Hank Blalock	1999	18	R	69%	1
Joe Mauer	2001	18	R	69%	1
Hanley Ramirez	2002	18	R	69%	1
Ramon Hernandez	1995	19	R	68%	1
Angel Salome	2005	19	R	68%	1
Marcos Vechionacci	2004	17	R	67%	0
Gary Sanchez	2010	17	R	66%	0
Scott Heard	2000	18	R	65%	0
Jose Tabata	2005	16	R	65%	1

Now for KATOH’s biggest whiffs. Looking at seasons prior to 2011, the following players had very high KATOH ratings, but never made it to baseball’s highest level. The biggest miss was Cesar King, a defensive-minded catcher from the Rangers organization. Though to KATOH’s credit, King did spend five days on the Kansas City Royals’ roster in 2001 without getting into a game. Following King are a couple of busted Yankees prospects in Jackson Melian and Eric Duncan. Not to make excuses for KATOH, but these guys’ high scores may have had something to do with the way the Yankees over-hyped their prospects back then. If those two weren’t on Baseball America’s top 100 list, KATOH would have pegged them in the 70’s, rather than in the high-90’s.

KATOH’s Biggest Misses

Player	Year	Age	Level	MLB Probability
Cesar King	1998	20	AA	99.427%
Jackson Melian	2000	20	AA	99%
Eric Duncan	2005	20	AA	98%
Matt Moses	2006	21	AA	98%
Juan Williams	1995	21	AA	98%
Jeff Natale	2005	22	AA	97%
Eric Duncan	2006	21	AA	97%
Nick Weglarz	2010	22	AAA	96%
Nick Weglarz	2009	21	AA	96%
Tony Mota	1999	21	AA	95%
Micah Franklin	1998	26	AAA	94%
Billy Martin	2003	27	AAA	94%
Bill McCarthy	2004	24	AAA	94%
Jackson Melian	1999	19	A+	94%
Tagg Bozied	2004	24	AAA	94%
Kevin Grijak	1995	23	AAA	93%
Angel Villalona	2008	17	A	93%
Danny Dorn	2010	25	AAA	93%
Nic Jackson	2003	23	AAA	92%
Pat Cline	1997	22	AA	92%

And here are the major leaguers who KATOH deemed least likely to make it when they were in the minors. Its worth noting that a couple of them — Jorge Sosa and Jason Roach — made it as pitchers.

Worst KATOH Scores Who Made it to the Majors

Player	Year	Age	Level	MLB Probability
Justin Christian	2004	24	A-	0.017%
Jorge Sosa	1999	21	A-	0.027%
Tyler Graham	2006	22	A-	0.087%
Gary Johnson	1999	23	A-	0.136%
Bo Hart	1999	22	A-	0.155%
Tommy Manzella	2005	22	A-	0.181%
Michael Martinez	2006	23	A-	0.185%
Eddy Rodriguez	2012	26	A+	0.194%
Kevin Mahar	2004	23	A-	0.215%
Will Venable	2005	22	A-	0.232%
Brent Dlugach	2004	21	A-	0.268%
Sean Barker	2002	22	A-	0.270%
Steve Holm	2002	22	A-	0.301%
Edgar V. Gonzalez	2000	22	A-	0.315%
Peter Zoccolillo	1999	22	A-	0.328%
Konrad Schmidt	2007	22	A-	0.337%
Tommy Medica	2010	22	A-	0.365%
Brian Esposito	2008	29	AA	0.392%
Jason Roach	1997	21	A-	0.396%
Jorge Sosa	2000	22	A-	0.439%

KATOH’s far from perfect, but overall, I think it does a pretty decent job of forecasting which players will make it to the majors. That being said, it’s still a work in progress, and I have a few ideas rolling around in my head to improve on the model. Furthermore, I’m working to develop something that will forecast how a minor leaguer will perform upon reaching the majors, to complement his MLB%. I’ll be dropping these new and improved KATOH projections (for both hitters and pitchers) after this year’s World Series, when we’ll all be desperate for something baseball-related to get us through the winter.

Statistics courtesy of FanGraphs, Baseball-Reference, and The Baseball Cube; Pre-season prospect lists courtesy of Baseball America.

Using xBABIP to Examine the Offensive End of the Mets’ Shortstop Dilemma

by Senor_Met

August 8, 2014

It’s no secret that a vast majority of Mets fans want Wilmer Flores to be playing shortstop every day. It’s also no secret that manager Terry Collins has some strange infatuation with Ruben Tejada, opting again and again to give him starts at shortstop.

Although Collins hasn’t given the media any clear reasoning as to why this is, there are a few reasons we can speculate. The biggest one is defense — Ruben Tejada has made major strides at shortstop this season, posting the highest DRS of his career. Flores, on the other hand, is a second baseman, and even his defense at second is questionable — he really profiles more as a corner infielder. However, with the other three infield positions being blocked by Daniel Murphy, David Wright, and the new-and-improved Lucas Duda, Ruben Tejada is the odd man out.

The other side of the coin is the one I’m going to be focusing on: offense. When Tejada started getting regular playing time as a 21-year-old in 2011, he showed some legitimate offensive potential, hitting line drives at an extremely impressive 28.1% rate (would have ranked 2nd among qualified batters,) good for .287/.345/.345 in 877 PAs between 2011 and 2012. Then, in 2013, he came to spring training out of shape, hit .202, got sent down, got hurt a couple times, and basically threw yet another monkey wrench into the Mets’ rebuild. The job became his to lose in 2014, and he’s hit a measly .228/.348/.280, the OBP even being inflated by the amount of intentional walks he received in the 8 hole. His 0.4 fWAR this season cancels out his -0.4 last season, making him a perfect replacement-level player.

Meanwhile, Wilmer Flores has been a top offensive prospect in the Mets system since he was signed out of Venezuela as a 16-year-old in 2007. His numbers finally started to reflect his talent in 2012, when he hit .300/.349/.479 between high A and AA. In 2013, he exploded in AAA, and the past two seasons has hit .321/.360/.543 with 28 home runs and 47 doubles in exactly 162 games. Sure, he plays in Vegas, one of the most hitter friendly parks in AAA, but these are still numbers that demand attention — attention that he hasn’t yet seemed to receive from Terry Collins. Despite Tejada’s offensive struggles, he has still started 86 games at short this season, as opposed to Flores’ 20. One of the reasons a few Mets fans have been pointing to is the fact that Flores has yet to actually produce at the major league level, hitting only .220/.254/.304 in his 201 big league plate appearances. But is that slash-line an accurate reflection of his talent? And, for that matter, is Tejada’s?

For this mini-evaluation, we’ll use slash12’s xBABIP formula. It’s never a perfect system, but it will give us a good estimation of what these players slash-lines should look like (or at least their average and OBP.)

After inserting Ruben Tejada’s batted ball profile, we get that his xBABIP for 2014 is .329 — much higher than his actual BABIP of .288. We can then plug that backwards into the BABIP formula to determine how many hits he theoretically should have. Since the formula is (H-HR)/(AB-HR-K+SF), we can plug in everything except for hits to get (H-2)/(289-2-65+0)=.329, simplify that to (H-2)/(222)=.329, multiply both sides by 222 to get H-2=73, and we can come to the conclusion that Ruben Tejada should have 73 hits on the year, instead of the 66 he has. This would make his batting average .253 and his OBP .364 (although, keep in mind that that’s being inflated by the 10 intentional walks he’s had while hitting 8th in the order. If we decided to remove those, his OBP would drop to .345).

Now, doing the same to Wilmer Flores is slightly tricky, as we don’t have nearly as large a sample size worth of batted ball data to use. In the interest of accuracy, we’ll use his career profile, so we can at least get a sample of 201 PAs instead of his 100 this year. Plugging his batted ball profile into the xBABIP calculator, we get a result of .333, compared to his actual career BABIP of .268. Doing the same backwards math we did with Tejada, this brings his expected career batting average up to .272, and his career OBP up to .304.

Now, these are only two stats, and they only tell us so much — Flores seems to be a better hitter, but his career 4.5% BB rate is clearly overmatched by Tejada. There isn’t a formula out there for expected slugging percentage — at least, as far as I know — so we can’t really determine what that would be (and subsequently, what their OPS would be). We could assume the same ISO, which would not be entirely accurate, but it would give us a .305/.669 for Tejada and a .355/.659 for Flores. Still, I think it’s clear, both from my biased perspective as a Mets fan and my objective perspective as a baseball fan, that Flores has the brighter future offensively — but it’s up to the Mets to decide how to capitalize on it.

Foundations of Batting Analysis: Part 4 — Storytelling with Context

by Colin Dew-Becker

August 7, 2014

Examining the foundations of batting analysis began in Part 1 with an historical examination of the earliest statistics designed to examine the performance of batters. In Part 2, I presented a new method for calculating basic averages reflecting the “real and indisputable” rate at which batters reached base. In Part 3, I examined the development of run estimation techniques over the last century, culminating with the linear weights system. I will employ that system now as I reconstruct run estimation from the bottom up.

We use statistics in baseball to tell stories. Statistics describe the action of the game or the performance of players over a period of time. Statistics inform us of how much value a player provided or how much skill a player showed in comparison to other players. To tell such stories successfully, we must understand how the statistics we use are constructed and what they actually represent.

A single, for instance, seems simple enough at first glance. However, there are details in its definition that we sometimes gloss over. In general, a single is any event in which the batter puts the ball into play without causing an out, while showing an accepted form of batting effectiveness (reaching on a hit), and ultimately advancing to first base due to the primary action of the event (before any secondary fielding errors or advancement on throws to other bases). Though this is specific in many regards, it is still quite a broad definition for a batting event. The event could occur in any inning, following any number of outs, and with any number of runners on the bases. The ball could be hit in any direction, with any speed and trajectory, and result in any number of baserunners advancing any number of bases.

These kinds of details form the contextual backdrop that characterizes all batting events. When we construct a statistic to evaluate these events, we choose what level of contextual detail we want to consider. These choices define our analysis and are critical in developing the story we want to tell. For instance, most statistics built to measure batting effectiveness—from the simple counting statistics like hits and walks, to advanced run estimators like Batter Runs or weighted On Base Average (wOBA)—are constructed to be independent of the “situational context” in which the events occur. That is, it doesn’t matter when during the game a hit is made or if there are any outs or any runners on the bases at the time it happens. As George Lindsey noted in 1963, “the measure of the batting effectiveness of an individual should not depend on the situations that faced him when he came to the plate.”

Situational context is the most commonly cited form of contextual detail. When a statistic is described as “context neutral,” the context being removed is very often the one describing the out/base state before and after the event and the inning in which it occurred. However, there are other contextual details that characterize the circumstances and conditions in which batting events occur that also tend to be removed from consideration when analyzing their value. Historically, where the ball was hit, as well as the speed and trajectory which it took to reach that location, have also not been considered when judging the effectiveness of batters. This has partly been due to the complexity of tracking such things, especially in the century of baseball recordkeeping before the advent of computers. Also, most historical batting analyses focus exclusively on the outcome for the batter, independent of the effect on other baserunners. If the batter hits the ball four feet or 400 feet but still only reaches first base, there is no difference in the personal outcome that he achieved.

If the value of a hit was limited to only how far the batter advances, then there would be no need to consider the “batted-ball context,” but as F.C. Lane observed in 1916, part of the value of making a hit is in the effect on the “runner who may already be upon the bases.” By removing the batted-ball context when considering types of events in which the ball is put into play, we’re assuming that a four-foot single and 400-foot single have the same general effect on other baserunners. For some analyses, this level of contextual detail describing an event may be irrelevant or insignificant, but for others—particularly when estimating run production—such a level of detail is paramount.

Let’s employ the linear weights method for estimating run production, but allow the estimation to vary from one completely independent of any contextual detail to one as detailed as we can make it. In this way, we’ll be able to observe how various details impact our valuation of events. Also, in situations where we are only given a limited amount of information about batting events, it will allow us to make cursory estimations of how much they caused their team’s run expectancy to change.

To begin, let’s define the run-scoring environment for 2013.[i] While we have focused on context concerning how events transpired on the field, the run scoring environment is another kind of contextual detail that characterizes how we evaluate those events. The exact same event in 2013 may not have caused the same change in run expectancy as it would have in 2000 when runs were scored at a different rate. We will define the run scoring environment for 2013 as the average number of runs that scored in an inning following a plate appearance in each of the 24 out/base states – a 2013-specific form of George Lindsey’s run expectancy matrix:

Base State	0 OUT	1 OUT	2 OUT
0	0.47	0.24	0.09
1	0.82	0.50	0.21
2	1.09	0.62	0.30
3	1.30	0.92	0.34
1-2	1.39	0.84	0.41
1-3	1.80	1.11	0.46
2-3	2.00	1.39	0.56
1-2-3	2.21	1.57	0.71

While we will focus on examining various levels of contextual detail concerning the events themselves, the run-scoring environment can also be varied based on contextual details concerning the scoring of runs. The matrix we will employ, as defined by Lindsey, reflects the average number of runs scored across the entire league. If we wanted, we could differentiate environments by league or park, among other things, to try and reflect a more specific estimate of the number of runs produced. As the work I’m going to present is meant to provide a general framework for run estimation, and these adjustments are not trivial, I’m going to stick with the basic model provided by Lindsey.

With Lindsey’s tool, we can define a pair of statistics for general analysis of run production. Expected Runs (xR) reflect the estimated change in a team’s run expectancy caused by a batter’s plate appearances independent of the situational context in which they occur. A batter’s expected Run Average (xRA) is the rate per plate appearance at which he produces xR.

xRA = Expected Runs / Plate Appearances = xR / PA

xR and xRA create a framework for estimating situation-neutral run production. Based on the contextual specificity that is used to describe the action of a plate appearance, xR and xRA will yield various estimations. The base case for calculating expected runs, xR₀, is calculated independently of any contextual detail, considering only that a plate appearance occurred. By definition, an average plate appearance will cause no change in a team’s run expectancy. Consequently, no matter a player’s total number of plate appearances, his xR₀ and, by extension, his xRA₀, will be 0.0.

This is completely uninformative of course, as base cases often are. So let’s add our first layer of contextual specificity by noting whether an out occurred due to the action of the plate appearance. This is the most significant contextual detail that we consider when evaluating batting events – it is the only factor that determines whether a plate appearance increases or decreases a team’s run expectancy. In 2013, 67.5 percent of all plate appearances resulted in at least one out occurring. On average, those events caused a team’s run expectancy to decrease by .252 runs. The 32.5 percent of plate appearances in which an out did not occur caused a team’s run expectancy to increase by .524 runs on average. We’ll define xR₁ as the estimated change in run expectancy based exclusively on whether the batter reached base without causing an out; xRA₁ is the rate at which a batter produced xR₁ per plate appearance.

You’ll notice that the components that construct xRA₁ can only take on two values—.524 and -.252—in the same way that the components that construct effective On Base Average (eOBA) (as defined in Part 2) can only take on two values—1 and 0. These statistics—xRA₁ and eOBA—have a direct linear correlation:

In effect, xRA₁ is a weighted version of eOBA, incorporating the same contextual details but on a different scale. This estimation provides us with an association between reaching base safely and producing runs. However, the lack of detail would suggest that all players that reach base at the same rate produce the same value, which is over simplified. It’s why you wouldn’t just use eOBA, or eBA, or any other basic statistic that reflects the rate which a batter reaches base, when judging the performance of a batter. Let’s add another layer of contextual detail to account for the different kinds of value a batter provides when he reaches base.

xR₂ will represent the estimated change in run expectancy based on whether the batter safely reached base and the number of bases to which he advanced due to the action of the plate appearance; xRA₂ will be the rate at which a batter produces xR₂ per plate appearance. While xR₁ and xRA₁ were built with just two components to estimate run production, xR₂ and xRA₂ require five components: one to define the value of an out, and four to define the value of safely reaching each base.

In 2013, a batter safely reaching first base during a plate appearance caused an average increase of .389 runs to his team’s run expectancy. Reaching second base was worth .748 runs, third base was worth 1.026 runs, and reaching home was worth 1.377 runs on average. Where xRA₁provided a run estimation analog to eOBA, xRA₂ is built with very similar components to effective Total Bases Average (eTBA), though it’s not quite a direct linear correlation:

The reason xRA₂ and eTBA do not correlate with each other perfectly, like xRA₁ and eOBA, is because the way in which a batter advances bases is significant in determining how valuable his plate appearances were. Consider two players that each had two plate appearances: Player A hit a home run and made an out, Player B reached second base twice. Their eTBA would be identical—2.000—as they each reached four bases in two plate appearances. However, from the run values associated with reaching those bases, Player A would record 1.125 xR₂ from his home run and out, while Player B would record 1.496 xR₂ from the two plate appearances leaving him on second base. Consequently, Player A would have produced a lower xRA₂ (.5625) than Player B (.7480), despite their having the same eTBA. These effects tend to average out over a large enough sample of plate appearances, but they will still cause variations in xRA₂ among players with the same eTBA.

As stated in Part 2, the two main objectives of batters are to not cause an out and to advance as many bases as possible. If the only value that batters produced came from accomplishing these objectives, then we would be done – xR₂ and xRA₂ would reflect the perfect estimations of situation-neutral run production. As I hope is clear, though, the value of a batting event is dependent not only on the outcome for the batter but on the impact the event had on all other runners on base at the time it occurred. Different types of events that result in the batter reaching the same base can have different average effects on other baserunners. For instance, a single and a walk both leave the batter on first base, but the former creates the opportunity for baserunners to advance further on average than the latter. To address this, the next layer of contextual detail will bring the official scorer into the fray. xR₃ will represent the estimated change in run expectancy produced during a batter’s plate appearance based on:

(1) whether the batter safely reached base,

(2) the number of bases, if any, to which the batter advanced due to the action of the plate appearance, and

(3) the type of event, as defined by the official scorer, that caused him to reach base or cause an out.

xRA₃ will, as always, be the rate at which a batter produces xR₃ per plate appearance.

Each of the run estimators that were examined in Part 3, from F.C. Lane’s methods through wOBA, are subsets of this level of xR. Expected runs incorporate estimations of the value produced during every event in which the batter was involved, including those which may be considered “unskilled.” The run estimators examined in Part 3 consider only those events that reflected a batter’s “effectiveness,” and either disregard the “ineffective” events or treat them as failures. xR₃ provides the total value produced by a batter, independent of the effectiveness he showed while producing it, based solely on how the official scorer defines the events. Consequently, some events, like strikeouts, sacrifice bunts, reaches on catcher’s interference, and failed fielder’s choices, among other more obscure occurrences, are examined independently in xR₃. From the two components of xR₂ and the five of xR₃, we build xR₄ with 18 components: five types of outs and 13 types of reaches.

To help illustrate how xR has progressed from level to level, here is a chart reflecting the run values for 2013 as estimated by xR based on the contextual detail provided thus far.

xR Progression

Beyond any consideration of skilled or unskilled production, xR₃ is the level at which most run estimators are constructed. It incorporates events that are well defined in the Official Rules of the game, and have been for at least the last few decades, and in some cases for over a century. While we still define most of a batter’s production by his accomplishing these events, we live in an era where we can differentiate between events on the field in more specific ways. Not all singles are identical events. We weaken our estimation of run production if we don’t account for the different kinds of singles, among other events, that can occur. xR₃ brought the official scorer into action; xR₄ will do the same with the stat stringer.

While the scorer is concerned with the result of an event, a stringer pays attention to the action in between the results. They chart the type, speed, and location of every pitch, and note the batted ball type (bunt, groundball, line drive, flyball, pop up) [ii] and the location to which the ball travels when put into play.While we don’t have this data as far back in time as we have result data, we do have decades worth of information concerning these details. By differentiating events based on these details, we will begin to unravel the “batted-ball context.” Ideally, we would know every detail of the flight of the ball, and use this to group together the most similar possible type of events for comparison.[iii] At present, we’re limited to what the scorers and stringers provide, but that’s still quite a lot of information.

xR₄ will represent the estimated change in run expectancy produced during a batter’s plate appearance based on:

(1) whether the batter safely reached base,

(2) the number of bases, if any, to which the batter advanced due to the action of the plate appearance,

(3) the type of event, as defined by the official scorer, that caused him to reach base or make an out,

(4) the type of batted ball, if there was one, as defined by the stat stringer, that resulted from the plate appearance,

(5) the direction in which the ball travelled, and

(6) whether the ball was fielded in the infield or outfield.

xRA₄ will be the rate at which a batter produces xR₄ per plate appearance.

There are 18 components in xR₃which describe the assorted types of general events a batter can create. When you add in these details concerning the batted-ball context, the number of components increases to 145 for xR₄. With such specific details being considered, we can no longer rely on a single season of data to accurately inform us on the average situation in which each type of event occurs; the sample sizes for some events are just too small. To address this, there are two steps required in evaluating events for xR₄. The first is to build a large sample of each event to build an accurate picture of their relative frequency in each out/base state. I’ve done this by using a sample covering the previous ten seasons to the one in which the estimations are being made. Once this step is completed, the run-scoring environment in the season being analyzed is applied to these frequencies, in the same way it is when looking at single season frequencies for basic events.

For instance, the single, which is traditionally treated as just one type of event, is broken into 24 parts based on the contextual details listed above. By observing the rate at which each of these 24 variations of singles occurred in each out/base state from 2004 through 2013, and applying the 2013 run-scoring environment, we get the following breakdown for the estimated value of singles in 2013:

Single	Left	Center	Right	All
Bunt, Infield	.418	.451	.436	.427
Groundball, Infield	.358	.361	.384	.363
Pop Up, Infield	.391	.359	.398	.369
Line Drive, Infield	.343	.369	.441	.369
Groundball, Outfield	.463	.464	.499	.474
Pop Up, Outfield	.483	.480	.498	.488
Line Drive, Outfield	.444	.463	.471	.460
Flyball, Outfield	.481	.479	.490	.482

This process is repeated for every type of batting event in which the ball is put into play. One of the ways we can use this information is to consider the run value based not on the result of the event, but on the batted-ball context that describes the event. Here are those values in the 2013 run-scoring environment:

	Popups	Groundballs	Fly Balls	Line Drives	All Swinging BIP
All Outs	-.261	-.257	-.226	-.257	-.249
Infield Out	-.260	-.257	——-	-.297	-.260
Outfield Out	-.269	——-	-.226	-.233	-.229
Left Out	-.262	-.260	-.230	-.251	-.253
Center Out	-.262	-.281	-.223	-.257	-.257
Right Out	-.260	-.229	-.227	-.262	-.237
All Reaches	.514	.468	1.108	.571	.629
Infield Reach	.436	.381	——-	.390	.382
Outfield Reach	.517	.503	1.108	.572	.659
Left Reach	.516	.463	1.172	.577	.632
Center Reach	.535	.443	1.006	.546	.593
Right Reach	.483	.510	1.166	.593	.672
All Infield	-.257	-.199	——-	-.267	-.211
All Outfield	-.003	.503	.093	.402	.262
All Left	-.219	-.058	.161	.332	.054
All Center	-.205	-.078	.030	.312	.030
All Right	-.191	-.069	.123	.326	.045
All	-.207	-.068	.093	.323	.042

Similarly, we can break down each player’s xR₄ by the value produced on each type of batted ball. Here are graphs for xR₄ produced on each of the four types of batted balls resulting from a swing, with respect to the number of batted balls of that type hit by the player. For simplicity, from this point on, when I drop the subscript when describing a batter’s expected run total, I’m referring to xR₄.

Line drives are the most optimal result for a batter. The first objective of batters is to reach base safely, and they did that on 67.0 percent of line drives last season. No batter who hit at least eight line drives in 2013 caused a net decrease in his team’s run expectancy during those events. For most batters, hitting the ball into the outfield in the air is the ideal way to produce value, as fly ball production tends to create a positive change in a team’s run expectancy. However, fly balls have the most variance of any of the batted ball types, and there are certainly batters who hurt their teams more when hitting the ball at a high launch angle than a low one. Here are the players to produce the lowest xRA on fly balls last season (minimum 50 fly balls):

Lowest xRA on Fly Balls, MLB – 2013
(minimum 50 fly balls)

Pete Kozma, StL	-.1626
Ruben Tejada, NYM	-.1546
Cliff Pennington, Ari	-.1513
Andres Torres, SF	-.1465
Placido Polanco, Mia	-.1224

For each of these batters, hitting the ball on the ground or on a line drive were far better results on average.

xRA by Batted Ball Type – 2013

	FB	GB	LD
Pete Kozma, StL	-.1626	-.0738	.2496
Ruben Tejada, NYM	-.1546	-.0961	.1227
Cliff Pennington, Ari	-.1513	-.0421	.3907
Andres Torres, SF	-.1465	-.0155	.4269
Placido Polanco, Mia	-.1224	-.0981	.1889

While groundballs may be a preferable result for some batters when compared to fly balls, they are still effectively batting failures for the team. There were 840 batters in 2013 to hit at least one groundball and only 44 produced a net positive change in their team’s run expectancy. Of those 44 players, only 11 hit more than 10 groundballs, and only two (Mike Trout and Juan Francisco) hit at least 100 groundballs. Here are the players with the highest xRA on groundballs in 2013 who hit at least 100 groundballs:

Highest xRA on Groundballs, MLB – 2013
(minimum 100 groundballs)

Mike Trout, LAA	.0187
Juan Francisco, Atl-Mil	.0123
Brandon Barnes, Hou	-.0076
Andrew McCutchen, Pit	-.0081
Marlon Byrd, NYM-Pit	-.0093

xR₄ allows us to tell the most detailed story concerning the type of value a batter produced, independent of the situational context at the time the plate appearance occurred. Because we gradually added layers of detail to our estimation, we can compare how each level of expected runs correlates to this most detailed level. In this way, we can judge how much information each level provides with respect to our most detailed estimation. Here is a graph that charts a batter’s xR₄ with respect to his xR₁, xR₂, and xR₃ estimations:

The line that cuts through the data reflects the xR₄ values charted against themselves. For each xR_n, we can calculate how well it correlates with xR₄ and, consequently, how much of xR₄ it can explain. Remember that we have already shown that xR₁has a direct linear correlation with eOBA and xR₂ has a very high, though not quite direct, correlation with eTBA. For the xR₁ values, we observe a correlation, r, with xR₄ of .912, and an r² of .832, meaning that knowing the rate at which a batter reaches base explains over four-fifths of our estimation of xR₄. For the xR₂ values, r² increases to .986; for the xR₃ values, r² increases slightly higher to .990.[iv]

The takeaway from this is that when considering the whole population of players, there is little difference in a run estimator that considers the batted-ball context and one that does not; you can still explain 99 percent of the value estimated by xR₄ by stopping at xR₃. In fact, if all you know is the rate at which a batter accomplishes his two main objectives—reaching base and advancing as far as possible—you can explain well over 90 percent of the value estimated by xR₄. However, on an individual level, there is enough variation that observing the batted-ball context can be beneficial. Here are the five players with the largest positive and negative differences between their xR₃ and xR₄ estimations:

Largest Increase from xR₃ to xR₄, MLB – 2013

Player	xR₃	xR₄	Diff
David Ortiz, Bos	44.1	48.2	+4.1
Kyle Seager, Sea	11.8	15.9	+4.1
Chris Davis, Bal	57.2	61.0	+3.8
Matt Carpenter, StL	36.6	40.3	+3.7
Freddie Freeman, Atl	38.6	41.9	+3.3

Largest Decrease from xR₃ to xR₄, MLB – 2013

Player	xR₃	xR₄	Diff
Adeiny Hechavarria, Mia	-27.2	-32.9	-5.7
Jean Segura, Mil	9.7	4.2	-5.5
Jose Iglesias, Bos-Det	4.5	-0.1	-4.7
Elvis Andrus, Tex	-8.6	-12.9	-4.3
Alexei Ramirez, CWS	-1.9	-5.8	-3.9

These changes are not massive, and these are the extreme cases for 2013, but they are certainly large enough that ignoring them will weaken specific analyses of batting production. Incorporating batted ball details into our analysis adds a significant layer of complexity to our calculation, but it must be considered if we want to tell the most accurate story of the value a batter produced.

If this work seems at all familiar, you may have read this article that I wrote last year on a statistic that I called Offensive Value Added (OVA). For all intents and purposes, OVA and xR are identical. I decided that the name change to xR would help me differentiate estimations more simply, as I could avoid naming four separate statistics for each level of contextual detail, but there was also a secondary reason for changing the presentation of the data. OVAr was the rate statistic associated with OVA, and it was scaled to look like a batting average, much in the same way that wOBA is scaled to look like an on base average. At the time, I choose to do this to make it easier to appreciate how a batter performed, since many baseball enthusiasts are comfortable interpreting the relative significant of a batting average.

After thinking on the subject, though, I came to decide that I prefer statistics that actually “mean” something to those that give a general, unit-less rating. For instance, try to explain what wOBA actually reflects. It starts as a run estimator, but then it’s transformed into a number that looks like a statistic with specific units (OBA), while not actually using those units. Once that transformation occurs, it no longer reflects anything specific and only serves as a way to rate batters. The same principle applies to other statistics as well, most notably OPS, which is arguably the most meaningless of all baseball statistics, perhaps all statistics ever (don’t get me started).

xR and xRA estimate the change in a team’s run expectancy caused by a batter’s plate appearances. They are measured in runs and runs per plate appearance, respectively. xRA may not look like a number you’ve seen before, and generally needs to be written out to four decimal places instead of three, unlike basic averages, but it’s linguistically very simple to use and understand. I’d rather sacrifice the comfort of having a statistic merely look familiar and instead have it actually reflect something tangible. This doesn’t take away from the value of a statistic like wOBA, which is a great run estimator no matter what scale it is on; a lack of meaning certainly does not imply a lack of value. Introducing an unscaled run average, xRA, will hopefully create a different perspective on how to talk about batting production.

There is one final expected run estimation that I want to consider that could easily cover an entire new part on its own, but I’ll limit myself to just a few paragraphs. The xR estimations we have built have been constructed independent of the situational context at the time of the batter’s plate appearance. Since we want to cover the entire spectrum of context-neutral run estimation to context-specific run estimation, we will conclude by considering xR_s, which is an estimate of the change in a team’s run expectancy based on the out/base state before and after the action of the plate appearance. This is very nearly the same thing as RE24 but it only considers runs produced due to the primary action of plate appearances and not baserunning events.

In many respects, xR_s is the simplest run estimator to construct of all that we have built thus far. There are only three pieces of information you need to know in a given plate appearance to construct xR_s: the run-scoring environment, the out/base state at the start of the action of the plate appearance, and the out/base state at the end of the action of the plate appearance. Next time you go to a baseball game, bring along a copy of a run expectancy matrix, like the one provided earlier. On a scorecard, at the start of every plate appearance, take note of the value assigned to the out/base state, making adjustments if any runners move while the batter is still in the batter’s box. Once the plate appearance is over, note the value of the new out/base state, separating out any advancement on secondary fielding errors or throws to other bases. Subtract the first value from the second value, and add in any RBIs on the play, and write the number in the box associated with the batter’s plate appearance; you just calculated xR_s. Do this for a whole game, and you will have a picture of the total value produced by every batter based on the out/base state context in which they performed.

The effective averages and expected run estimations provide a foundation on which batting analysis can be performed. They combine both “real and indisputable facts” with detailed estimations of the run produced in every event in which a batter participates. Any story that aims to describe the value that a batter provides to his team must consider these statistics, as they are the only ones which account for all value produced. 147 years ago, Henry Chadwick suggested that batters should be judged on whether they passed a “test of skill.” I think they should be judged on whether they passed a “test of value.”

Thanks to Benjamin H Byron for editorial assistance, as well as the staff at the Library of Congress for assistance in locating original copies of the 19^th century newspaper articles included in Part 1.

Here is data on eOBA, eTBA, and each level of xR and xRA estimation, for each batter in 2013.

Bibliography

[i] I’ll be focusing on 2013 because the full season is complete. All the work described here could easily be applied to 2014, or any other season, I just don’t want to use incomplete information.

[ii] While these terms are used a lot, there aren’t any specific definitions commonly accepted that differentiate each type of batted ball. For terms used so commonly, it doesn’t make much sense to me that they are not well defined. It won’t apply to the data used in this research, but here is my attempt at defining them.

A bunt is a batted ball not swung at but intentionally met with the bat. A groundball is a batted ball swung at that lands anywhere between home plate and the outer edge of the infield dirt and would be classified as a line drive if it made contact with a fielder in the air. A line drive is a batted ball swung at that leaves the bat at an angle of at most 20° above parallel to the ground (the launch angle), and either lands in the outfield or makes contact with any fielder before landing (generally through a catch, but sometimes a deflection). A fly ball is a batted ball swung at, with a launch angle between 20° and 60° above parallel (not inclusive), that either lands in the outfield or is caught in the air by a player in the outfield. A popup is a batted ball swung at that either (a) leaves the bat at an angle of 60° or greater above parallel and lands or is caught in the air in the outfield, or (b) leaves the bat at an angle greater than 30° and lands or is caught in the air in the infield.

This would result in some balls being classified differently than they currently are, and not just because differentiating between a line drive and a fly ball is somewhat difficult with just a pair of eyes. If the defense were to play an infield shift, and the batter were to hit a line drive into the outfield grass into that shift, subsequently being thrown out at first base, it would likely be called a groundout by current standards. Batted balls should not be defined based on defensive success or failure, but by the general path which they take when leaving the bat. It may be unusual to credit a batter with making a line out despite the ball hitting the ground, but it more accurately reflects the type of ball put into play by the batter.

I don’t know that these are the “correct” ways to group together these events, but as we now are using technology that tracks the flight of the baseball from the moment it is released by the pitcher through the end of the play, we should probably have better definitions for types of batted balls than those currently provided by MLB. I don’t expect a human stringer to be able to differentiate between a ball hit with a 15° launch angle or a 25° launch angle, but that doesn’t mean we shouldn’t have some standard definition for which they should aim.

[iii] In theory, xR₅ would attempt to consider details that are even more specific, perhaps the initial velocity of the ball off the bat, the launch angle, and whatever other information can be gleaned from technology like HIT F/X. The xR framework leaves room to consider any further amount of detail that a researcher wants to consider.

[iv] Though not charted here, the r² value based on the correlation between wRAA, the “counting” version of wOBA, and xR₄ is .984. As wRAA is nearly identical to xR₃ but excludes a few of the more rare events from its calculation, it’s not surprising that the r² value between wRAA and xR₄ is just slightly smaller than the r² between xR₃ and xR₄.

Leadoff Rating 2.0

by Brad Oremland

August 6, 2014

It feels icky to create a statistical formula based on what “feels right”.

Last month, I introduced a stat called Leadoff Rating, or LOR. The idea was that most systems to identify great leadoff hitters tab players like Ted Williams and Mickey Mantle, who would always hit closer to the middle of the order. I wanted to distinguish players specially suited to batting leadoff. The formula was simple: OBP minus ISO. By subtracting isolated power, we identified players who get on base a lot but aren’t true sluggers. It’s an easy calculation, and it produced fairly reasonable results. Two particular things bothered me:

1. Bad hitters occasionally had good leadoff ratings because of their very low ISO.

2. Rickey Henderson ranked 45th.

We know that leadoff is one of the two or three most important positions in the batting order. As little impact as lineup construction has on winning percentage, leadoff hitters are important. But LOR saw high OBP and low ISO as equally meaningful, so players with no power sometimes rated as desirable leadoff hitters. That seemed like something to correct.

Rickey Henderson is generally recognized as the greatest leadoff man of all time. LOR did not show this, for two main reasons. One was that the formula did not include baserunning. The other was that the all-time list slanted heavily towards Deadball players. Before Babe Ruth, everyone had low isolated power. Ty Cobb was a terrific power hitter, who led the AL in slugging eight times. Cobb’s career ISO (.146) is basically the same as Rickey’s (.140). Henderson only ranked among the top 10 in slugging twice. The game has changed.

Based on the feedback of FanGraphs readers and on my own muddlings, I’ve reworked the leadoff rating formula. The new system is more complicated — it’s annoying to do without a spreadsheet — and it’s kind of haphazard. OBP – ISO was a nice system because of its simplicity. With the updated formula, I’m guessing, choosing numbers that seem right. If someone better than I am at math would care to suggest revisions, please do so. I am fully prepared to give this stat away to smart people.

The formula I’m using now is — wait. There’s another calculation I abandoned, but it’s important for explaining how we arrived at the current iteration, and that middle step looked like this: OBP – ( .75 * ISO ) + ( ( .005 * BsR ) / ( PA / 600 ) )

On-base percentage is the heart of leadoff rating. A good hitter, and especially a good leadoff hitter, must get on base. But I only subtracted 3/4 of ISO, because (1) low ISO is not as important as high OBP, and (2) the original formula was probably a little too hard on doubles hitters. Guys like Rickey and Tim Raines ranked too low because they had more power than players like Jason Kendall and Ozzie Smith.

Commenter foxinsox suggested adding (Constant * BsR) to the calculation, which was a fine idea I should have seen earlier. The hitch was turning BsR into a rate stat. By using BsR/PA or BsR/G, we can incorporate that element smoothly.

When I ran the numbers, the historical lists looked great (Rickey Henderson in the top 10!), but for active players, there were hits and misses. Elvis Andrus came back as the ideal leadoff hitter in 2013, and Craig Gentry (.264/.326/.299) ran away with 2014 to date. Even with the adjustments, LOR rewarded low ISO. While a .250 ISO isn’t really the right fit for the top of the batting order, neither is a sub-.050 ISO. We don’t want a guy who only hits singles, we just don’t want a cleanup hitter. Looking at the historical lists, I found that most of the top players had an ISO right around .100, so I created a Goldilocks formula, preferring a minimal absolute difference from .100 ISO. Rather than simply treating low ISO as desirable, we’re looking for the sweet spot between singles and slugging. The new formula is:

OBP – .75 * | .100 – ISO | + ( .005 * BsR ) / ( PA / 600 )

That’s on-base percentage, minus 3/4 of the absolute difference between ISO and .100, plus .005 times BsR per 600 plate appearances. Now very low isolated power is punished just as much as very high ISO.

Hopefully you want to see some lists. I’ll show you five: the all-time list, the post-Jackie Robinson list, the leaders for the 2013 season, 2014 to date (through July 31), and 2014 rest-of-season projections (ZiPS). We’ll also look at the 2014 leaders (both to date and projected) for every team in the major leagues. Read the rest of this entry »

Using Short-Season A Stats to Predict Future Performance

by Chris Mitchell

August 3, 2014

Over the last couple of weeks, I’ve been looking into how a player’s stats, age, and prospect status can be used to predict whether he’ll ever play in the majors. So far, I’ve analyzed hitters in Rookie leagues, Low-A, High-A, Double-A and Triple-A using a methodology that I named KATOH (after Yankees prospect Gosuke Katoh), which consists of running a probit regression analysis. In a nutshell, a probit regression tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. While KATOH technically predicts the likelihood that a player will reach the majors, I’d argue it can also serve as a decent proxy for major league success. If something makes a player more likely to make the majors, there’s a good chance it also makes him more likely to succeed there.

For hitters in Low-A and High-A, age, strikeout rate, ISO, BABIP, and whether or not he was deemed a top 100 prospect by Baseball America all played a role in forecasting future success. And walk rate, while not predictive for players in Rookie ball, Low-A, or High-A, added a little bit to the model for Double-A and Triple-A hitters. Today, I’ll look into what KATOH has to say about players in Short-Season A-ball. Due to varying offensive environments in different years and leagues, all players’ stats were adjusted to reflect his league’s average for that year. For those interested, here’s the R output based on all players with at least 200 plate appearances in a season in SS A-ball from 1995-2007.

Just like we saw with hitters in Rookie ball, a player’s Baseball America prospect status couldn’t tell us anything about his future as a big leaguer. This was entirely due the scarcity players top 100 prospects in the sample, as only a handful of players spent the year in SS A-ball after making BA’s top 100 list. Somewhat surprisingly, walk rate is predictive for players in SS-A, despite being statistically insignificant for hitters in Rookie ball and the more advanced A-ball levels. Another interesting wrinkle is the “Strikeout_Rate:Age” variable. Basically, this says that strikeout rate matters more for younger players than for older players at this level. Although frequent strikeouts are obviously a bad thing no matter how old you are:

The season is less than 50 games old for most teams in the New York-Penn and Northwest Leagues, which makes it a little premature to start analyzing players’ stats. But just for kicks, here’s a look at what KATOH says about this year’s crop of players with at least 100 plate appearances through July 28th. The full list of players can be found here, and you’ll find an excerpt of those who broke the 40% barrier below:

Player	Organization	Age	MLB Probability
Rowan Wick	STL	21	82%
Eduard Pinto	TEX	19	68%
Marcus Greene	TEX	19	60%
Mauricio Dubon	BOS	19	59%
Franklin Barreto	TOR	18	57%
Christian Arroyo	SFG	19	57%
Skyler Ewing	SFG	21	56%
Taylor Gushue	PIT	20	55%
Domingo Leyba	DET	18	55%
Raudy Read	WSN	20	53%
Nick Longhi	BOS	18	52%
Andrew Reed	HOU	21	52%
Danny Mars	BOS	20	51%
Amed Rosario	NYM	18	49%
Yairo Munoz	OAK	19	48%
Seth Spivey	TEX	21	47%
Mike Gerber	DET	21	47%
Mark Zagunis	CHC	21	47%
Kevin Krause	PIT	21	46%
Leo Castillo	CLE	20	45%
Jordan Luplow	PIT	20	45%
Mason Davis	MIA	21	40%
Kevin Ross	PIT	20	40%
Franklin Navarro	DET	19	40%

As we saw with Rookie league hitters, KATOH doesn’t think any of these players are shoo-ins to make it to the majors. Even Rowan Wick, who hit a Bondsian .378/.475/.815 before getting promoted, gets just 82%. This goes to show that SS A-ball stats just aren’t all that meaningful.

Once the season’s over, I’ll re-run everything using the final 2014 stats, which will give us a better sense of which prospects had the most promising years statistically. I also plan to engineer an alternative methodology — to supplement this one — that will take into account how a player performs in the majors, rather than his just getting there. Additionally, I hope to create something similar for projecting pitchers based on their statistical performance. In the meantime, I’ll apply the KATOH model to historical prospects and highlight some of its biggest “hits” and “misses” from years past. Keep an eye out for the next post in the coming days.

Statistics courtesy of FanGraphs, Baseball-Reference, and The Baseball Cube; Pre-season prospect lists courtesy of Baseball America.

Pitch Win Values for Starting Pitchers – July 2014

by Stats All Folks

August 2, 2014

Introduction

A couple months back, I introduced a new method of calculating pitch values using a FIP-based WAR methodology. That post details the basic framework of these calculations and can be found here . The May and June updates can be found here and here respectively. This post is simply the July 2014 update of the same data. What follows is predominantly data-heavy but should still provide useful talking points for discussion. Let’s dive in and see what we can find. Please note that the same caveats apply as previous months. We’re at the mercy of pitch classification. I’m sure your favorite pitcher doesn’t throw that pitch that has been rated as incredibly below average, but we have to go off of the data that is available. Also, Baseball Prospectus’s PitchF/x leaderboards list only nine pitches (Four-Seam Fastball, Sinker, Cutter, Splitter, Curveball, Slider, Changeup, Screwball, and Knuckleball). Anything that may be classified outside of these categories is not included. Also, anything classified as a “slow curve” is not included in Baseball Prospectus’s curveball data.

Constants

Before we begin, we must first update the constants used in calculation for Jule. As a refresher, we need three different constants for calculation: strikes per strikeout, balls per walk, and a FIP constant to bring the values onto the right scale. We will tackle them each individually.

First, let’s discuss the strikeout constant. In July, there were 47,449 strikes thrown by starting pitchers. Of these 47,449 strikes, 4,585 were turned into hits and 13,750 outs were recorded. Of these 13,750 outs, 3,725 were converted via the strikeout, leaving us with 10,025 ball-in-play outs. 10,025 ball-in-play strikes and 4,585 hits sum to 14,610 balls-in-play. Subtracting 14,610 balls-in-play from our original 47,449 strikes leaves us with 32,839 strikes to distribute over our 3,725 strikeouts. That’s a ratio of 8.82 strikes per strikeout. This is exactly the same as our from 8.82 strikes per strikeout in June.

The next two constants are much easier to ascertain. In July, there were 26,244 balls thrown by starters and 1,328 walked batters. That’s a ratio of 19.76 balls per walk, up from 19.36 balls per walk in June. This data would suggest that hitters were slightly less likely to walk in July than previously. The FIP subtotal for all pitches in July was 0.52. The MLB Run Average for July was 4.17, meaning our FIP constant for May is 3.65.

Constant	Value
Strikes/K	8.82
Balls/BB	19.76
cFIP	3.65

The following table details how the constants have changed month-to-month.

Month	K	BB	cFIP
March/April	8.47	18.50	3.68
May	8.88	18.77	3.58
June	8.82	19.36	3.59
July	8.82	19.76	3.65

Pitch Values – July 2014

For reference, the following table details the FIP for each pitch type in the month of July.

Pitch	FIP
Four-Seam	4.06
Sinker	4.20
Cutter	4.42
Splitter	3.50
Curveball	4.08
Slider	3.87
Changeup	4.79
Screwball	3.58
Knuckleball	3.97
MLB RA	4.16

As we can see, only three pitches would be classified as below average for the month of July: sinkers, cutters, and changeups. Four-Seam Fastballs and curveballs also came in right around league average. Pitchers that were able to stand out in these categories tended to have better overall months than pitchers who excelled at the other pitches. Now, let’s proceed to the data for the month of July.

Four-Seam Fastball

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Ian Kennedy	0.6	180	Brad Peacock	-0.3
2	Clayton Kershaw	0.6	181	Jake Odorizzi	-0.3
3	Jose Quintana	0.6	182	Jason Hammel	-0.3
4	Drew Hutchison	0.5	183	Edwin Jackson	-0.3
5	Jacob deGrom	0.5	184	Chris Young	-0.3

Sinker

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Brandon McCarthy	0.4	167	Chase Whitley	-0.2
2	Roberto Hernandez	0.4	168	Andrew Heaney	-0.2
3	Doug Fister	0.4	169	Jon Niese	-0.2
4	Hisashi Iwakuma	0.4	170	David Buchanan	-0.2
5	Wade Miley	0.3	171	Nick Tepesch	-0.3

Cutter

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Josh Collmenter	0.3	77	Brandon McCarthy	-0.2
2	Jon Lester	0.3	78	Drew Smyly	-0.2
3	Kevin Correia	0.2	79	Brandon Workman	-0.2
4	Jarred Cosart	0.2	80	Dan Haren	-0.3
5	Adam Wainwright	0.2	81	Hector Noesi	-0.4

Splitter

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Hisashi Iwakuma	0.3	27	Daisuke Matsuzaka	0.0
2	Hiroki Kuroda	0.3	28	Ubaldo Jimenez	0.0
3	Jake Odorizzi	0.2	29	Tim Lincecum	-0.1
4	Alex Cobb	0.2	30	Doug Fister	-0.1
5	Tim Hudson	0.2	31	Clay Buchholz	-0.1

Curveball

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Sonny Gray	0.3	155	Hiroki Kuroda	-0.1
2	Clay Buchholz	0.2	156	Josh Tomlin	-0.2
3	Jesse Hahn	0.2	157	Kevin Correia	-0.2
4	Adam Wainwright	0.2	158	Eric Stults	-0.3
5	Jose Quintana	0.2	159	Josh Beckett	-0.3

Slider

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Garrett Richards	0.5	125	Jair Jurrjens	-0.1
2	Tyson Ross	0.4	126	Jason Lane	-0.1
3	Jake Arrieta	0.3	127	Jake Buchanan	-0.1
4	Brett Anderson	0.3	128	Matt Cain	-0.1
5	Kyle Lohse	0.3	129	C.J. Wilson	-0.1

Changeup

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Cole Hamels	0.3	156	Rubby de la Rosa	-0.2
2	David Price	0.3	157	David Holmberg	-0.2
3	Chris Sale	0.2	158	Mike Minor	-0.2
4	Zack Greinke	0.2	159	Jeff Locke	-0.3
5	James Shields	0.2	160	Drew Hutchison	-0.4

Screwball

Rank	Pitcher	Pitch Value
1	Trevor Bauer	0.0
2	Julio Teheran	0.0
3	Hector Santiago	0.0

Knuckleball

Rank	Pitcher	Pitch Value
1	R.A. Dickey	0.4

Overall

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Cole Hamels	1.0	187	Jair Jurrjens	-0.4
2	Jacob deGrom	0.9	188	Erik Bedard	-0.4
3	Tyson Ross	0.9	189	Jason Hammel	-0.4
4	Jose Quintana	0.9	190	Brad Peacock	-0.4
5	Chris Sale	0.9	191	Nick Tepesch	-0.4

Pitch Ratings – July 2014

Four-Seam Fastball

Rank	Pitcher	Pitch Rating	Rank	Pitcher	Pitch Rating
1	Drew Hutchison	59	83	Jake Odorizzi	38
2	Jose Quintana	59	84	Jake Peavy	38
3	Cole Hamels	58	85	Josh Tomlin	36
4	Mark Buehrle	58	86	Brad Peacock	35
5	Tim Lincecum	58	87	Jason Hammel	34

Sinker

Rank	Pitcher	Pitch Rating	Rank	Pitcher	Pitch Rating
1	Travis Wood	58	73	Kevin Correia	36
2	Scott Kazmir	57	74	John Danks	36
3	Matt Garza	57	75	Jeff Samardzija	35
4	Brandon McCarthy	57	76	Dan Haren	32
5	Doug Fister	57	77	Nick Tepesch	25

Cutter

Rank	Pitcher	Pitch Rating	Rank	Pitcher	Pitch Rating
1	Marcus Stroman	58	32	Mike Minor	33
2	Jon Lester	58	33	Tim Hudson	33
3	Daisuke Matsuzaka	57	34	Brandon McCarthy	32
4	Phil Hughes	57	35	Dan Haren	28
5	Franklin Morales	57	36	Hector Noesi	20

Splitter

Rank	Pitcher	Pitch Rating	Rank	Pitcher	Pitch Rating
1	Tim Hudson	57	8	Jorge de la Rosa	53
2	Kyle Kendrick	56	9	Alfredo Simon	53
3	Hisashi Iwakuma	56	10	Jeff Samardzija	53
4	Kevin Gausman	56	11	Alex Cobb	52
5	Hiroki Kuroda	56	12	Tim Lincecum	42

Curveball

Rank	Pitcher	Pitch Rating	Rank	Pitcher	Pitch Rating
1	Jacob deGrom	59	65	Franklin Morales	38
2	Felix Hernandez	59	66	Chase Anderson	38
3	Clay Buchholz	58	67	Jered Weaver	37
4	Brandon McCarthy	58	68	Kevin Correia	26
5	David Phelps	58	69	Josh Beckett	20

Slider

Rank	Pitcher	Pitch Rating	Rank	Pitcher	Pitch Rating
1	Jordan Zimmermann	59	55	Zack Wheeler	44
2	Brett Anderson	59	56	Miles Mikolas	43
3	Wei-Yin Chen	58	57	Miguel Gonzalez	42
4	Kyle Lohse	58	58	Carlos Martinez	40
5	Corey Kluber	58	59	Yu Darvish	39

Changeup

Rank	Pitcher	Pitch Rating	Rank	Pitcher	Pitch Rating
1	Chase Whitley	60	65	Jeff Locke	30
2	Cole Hamels	59	66	Joe Kelly	27
3	Chase Anderson	59	67	Rubby de la Rosa	26
4	Hector Santiago	58	68(t)	Drew Hutchison	20
5	Jered Weaver	57	68(t)	Mike Minor	20

Screwball

Rank	Pitcher	Pitch Rating
1	Trevor Bauer	52

Knuckleball

Rank	Pitcher	Pitch Rating
1	R.A. Dickey	52

Monthly Discussion

As we can see, Cole Hamels takes the top for this month due to the strength of his overall repertoire. Hamels was classified as throwing five different pitches in July (Four-Seam, Sinker, Cutter, Curveball, and Changeup) and managed to earn at least 0.1 WAR from all five. The most valuable pitch overall in July was Ian Kennedy’s Four-Seam Fastball. The least valuable was Drew Hutchison’s Changeup. As far as offspeed pitches, Garrett Richards’s 0.5 WAR from his slider lead the way. The least valuable fastball was Hector Noesi’s cutter.

On our 20-80 scale pitch ratings, the highest rated qualifying pitch was Chase Whitley’s changeup. The lowest rated pitches were the changeups thrown by Drew Hutchison and Mike Minor, Hector Noesi’s cutter, and Josh Beckett’s curveball. The highest rated fastball was Drew Hutchison’s four-seam fastball.

Pitch Values – 2014 Season

Four-Seam Fastball

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Ian Kennedy	1.9	247	Masahiro Tanaka	-0.4
2	Jose Quintana	1.7	248	Dan Straily	-0.4
3	Phil Hughes	1.6	249	Nick Martinez	-0.4
4	Jordan Zimmermann	1.6	250	Juan Nicasio	-0.4
5	Clayton Kershaw	1.5	251	Marco Estrada	-0.7

Sinker

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Charlie Morton	1.5	236	John Danks	-0.3
2	Felix Hernandez	1.3	237	Wandy Rodriguez	-0.3
3	David Price	1.1	238	Vidal Nuno	-0.3
4	Chris Archer	1.1	239	Nick Tepesch	-0.4
5	Cliff Lee	1.1	240	Andrew Heaney	-0.4

Cutter

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Madison Bumgarner	1.2	110	Dan Haren	-0.2
2	Adam Wainwright	1.2	111	Felipe Paulino	-0.2
3	Corey Kluber	1.2	112	Hector Noesi	-0.3
4	Jarred Cosart	1.2	113	C.J. Wilson	-0.3
5	Josh Collmenter	1.0	114	Brandon McCarthy	-0.5

Splitter

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Masahiro Tanaka	0.8	32	Jake Peavy	-0.1
2	Alex Cobb	0.6	33	Franklin Morales	-0.2
3	Hisashi Iwakuma	0.6	34	Miguel Gonzalez	-0.2
4	Hiroki Kuroda	0.6	35	Danny Salazar	-0.2
5	Tim Hudson	0.4	36	Clay Buchholz	-0.4

Curveball

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Sonny Gray	1.1	210	Homer Bailey	-0.2
2	A.J. Burnett	0.9	211	Alfredo Simon	-0.2
3	Brandon McCarthy	0.8	212	Felipe Paulino	-0.3
4	Adam Wainwright	0.7	213	Franklin Morales	-0.3
5	Jose Fernandez	0.6	214	Eric Stults	-0.4

Slider

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Garrett Richards	1.3	179	Roberto Hernandez	-0.2
2	Tyson Ross	1.1	180	Liam Hendriks	-0.2
3	Kyle Lohse	0.8	181	Erasmo Ramirez	-0.3
4	Corey Kluber	0.8	182	Danny Salazar	-0.3
5	Ervin Santana	0.8	183	Travis Wood	-0.4

Changeup

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Felix Hernandez	0.9	232	Wandy Rodriguez	-0.4
2	Stephen Strasburg	0.6	233	Matt Cain	-0.4
3	Cole Hamels	0.6	234	Jordan Zimmermann	-0.5
4	Chris Sale	0.5	235	Drew Hutchison	-0.6
5	Roberto Hernandez	0.5	236	Marco Estrada	-0.6

Screwball

Rank	Pitcher	Pitch Value
1	Trevor Bauer	0.1
2	Alfredo Simon	0.0
3	Hector Santiago	0.0
4	Julio Teheran	0.0

Knuckleball

Rank	Pitcher	Pitch Value
1	R.A. Dickey	1.2
2	C.J. Wilson	0.0

Overall

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Felix Hernandez	3.5	254	Felipe Paulino	-0.5
2	Adam Wainwright	3.2	255	Juan Nicasio	-0.5
3	Garrett Richards	2.9	256	Nick Martinez	-0.6
4	Corey Kluber	2.9	257	Wandy Rodriguez	-0.8
5	Jose Quintana	2.7	258	Marco Estrada	-1.2

Year-to-Date Discussion

If we look at the year-to-date numbers, AL FIP and MLB WAR leader Felix Hernandez still sits in the top spot. Current MLB FIP leader Clayton Kershaw ranks ninth. The least valuable starter has been Marco Estrada. On a per-pitch basis, the most valuable pitch has been Ian Kennedy’s four-seam fastball. The most valuable offspeed pitch has been Garrett Richards’s slider. The least valuable pitch has been Marco Estrada’s four-seam fastball. The least value offspeed pitch has been Marco Estrada’s changeup. Needless to say, it’s been a rough year for Marco. Qualitatively, I feel fairly encouraged by the year-to-date results so far. The leaderboard is topped by two no-doubt aces, both of whom currently in the top two in their respect leagues in FIP, and Marco Estrada comes in at the bottom after posting the highest FIP among qualified starters so far. For reference, the top five in the year-to-date overall rankings are currently 1st, 12th, 10th, 2nd, and 9th on the FanGraphs WAR leaderboards respectively.

« Previous Page — « Previous entries

Next entries » — Next Page »

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG