Category: Research | Page 67

Archive for Research

When Should I Steal?

by Nick

September 11, 2014

The Stolen Base

Some consider the stolen base a “lost art.” Gone are the days of Vince Coleman’s back-to-back-to-back 100+ stolen base seasons of Whitey-ball folklore. Teams are stealing at the lowest rates (per game) since the 1950’s.

Stolen Bases by Year

Aside from the 2011 outlier, stolen base rates have trended downward at a serious pace, but stolen bases still have their place in the game, especially in increasingly shrinking run environments, but at what point is the value added from a stolen base worth the risk of an out?

Run Expectancy

Tom Tango’s handy-dandy run expectancy chart can give us this answer. In his run expectancy matrix, we can see how run expectancy can change from one state to another from a series of events. The basic guide that saberists abide by is that you should be able to steal bases twice as much as you get caught trying to steal to break even in expected runs, but every situation is different. With runners on first and third and two outs, you would actually have to steal bases at an almost 6:1 ratio to break even.

This is because of three factors: you are not adding any value to the runner that is already on third, making an out takes the bat out of someone’s hands, and making an out with someone already in scoring position is the most detrimental kind of out. Also, in any given situation, you are facing a battery with different characteristics. Stealing a base off of Kyle Lohse and Yadier Molina was nearly impossible back in 2011. On the other hand, stealing a base off of John Lackey and Jarrod Saltalamacchia would have been a lot easier. Accounting for the risk of your own baserunner, the defense, league rates, and base-out situation will lead to the most informed decision.

In the tool below, begin by picking your situation (the strings go: out, first base, second base, third base where “x” means no runner and a number means a runner occupies that base e.g. 0x2x means no outs and runner on second base). Then evaluate your baserunner’s steal rate against an average opponent (Steamer’s updated projection gives Kolten Wong a 21/24 chance of stealing a base). After that, evaluate your opponent’s steal rate against (lefty or righty pitcher, strong armed catcher). Then plug in the league average steal rate, and you should have an expected stolen base percentage for your given situation and the given change in run expectancy (RE24).

LINK

Billy Butler In: The Good, The Slightly Above Average, And The Ugly

by Danny Sader

September 10, 2014

For the past two years or so, Kansas City has been torn about breakfast… Billy “Big Country Breakfast” Butler that is. During this past offseason there were many rumors that the Royals were going to trade him and it seemed inevitable upon entering talks with then free agent Carlos Beltran. Billy Butler is part of the home-grown youth movement in Kansas City with Alex Gordon, and later followed by Salvy Perez, Mike Moustakas, Eric Hosmer, and company. From 2009 through 2013, Billy Butler has offensively been above average, and even great! However, after failing to meet expectations last year, and in some opinion already being in decline at the age of 28, Billy came out and struggled mightily to start the 2014 season.

But he has turned it around somewhat, and with the Royals making headlines this August, Big Country played a big part. So I wanted to look at what he did differently comparing his April dud, to his career average, and to his being a stud again in August. We will measure his overall offensive prowess with WRC+, which in this study would be 50 in March/April, 118 for his career average, and 126 in August. So let’s look at the more telling processing stats.

Split	BB%	K%	BB/K	BABIP	GB/FB	LD%	GB%	FB%	HR/FB
April	8.3%	18.3%	0.45	0.275	2.82	18.8%	60.0%	21.3%	0.0%
Career Average	8.9%	14.4%	0.62	0.325	1.51	19.9%	48.3%	31.9%	11.1%
August	5.8%	13.2%	0.44	0.308	1.35	23.2%	44.2%	32.6%	12.9%

One of the first things to pop out at you is the BB/K ratio. While under his career margin (and by a decent margin too), his BB/K rate is nearly the exact same in April and August. A lot of times credit for a hitter’s success is given to an increase in the BB% and decrease in the K%, but here Butler cuts down on both, therefore increasing the amount of balls he puts into play bringing us to BABIP. Both his April and August are way below his career norms. Perhaps dealing with a little unluckiness? Or just weak contact? Fact is even with his BABIP down and his home run rate relatively consistent he can still create above average production.

Now comes the most telling rate, which is the type of balls that he hits. As someone who is an AL DH, Billy Butler is not only expected to hit, but to slug. That big goose egg for HR’s in April is just an absolute killer, and the culprit is the GB%. It is no wonder why a big, SLOW (we all know about his base running and uncanny attraction to double plays), gap to gap power hitter has one of the worst months of his career considering his GB% is up almost 12% and his FB% is down nearly 10%. Billy Butler will never be Aoki. He has to get the ball in the air. He lives on hitting doubles into the deep gaps at Kauffman Stadium and with ratios such as those it is no surprise he puts up a WRC+ of 50.

When your BB/K ratio is so nearly identical but yet you put up such drastically different numbers, not to mention the fluctuations in his BABIP, it has to come back to his swing mechanics and getting to a consistently good contact position where he can drive the ball.

Split	O-Swing%	Z-Swing%	Swing%	O-Contact%	Z-Contact%	Contact%	Zone%	F-Strike %	SwStr%
April	30.0%	58.6%	43.7%	77.4%	92.9%	87.4%	48.0%	57.8%	5.5%
Career Average	28.0%	63.0%	44.3%	69.4%	90.0%	83.1%	46.7%	56.0%	7.2%
August	37.8%	62.1%	49.5%	70.1%	91.5%	83.1%	48.2%	71.9%	8.5%

Billy’s discipline at the plate has been waning. But the month he really lacked discipline is the same month he did so well in: August. In April he was within his career norms for all of his discipline stats except O-Contact%. Overall he was swinging less and missing less. And that is where the problem may lie! It is not so much that he was struggling with pitch selection, because clearly he was even worse with discipline in August, but the fact that he didn’t miss when he swung.

In a sense Butler was too good at making contact! With his swinging percentage up along with increasingly bad pitch selection, the higher his swinging strike percentage, the better! And perhaps with his swing percentage, his first pitch strike percentage, and his O-Swing percentage all up, he has changed to a more aggressive approach? Again all of this can lead back to the assumption of Butler making poor contact in April. Which leads to the question of what has he done differently, if anything, with his swing?

Split	Fastball %	Slider %	Cutter %	Curveball %	Changeup %	Splitfinger %
April	52.5%	19.5%	8.5%	10.3%	8.8%	0.5%
Career Average	56.3%	18.1%	5.6%	8.6%	9.9%	1.0%
August	50.4%	22.9%	8.3%	9.5%	8.5%	0.7%

Split	Fastball %	wFB/c	wSL/c	wCT/c	wCB/c	wCH/c	wSF/c
April	52.5%	-2.45	-0.92	0.56	1.96	0.86	-11.47
Career Average	56.3%	1.09	-0.81	0.16	0.29	0.16	-1.45
August	50.4%	2	1.89	-1.74	-5.1	-2.11	25.04

Now the main reason I bring these stats up is that I am a huge believer in fastball hunting. These charts may not be the most reliable in telling of pitch selection, but they do tell you if he has been seeing certain pitches better and the rates at which he has been seeing pitches. So I wanted to look closely at his fastball rate in particular just to see if there was anything funky going on. And what was so funky is that in August he was crushing it! The more fastballs you see the better chance you have to hit well. While I am not sure of the exact quantity of fastballs he faced, for the most part he has been seeing the same consistent rate of different pitches he always has and he definitely has done one of his better jobs of taking advantage of the fastballs he has seen. Can a correlation be made between his April failures and August success against fastballs to a possible new approach and/or adjustment in his swing mechanics? Or just unlucky, bad contact?

After searching through the KC Star (hometown newspaper) as well as other media report outlets, I have not been able to find much of anything indicating adjustments being made. There was some talk of just his timing being off, but other than that there are not many clues. I wish I knew how to make video clips of swings and find a couple angles of Billy Butler’s swing in April compared to his swing in August and dissect them both. I would like to see what, if anything, is different. If we could see his timing and especially his bat path, I believe we can tell a lot about what he is doing wrong or right. If anyone can provide those, or teach how to make them, please do and send to me!

However, going off of what I have seen here, everything to me points back to weak contact consistently being made. Whether due to timing or mechanics, I am not sure. Normally I would say this is due to poor pitch selection, but as I showed above, he had even worst discipline and pitch selection in August than April and still put up very stellar numbers. To be clear hard contact is not good enough for a player of Billy Butler’s style. He NEEDS to get air under his pitch. Now they say that this is a game of adjustments. I would love to know what, if any, adjustments Billy “Big Country Breakfast” Butler has made. After all, could it really have just been a string of bad luck?

Run Distribution Using the Negative Binomial Distribution

by Sean Dolinar

September 9, 2014

In this post I use the negative binomial distribution to better model the how MLB teams score runs in an inning or in a game. I wrote a primer on the math of the different distributions mentioned in the post for reference, and this post is divided to a baseball-centric section and a math-centric section.

The Baseball Side

A team in the American League will average .4830 runs per inning, but does this mean they will score a run every two innings? This seems intuitive if you apply math from Algebra I [1 run / 2 innings ~ .4830 runs/inning]. However, if you attend a baseball game, the vast majority of innings you’ll watch will be scoreless. This large number of scoreless innings can be described by discrete probability distributions that account for teams scoring none, one, or multiple runs in one inning.

Runs in baseball are considered rare events and count data, so they will follow a discrete probability distribution if they are random. The overall goal of this post is to describe the random process that arises with scoring runs in baseball. Previously, I’ve used the Poisson distribution (PD) to describe the probability of getting a certain number of runs within an inning. The Poisson distribution describes count data like car crashes or earthquakes over a given period of time and defined space. This worked reasonably well to get the general shape of the distribution, but it didn’t capture all the variance that the real data set contained. It predicted fewer scoreless innings and many more 1-run innings than what really occured. The PD makes an assumption that the mean and variance are equal. In both runs per inning and runs per game, the variance is about twice as much as the mean, so the real data will ‘spread out’ more than a PD predicts.

The graph above shows an example of the application of count data distributions. The actual data is in gray and the Poisson distribution is in yellow. It’s not a terrible way to approximate the data or to conceptually understand the randomness behind baseball scoring, but the negative binomial distribution (NBD) works much better. The NBD is also a discrete probability distribution, but it finds the probability of a certain number of failures occurring before a certain number of successes. It would answer the question, what’s the probability that I get 3 TAILS before I get 5 HEADS when I continue to flip a coin. This doesn’t at first intuitively seem like it relates to a baseball game or an inning, but that will be explained later.

From a conceptual stand point, the two distributions are closely related. So if you are trying to describe why 73% of all MLB innings are scoreless to a friend over a beer, either will work. I’ve plotted both distributions for comparison throughout the post. The second section of the post will discuss the specific equations and their application to baseball.

Runs per Inning

Because of the difference in rules regarding the designated hitter between the two different leagues there will be a different expected value [average] and variance of runs/inning for each league. I separated the two leagues to get a better fit for the data. Using data from 2011-2013, the American League had an expected value of 0.4830 runs/inning with a 1.0136 variance, while the National League had 0.4468 runs/innings as the expected value with a .9037 variance. [So NL games are shorter and more boring to watch.] Using only the expected value and the variance, the negative binomial distribution [the red line in the graph] approximates the distribution of runs per inning more accurately than the Poisson distribution.

It’s clear that there are a lot of scoreless innings, and very few innings having multiple runs scored. The NBD allows someone to calculate the probability of the likelihood of an MLB team scoring more than 7 runs in an inning or the probability that the home team forces extra innings down by a run in the bottom of the 9th. Using a pitcher’s expected runs/inning, the NBD could be used to approximate the pitcher’s chances of throwing a no-hitter assuming he will pitch for all 9 innings.

Runs Per Game

The NBD and PD can be used to describe the runs scored in a game by a team as well. Once again, I separated the AL and NL, because the AL had an expected run value of 4.4995 runs/game and a 9.9989 variance, and the NL had 4.2577 runs/game expected value and 9.1394 variance. This data is taken from 2008-2013. I used a larger span of years to increase the total number of games.

Even though MLB teams average more than 4 runs in a game, the single most likely run total for one team in a game is actually 3 runs. The negative binomial distribution once again modeled the empirical distribution well, but the PD had a terrible fit when compared to the previous graph. Both models, however, underestimate the shut-out rate. A remedy for this is to adjust for zero-inflation. This would increase the likelihood of getting a shut out in the model and adjust the rest of the probabilities accordingly. An inference of needing zero-inflation is that baseball scoring isn’t completely random. A manager is more likely to use his best pitchers to continue a shut out rather than randomly assign pitchers from the bullpen.

Hits Per Inning

It turns out the NBD/PD are useful with many other baseball statistics like hits per inning.

The distribution for hits per inning are slightly similar to runs per inning, except the expected value is higher and the variance is lower. [AL: .9769 hits/inning, 1.2847 variance | NL: .9677 hits/inning, 1.2579 variance (2011-2013)] Since the variance is much closer to the expected value, hits per inning has more values in the middle and fewer at the extremes than the runs per inning distribution.

I could spend all day finding more applications of the NBD and PD, because there are really a lot of examples within baseball. Understanding these discrete probability distributions will help you understand how the game works, and they could be used to model outcomes within baseball.

The Math Side

Hopefully, you skipped down to this section right away if you are curious about the math behind this. I’ve compiled the numbers used in the graphs for the American League for those curious enough to look at examples of the actual values.

The Poisson distribution is given by the equation:

There are two parameters for this equation: expected value [λ] and the number of runs you are looking to calculate [x]. To determine the probability of a team scoring exactly three runs in a game, you would set x = 3 and using the AL expected runs per game you’d calculate:

This is repeated for the entire set of x = {0, 1, 2, 3, 4, 5, 6, … } to get the Poisson distribution used through out the post.

One of the assumption the PD makes is that mean and the variance are equal. For these examples, this assumption doesn’t hold true, so the empirical data from actual baseball results doesn’t quite fit the PD and is overdispersed. The NBD accounts for the variance by including it in the parameters.

The negative binomial distribution is usually symbolized by the following equation:

where r is the number of successes, k is the number of failures, and p is the probability of success. A key restriction is that a success has to be the last event in the series of successes and failures.

Unfortunately, we don’t have a clear value for p or a clear concept on what will be measured, because the NBD measures the probability of binary, Bernoulli trials. It’s helpful to view this problem from the vantage point of the fielding team or pitcher, because a SUCCESS will be defined as getting out of the inning or game, and a FAILURE will be allowing 1 run to score. This will conform to the restriction by having a success [getting out of the inning/game] being the ultimate event of the series.

In order to make this work the NBD needs to be parameterized differently for mean, variance, and number of runs allowed [failures]. The NBD can be written as

where

So using the same example as the PD distribution, this would yield:

The above equations are adapted from this blog about negative binomials and this one about applying the distribution to baseball. The Γ function used in the equation instead of a combination operator because the combination operator can’t handle the non-whole numbers we are using to describe the number of successes.

Conclusion

The negative binomial distribution is really useful in modeling the distribution of discrete count data from baseball for a given inning or game. The most interesting aspect of the NBD is that a success is considered getting out of the inning/game, while a failure would be letting a run score. This is a little counterintuitive if you approach modeling the distribution from the perspective of the batting team. While the NBD has a better fit, the Poisson distribution has a simpler concept to explain: the count of discrete events over a given period of time, which might make it better to discuss over beers with your friends.

The fit of the NBD suggests that run scoring is a negative binomial process, but inconsistencies especially with shut outs indicate elements of the game aren’t completely random. I’m explaining the underestimation of the number of shut outs as the increase use of the best relievers in shut out games over other games increasing the total number of shut outs and subsequently decreasing the frequency of other run-total games.

All MLB data is from retrosheet.org. It’s available free of charge from there. So please check it out, because it’s a great data set. If there are any errors or if you have questions, comments, or want to grab a beer to talk about the Poisson distribution please feel free to tweet me @seandolinar.

Pitch Win Values for Starting Pitchers — August 2014

by Stats All Folks

September 8, 2014

Introduction

A couple months back, I introduced a new method of calculating pitch values using a FIP-based WAR methodology. That post details the basic framework of these calculations and can be found here . The May, June, and July updates can be found here, here, and here respectively. This post is simply the August 2014 update of the same data. What follows is predominantly data-heavy but should still provide useful talking points for discussion. Let’s dive in and see what we can find. Please note that the same caveats apply as previous months. We’re at the mercy of pitch classification. I’m sure your favorite pitcher doesn’t throw that pitch that has been rated as incredibly below average, but we have to go off of the data that is available. Also, Baseball Prospectus’s PitchF/x leaderboards list only nine pitches (Four-Seam Fastball, Sinker, Cutter, Splitter, Curveball, Slider, Changeup, Screwball, and Knuckleball). Anything that may be classified outside of these categories is not included. Also, anything classified as a “slow curve” is not included in Baseball Prospectus’s curveball data.

Constants

Before we begin, we must first update the constants used in calculation for August. As a refresher, we need three different constants for calculation: strikes per strikeout, balls per walk, and a FIP constant to bring the values onto the right scale. We will tackle them each individually.

First, let’s discuss the strikeout constant. In August, there were 52,238 strikes thrown by starting pitchers. Of these 52,238 strikes, 4,887 were turned into hits and 15,293 outs were recorded. Of these 15,293 outs, 4,118 were converted via the strikeout, leaving us with 11,175 ball-in-play outs. 11,175 ball-in-play strikes and 4,887 hits sum to 16,062 balls-in-play. Subtracting 16,062 balls-in-play from our original 52,238 strikes leaves us with 36,176 strikes to distribute over our 4,118 strikeouts. That’s a ratio of 8.78 strikes per strikeout. This is slightly lower than our from 8.82 strikes per strikeout in June and July, meaning batters were slightly easier to strikeout in August.

The next two constants are much easier to ascertain. In August, there were 28,957 balls thrown by starters and 1,521 walked batters. That’s a ratio of 19.04 balls per walk, down from 19.76 balls per walk in August. This data would suggest that hitters were more likely to walk in August than previously. The FIP subtotal for all pitches in August was 0.48. The MLB Run Average for August was 4.12, meaning our FIP constant for is 3.65.

Constant	Value
Strikes/K	8.78
Balls/BB	19.04
cFIP	3.65

The following table details how the constants have changed month-to-month.

Month	K	BB	cFIP
March/April	8.47	18.50	3.68
May	8.88	18.77	3.58
June	8.82	19.36	3.59
July	8.82	19.76	3.65
August	8.78	19.04	3.65

Pitch Values – August 2014

For reference, the following table details the FIP for each pitch type in the month of August.

Pitch	FIP
Four-Seam	4.03
Sinker	4.17
Cutter	4.14
Splitter	4.48
Curveball	4.21
Slider	4.15
Changeup	4.47
Screwball	2.22
Knuckleball	4.56
MLB RA	4.12

As we can see, only two pitches would be classified as above average for the month of August: four-seam fastballs and screwballs. Sinkers, cutters, and sliders also came in right around league average. Pitchers that were able to stand out in other categories tended to have better overall months than pitchers who excelled at the these pitches. Now, let’s proceed to the data for the month of August.

Four-Seam Fastball

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Chris Tillman	0.7	183	Sean O’Sullivan	-0.2
2	Jose Quintana	0.6	184	John Danks	-0.2
3	Phil Hughes	0.6	185	Anthony Ranaudo	-0.3
4	Max Scherzer	0.6	186	Jason Hammel	-0.3
5	Madison Bumgarner	0.5	187	Stephen Strasburg	-0.4

Sinker

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Mike Leake	0.5	169	Shelby Miller	-0.2
2	Rick Porcello	0.4	170	Travis Wood	-0.2
3	Kyle Hendricks	0.4	171	Mat Latos	-0.3
4	Dallas Keuchel	0.3	172	Tsuyoshi Wada	-0.3
5	Jimmy Nelson	0.3	173	Kyle Kendrick	-0.3

Cutter

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Jarred Cosart	0.6	74	Scott Carroll	-0.1
2	Josh Collmenter	0.4	75	Jorge de la Rosa	-0.1
3	Corey Kluber	0.3	76	J.A. Happ	-0.1
4	James Shields	0.3	77	Kevin Correia	-0.2
5	Jerome Williams	0.2	78	Dan Haren	-0.2

Splitter

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Alex Cobb	0.4	26	Miguel Gonzalez	-0.1
2	Mat Latos	0.2	27	Hisashi Iwakuma	-0.1
3	Alfredo Simon	0.1	28	Felix Hernandez	-0.1
4	Hiroki Kuroda	0.1	29	Jorge de la Rosa	-0.1
5	Kyle Kendrick	0.1	30	Tim Hudson	-0.2

Curveball

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Alex Wood	0.3	157	James Shields	-0.2
2	Brandon McCarthy	0.3	158	Jesse Hahn	-0.2
3	Adam Wainwright	0.3	159	Max Scherzer	-0.2
4	Clay Buchholz	0.2	160	Zack Greinke	-0.3
5	Scott Feldman	0.2	161	Nick Martinez	-0.3

Slider

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Clayton Kershaw	0.4	123	Dallas Keuchel	-0.2
2	Chris Archer	0.3	124	Scott Baker	-0.2
3	Tyler Matzek	0.3	125	Rubby de la Rosa	-0.2
4	Collin McHugh	0.3	126	Bartolo Colon	-0.2
5	Kyle Gibson	0.2	127	Rafael Montero	-0.2

Changeup

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Chris Capuano	0.4	154	Jon Niese	-0.2
2	Jeremy Guthrie	0.3	155	Henderson Alvarez	-0.2
3	Roberto Hernandez	0.2	156	Zack Greinke	-0.2
4	David Price	0.2	157	Brad Peacock	-0.3
5	Max Scherzer	0.2	158	Brad Hand	-0.4

Screwball

Rank	Pitcher	Pitch Value
1	Trevor Bauer	0.0

Knuckleball

Rank	Pitcher	Pitch Value
1	R.A. Dickey	0.1

Overall

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Alex Cobb	0.9	186	Jason Hammel	-0.2
2	Jordan Zimmermann	0.8	187	Justin Masterson	-0.2
3	Corey Kluber	0.8	188	Sean O’Sullivan	-0.3
4	Jarred Cosart	0.8	189	Kyle Lohse	-0.4
5	Collin McHugh	0.8	190	Brad Hand	-0.4

Pitch Ratings – August 2014

Four-Seam Fastball

Rank	Pitcher	Pitch Rating	Rank	Pitcher	Pitch Rating
1	Jose Quintana	59	87	Vance Worley	39
2	Brad Peacock	59	88	Stephen Strasburg	37
3	Michael Pineda	59	89	Justin Masterson	36
4	Phil Hughes	58	90	Anthony Ranaudo	35
5	Franklin Morales	58	91	John Danks	35

Sinker

Rank	Pitcher	Pitch Rating	Rank	Pitcher	Pitch Rating
1	Rick Porcello	58	68	Travis Wood	37
2	Jake Arrieta	58	69	Kyle Kendrick	36
3	Gio Gonzalez	57	70	John Lackey	35
4	J.A. Happ	57	71	Mat Latos	35
5	Marcus Stroman	57	72	Tsuyoshi Wada	33

Cutter

Rank	Pitcher	Pitch Rating	Rank	Pitcher	Pitch Rating
1	Franklin Morales	58	27	Brandon McCarthy	43
2	Corey Kluber	58	28	Jake Peavy	40
3	James Shields	58	29	Ryan Vogelsong	39
4	Jerome Williams	57	30	Dan Haren	38
5	Tim Hudson	56	31	Kevin Correia	33

Splitter

Rank	Pitcher	Pitch Rating	Rank	Pitcher	Pitch Rating
1	Mat Latos	58	7	Matt Shoemaker	50
2	Alex Cobb	56	8	Jake Odorizzi	49
3	Kyle Kendrick	55	9	Jorge de la Rosa	45
4	Tsuyoshi Wada	54	10	Kevin Gausman	42
5	Alfredo Simon	54	11	Hisashi Iwakuma	41

Curveball

Rank	Pitcher	Pitch Rating	Rank	Pitcher	Pitch Rating
1	Felix Hernandez	60	66	Dillon Gee	37
2	Brandon McCarthy	58	67	Scott Carroll	37
3	Jacob deGrom	58	68	James Shields	33
4	Brandon Workman	57	69	Jesse Hahn	24
5	Jeremy Hellickson	57	70	Max Scherzer	22

Slider

Rank	Pitcher	Pitch Rating	Rank	Pitcher	Pitch Rating
1	Max Scherzer	59	54	Tanner Roark	40
2	Wei-Yin Chen	59	55	Kyle Lohse	38
3	Jordan Zimmermann	59	56	Vance Worley	37
4	Corey Kluber	59	57	Dallas Keuchel	35
5	Tyler Matzek	58	58	Tim Lincecum	27

Changeup

Rank	Pitcher	Pitch Rating	Rank	Pitcher	Pitch Rating
1	Chris Capuano	58	59	Wade Miley	38
2	Roberto Hernandez	58	60	Robbie Ray	36
3	Allen Webster	57	61	Trevor May	32
4	Yohan Flande	57	62	Zack Greinke	28
5	Jeremy Guthrie	57	63	Jon Niese	28

Screwball

Rank	Pitcher	Pitch Rating
1	Trevor Bauer	59

Knuckleball

Rank	Pitcher	Pitch Rating
1	R.A. Dickey	49

Monthly Discussion

As we can see, Alex Cobb takes the top for this month mainly due to the strength of his sinker and splitter. Cobb was classified as throwing four different pitches in August (Four-Seam, Sinker, Splitter, and Curveball) and managed to earn at least 0.1 WAR from all four. The most valuable pitch overall in August was Chris Tillman’s Four-Seam Fastball. The least valuable was Stephen Strasburg’s Four-Seam Fastball. As far as offspeed pitches, Chris Capuano’s 0.4 WAR from his changeup lead the way. The least valuable offspeed pitch was Brad Hand’s slider.

On our 20-80 scale pitch ratings, the highest rated qualifying pitch was Felix Hernandez’s curveball. The lowest rated pitch was the curveball thrown by Max Scherzer. The highest rated fastball was Jose Quintana’s four-seam fastball. The lowest rated fastball was Tsuyoshi Wada’s sinker.

Pitch Values – 2014 Season

Four-Seam Fastball

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Jose Quintana	2.4	262	Dan Straily	-0.3
2	Ian Kennedy	2.4	263	Edwin Jackson	-0.3
3	Phil Hughes	2.2	264	Masahiro Tanaka	-0.4
4	Jordan Zimmermann	2.1	265	Juan Nicasio	-0.4
5	Chris Tillman	1.9	266	Marco Estrada	-0.7

Sinker

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Charlie Morton	1.7	251	Mike Pelfrey	-0.3
2	Dallas Keuchel	1.4	252	Dan Straily	-0.3
3	Chris Archer	1.3	253	John Danks	-0.3
4	Mike Leake	1.3	254	Wandy Rodriguez	-0.3
5	Felix Hernandez	1.2	255	Andrew Heaney	-0.4

Cutter

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Jarred Cosart	1.8	118	Felipe Paulino	-0.2
2	Corey Kluber	1.5	119	C.J. Wilson	-0.3
3	Madison Bumgarner	1.4	120	Dan Haren	-0.3
4	Josh Collmenter	1.4	121	Hector Noesi	-0.4
5	Adam Wainwright	1.3	122	Brandon McCarthy	-0.6

Splitter

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Alex Cobb	1.0	35	Jake Peavy	-0.1
2	Masahiro Tanaka	0.8	36	Franklin Morales	-0.2
3	Hiroki Kuroda	0.7	37	Danny Salazar	-0.2
4	Hisashi Iwakuma	0.5	38	Miguel Gonzalez	-0.3
5	Kyle Kendrick	0.4	39	Clay Buchholz	-0.3

Curveball

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Sonny Gray	1.1	225	Homer Bailey	-0.2
2	A.J. Burnett	1.1	226	Josh Collmenter	-0.2
3	Brandon McCarthy	1.0	227	Franklin Morales	-0.3
4	Adam Wainwright	1.0	228	Felipe Paulino	-0.3
5	Felix Hernandez	0.8	229	Eric Stults	-0.5

Slider

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Garrett Richards	1.5	192	Liam Hendriks	-0.2
2	Tyson Ross	1.2	193	Rafael Montero	-0.3
3	Chris Archer	1.0	194	Danny Salazar	-0.3
4	Corey Kluber	1.0	195	Erasmo Ramirez	-0.4
5	Jordan Zimmermann	1.0	196	Travis Wood	-0.5

Changeup

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Felix Hernandez	0.8	245	Wandy Rodriguez	-0.4
2	Stephen Strasburg	0.8	246	Jordan Zimmermann	-0.4
3	Roberto Hernandez	0.7	247	Matt Cain	-0.4
4	Cole Hamels	0.7	248	Marco Estrada	-0.6
5	Chris Sale	0.6	249	Drew Hutchison	-0.7

Screwball

Rank	Pitcher	Pitch Value
1	Trevor Bauer	0.1
2	Alfredo Simon	0.0
3	Hector Santiago	0.0
4	Julio Teheran	0.0

Knuckleball

Rank	Pitcher	Pitch Value
1	R.A. Dickey	1.3
2	C.J. Wilson	0.0

Overall

Rank	Pitcher	Pitch Value	Rank	Pitcher	Pitch Value
1	Corey Kluber	3.7	270	David Holmberg	-0.4
2	Adam Wainwright	3.6	271	Felipe Paulino	-0.5
3	Garrett Richards	3.5	272	Juan Nicasio	-0.5
4	Jose Quintana	3.4	273	Wandy Rodriguez	-0.8
5	Felix Hernandez	3.3	274	Marco Estrada	-1.2

Year-to-Date Discussion

If we look at the year-to-date numbers, Indians ace and Cistulli favorite Corey Kluber has claimed the top spot. Current MLB FIP and WAR leader Clayton Kershaw ranks eighth, with every pitcher ranked above him having made at least three more starts. The least valuable starter has been Marco Estrada. On a per-pitch basis, the most valuable pitch has been Jose Quintana’s four-seam fastball. The most valuable offspeed pitch has been Garrett Richards’s slider. The least valuable pitch has been Marco Estrada’s four-seam fastball. The least value offspeed pitch has been Drew Hutchison’s changeup.

Cat Days of Summer: The Tigers and Schedule Effects

by A. R.

September 8, 2014

If you’ve been on the internet in the last few weeks (or within earshot of a Michigander) you may have heard about the Tigers. Specifically, you may have heard about how the odds in favor of a Detroit appearance in the 2014 ALDS dropped from 21-to-1 on July 25 to under break-even by August 23 before a slight rebound to finish out the month. Even more specifically, you may have read Mike Petriello’s article about that on this very website. Or at the very least, you may have heard their struggles described in a less quantitative fashion. Regardless, the month of August was not kind to the Bengals.

As Petriello pointed out, this has been less of a Tigers collapse than a Royals surge. But there’s still something to the idea that the Tigers were playing worse in August than they had been previously. Let’s start with the basics:

2014	First Half	August
R/G	4.80	4.58
RA/G	4.25	4.74
W%	.582	.516
Pythagenpat	.557	.484

In August, the Tigers scored fewer runs, allowed more runs, and won fewer games than in the first half. On some level, that’s all that really matters. On another level, something else is different about August for these Tigers.

Back on July 14, Buster Olney and Jeff Sullivan both wrote articles about schedule strength. Olney called the Tigers’ schedule the second-most difficult of 17 “contending” teams (paywall), while Sullivan said it was the easiest in all of MLB. One of the key reasons for the discrepancy was that Sullivan was using projections to determine the difficulty of a particular opponent, while Olney was using actual results. Score one for Sullivan. Another key difference was that as of July 14, the Tigers were about to play 55 games in 56 days, which did not factor into Sullivan’s analysis.

A point for Olney? Perhaps. But first, what would we expect to see if this was a result of schedule fatigue? Or put another way, which groups of players might be hurt most or least by not having a day off? Based on conventional wisdom, the bullpen would probably be the most affected, and the starters the least. So how does this match up to the Tigers? Read the rest of this entry »

Mike Trout and the MVP

by Mike Pozar

September 4, 2014

In 2012 and 2013, Mike Trout was considered by most in the sabermetric community to be the most valuable player in the American League. That Miguel Cabrera ended up winning in both years was the source of much debate and consternation, to say the least. Analytically-inclined fans and writers were fed up, frustrated, and outright angry with the “old school” writers voting for Cabrera based on a different set of values. Now, in an amusing twist, it appears that this year Trout has his best chance yet to wind up with the award, in large part by having a season that is less aligned with what the sabermetric community values, and more aligned with what the majority of the voting population values. I took a look at the changes in various aspects of Trout’s game and analyzed how the regressions/improvements will impact his candidacy, based on what voters traditionally have cared about.

Defense

A large part of Trout’s previous MVP candidacy (particularly in 2012) centered on his defense — an area that traditionally has had fewer metrics to quantify a player’s value (as compared to say, hitting). In 2012, DRS had Trout as worth 21 runs above average; UZR had him at 13.3.

In 2013, Trout’s defensive value declined to the point where he was worth -9 runs by DRS and +4.4 runs by UZR. This discrepancy was a major reason why Baseball-Reference’s DRS-based WAR for Trout was 8.9 while FanGraphs’ UZR-based WAR was 10.5.

This year, Trout’s worth -6 by DRS and -7.2 by UZR.

In actuality, it didn’t take a rocket scientist to predict this regression; Trout’s arm has been consistently slightly below average, and his range ended up over-contributing in 2012 thanks to a handful of plays that broke his way. Interestingly enough, the sabermetric crowd didn’t call any attention to this detail in 2012, choosing instead to use Trout’s defensive numbers to bolster their MVP case; now this year they’re bending over backwards to try to discredit Alex Gordon’s defensive numbers so they can justify giving the MVP to Trout as they’ve hoped to be able to do all season long…but that’s a post for a different day.

Baserunning

Likewise in 2012, Trout’s baserunning was valued at 12 runs above average, which included his other-worldly 49 SB and 5 CS. In 2013, his baserunning added 8.1 runs, including 33 SB and 7 CS — still a great 82.5% success rate.

This year, Trout’s been worth all of 1.5 runs on the bases, with just 13 SB and 2 CS.

Hitting

Trout’s offense is down slightly, but not nearly to the extent that his defense and baserunning have been. Like his defense, this regression was fairly predictable, given Trout’s unsustainably high BABIP in 2012 and 2013. His OPS is down to 0.934 compared to 0.963 and 0.988 in 2012 and 2013, but he still has plenty else to hang his hat on: he leads the league in total bases; he’s already hit 30 homers, a total he hasn’t surpassed before; and, with 94 RBIs, he’ll easily pass that magical/meaningless 100 threshold soon as well. The voters as a whole still like HRs, RBIs, and round numbers.

Clutch Hitting

In previous years, Trout was criticized (at least by me!) for not getting hits in key situations. Here are Trout’s offensive splits with Bases Empty versus with Runners on Base:

Year	Split	BABIP	OPS	tOPS+
2012	Empty	0.403	0.985
2012	RoB	0.343	0.917	90
2013	Empty	0.399	1.023
2013	RoB	0.339	0.934	90
2014	Empty	0.343	0.916
2014	RoB	0.348	0.944	104

In 2012-2013, he performed significantly worse with runners on. Presumably most folks here would no doubt cling to the notion that this is entirely luck, and that sequencing like this is entirely unpredictable and out of players’ control. I argue that even if so, if we’re talking about how much value a player added to his team in a given year, he’s adding more value in years when he gets clutch hits than in years when he doesn’t. And this year, he’s actually reversed the trend. His 2014 WPA of 5.52 has already exceeded his 2012 and 2013 marks of 5.32 and 4.60.

The Field

Fortunately for Trout this year, there haven’t been many other position players giving him a run for his money. Josh Donaldson has cooled off as expected after a hot start. Alex Gordon’s case is even more heavily dependent on defensive metrics than Trout’s was in 2012, and I don’t see many voters slotting him above Trout. After that, I just don’t see the award going to Robinson Cano or Kyle Seager (the only other 2 AL players in the top 10 for position player WAR as of this writing), unless Cano truly catches fire in September and leads the Mariners to the playoffs. In fact Trout’s best competition for the MVP may well end up being a pitcher (another Mariner, no less!), Felix Hernandez. And we know how hard it is for a pitcher to win the MVP even when his WAR outpaces that of position players (“They only pitch every 5 days!”).

Playoffs?!

Last and perhaps most importantly, I present the Angels’ records and division finishes over the past 3 seasons:

2012: 89-73, 3rd

2013: 78-84, 3rd

2014: 81-53, 1st (through 8/30)

FanGraphs gives the Angels a 99.9% chance of making the playoffs. In fact, as of this writing, no other team in baseball has more than 78 wins, while the Angels have 81. This should finally appease the “MVPs should lead their team to the playoffs” voters.

The Vote

So Trout’s hitting is slightly down and his defense and baserunning are way down from when he had his previous “MVP-caliber” seasons. Fortunately for Trout, the voters by and large don’t value defense and baserunning as much as they probably should (though that’s starting to change, albeit slowly). And as for hitting being down, 2014 Trout is doing more of what they value: hitting homers and driving in runs. The only thing that might work against him is if he doesn’t bat .300 (he’s at .290 as of now), and the voters like nice round numbers (and they value BA over newfangled mumbo-jumbo like OBP and OPS). Overall though, with the Angels in line for their first playoff spot since 2009 and no other traditional MVP-makeup players in the field, Trout seems like a shoo-in.

Criteria	As Compared to 2012-2013	Do Voters care?
Defense	Way Down	Not much
Baserunning	Way Down	Not much
Overall Hitting	Somewhat down	Somewhat
HRs, RBIs	Up	Yes
Playoffs	Angels in much better position	Yes
Field	Not as many standouts as 2012-2013(Alex Gordon != Miguel Cabrera)	Yes

So there you have it: Trout will win the AL MVP award for all the wrong reasons.

The Search for a Good Approach

by Andrew Patrick

September 3, 2014

Last week I explored the strategic effect of seeing more pitchers per plate appearance. I love the ten-pitch walk as much as the next guy, but what I love even more is seeing a guy be able to change that approach to beat a scouting report. Let’s take a look at June 5, 2014, when the A’s went to see Masahiro Tanaka for the first time. The first batter is Coco Crisp:

Pitcher
M. Tanaka
Batter
C. Crisp
	Speed	Pitch	Result
1	91	Sinker	Ball
2	90	Sinker	Ball
3	91	Fastball (Four-seam)	Ball
4	90	Fastball (Four-seam)	Called Strike
5	91	Fastball (Four-seam)	Foul
6	92	Fastball (Four-seam)	In play, out(s)

So Crisp doesn’t get the best of Tanaka, but he makes Tanaka labor a bit through six pitches. If you’re going to make an out to start the game, it might as well be a long one. For the next batter, John Jaso, Tanaka decides to go right after him:

Pitcher
M. Tanaka
Batter
J. Jaso
	Speed	Pitch	Result
1	90	Sinker	In play, run(s)

I may be looking too deeply into the narrative here, but I love to imagine Tanaka getting a bit frustrated here. Perhaps the scouting report said that both Coco is aggressive early, while Jaso’s running 15% walk rates in 2012 and 2013 suggest that he’s more patient. Tanaka has to throw six pitches in order to get Crisp out, but after deciding to go right after Jaso, he gets taken deep.

So I wondered if there are players who are able to fulfill both ends of this spectrum. Are there any players that are capable of prolonging their time at the plate until they see the pitch they want, but are also aggressive and willing enough to hit the gas on the first pitch? I used FanGraphs for the pitches/plate appearance data, but used baseball-reference’s play index to look up all instances of first-pitch hits this season. Originally I was going to use first-pitch swings, but I decided to just stick to times when the pitcher gets punished for trying to get ahead early. After all, if your decision is to get ahead early in the count, and the guy swings but all he does is foul it off or hit into an out, then that doesn’t change your approach as a pitcher. I wanted to see guys whom the book isn’t written on yet. Advance Warning: These stats will be about a week old by the time you see them, as I am a slow, slow man.

Best P/PA Rank + FPH Rank (I have no idea how to pitch to them)	FPH%	P/PA	FPHR	PPAR	FPHR + PPAR	wOBA
Scott Van Slyke	5.940594059	4.143564356	26	45	71	0.385
Eric Campbell	4.2424242424	4.248520710	117	18	99	0.326
Jesus Guzman	4.294478528	4.17791411	111	33	144	0.247
Daniel Murphy	4.577464789	4.111842105	87	58	145	0.305
Joey Votto	4.044117647	4.334558824	135	12	147	0.359
Mark Reynolds	5.037783375	4.0375	59	91	150	0.307

(For Reference: FPH% = First Pitch Hit Percentage, or how often a batter gets a hit on the first pitch they see. P/PA = Pitches per Plate Appearance. FPHR = First Pitch Hit Ranking, or how they rank in this category compared to the rest of the league. PPAR = Pitches per Plate Appearance Ranking. FPHR + PPAR = The addition of these two numbers.)

I like this table! I have wondered at times what has caused Scott Van Slyke‘s resurgence this year. Perhaps this table gives us a bit of a clue. Van Slyke is the only person in the MLB to rank in the top 50 in both FPHR and PPAR. That’s pretty neat. Daniel Murphy is also quite balanced, but he’s been much more consistent over the last few years. He’s particularly interesting in that he doesn’t have a particularly high walk rate or strikeout rate. I guess he’s just selective at times. Jesus Guzman’s presence on this list goes to show that a good approach doesn’t necessarily mean success; it just means that he may not head back to the bench in any predictable fashion. I stretched out the table one spot to include Mark Reynolds, because his name on this table makes me feel better about drafting him in Fantasy Baseball for past five years.

I also wanted to look at the flip-side. Who are the guys who don’t tend to take a lot of pitches, but also don’t tend to make any decent contact on first pitches?

Highest P/PA Rank + FPH Rank (Pick your poison)	FPH%	P/PA	FPHR	PPAR	FPHR+PPAR	wOBA
Joaquin Arias	0.6451612903	3.55483871	370	400	770	0.221
Ben Revere	1.629327902	3.563636364	365	368	733	0.307
Endy Chavez	0.9345794393	3.674311927	321	393	714	0.301
Conor Gillaspie	2.168674699	3.587112172	359	329	688	0.353
Jean Segura	2.564102564	3.42462845	396	289	685	0.262

Here we have a much less impressive list. Joaquin Arias has been one of the worst hitter in the majors this year, and his dominance atop this leaderboard makes a bit of sense. However, Conor Gillaspie is having an excellent season for the Pale Hose, despite the fact that he doesn’t seem to excel in either of the areas this article is interested in. One pecuilar note is that this group is pretty poor at hitting for power in general; these 5 guys have 13 home runs between them on the year, and six of those are Gillaspie’s.

So now let’s look at the weird ones. I would think that it stands that if there are certain players who tend to take a lot of pitches and who also never seem to square up the first pitch, then we know our game plan. Get ahead early on these batters. We can try to view that by simply looking at each players FPH Ranking minus their PPA ranking. This is the same at looking at the absolute value of their PPAR minus their FPAR. Here are the top five in that respect:

Worst in FPHR, Best in PPAR (Groove it Early)	FPH%	P/PA	FPHR	PPAR	FPHR-PPAR	wOBA
Jason Kubel	1.136363636	4.471590909	387	4	383	0.278
Aaron Hicks	0.641025641	4.224358974	401	21	380	0.286
Mike Trout	1.217391304	4.418965517	385	6	379	0.401
Matt Carpenter	1.376936317	4.357264957	380	8	372	0.343
A.J. Ellis	1.181102362	4.255813953	386	17	369	0.264

Golly; I’ve figured out Mike Trout! Mike Trout ranks very highly on our list of PPAR but is unfortunately relatively average when it comes to the first-pitch punish. All of these guys actually fit this mold. We have three relatively poor hitters accompanied by the best player in baseball and an above average infielder on a winning team. So we can tell that being patient isn’t necessarily a good or bad thing; it’s just that hitter’s style. Now let’s take a look at the reverse:

Best in FPHR, Worst in PPAR (Don’t throw it in the zone early)	FPH%	P/PA	FPHR	PPAR	PPAR-FPHR	wOBA
Jose Altuve	8.159722222	3.175862069	5	407	402	0.355
Wilson Ramos	7.169811321	3.293680297	6	405	399	0.327
Erick Aybar	6.628787879	3.347091932	12	401	389	0.312
Ender Inciarte	8.360128617	3.471518987	3	391	388	0.284
A.J. Pierzynski	6.413994169	3.391930836	16	399	383	0.283

It’s always satisfying when the data shows what you expect it to. I imagined Jose Altuve as being among the more aggressive hitters, and this shows that at least. Altuve ranks 5th in the league in FPH% and is rather mediocre in the PPA category. Interesting to see that this top five is also sorted by wOBA; Altuve is the best hitter on the list, and Pierzynski is the worst. So there’s nothing necessarily wrong with an aggressive approach, but it does give us a clue as to a possible plan of attack.

So all this is to say, like my last article, that no particular approach is best. One can look to swing at the first pitch, or one can be patient and wait for their pitch to come. That said, everybody does have an approach, and that means they’ve got something they’re not looking for. Stats like FPH and PPAR may just give us more clues as fans as to what teams put together with scouting reports.

So to conclude by going back to our first example, perhaps Tanaka should have read this data before his start against the A’s. Coco ranks 266th in the league in FPHR, but a respectable 76th in PPAR. Conversely, Jaso ranks 80th in the league in FPHR, but just 225th in PPAR. Tanaka might have been better served by going after the aging Crisp and saving his energy for the somewhat aggressive Jaso.

Is Nolan Ryan Overrated by FIP?

by Brad Oremland

August 31, 2014

Nolan Ryan was a singular pitcher. He’s unique in baseball history, so distinct that it’s hard to know where to start. I’m going to begin with the obvious: strikeouts. Nolan Ryan struck out 5,714 batters, 17% more than second-place Randy Johnson. Only 16 pitchers in history recorded half as many strikeouts as Nolan Ryan. He led his league in strikeouts 11 times, the most since Walter Johnson (12).

Ryan also walked the most batters in history — 2,795. Steve Carlton is second on that list, with 1,833. Ryan averaged 4.67 BB/9 and 12.4 BB%. Both figures are higher than anyone else who pitched even half as many innings. Ryan led his league in walks eight times.

Ryan also threw 277 wild pitches, most since 1900. He allowed 757 stolen bases, almost 40% more than second-place Greg Maddux. Ryan led AL pitchers in errors four times, and retired with a ghastly .895 fielding percentage. Joe Posnanski summed up Ryan’s career, “He’s the most extraordinary pitcher who ever lived, I think. But I also think he’s not especially close to the best.”

Nolan Ryan is unique, and it makes him hard to evaluate. Casual fans and the old-school crowd have always worshiped Nolan Ryan. His uniform number was retired by three different teams, and he was the leading vote-getter, among pitchers, for the MLB All-Century Team. He got more than twice as many votes as Walter Johnson. But when you really look at his stats, Ryan doesn’t come off well.

Take wins. Yes, the pitcher win, because this is surprising. In a career that spanned 26 seasons (not including 1966, when he had only one decision), Ryan only led his team in wins 7 times. Actually, it’s 5 times outright — 7 counts two years he tied for the lead. In 11 of his 27 seasons (41%), Ryan had a lower winning percentage than the team. He lost more games (292) than anyone but Cy Young and Walter Johnson. What about ERA? Ryan led his league in ERA twice, but in one of those years, he went 8-16. The other year, strike-shortened 1981, he didn’t lead the league in strikeouts, but did lead the majors in wild pitches (16). His 1.25 WHIP ranks 278th all-time. Ryan never won a Cy Young Award and never finished among the top 10 in MVP voting.

They say a little knowledge is a dangerous thing. When you look at stats like wins and ERA, Ryan looks more like a good pitcher than a great one. He’s almost a compiler, just a guy who played forever, rather than a true standout. Then you look at FIP. Ryan had a FIP of 2.97 (84 FIP-), and he pitched 5,386 innings, giving him 106.6 WAR. By FIP, Nolan Ryan is the 6th-most valuable pitcher of all time: Roger Clemens, Cy Young, Walter Johnson, Greg Maddux, Randy Johnson, Nolan Ryan.

I suspect the percentage of FanGraphs readers who believe Nolan Ryan was one of the six best pitchers ever is south of 5%, maybe less than 1%. He rates considerably worse by RA9-WAR, 89.5 instead of 106.6, 25th all-time. Even that would seem high to many stat-oriented fans. It’s better than Bob Feller, basically equal to Pedro Martinez. Ryan also ranks 20th in rWAR (83.8), again much lower than when judged by FIP.

I gave this post a stupid title, with an obvious answer. Is Nolan Ryan overrated by FIP? Yes, clearly. His ERA was 20 points higher — in a 28-year, 807-game, 5,400-inning career. I think the numbers stabilize before 5,000 innings. Ryan’s RA9-WAR is 17 points lower than his fWAR, the biggest deficit of any pitcher in history. Ryan is overrated by FIP. That’s not a major revelation. The interesting question is why Nolan Ryan is overrated by FIP — and whether he is underrated by RA and ERA.
Read the rest of this entry »

The A’s and Hitting With Men On Base

by Mike Pozar

August 30, 2014

Earlier this month I wrote about how the A’s front office is currently outpacing their competition when it comes to roster construction. I focused primarily on how they’ve taken the platoon advantage to another level, loading up on defensively versatile players to allow for day-to-day lineup construction that maximizes the number of plate appearances where their hitters have the platoon advantage. As a result of this, they get 70% of their PAs with the platoon advantage, as compared to the league average of 55%. As part of my investigation into the platoon splits of A’s players, I also noticed another split of interest: offensive performance with runners on base as compared to with the bases empty. After investigation, I’ve concluded that the A’s have identified and targeted players that have higher offensive production with runners on base.

League-wide trends
First, it should be noted that in general, everyone hits better with runners on base. There are two primary reasons for this. The first is sampling bias: if runners are on base, you’re more likely to be facing an inferior pitcher, as such pitchers allow more baserunners and hence face proportionally more batters with runners already on base. Second, the defense is concerned with more than just the current batter. With the bases empty, the defense presumably aligns themselves to maximize the chances of getting the batter out (or, more precisely, to minimize the overall output of the batter). With runners on, there are other considerations – ensuring that the runners don’t steal, for example – that change the defensive alignment. As a result, a given ball in play is more likely to be a hit if there are runners on base. League-wide in 2014, the numbers look like this:

	PA	OPS	BAbip	tOPS+
Bases Empty	80375	0.687	0.296	95
Runners on Base	61905	0.725	0.302	106

tOPS+ is a measure of the split, relative to average. Roughly speaking, the above numbers mean that on average, hitters’ OPS is 6% higher (tOPS+ = 106) with runners on base compared to OPS in all scenarios.

Some teams have been better than others when it comes to hitting with runners on base:

Team	OPS (Empty)	OPS (RoB)	OPS Diff	BAbip (Empty)	BAbip (RoB)	BAbip Diff	tOPS+
OAK	0.672	0.789	0.117	0.264	0.306	0.042	118
SEA	0.633	0.740	0.107	0.281	0.312	0.031	118
NYM	0.622	0.713	0.091	0.284	0.288	0.004	116
COL	0.740	0.820	0.080	0.319	0.332	0.013	112
CIN	0.648	0.719	0.071	0.291	0.288	-0.003	112
CLE	0.688	0.756	0.068	0.288	0.304	0.016	111
BAL	0.705	0.771	0.066	0.288	0.310	0.022	111
ATL	0.662	0.716	0.054	0.296	0.317	0.021	109
BOS	0.664	0.713	0.049	0.294	0.297	0.003	108
MIA	0.675	0.724	0.049	0.313	0.318	0.005	108
PHI	0.650	0.694	0.044	0.294	0.290	-0.004	107
CHW	0.700	0.743	0.043	0.308	0.311	0.003	107
LAA	0.717	0.752	0.035	0.290	0.327	0.037	106
PIT	0.710	0.744	0.034	0.302	0.313	0.011	106
CHC	0.666	0.700	0.034	0.300	0.279	-0.021	106
KCR	0.681	0.715	0.034	0.306	0.297	-0.009	106
MIL	0.710	0.740	0.030	0.299	0.295	-0.004	106
ARI	0.677	0.709	0.032	0.293	0.298	0.005	105
SFG	0.670	0.698	0.028	0.283	0.310	0.027	105
WSN	0.691	0.718	0.027	0.303	0.302	-0.001	104
MIN	0.691	0.710	0.019	0.293	0.308	0.015	103
HOU	0.696	0.711	0.015	0.292	0.294	0.002	102
NYY	0.686	0.697	0.011	0.282	0.293	0.011	102
DET	0.750	0.760	0.010	0.309	0.319	0.010	102
TBR	0.698	0.707	0.009	0.298	0.287	-0.011	102
SDP	0.637	0.644	0.007	0.278	0.274	-0.004	102
TEX	0.694	0.689	-0.005	0.308	0.299	-0.009	99
TOR	0.746	0.740	-0.006	0.303	0.291	-0.012	99
LAD	0.726	0.715	-0.011	0.313	0.310	-0.003	99
STL	0.699	0.688	-0.011	0.307	0.290	-0.017	98

Here, tOPS+ is the measure of the split relative to that team’s average. So for example, the Tigers’ OPS with Runners on Base (RoB) is 0.760, vs. 0.750 with Bases Empty for a tOPS+ of 102. The Reds on the other hand have a split of 0.648 vs. 0.719 for a tOPS+ of 112. The Tigers are a better offensive team overall than the Reds, but the Reds’ split with runners on base is larger.

The A’s
The A’s and Mariners top the list as having the largest split with runners on base. Let’s take a look at the A’s individual players and how they perform with RoB:

Name	PA	OPS	BAbip	tOPS+
Josh Donaldson	242	0.953	0.318	138
Brandon Moss	239	0.933	0.348	130
Yoenis Cespedes	208	0.798	0.310	114
Jed Lowrie	202	0.563	0.250	69
Alberto Callaspo	183	0.656	0.264	116
Derek Norris	147	0.878	0.316	109
John Jaso	144	0.842	0.351	120
Coco Crisp	143	0.857	0.333	130
Josh Reddick	134	0.837	0.283	122
Eric Sogard	115	0.587	0.258	108
Stephen Vogt	96	0.887	0.338	106
Nick Punto	94	0.679	0.368	135
Craig Gentry	85	0.676	0.333	116

Again, the tOPS+ column represents how well the player performs with runners on base relative to that player’s average performance. We can see that across the board, with the notable exception of Jed Lowrie, all the A’s have been performing better with runners on this year.

Now typically this is where you’d say the A’s are just getting lucky, and expect them to regress to the mean. Certainly some regression is expected, but I’m not sold on the idea that this is entirely luck-driven. We know that there are some players who routinely and consistently perform better with runners on base – sometimes dramatically so. Let’s take a look at these players’ career numbers to see if they might be such players:

Name	PA	OPS	BAbip	tOPS+
Donaldson – Empty	861	0.701	0.259	74
Donaldson – RoB	675	0.945	0.351	134
Moss – Empty	1084	0.737	0.263	85
Moss – RoB	944	0.864	0.348	117
Cespedes – Empty	844	0.746	0.277	90
Cespedes – RoB	768	0.824	0.304	111
Lowrie – Empty	1338	0.732	0.283	98
Lowrie – RoB	1096	0.756	0.299	104
Callaspo – Empty	2045	0.678	0.281	92
Callaspo – RoB	1580	0.741	0.287	110
Norris – Empty	471	0.694	0.292	87
Norris – RoB	390	0.813	0.309	116
Jaso – Empty	940	0.702	0.275	85
Jaso – RoB	697	0.835	0.308	120
Crisp – Empty	3609	0.742	0.298	100
Crisp – RoB	2237	0.739	0.291	100
Reddick – Empty	992	0.761	0.291	109
Reddick – RoB	820	0.692	0.249	89
Sogard – Empty	488	0.591	0.253	91
Sogard – RoB	362	0.654	0.274	112
Vogt – Empty	206	0.716	0.288	93
Vogt – RoB	183	0.773	0.300	107
Punto – Empty	2087	0.633	0.298	96
Punto – RoB	1627	0.664	0.298	106
Gentry – Empty	549	0.692	0.350	98
Gentry – RoB	432	0.709	0.325	103

Almost all of them have put up large splits with runners on. Of course, it can take upwards of 1000 PAs for something like BABIP to stabilize (and even then you still need to account for regression to the mean), and many of these players aren’t at that threshold. Nevertheless, taking these players’ careers in aggregate gives us 27,000 plate appearances; across these, the players show in an increase of 14 points of BABIP and 53 points of OPS with runners aboard. When compared to league average (6 points of BABIP and 38 points of OPS), it really looks like the A’s are targeting players that have some inherent, non-random ability to perform better with runners on base (to a greater extent than average).

A quick look at the Mariners
The other team leading the league in the split is the Mariners. What’s going on there? A look at the individual players’ splits shows:

Name	PA	OPS	BAbip	tOPS+
Robinson Cano	221	1.032	0.327	137
Kyle Seager	219	0.905	0.336	120
Dustin Ackley	177	0.702	0.310	104
Mike Zunino	167	0.640	0.247	88
Brad Miller	139	0.619	0.293	108
Justin Smoak	117	0.697	0.268	119
James Jones	115	0.634	0.366	112
Logan Morrison	106	0.671	0.244	106
Corey Hart	101	0.580	0.269	97

The two biggest contributors, by far, are Cano and Seager. If a genie were to give you one very specific wish which was, you get to pick 2 players on your team to magically perform dramatically better with runners on base, you’d want to pick the 2 guys who a) are clearly the best hitters on your team and b) get the most plate appearances. For the Mariners, that’s Cano and Seager.

Here, I absolutely expect regression to the mean. I don’t think the Mariners keep this up. In fact, looking at Cano’s career numbers (over 6000 PA’s), he’s actually been better with the bases empty: OPS of .873 vs. 0.845, and BABIP of 0.335 vs. 0.313 — but for some reason so far this year he’s been far better with runners on.

What does it all mean?
The A’s have figured it out. The Mariners have been lucky. The Mariners will regress heavily to the mean for the remainder of the season. The A’s might regress somewhat, but they’re on to something. By building a roster of players that are more productive with runners on base, they score more runs.

This explains why the A’s are outperforming their Expected Runs, or BaseRuns. BaseRuns predicts how many runs a team scores based purely on their aggregate totals (hits, homers, total bases, etc.), removing all sequencing from the picture entirely. Based on BaseRuns, FanGraphs says they “should have” only scored 4.54 runs per game, when they’ve actually been scoring 4.82 runs per game. If we can do a better job quantifying how much of this sequencing is luck-based versus skill-based, we can do a better job projecting run scoring, and by extension, win percentages.

Baseball’s 10 Most Unusual Hitters

by Foster Honeck

August 30, 2014

Baseball, more than any other major team sport, has the reputation for having the least athletic athletes. Jose Molina is obligated to, at times, sprint. Jorge de la Rosa must swing a baseball bat. David Ortiz sometimes has to play in the field. Having skills like catcher defense, pitching, and hitting with power will earn you playing time, and many players have such elite strengths that it’s worth it just to deal with those weaknesses. So many of baseball’s skills are unrelated that players have to spend a lot of time doing things they aren’t good at, at least relative to other MLB talent. A good way to make anyone look unathletic is to make them perform a long list of skills that have little to do with one another and compare them to the best in the world at those tasks.

I wanted to assemble a list of players who experienced something like this phenomenon the most frequently. Essentially, I wanted to see what players’ strengths and weaknesses were the farthest apart. To determine those players whose skills varied the most between themselves, I gathered what I consider to be the six stats that best describe what a player’s strengths and weaknesses are. BABIP and K% for contact, BB% for discipline, ISO for power, and Fielding and Baserunning values. I then gathered stats from 2011-2014 to better control for less reliable fielding metrics, assigned each player’s stats a percentile rank, and calculated the standard deviation of those six stats for each player.

For instance, Mike Trout’s attributes look like this:

Mike Trout