Archive for Research

Vertical Command – Or Lack Thereof

I read a great book by Mike Stadler called the Psychology of Baseball. In it he referenced that it is far more difficult for humans to control where a ball ends up vertically (due to the need for advanced spatial reasoning) compared to horizontally. You can find his discussion starting on page 86. Amazon Link

I’m going to show you three pictures which will illustrate this quite well. Data is inclusive of all pitches thrown in regular season games since 2010. The first is a heat map of sorts which maps vertical distance from the center of the zone (from PITCHf/x data sz_top and sz_bottom) on the y axis and velocity on the x axis. What we see quite clearly is that it is *much* better to throw a four-seam fastball up in the zone than down in the zone, almost irrespective of velocity. In fact, a 92 MPH four-seam fastball thrown 0.8 feet above the center of the zone will get about 13% swings and misses; a 98 mph four-seam fastball thrown below the center of the zone will get 12% swings and misses. Behold the graph, from a fan:

Four Seam Fastball, Depth x Velocity
Four-Seam Fastball, Depth x Velocity

The question then becomes, if a pitcher throws the ball up in the zone, how will the probability of a HR change? This brings us to picture #2, where we have the same x and y axes (apparently that’s the plural of axis, thanks google), but instead we have HR% (# of HRs/Total Pitches). I’ve removed 99+ MPHs from the graph as they were displaying SSS noise.

HR% by Depth and Velocity
HR% by Depth and Velocity

So interestingly, if you look at the totals on the right, it paints a visual that HRs are NOT hit on high fastballs, but rather on fastballs closer to the heart of the zone (vertically). In fact (and a story for another day) there is a 97% R-squared correlation between distance from the center of the zone and HR%. On an aside, this also reproduces other research which indicate that faster fastballs yield fewer home runs. The trend is also quite linear (don’t have a computed R2 for that, but that’s old news anyway).

Now, if you are far more likely to get a swinging strike and you aren’t putting yourself at risk for a home run by throwing up in the zone, if we looked at a distribution of four-seam fastballs, we should see a higher proportion of four-seamers up in the zone, ideally right at the top 0.8 to 1.0 feet above the zone, where whiffs are plentiful and HRs are scarce. Beware SSS in some of the higher velocities, but note that a 95 MPH fastball only .4 feet above the center of the zone will yield more HRs than an 88 MPH fastball thrown at the top of the zone (the 95 MPH fastball will still yield more whiffs, but just goes to show how important command is). This is what we actually see:

A nearly uniform distribution across all velocities, slightly skewed to below the center of the zone. I’m not ready to conclude that pitchers are not capable of pitching up in the zone with four-seam fastballs, it may just be old school “pitch down in the zone” thinking. I still find it astonishing how consistent the data is across the velocity spectrum. It almost appears to me that if a pitcher can simply pitch higher in the zone with a four-seam fastball, they can make their stuff play up a lot, sort of like MadBum:

Still not pitching at the top end of the zone, but definitely skewed higher, with his distribution centered around .3 feet above the heart of the zone.


GB% by Pitch Type and Location

Red = High GB% rate (ground balls / total pitches)
Yellow = Medium ; Green = Low

The size of the circle also represents the magnitude.

Numbers are in Feet, with -X being inside (handedness neutral) and Z being height in feet above the center of the strike zone (as per PITCHf/x strike zone top and bottom). The X is flipped for left handed batters. After I’ve published a few of these, I’ll work on publishing a version to Tableau Public, though not sure how it will perform given the huge underlying data set.

Some observations:

1) The cutter, which appeared to have two hot zones for swings and misses, appears to have only one hot zone for groundballs, of about .5 feet to 1 foot below the center of the zone and between .4 feet away and .4 feet in from the center of the plate. In the previous post we saw that as you went farther away from the plate horizontally and about .5 foot lower, you get swinging strikes.

2) Changeups down and away get groundballs. They also get swings and misses. Groundbreaking stuff here…

3) Two-seamers and sinkers have a very large area that get groundballs (another shocker), though what surprises me is how high it starts (almost at the center of the plate). It makes me wonder if I need to double-check my methodology. As you get lower in the zone, you get fewer swings and more takes, so the GB% goes down dramatically.

4) Curveballs only get groundballs if they are in the strike zone when crossing the plate (down and away). If you bury it, you basically trade the GB for a swing and a miss. I’m thinking I need to rebuild this chart with fewer grids, but a bunch of pie charts, to somehow visualize how results morph based on location.

Finally figured out how to get PITCHf/x data into Tableau (used Alteryx to scrape MLB) — having lots of fun and appreciate the feedback!


Can we Calculate MVP with a CPA?

No this isn’t a piece for accountants so please don’t give up on it or go to sleep!  It is an MVP discussion, and there is always a lot to talk about with the MVP, the very definition of which is vague, entitling anyone to interpret it how they wish.  There are perpetual questions– is it for an outstanding player or one who can meet some criteria of clutch?  Can a pitcher be more valuable than an everyday player?  Must a candidate play on a contender?

This article lays out a framework for quantifying these issues.   As described, a definitive answer requires a little more data than we now have, but it’s possible to have this data for an interesting quantitative measure of the MVP.

Let’s start with principles:

  • The objective is to win a championship. I don’t expect this to be controversial, is it?  As we’ll get into, this doesn’t mean a non-contender can’t win, but it will be more difficult for them to do so.
  • Context and chances matter. We aren’t trying to pick the best player, we’re trying to pick the most valuable.  We’re not trying to forecast the future, we’re looking back at the past.  Whether a player benefits from the luck of situations or of opportunities, the player who capitalizes upon his luck seems to this author to have been more valuable than an unlucky player who doesn’t have as many such opportunities.  Dave Studeman and Dave Cameron have written well on this topic.  If you don’t agree, take it up with them.  (For future research – must an author’s first name begin with the letter “D” to believe this?)  Further, the context of a player’s team matters – clinching a pennant on the last day is more valuable based upon context than an April rout or a meaningless September game between call-ups.

Introducing CPA

Accordingly, we’ll take something old, Win Probability Added, (WPA) and dust off and tweak something else old, Championship Leverage Index (CLI) to make up a new statistic to measure value – Championship Probability Added (CPA).  Our formula is CPA = sum of all daily WPA x CLI.

The facets of WPA are discussed thoroughly in another Studeman article.  Suffice to say, it captures a hitter’s or pitcher’s contribution to the probability of his team winning a game, which we can take as a player’s value to his team in the particular game.

As for the importance of the game to the team, Studeman and Sky Andrecheck have developed a measure of the game’s importance, the Championship Leverage Index, how the outcome of a game affects a team’s championship probability, but, as Studeman pointed out in his WPA article, the new wild-card format makes calculation of CLI difficult.

Fortunately, FanGraphs has a big part of the answer in their playoff probability table, which daily measures a team’s playoff and championship probabilities.  The day to day changes in these probabilities are indicative of each game’s importance, although a full measure of a game’s importance would require running the simulations 15 more times to determine the change in probability for each game’s alternative outcome.

There are different measures of championship probability in these tables based upon projections or upon random (coin toss) probabilities for a season’s balance.  The projection-based probabilities may be more accurate, but, for our purpose, measuring the value of each game, the coin toss probabilities are more useful.  1) The projection-based probabilities are more volatile early in the season as they vary not only with the game’s outcome, but with players’ individual performance which in turn affect his team’s projections.  Thus early-season games are weighted more highly than late games.  2) A player’s individual impact can be diminished because it already has been factored into a team’s projections.

The 2015 MVP Race by CPA

For now, without the complete probabilistic simulations, we’ll try to approximate the value of a game by taking the absolute value of a daily change in a team’s championship probabilities.  We use the absolute value of the daily change since it measure’s the game’s importance whether or not a team wins.  Without this, a player would be penalized if his team loses a game, even if he has a big (valuable) game (high WPA).

For now, the daily changes must be recorded from FanGraphs by hand, so we’ll run with an illustrative example rather than a definitive analysis.  Let’s start with the top two players by WAR in each league:

American

National
Player WAR Player WAR
Trout 9.0 Harper 9.5
Donaldson 8.7 Kershaw 8.6

In the AL, both the Angels and Jays were in contention, although, the Jays’ chances became markedly better later in the season.

Championship Probability
All Star Break September 1

September 30

Angels

6.4%   1.0%

 2.4%

Jays 2.1% 10.0%

12.6%

 

While the Angels had a low probability, there was still a lot of opportunity for Mike Trout to benefit from swings in their chances in the end, but he couldn’t make up all the ground on Josh Donaldson’s high WPA during the Jays high CLI second-half run.

Cumulative Championship Probability Added

All Star Break

September 1 September 30
Trout 0.9% 1.1% 1.3%

Donaldson

1.1% 2.1% 2.5%

On the NL side, WAR leader, Bryce Harper, had his CPA affected by the Nats dropping out of playoff contention.

Championship Probability
All Star Break September 1 September 30
Dodgers 8.9% 11.7% 12.8%
Nationals 8.1%   0.8%   0.0%

Harper’s early-season lead fell by the wayside as Kershaw’s performance improved from its negative start and the Dodgers remained in the championship hunt.

Cumulative Championship Probability Added

All Star Break September 1 September 30
Kershaw 0.5% 1.2% 1.5%
Harper 1.1% 1.1% 1.4%

So, definitive MVP stat?  Not yet, but hopefully a step in that direction.  Calculating a probabilistic CLI would be a big help.  Improvements to WPA to incorporate base running and fielding would help too.

Thoughts?


How to Get a Swinging Strike by Pitch Type and Location

Red = High swinging-strike rate (swing and a miss / total pitches)
Yellow = Medium ; Green = Low

The size of the circle also represents how high the whiff rate is

Numbers are in Feet, with -X being inside (handedness neutral) and Z being height in feet above the center of the strike zone (as per PITCHf/x strike zone top and bottom)

Some observations (and probably repetition of prior research):

1) Four-seam fastballs are great between 0.8 to 1.4 feet above the middle of the zone and between -.5 and .5 across the plate (i.e., if you want to get a swing and a miss on a four-seamer, throw it high and right down the middle). Will have similar views for GB% and HR% soon.

2) Sliders, changeups and curveballs all need to be thrown low in the zone; doesn’t appear to matter inside or outside, though changeups need to be around the plate (or they don’t get swings).

3) There is almost nowhere you can throw a two-seamer to get swings and misses, though down and in and basically high appear to be the best places to get strikes.

 

More to come if you think this is interesting!


The Risk and Reward of Attempting to Pick Runners Off

Recently, Dave Cameron examined a planned back-pick by Russell Martin and the Blue Jays in Game 1 of the ALDS.  The play didn’t have a chance to happen because Delino DeShields put a 2-1 change up in play.  Not just in play, but on the ground to directly where the second baseman Ryan Goins would have been had he not been breaking for second in anticipation of the pick.  Dave wrote a great article that covered the play in depth, so feel free to go read it here.  In this article, I analyze the strategy of calling for a set pickoff attempt. What I found not only vindicates Martin and the Jays, but also questions one of my longest-held beliefs about pickoffs.

My strategy for evaluating the set pickoff was to calculate the break-even point (BEP) for a pickoff attempt using Run Expectancy (RE), similar to previous analyses on bunting and stealing. To calculate the BEP for a given pickoff attempt, I calculated the RE benefit (to the defense) of an out and the weighted RE cost of a safe call or an error.  This sounds simple enough, but calculating the RE after an error involved some guesswork.

Although errors can result in multiple outcomes, I chose to pick one outcome for each base to simplify the analysis. Thus, I assumed 2 bases for all runners on an errant throw to first, 1 base for all runners on an error to second, and, after much thought, 2 bases for runners on second and 1 base for runners on the corners on an error to third. If you have data that can replace these assumptions, please let me know.  Otherwise, be cognizant of my assumptions when you attempt to make use of the findings.  For example, if there is a slow runner on second, the BEP for a pickoff attempt to a corner will be overly conservative (inflated).  Additionally, I didn’t differentiate between pickoff attempts from the pitcher and the catcher.  The pitcher has a shorter, unobstructed throw, and favorable balk rules when picking to second or third, but still has to deal with the risk of a balk, especially to first, along with the added difficulty of throwing off the mound.  Finally, while calling for a back-pick from the catcher can put a defender out of position, I chose to ignore this factor because a) I assume it is rare for a hitter to find the vacated hole, and b) the defense can choose to avoid contact.

In order to weight the cost of a failed pickoff attempt appropriately, I had to estimate what the error rate would be on attempts.  While we do have data on pitcher error rates on pickoff attempts (around 0.95%), the data are only from throws to first.  Set pickoff plays are more challenging for the defense, so the error rate should be higher than on typical attempts to first.  My solution, in lieu of empirical data from actual set pickoff attempts, was to estimate catchers’ throwing error rates from the 2015 season.  I chose this strategy for two reasons: First, catchers are one of the primary players who can attempt a set pickoff, so it made sense to sample from their performance.  And second, catchers accumulate a large portion of their assists under similar conditions to the pickoff attempt (for example, in 2015 nearly 40% of all catcher assists came from caught stealing).  Thus, I expected catcher throwing error rates to approximate the error rates we would observe on set pickoff plays.

While not a perfect method, I estimated catcher throwing error rate as Throwing Errors / Assists + Throwing Errors + Stolen Bases.  The mean throwing error rate in a sample of catchers (n = 38) who played at least 500 innings in 2015 was 3.6%.  Do you accept that set pickoff plays will result in 3.8 times more errors than typical pickoff throws to first? If not, adjust your own estimates accordingly.

Using the estimated throwing error rate for catchers, the formula for estimating the BEP on a set pickoff attempt is RE cost / (RE cost – RE benefit). In this equation, RE benefit = RE after a pickoff – RE before a pickoff; RE cost = RE before a failed attempt – RE with a failed attempt, and RE with a failed attempt = (RE of a safe call *.964) + (RE of an error *.036).  Using the RE tables found here, I generated Table 1 below.

 

Runners Outs First Second Third
1 _ _ 0 3.51%
1 3.32%
2 3.24%
1 2 _ 0 3.32% 2.18%
1 4.21% 1.93%
2 9.17% 2.33%
1 _ 3 0 2.37% 0.74%
1 3.47% 1.92%
2 6.72% 5.99%
_ 2 3 0 1.70% 1.41%
1 1.93% 1.73%
2 5.06% 5.06%
1 2 3 0 10.21% 1.97% 1.64%
1 4.85% 2.78% 2.48%
2 7.58% 3.92% 3.92%
_ 2 _ 0 1.54%
1 1.43%
2 1.26%
_ _ 3 0 0.11%
1 1.74%
2 5.61%

Table 1.  Success rate required to attempt a pick at each base.

Table 1 presents the BEP for the defense of (successful pickoffs / attempts) X 100.  In other words, Table 1 provides the minimum expectation of success required for the defense to attempt a set pickoff and it be a break-even strategy. Unfortunately, it is difficult to guess how successful set pickoff attempts typically are.  In Dan Malkiel’s study of pickoffs to first, he found that righties and lefties were successful about 2% and 4% of the time, respectively.  However, Malkiel’s study sampled situations with base-stealers on first, so the stolen-base rate was between 17% and 21%.  It’s impossible to know what percentage of successful pickoffs occurred when the runner intended to steal, but it’s safe to say 2% and 4% success rates are a little high if the runner on first isn’t planning on going. Set pickoffs usually work differently than throws to first, since neither the pickoff nor the steal are always expected. Therefore, the data on picks to first can only serve as a point of reference, helping to calibrate expectations rather than serving as predictions themselves.

One way to assess if teams are over- or under-utilizing set pickoffs is to compare their pickoff to error ratios with the BEPs for that metric. Unfortunately, I could only find data for one special case of the set pickoff: a catcher back-pick to first.  In the Malkiel study, successful back-picks were 96% of back-picks plus errors.  If we assume an error puts the runner on third, the BEP for pickoffs/pickoffs + errors is 50%, suggesting that catchers have room to get much more aggressive in attempting to pick runners off first.  Without more data, it’s difficult to comment further on current MLB behaviour regarding set pickoff plays. Nevertheless, the estimates in Table 1 provide interesting insights into the risks and rewards of pickoff plays. Below, I list six lessons that can be gleaned from Table 1.  At least two of these lessons fly directly in the face of my own long-held beliefs, and maybe yours too!

Lesson 1

If, at any time, the defense notices that it has better than a 15% chance of picking off a runner, they should attempt the pickoff.

Lesson 2

Pickoff attempts require greater confidence with two outs, with three exceptions.  Often, the required success rate is over 5%, requiring a fairly egregious mistake by the runner to warrant a throw. The exceptions to this rule are with a runner on first, a runner on second, or a pick to second with runners on first and second.

Lesson 3

A runner on second with no runner ahead of him should probably be targeted frequently.  The BEPs are consistently low for attempting the pickoff to second, while the runner is motivated to be aggressive by the chance to score a run or steal third. Even failed attempts have the favorable by-product of keeping the runner close, a factor not considered in Table 1.

Lesson 4

Throwing behind the runner on first with runners on first and second or the bases loaded is dangerous.  This doesn’t mean it’s a bad play if the runner on first opens the door, but the defense should be really confident to make the throw.

Now for the lessons that go against everything I thought I knew…

Lesson 5

Pitchers should throw over to third with runners on 1st and 3rd in a steal situation.  Ever since the MLB outlawed the fake-to-third move, pitchers haven’t been allowed to bluff the throw in hopes of catching the runner breaking from first.  Based on Table 1, it seems strange that pitchers ever faked the throw to begin with.  With no one out, the defense would only need to pick the runner off third 8 times per 1000 attempts, or nail the runner stealing second 3 times per 100 attempts, or a combination of the two to break even.  Additionally, if the runner on first breaks for second it’s an easier throw from third than from first, which was often the result with the fake-to-third move.  While many old-school baseball people will object to throwing over to third, the common refrain “he’s not going anywhere!” doesn’t necessarily apply to the 1st and 3rd steal situation.  The runner could be trying to get closer to home so he can steal on the catcher’s throw to second, making it the perfect time to throw over.  Although the third baseman’s positioning will sometimes make a true pickoff attempt at third difficult, the rules do not require the pitcher to throw directly to third.  Thus, teams can make legitimate efforts to get the runner on third when the situation allows it, while other times making throws away from the base solely to catch the runner on first breaking for second.

Lesson 6

The situation that requires the lowest probability of success to attempt a pickoff is when there is a runner on third with no one out.  The defence needs to nab merely 2 runners out of every 1000 attempts to break even. And get this, the BEP on pickoff attempts to third with 0 out is lower than the BEP for typical throws to first, even with the much lower error rate on throws to first (0.95%), and even after adjusting the assumed cost of an error to one base.  Holding probability of success constant, the pickoff attempt to get a runner on third with 0 out is the least risky pickoff attempt possible. The LEAST risky.

Of course, a runner who is on third with no one out should be taking no chances.  But that doesn’t mean a pickoff will never work…

 


An Introduction to Determining Arbitration Salaries: Starting Pitchers

My name is Rich Rieders and I am a 2015 graduate of Rutgers Law School. Over the winter, I participated in Tulane University’s 9th Annual Baseball Arbitration Competition and we finished in 2nd place overall out of 40 teams.

In order to prepare for the competition, I created a database (going back to 2008) consisting of all arbitration awards and players who signed 1-year contracts avoiding arbitration along with their respective statistics. Using regression analysis, I was able to determine which statistics correlate most with salary. In turn, I have created a projection system that can accurately predict arbitration salaries. My projections are more accurate than the ones featured on MLBTradeRumors.

I will be releasing my 2016 projections once the season is over and all awards are announced.

The goal of this article is to properly explain how arbitration salaries are determined and how to choose the best comparative baseball salaries (comps) as outlined in Article VI, Section E, Part 10(a) of the CBA. You can think of the comps as legal precedent. The closer the comps are to the player’s stats, the more comps you have and the more recent those comps are, the stronger your argument.

First and foremost, the purpose of the arbitration process is to compensate the player for his actual results on the field, not to give him a salary based on what we expect he will produce in the upcoming season. We concern ourselves with only the traditional stats. I know this is a complete departure from the way we normally think here on FanGraphs, but salary arbitration is a completely different animal. In essence, arbitration salaries are determined by the accumulation of traditional counting stats.

For our purposes, there are six types of players who are up for arbitration in a given offseason and each type has its own separate valuation. The six types of players are:

(1) First-year-eligible SP

(2) SP who have previously been through the arbitration process

(3) First-year-eligible RP

(4) RP who have previously been through the arbitration process

(5) First-year-eligible position player

(6) Position players who have previously been through the arbitration process.

I will explain, in detail, how to properly choose player comps for each of the six group of players. In this segment, we will focus just on the starting pitchers.

For a SP who is arbitration eligible for the first time, here are the statistics that correlate most with eventual salary:

Platform IP: 60.83%

Platform GS: 57.59%

Platform SO: 54.41%

Platform W: 53.12%

Career IP: 50.56%

Career SO: 47.45%

Career W: 42.76%

Career GS: 37.10%

When initially looking for player comps, these are statistics we are going to focus on. Keep in mind that although ERA is not listed, it is nonetheless important as ERA is still one of the default statistics during a hearing and the first basis for comparison. Note that rate stats almost always have a very low correlation since rate stats do not take into account playing time.

Let’s use Atlanta Braves starter, Shelby Miller, as an example of a first-year-eligible SP.

Shelby Miller is arbitration-eligible for the first time going into 2016 with 3 years and 30 days of service time (3.030). In his platform season (2015), Miller made 33 starts recording 6 wins, 171 SO with a 3.02 ERA in 205.1 IP. Over his career, Miller has compiled 575 IP, 32 W, 483 SO with a 3.22 ERA in 96 GS. The objective here is to find the players who avoided arbitration by signing a 1 year contract with statistics that are most similar to Miller’s. The more recent, the better. The best way to do that is to set a floor and a ceiling and then work your way towards the middle.

From Miller’s perspective, let’s look at Miguel Gonzalez’s 2014 platform season. Like Miller, Gonzalez posted a low win total despite a very strong ERA. Gonzalez made 26 starts, recorded 10 wins, 111 SO with a 3.23 ERA in 159 IP. Over his career, Gonzalez compiled 69 starts, 30 wins, 308 SO with a 3.45 ERA in 435.2 IP. Although their ERA and win totals are extremely close, Miller bests Gonzalez in all the most important categories and has significantly more playing time and strikeouts. Therefore, we can definitively state Miller should receive more than Gonzalez did. As such, Gonzalez’s 2015 salary of 3.45 million should be the floor.

From Atlanta’s perspective, let’s look at Chris Tillman’s 2014 platform season. Like Miller, Tillman pitched a similar amount of innings and games with a pretty low ERA. In his platform season, Tillman made 34 starts recording 13 wins, 150 SO and a 3.34 ERA in 207.1 IP. Over his career, Tillman compiled 45 W, 680.1 IP, 511 SO with a 4.00 ERA in 118 GS. Although Miller has the better ERA, Tillman is superior in all the other major categories. Hence, we can conclude that Miller will receive less than Tillman. We can use Tillman’s 2015 salary of $4.315 million as the ceiling.

Given the above, Shelby Miller is likely to receive somewhere between $3.45 million and $4.315 million. Now that we have a range, let’s find someone towards the middle.

In 2011, Justin Masterson made 33 starts with 12 W, 158 SO, 3.21 ERA in 216 IP. Over his career he made 87 starts, with 28 W, 485 SO, 3.92 ERA in 613.2 IP. Those numbers are quite similar across the board with Miller having a better ERA, but fewer IP. Masterson’s 2012 salary was $3.825 million. Alex Cobb ($4.0 million in 2015),  Travis Wood ($3.9 million in 2014) and Steven Strasburg ($3.975 million in 2014) are all good comps as well.

As for my model, Miller projects to receive $3,859,816 +/- $145,351 which is perfectly in line with the comps above. MLBTradeRumors projects him at $4.9 million, which is not only significantly higher than the above comps, but would beat the record for a first-year player by nearly 600K.

For a player who has already been through the arbitration process before, the valuation is completely different as career statistics are no longer used the 2nd, 3rd, 4th, etc. time around (except in a few rare cases). This group of players are the most difficult to project since we use fewer variables due to the exclusion of career stats and how there are fewer SP across the league than relievers or position players. Nonetheless, we can still get a pretty good idea what their eventual salary will be.

For an SP who has previously been through the arbitration process, the stats that correlate most with eventual salary are:

(1) Platform W: 69.12%

(2) Platform RA9-WAR: 64.04%

(3) Platform SO: 60.97%

(4) Platform fWAR: 58.93%

(5) Platform IP: 58.34%

(6) Platform GS: 49.75%

For example, let’s look at Angels SP Garrett Richards who is arbitration eligible for the second time going into 2016. As a Super-2 going into 2015, Richards received a $3.2 million salary. That figure includes everything he had done in his career up to that point. Thus, when determining his 2016 salary, we don’t need to focus on previous seasons. We need only determine what his 2015 season was worth and give him a raise. In his platform season (2015), Richards made 32 starts recording 15 wins, 176 SO, 3.65 ERA, 2.5 fWAR and 2.8 RA9-WAR in 207.1 IP. We want to find the players whose stats are most similar to Richards.

First let’s discuss Matt Garza’s 2010 platform season (a bit old, but still useful) where he made 32 starts recording 15 wins, 150 SO, 3.91 ERA, 1.9 fWAR and 2.8 RA9-WAR in 204.2 IP. Other than the strikeout numbers, we have a virtually identical season. As such, Richards is likely to receive a raise higher than Garza’s $2.6 million raise going into 2011. We can consider a raise of $2.6 million to be his floor.

Next let’s look at C.J. Wilson’s 2010 platform season (again old, but useful still) where he made 33 starts recording 15 wins, 170 SO, 3.35 ERA, 4.1 fWAR and 5.1 RA9-WAR in 204 IP. Wilson has the same amount of wins and virtually the same number of SO although Wilson has a clear advantage in fWAR and RA9-WAR with a slightly better ERA so it’s pretty safe to say that Richards is likely to get a raise lower than Wilson’s $3.9 million raise. The $3.9 million should be the ceiling.

Homer Bailey’s 2012 platform season is a great final comparison. Bailey made 33 starts recording 13 wins, 168 SO, 3.68 ERA, 2.7 fWAR and 2.8 RA9-WAR in 208 IP. Both players are virtually identical statistically. Bailey received a raise of $2.925 million so Richards is likely to receive a very similar raise himself. Shaun Marcum ($3.1 million in 2011), Jordan Zimmerman ($3.050 million in 2011) and Max Scherzer ($2.975 million in 2013) are all good comps as well.

Therefore, we can be certain that Richards will receive a raise somewhere between $2.6 million and $3.9 million. As for my model, Richards projects to receive a raise of $2,923,484 for a total salary of $6,123,484+/- $336,500 and, unsurprisingly, that is perfectly in line with the comps above. MlbTradeRumors is projecting a raise of $3.6 million for a total salary of $6.8 million which I think is a bit generous given the comps we have at our disposal, but not unreasonable.

Next up: Relief Pitchers.


Hardball Retrospective – The “Original” 1977 Pittsburgh Pirates

In “Hardball Retrospective: Evaluating Scouting and Development Outcomes for the Modern-Era Franchises”, I placed every ballplayer in the modern era (from 1901-present) on their original team. Therefore, Dan Quisenberry is listed on the Royals roster for the duration of his career while the Tigers declare Charlie Gehringer and the Senators claim Goose Goslin. I calculated revised standings for every season based entirely on the performance of each team’s “original” players. I discuss every team’s “original” players and seasons at length along with organizational performance with respect to the Amateur Draft (or First-Year Player Draft), amateur free agent signings and other methods of player acquisition.  Season standings, WAR and Win Shares totals for the “original” teams are compared against the “actual” team results to assess each franchise’s scouting, development and general management skills.

Expanding on my research for the book, the following series of articles will reveal the finest single-season rosters for every Major League organization based on overall rankings in OWAR and OWS along with the general managers and scouting directors that constructed the teams. “Hardball Retrospective” is available in digital format on Amazon, Barnes and Noble, GooglePlay, iTunes and KoboBooks. The paperback edition is available on Amazon, Barnes and Noble and CreateSpace. Supplemental Statistics, Charts and Graphs along with a discussion forum are offered at TuataraSoftware.com.

Don Daglow (Intellivision World Series Major League Baseball, Earl Weaver Baseball, Tony LaRussa Baseball) contributed the foreword for Hardball Retrospective. The foreword and preview of my book are accessible here.

Terminology

OWAR – Wins Above Replacement for players on “original” teams

OWS – Win Shares for players on “original” teams

OPW% – Pythagorean Won-Loss record for the “original” teams

Assessment

The 1977 Pittsburgh Pirates          OWAR: 53.6     OWS: 347     OPW%: .524

GM Joe Brown acquired all of the ballplayers on the 1977 Pirates roster. Based on the revised standings the “Original” 1977 Pirates tied for second place with the Phillies, one game behind the Cardinals. Pittsburgh topped the Senior Circuit in OWS during consecutive campaigns (1977-78).

Dave Parker (.338/21/88) collected his first batting title and paced the League with 215 base knocks and 44 two-baggers. “Cobra” merited his first All-Star nomination and Gold Glove Award while placing third in the NL MVP balloting. Mitchell Page (.307/21/85) pilfered 42 bags and finished runner-up in the Rookie of the Year vote. Don Money cracked a career-best 25 circuit clouts and received his third All-Star nod. Al “Scoop” Oliver contributed a .308 BA with 19 round-trippers. Richie Zisk clubbed 30 long balls and knocked in 101 baserunners. Willie Randolph laced 11 triples and tallied 91 runs.

Willie Stargell swatted 13 big-flies despite missing almost two-thirds of the 1977 campaign. “Pops” ranks ninth among left fielders according to Bill James in “The New Bill James Historical Baseball Abstract.” Teammates enumerated in the “NBJHBA” top 100 rankings include Parker (14th-RF), Randolph (17th-2B), Oliver (31st-CF), Manny Sanguillen (42nd-C), Dave Cash (50th-2B), Money (55th-3B), Richie Hebner (56th-3B), Zisk (69th-RF), Freddie Patek (73rd – SS), Bob Bailey (79th – 3B), Tony Armas (89th-RF) and Rennie Stennett (90th-2B).

LINEUP POS WAR WS
Willie Randolph 2B 4.69 20.08
Mitchell Page LF 6.07 29.41
Dave Parker RF 5.02 32.6
Don Money 3B/2B 4.11 21.87
Al Oliver CF/LF 2.04 20.19
Richie Hebner 1B 2.7 16.01
Milt May C 1.66 9.63
Freddie Patek SS 1.55 14.76
BENCH POS WAR WS
Rennie Stennett 2B 3.57 17.98
Art Howe 2B 1.83 13.91
Richie Zisk RF 1.82 20.15
Tony Armas CF 1.49 7.76
Dave Cash 2B 1.11 17.12
Frank Taveras SS 1.01 13.78
Ed Ott C 0.67 10.69
Willie Stargell 1B 0.63 8.47
Omar Moreno CF 0.47 12.62
Craig Reynolds SS 0.1 6.74
Gene Clines LF 0.04 5.48
Bob Bailey 0.02 1.61
Jimmy Sexton SS -0.05 0.58
Mike Edwards 2B -0.12 0.16
Miguel Dilone LF -0.33 0.32
Dale Berra 3B -0.41 0.42
Manny Sanguillen C -0.44 10.27
Ken Macha 3B -0.55 0.66
Mario Mendoza SS -0.58 1.22
Bobby Tolan 1B -0.68 0.26

John “Candy Man” Candelaria earned his lone All-Star appearance with a 20-5 record along with a League-best 2.34 ERA. Dock Ellis supplied 12 victories and an ERA of 3.63. Rick Langford surpassed the 200-innings mark while losing 19 of 27 decisions. The bullpen subdued late-inning rallies by the opposition, co-anchored by Gene Garber (2.35, 19 saves) and Kent Tekulve (10-1, 3.06).

ROTATION POS WAR WS
John Candelaria SP 8.07 24.69
Dock Ellis SP 2.37 16.07
Rick Langford SP 1.02 8.44
Bruce Kison SP 0.39 5.71
Timothy Jones SP 0.64 1.73
BULLPEN POS WAR WS
Gene Garber RP 2.19 15.16
Kent Tekulve RP 0.95 11.48
Bruce Dal Canton RP 0.25 1.6
Doug Bair RP 0.21 5.46
Al Holland RP -0.09 0
Ed Whitson SP 0.27 1.21
Woodie Fryman SP 0.08 1.95
Rick Honeycutt SP 0.02 1.18
Silvio Martinez RP -0.22 0.07
Odell Jones SP -0.28 2.24
Bill Laxton RP -0.64 3.32
Ramon Hernandez RP -0.67 0.1
Larry Demery SW -1.04 2.06

The “Original” 1977 Pittsburgh Pirates roster

NAME POS WAR WS General Manager Scouting Director
John Candelaria SP 8.07 24.69 Joe Brown Harding Peterson
Mitchell Page LF 6.07 29.41 Joe Brown Harding Peterson
Dave Parker RF 5.02 32.6 Joe Brown Harding Peterson
Willie Randolph 2B 4.69 20.08 Joe Brown Harding Peterson
Don Money 2B 4.11 21.87 Joe Brown
Rennie Stennett 2B 3.57 17.98 Joe Brown Harding Peterson
Richie Hebner 1B 2.7 16.01 Joe Brown
Dock Ellis SP 2.37 16.07 Joe Brown
Gene Garber RP 2.19 15.16 Joe Brown
Al Oliver LF 2.04 20.19 Joe Brown
Art Howe 2B 1.83 13.91 Joe Brown Harding Peterson
Richie Zisk RF 1.82 20.15 Joe Brown
Milt May C 1.66 9.63 Joe Brown
Freddie Patek SS 1.55 14.76 Joe Brown
Tony Armas CF 1.49 7.76 Joe Brown Harding Peterson
Dave Cash 2B 1.11 17.12 Joe Brown
Rick Langford SP 1.02 8.44 Joe Brown Harding Peterson
Frank Taveras SS 1.01 13.78 Joe Brown
Kent Tekulve RP 0.95 11.48 Joe Brown Harding Peterson
Ed Ott C 0.67 10.69 Joe Brown Harding Peterson
Timothy Jones SP 0.64 1.73 Joe Brown Harding Peterson
Willie Stargell 1B 0.63 8.47 Joe Brown
Omar Moreno CF 0.47 12.62 Joe Brown Harding Peterson
Bruce Kison SP 0.39 5.71 Joe Brown
Ed Whitson SP 0.27 1.21 Joe Brown Harding Peterson
Bruce Dal Canton RP 0.25 1.6 Joe Brown
Doug Bair RP 0.21 5.46 Joe Brown Harding Peterson
Craig Reynolds SS 0.1 6.74 Joe Brown Harding Peterson
Woodie Fryman SP 0.08 1.95 Joe Brown
Gene Clines LF 0.04 5.48 Joe Brown
Bob Bailey 0.02 1.61 Joe Brown Rex Bowen
Rick Honeycutt SP 0.02 1.18 Joe Brown Harding Peterson
Jimmy Sexton SS -0.05 0.58 Joe Brown Harding Peterson
Al Holland RP -0.09 0 Joe Brown Harding Peterson
Mike Edwards 2B -0.12 0.16 Joe Brown Harding Peterson
Silvio Martinez RP -0.22 0.07 Joe Brown Harding Peterson
Odell Jones SP -0.28 2.24 Joe Brown Harding Peterson
Miguel Dilone LF -0.33 0.32 Joe Brown Harding Peterson
Dale Berra 3B -0.41 0.42 Joe Brown Harding Peterson
Manny Sanguillen C -0.44 10.27 Joe Brown
Ken Macha 3B -0.55 0.66 Joe Brown Harding Peterson
Mario Mendoza SS -0.58 1.22 Joe Brown Harding Peterson
Bill Laxton RP -0.64 3.32 Joe Brown
Ramon Hernandez RP -0.67 0.1 Joe Brown
Bobby Tolan 1B -0.68 0.26 Joe Brown Rex Bowen
Larry Demery SW -1.04 2.06 Joe Brown Harding Peterson

 

Honorable Mention

The “Original” 2012 Pirates    OWAR: 46.1     OWS: 303     OPW%: .597

The Bucs seized the National League pennant with 97 victories and topped the circuit in OWS. Andrew McCutchen (.327/31/96) led the League with 194 base hits, earned a Gold Glove Award and finished third in the 2012 National League MVP balloting. Aramis Ramirez (.300/27/105) drilled a League-leading 50 doubles. Pedro Alvarez went yard 30 times while Jose A. Bautista launched 27 bombs in an injury-shortened campaign. Jeff Keppinger delivered a .325 BA in a utility role. Paul Maholm posted a record of 13-11 with a 3.67 ERA and fellow hurler Bronson Arroyo accrued 12 wins with a 3.74 ERA in 202 innings.

On Deck

The “Original” 1931 Athletics

References and Resources

Baseball America – Executive Database

Baseball-Reference

James, Bill. The New Bill James Historical Baseball Abstract. New York, NY.: The Free Press, 2001. Print.

James, Bill, with Jim Henzler. Win Shares. Morton Grove, Ill.: STATS, 2002. Print.

Retrosheet – Transactions Database

Seamheads – Baseball Gauge

Sean Lahman Baseball Archive


OK, the American League Really IS Sweden

Last month, I wrote about the two leagues, noting that

  1. The American League, perceived as being bad this year, was actually a good deal better than the National League overall, and
  2. The perception of the American League’s weakness was due to a near-record level of parity, with neither great nor bad teams.

Let’s start with the second point. At the time of the post, through games of September 5, the standard deviation of winning percentages among American League clubs was the lowest it has been in the 30-team era. Projected onto a 162-game season, the standard deviation of wins for American League teams was 7.8, barely eking out 2007’s 7.9 as the most egalitarian distribution of wins since 1998.

Since September 5, a .500 record has become a black hole, exerting irresistible gravity throughout the American League galaxy:

  • Of the teams with the six best records in the league on that date–the Royals, Blue Jays, Yankees, Astros, Rangers, and Twins–only Toronto and Texas had a winning record the rest of the season.
  • Baltimore, the sixth-worst team in the league as of the morning of September 6, tied the Jays for the best record in the East thereafter. Boston, then the third-worst team, went 15-12 the rest of the way.
  • Cleveland, four games below .500 at the time, scrambled to finish 81-80.

Overall, parity in the already-equality-loving Junior Circuit increased, by so much that I looked beyond the post-1998 30-team era. I calculated the standard deviation of winning percentages for every league-season since 1901. I then multiplied the standard deviations by 162 to arrive at the standard deviation of wins over a 162-game season. Yes, I know, most of those seasons were shorter than 162 games, but that’s OK; I’m just looking to turn the standard deviation of winning percentages, which is not an intuitive figure (e.g., American League, 1930, 0.1107), into something that is recognizable (17.9 wins). Here are the ten seasons in baseball history with the highest parity, that is, the lowest standard deviation of wins:

The 2015 American League is the most egalitarian, populist, tax the rich/feed the poor, Kumbaya-singing league in baseball history. As I suggested in September, it’s the Sweden of leagues.

(The National League finished 2015 with a standard deviation of 13.1 wins, ranking it 102 out of 230 league-seasons in terms of parity. It was the ninth-most unequal among 36 league-seasons since the expansion to 30 teams in 1998. For Gini coefficient detractors, the most unequal league ever was the 1909 National League, which featured the 110-42 Pirates, 104-49 Cubs, and 92-61 Giants, along with the 55-98 Dodgers, 54-98 Cardinals (Yadi was hurt), and 45-108 Braves.)

Now, as to the other point, the American League’s superiority over the National League despite its group hug ethic, here’s a chart.

Twelve years and running.


The Pittsburgh Pirates and Two Missed Opportunities

1. The Pirates finished the year with a 98-64 record, the second best in all of baseball. That ties them with the 1979 and 1908 clubs for the third most wins in franchise history. (The 1909 Pirates won 110 and the 1902 club won 103.) The Pirates’ record, however, included a losing record against two of the worst teams in the game, the Cincinnati Reds (8-11) and the Milwaukee Brewers (9-10).

Let’s break that down. In games in which the Reds didn’t play the Pirates, they were 53-90. In games in which the Brewers didn’t play the Pirates, they were 58-85. So in their non-Pirates games, the two clubs combined for a 111-175 record, a .388 winning percentage. Had they played at that pace in their 38 games against the Pirates, they would have won .388 x 38 games = 15 games, losing 23. Turned around, the Pirates would have gone 23-15 against the Reds and Brewers.

The Pirates were 81-43 in their games that weren’t against Cincinnati or Milwaukee. Had they gone 23-15 against the two clubs–that, is had they been as successful as the rest of the teams in the majors were–their record would have been 104-58. That would have given the Pirates the best record in baseball. They would be enjoying four off days, looking forward to Wednesday’s wild-card game between the Cardinals and Cubs to see whom they’d face at home to kick off the Division Series on Friday.

2. The Pirates had four relief pitchers who pitched at least 60 innings: Mark Melancon, Tony Watson, Jared Hughes, and Arquimedes Caminero. Of the four, the pitcher with the lowest average leverage index when entering a game was Caminero, wasting his namesake’s leverage expertise.


Hardball Retrospective – The “Original” 1979 Montreal Expos

In “Hardball Retrospective: Evaluating Scouting and Development Outcomes for the Modern-Era Franchises”, I placed every ballplayer in the modern era (from 1901-present) on their original team. Therefore, Tony Perez is listed on the Reds roster for the duration of his career while the Red Sox declare Wade Boggs and the Rockies claim Troy Tulowitzki. I calculated revised standings for every season based entirely on the performance of each team’s “original” players. I discuss every team’s “original” players and seasons at length along with organizational performance with respect to the Amateur Draft (or First-Year Player Draft), amateur free agent signings and other methods of player acquisition.  Season standings, WAR and Win Shares totals for the “original” teams are compared against the “actual” team results to assess each franchise’s scouting, development and general management skills.

Expanding on my research for the book, the following series of articles will reveal the finest single-season rosters for every Major League organization based on overall rankings in OWAR and OWS along with the general managers and scouting directors that constructed the teams. “Hardball Retrospective” is available in digital format on Amazon, Barnes and Noble, GooglePlay, iTunes and KoboBooks. The paperback edition is available on Amazon, Barnes and Noble and CreateSpace. Supplemental Statistics, Charts and Graphs along with a discussion forum are offered at TuataraSoftware.com.

Don Daglow (Intellivision World Series Major League Baseball, Earl Weaver Baseball, Tony LaRussa Baseball) contributed the foreword for Hardball Retrospective. The foreword and preview of my book are accessible here.

Terminology

OWAR – Wins Above Replacement for players on “original” teams

OWS – Win Shares for players on “original” teams

OPW% – Pythagorean Won-Loss record for the “original” teams

Assessment

The 1979 Montreal Expos          OWAR: 53.9     OWS: 327     OPW%: .572

GM Jim Fanning acquired 88% (23/26) of the ballplayers on the 1979 Expos roster. Based on the revised standings the “Original” 1979 Expos captured the first pennant in franchise history with 93 victories while topping the National League in OWAR and OWS.

Gary “Kid” Carter paced Montreal with 28 Win Shares and 5.2 WAR. The Hall of Fame backstop slugged 22 round-trippers and commenced a run of 10 consecutive All-Star appearances. Third-sacker Larry Parrish (.307/30/82) clubbed 39 two-baggers en route to a four-place finish in the N.L. MVP balloting. Andre “The Hawk” Dawson displayed his five-tool talent, blasting 25 long balls and nabbing 35 bags. Gary Roenicke swatted 25 big-flies while platooning in left field. Warren Cromartie delivered career-highs with 181 base knocks and 46 doubles. Ellis Valentine contributed 21 jacks and Tony Scott swiped 37 bases.

Tim Raines received the proverbial “cup of coffee” in 1979 with six pinch-running appearances. “Rock” pilfered 808 bases during a career that spanned 23 seasons. He ranks eighth among left fielders according to Bill James in “The New Bill James Historical Baseball Abstract.” Teammates listed in the “NBJHBA” top 100 rankings include Carter (8th-C), Dawson (19th-RF) and Parrish (53rd-3B).

LINEUP POS WAR WS
Tony Scott RF/CF 1.21 13.93
Warren Cromartie 1B/LF 3.28 17.18
Andre Dawson CF 2.74 24.01
Gary Carter C 5.25 28.95
Larry Parrish 3B 4.07 27.34
Gary Roenicke LF 3.33 18.9
Tony Bernazard 2B 0.6 2.56
SS
BENCH POS WAR WS
Jerry White RF 0.79 6.09
Barry Foote C 1.58 12.2
Bombo Rivera LF 0.55 5.17
Ellis Valentine RF 0.4 14.41
Tim Raines 0 0
Terry Humphrey C -0.22 0.21

Steve Rogers (13-12, 3.00), the Expos first-round selection in the June 1971 Amateur Draft, hurled a League-leading 5 shutouts and achieved his third All-Star invitation. Dan Schatzeder posted a 10-5 mark with a 2.83 ERA. David Palmer fashioned a 2.64 ERA with a record of 10-2 in his rookie campaign. Scott Sanderson contributed 9 victories along with a 3.43 ERA. Byron McLaughlin collected 7 wins and 14 saves working in a variety of roles while portsider Shane Rawley saved 11 contests.

ROTATION POS WAR WS
Steve Rogers SP 3.78 16.61
Dan Schatzeder SP 3.31 13.13
David Palmer SP 2.25 11.23
Scott Sanderson SP 1.89 10.21
Balor Moore SP 0.01 5.7
BULLPEN POS WAR WS
Byron McLaughlin SW 1.29 11.04
Shane Rawley RP 0.78 7.84
Bill Atkinson RP 0.22 2.08
Dale Murray RP -1.09 3.15
Bill Gullickson RP 0.02 0.14
Gerry Hannahs SP 0.03 0.74
Bob James RP -0.21 0
Craig Minetto SP -2 0.47

The “Original” 1979 Montreal Expos roster

NAME POS WAR WS General Manager Scouting Director
Gary Carter C 5.25 28.95 Jim Fanning Mel Didier
Larry Parrish 3B 4.07 27.34 Jim Fanning Mel Didier
Steve Rogers SP 3.78 16.61 Jim Fanning Mel Didier
Gary Roenicke LF 3.33 18.9 Jim Fanning Mel Didier
Dan Schatzeder SP 3.31 13.13 Jim Fanning Danny Menendez
Warren Cromartie LF 3.28 17.18 Jim Fanning Mel Didier
Andre Dawson CF 2.74 24.01 Jim Fanning Mel Didier
David Palmer SP 2.25 11.23 Jim Fanning Danny Menendez
Scott Sanderson SP 1.89 10.21 Charlie Fox Danny Menendez
Barry Foote C 1.58 12.2 Jim Fanning Mel Didier
Byron McLaughlin SW 1.29 11.04 Jim Fanning Mel Didier
Tony Scott CF 1.21 13.93 Jim Fanning
Jerry White RF 0.79 6.09 Jim Fanning Mel Didier
Shane Rawley RP 0.78 7.84 Jim Fanning Mel Didier
Tony Bernazard 2B 0.6 2.56 Jim Fanning Mel Didier
Bombo Rivera LF 0.55 5.17 Jim Fanning Mel Didier
Ellis Valentine RF 0.4 14.41 Jim Fanning Mel Didier
Bill Atkinson RP 0.22 2.08 Jim Fanning Mel Didier
Gerry Hannahs SP 0.03 0.74 Jim Fanning Mel Didier
Bill Gullickson RP 0.02 0.14 Charlie Fox Danny Menendez
Balor Moore SP 0.01 5.7 Jim Fanning
Tim Raines 0 0 Charlie Fox Danny Menendez
Bob James RP -0.21 0 Jim Fanning Danny Menendez
Terry Humphrey C -0.22 0.21 Jim Fanning
Dale Murray RP -1.09 3.15 Jim Fanning Mel Didier
Craig Minetto SP -2 0.47 Jim Fanning Mel Didier

Honorable Mention

The “Original” 1985 Expos     OWAR: 55.8     OWS: 320     OPW%: .556

Montreal claimed the National League East division title by a five-game margin over New York while pacing the Senior Circuit in OWAR and OWS. Tim Raines stole 70 bases in 79 tries and batted .320 with 115 runs scored. Raines (35 WS) and Gary Carter (33 WS) surpassed the 30 Win Share plateau as the “Kid” blasted 32 moon-shots. Tim Wallach dialed long-distance 22 times and earned his first Gold Glove Award. Tony Bernazard supplied career-bests with a .301 BA, 169 hits, 17 home runs and 73 RBI. Andre Dawson collected his sixth consecutive Gold Glove Award and drove in 91 runs. Bob James anchored the bullpen staff with 32 saves, 8 victories and a 2.13 ERA. Shane Rawley provided 13 wins with a 3.31 ERA in 31 starts while Joe Hesketh delivered a 2.49 ERA and a record of 10-5 in his freshmen year.

On Deck

The “Original” 1977 Pirates

References and Resources

Baseball America – Executive Database

Baseball-Reference

James, Bill. The New Bill James Historical Baseball Abstract. New York, NY.: The Free Press, 2001. Print.

James, Bill, with Jim Henzler. Win Shares. Morton Grove, Ill.: STATS, 2002. Print.

Retrosheet – Transactions Database

Seamheads – Baseball Gauge

Sean Lahman Baseball Archive