Author Archive

Z-Scores in Sports (a Supporting Argument for zDefense)

This is part 3 of the Player Evaluator and Calculated Expectancy (PEACE) model, which is an alternative to Wins Above Replacement.  This article will introduce evidence that z-scores can be converted into runs (or points in other sports) with accuracy and reliability, as well as analyze the results that zDefense has produced.

Recall that zDefense is broken down into 4 components: zFielding, zRange, zOuts, and zDoublePlays.  The fielding and range components depend on the accuracy of Calculated Runs Expectancy, which I introduced in Part 1.  Outs and double plays, though, use a different technique: they take z-scores for the relevant rate statistics, then multiply by factors of playing time.  Here were the equations:

  • zOuts = [(Player O/BIZ – Positional O/BIZ) / Positional O/BIZ Standard Deviation] * (Player Innings / Team Innings) * (√ Player BIZ / 2)
  • zDoublePlays = [(Player DP/BIZ – Positional DP/BIZ) / Positional DP/BIZ Standard Deviation] * (Player Innings / Team Innings) * (√ Player BIZ / 2) * Positional DP/BIZ

 

We can set up models in other sports that estimate point differentials using very similar techniques.  I’ve developed one for college football and another for the NBA.

For the first model, I’ve used the data for every Division I FBS football team from 2000-2014 (1,802 teams), and I defined the relevant statistics and their “weights” as such:

  • zPassing = [[Completion Percentage z-score * Completions per Game] + [Passing Yards per Attempt z-score * Passing Attempts per Game]] / 10
  • zRushing = [Rushing Yards per Attempt z-score * Rushing Attempts per Game] / 10
  • zTurnovers = [Turnovers per Game z-score]
  • zPlays = [Number of Offensive Plays per Game z-score] 

 

These 4 components summed make up zOffense, while taking each team’s opponents’ calculations results in zDefense.

What I found after summing the different components was that the resulting number, when divided by the number of games played, was a very accurate estimator for a team’s average point differential.

Among the nearly 2,000 college football teams, the average difference between zPoints (calculated margin of victory) and actual MOV was just 3.21 points, with a median of 2.77, and a max difference of 13.97 points.  About 20% of teams’ MOV were calculated to within 1 point or less, 53% were accurate to 3 points or less, 79% to 5 points or less, and 99% to 10 points or less.  The regression model for this dataset can be seen below:

 

The NBA model has similar results using 6 parts:

  • z3P (3-point shots) = [[3P FG% z-score * 3-point attempts * 3] / 10
  • z2P (2-point shots) = [2P FG% z-score * 2-point attempts * 2] / 10
  • zFreeThrows = [FT% z-score * free throw attempts] / 10
  • zTurnovers = [Turnovers per Minute z-score * League Average Points per Possession] * 2
  • zORB (offensive rebounds) = [Offensive Rebounds per Minute z-score * League Average Points per Possession]
  • zDRB (defensive rebounds) = [Defensive Rebounds per Minute z-score * League Average Points per Possession] 

 

Similar to the football model, these 6 components make up zOffense, while each team’s opponents’ calculations make zDefense.  I particularly like z3P, z2P, and zFT because they multiply the z-score by the “weight”: 1, 2, or 3 points.  Recall that zRange is multiplied by the IF/OF Constant, which is just the difference, on average, in runs between balls hit to the outfield vs. balls that remain in the infield.

I’ve only done the calculations for the 2013-2014 season, where teams averaged 1.033 points per possession.  To convert to zPoints in this model, add zOffense and zDefense, then divide by 5.

In most seasons, elite teams will have an average point differential of +10, while terrible ones will hover around -10.  On average, the NBA model had an average difference between the calculated and actual differential of just 1.331 points, with a median of 0.800.  17 out of 30 teams were calculated within 1 point, 25 within 2, and 29 out of 30 were accurate to within 5 points per game.

The fact that these models can be created using the same general principle (rate statistic z-scores multiplied by a factor of playing time equates relative points) provides some evidence that similar results are calculable in baseball.  This is the basis for zDefense in PEACE.  Let’s look at the results.

Most sabermetricians would turn to the Fielding Bible Awards for a list of the best fielders by position in any given year, so we’ll use those results to compare.  If we assume that the Fielding Bible is accurate, then we would expect zDefense to produce similar conclusions.  Comparing the 2014 winners to the players ranked as the best at their position by zDefense, we can see some overlap.  The number in parentheses is the positional ranking of the Fielding Bible Award winner by zDefense.

  • Position: Fielding Bible Winner (#)…zDefense Winner
  • C: Jonathan Lucroy (12)…Yadier Molina
  • 1B: Adrian Gonzalez (1)…Adrian Gonzalez
  • 2B: Dustin Pedroia (2)…Ian Kinsler
  • 3B: Josh Donaldson (2)…Kyle Seager
  • SS: Andrelton Simmons (8)…Zack Cozart
  • LF:Alex Gordon (1)…Alex Gordon
  • CF: Juan Lagares (3)…Jacoby Ellsbury
  • RF: Jason Heyward (1)…Jason Heyward
  • P: Dallas Keuchel (5)…Hisashi Iwakuma

The multi-position winner, Lorenzo Cain, was also rated very favorably by zDefense.  While most positions don’t have a perfect match, every single Fielding Bible winner was near the very top of their position for zDefense.  This is the case for almost every instance, which isn’t surprising: if there were drastic disagreements about who is truly elite, then we would suspect one of the metrics to be egregiously inaccurate.  Instead, we see many similarities at the top, which provides some solid evidence that zDefense is a valid measure.

As always, feel free to comment with any questions, thoughts, or concerns.


A zDefense Primer

This is installment 2 of the Player Evaluator and Calculated Expectancy (PEACE) system, which will culminate in a completely independent calculation of wins relative to replacement-level players.  Part 1 can be found here: http://www.fangraphs.com/community/an-introduction-to-calculated-runs-expectancy/

I reference Calculated Runs Expectancy a lot, so I highly recommend reading that article to gain some understanding of what I’m talking about.  Today I’m going to introduce my own defensive metric, zDefense, which operates under the same aggregate sum logic as UZR, but utilizes completely different arrangements of its components.

zDefense has 3 different methods of calculation: one for pitchers and catchers, one for infield positions, and one for outfielders.  I’ll explain how all three forms work to calculate each player’s defensive contribution in terms of runs relative to average (which for fielding is also considered “replacement-level”).  For this report, the seasons 2012-2014 have been calculated and will be compared throughout.

For pitchers and catchers, where Ball in Zone (BIZ) data isn’t available, the only calculation is zFielding, which measures how many relative runs player’s allowed according to Calculated Runs Expectancy (CRE).  For the pitchers, their defense is measured in terms of stolen bases, caught stealing, pickoffs, errors, and balks.  The catchers are judged based on stolen bases, caught stealing, wild pitches and passed balls, pickoffs, and errors.  In order to isolate each player’s individual contribution, each team’s “Base CRE” is calculated by taking their opponents’ offensive numbers and zeroing all baserunning/fielding statistics.  Then each player’s defensive numbers are included as the offensive counterpart and the difference between the new CRE calculation and the Base CRE indicates runs credited to that player defensively.  For example, in 2014 the St. Louis Cardinals had a Base CRE of 491 runs.  When analyzing Yadier Molina, his statistics (21 Stolen Bases, 23 Caught Stealing, 6 Pickoffs, 27 Bases Taken) are included in the equation and produce a new CRE value of 500, which means that he was responsible for about 9 runs allowed defensively.  This is done for all players and then compared to the positional average, which is where pitchers and catchers deviate from the other positions.

Without BIZ data, pitchers and catchers are evaluated based on the positional average number of innings played per defensive run allowed.  All other positions, however, are evaluated relative to the average number of runs allowed per ball in zone.  These numbers are almost constant year-to-year, with only miniscule variations (for example, the number of runs per BIZ for outfielders from 2012-2014 were 0.079, 0.079, and 0.078).

So in order to calculate Yadier Molina’s 2014 zDefense, his numbers would be plugged into the equation:

  • zDefense (Pitchers/Catchers) = (Innings Played / Positional Innings per Run) – Player Defensive Runs Allowed
  • zDefense (Molina, 2014) = (931.7 / 38.9) – 9.1 = +14.820

 

In 2014, catchers averaged one defensive run allowed every 38.9 innings; which means that an average catcher would be expected to allow about 24 runs in the number of innings that Molina caught.  Instead, he only allowed 9, saving the Cardinals nearly 15 runs in 2014.  This is all it takes to calculate the defensive contribution of pitchers and catchers.

For infielders and outfielders, zFielding is just one component; one that essentially tells how well fielders handled balls hit to them in terms of errors and preventing baserunner advancement.  It’s calculated slightly differently than for pitchers and catchers, but the first few steps are the same: find the team Base CRE, include player defensive stats, find the difference between the two CRE calculations, compare to positional rate.  Let’s use the Royals’ Alex Gordon in 2014 as an example.  The Royals as a team had a Base CRE of 519, and Gordon’s defensive contribution resulted in a new CRE of 528 (a difference of 9.1).  From here, just plug in the variables:

  •  zFielding (Infielder/Outfielders) = (Positional Runs per BIZ * Player BIZ) – Player Defensive Runs Allowed
  • zFielding (Gordon, 2014) = (0.064 * 261) – 9.1 = +7.724

 

Considering the number of balls in Gordon’s zone in 2014, he saved the Royals nearly 8 runs just by preventing errors and baserunner advancement.  But there are still a few other considerations for position players: zRange, zOuts, and zDoublePlays.

zRange attempts to quantify the number of runs saved by simply reaching balls in play using BIZ data and the runs per BIZ table from above.  It has 2 forms, one each for infielders and outfielders, but both begin the same way.  The first step is to find each position’s Real Zone Rating (RZR), which measures the percentage of BIZ fielded.  These numbers are more dynamic than the previous table, and the general trend has been towards higher RZR at all positions as offensive production has dwindled in the past decade.

The next step is basically the exact same as zFielding, except instead of finding relative runs allowed, we are looking for relative plays made.  For example, Alex Gordon in 2014 fielded 235 out of 261 BIZ (0.900 RZR), which was better than his positional average of 0.884.  By multiplying 261 and 0.884, it can be seen that Gordon reached about 4 more balls than the average left fielder would have.  From there, the relative number of plays is multiplied by the appropriate constant.  This is where one of the alterations to zDefense occurred.

For infielders, the idea is that by reaching a ball in play, the fielder has prevented the ball from reaching the outfield.  So in theory, this reduces the average number of runs that hit ball would be worth.  This is known as the IF (infield) Constant, and is the difference between the average runs per BIZ between outfield and infield balls in play.  In 2014 this constant was 0.068 (0.078 – 0.010), and has been nearly identical for each of the past three seasons.

For outfielders, the ball in play will almost always be classified as an outfield ball regardless of whether the fielder reaches it or not, so the OF (outfield) constant is just the average number of runs per BIZ for the outfield as a whole.  In 2014 this was 0.078, which would be multiplied by Gordon’s 4 relative plays above average.

Additionally, each player fields a number of balls outside of their zone (OOZ).  The number of OOZ plays is halved because they aren’t necessarily run-saving plays: when a shortstop catches a popup on the pitcher’s mound or when the first baseman extends to his right rather than let the second baseman handle the play, they may count as OOZ plays without being marginally beneficial.  The half of OOZ plays is also multiplied by the appropriate constant, added onto the previous product, and produces zRange.

  • zRange = {[Player Plays Made – (Player BIZ * Positional RZR)] + (Player OOZ Plays Made / 2)} * IF/OF Constant
  • zRange (Gordon, 2014) = {[235 – (261 * 884)] + (106 / 2)} * 0.078 = +4.436

 

On top of saving the Royals 8 runs with his arm and glove, Gordon also saved them over 4 runs with his legs and eyes.  This is where the biggest change to the formula happened; before, zRange was being calculated nearly identically to zOuts, which resulted in players essentially being credited twice with their relative RZR.  Instead, zRange just multiplies relative plays by the appropriate constant and recognizes that zOuts is a reflection of range and ability to convert balls into outs.

zOuts uses a very different approach than the previous 2 components; rather than find relative run values by conventional means, a rate statistic z-score is found and then multiplied by “playing time.”  It will be shown in the next section that this works remarkably well, but for now we are just looking at the derivation.  For zOuts, 2 different numbers are required for each player: their Real Zone Rating, and their Field-to-Out Percentage (F2O%).  These 2 numbers combine to form outs per BIZ, which is the comparative average each player is evaluated against.  Like the previous numbers, these also remain fairly consistent with a general trend negatively related to scoring.

Also required for z-scores is the standard deviation.  For these calculations, I have been using the standard deviation for just players with at least 100 innings played at that position to eliminate outliers.

Taking the z-score of outs per BIZ is simple enough, but what defines “playing time?”  Well, there are 2 factors that work well in eliminating outliers: the first is the percentage of total innings played at that position by that player.  If a team plays 1400 innings in the field over the course of the year, it means there are 1400 defensive innings available at each position, so a player who played in 1000 of them would have played about 71% of the defensive innings at that position.  The second factor considers that while players may have played an equal number of innings, they may not have had an equal number of balls to field.  This factor is one-half the square root of the number of BIZ for each player.

  • zOuts = [(Player O/BIZ – Positional O/BIZ) / Positional O/BIZ Standard Deviation] * (Player Innings / Team Innings) * (√ Player BIZ / 2)
  • zOuts (Gordon, 2014) = [(0.450 – 0.417) / 0.068] * (1372.7 / 1450.7) * (√ 261 / 2) = +3.741

 

zOuts is a blended statistic; it measures how well players convert balls into outs by considering their range and out-producing ability.  Alex Gordon saved the Royals another 4 runs this way, which brings his total zDefense to:

  • zDefense (Outfielders) = zFielding + zRange + zOuts
  • zDefense (Gordon, 2014) = +7.724 + 4.436 + 3.741 = +15.900

 

This is all it takes to calculate the defensive contribution of outfielders, but infielders still have one more factor to consider: double play ability.  zDoublePlays is nearly identical to zOuts, except double plays per BIZ is the positional average required.

From there, the calculation is almost the same as zOuts:

  • zDoublePlays = [(Player DP/BIZ – Positional DP/BIZ) / Positional DP/BIZ Standard Deviation] * (Player Innings / Team Innings) * (√ Player BIZ / 2) * Positional DP/BIZ

 

The last part at the end affects the weight of zDP in the overall zDefense equation.  The ability to turn double plays isn’t really a selling point for corner infielders because of the relative rarity of those plays.  Double play ability is much more relevant to middle infielders, and multiplying by the positional averages helps to bring this disparity into the equation.  JJ Hardy consistently ranks as elite in terms of double play ability, so we’ll use him as the example player here:

  • zDoublePlays (Hardy, 2014) = [(0.313 – 0.236) / 0.091] * (1257.0 / 1461.3) * (√ 316 / 2) * 0.236 = +1.540

 

And if we want the entire infielder formula written out:

  • zDefense (Infielders) = zFielding + zRange + zOuts +zDoublePlays

 

Like the previous post, there is a lot of new information to take in here, so feel free to ask any questions or leave any comments with feedback, thoughts, or concerns with work I’ve presented.  The next installment will be an exploration of z-scores in sports and how they correspond to actual points/runs, which I’ll use to provide credibility for zDefense.


An Introduction to Calculated Runs Expectancy

Introduction first: my name is Walter King and over the next few weeks I plan on sharing my counter to Wins Above Replacement, which I call PEACE: Player Evaluator and Calculated Expectancy.  The engine behind PEACE is Calculated Runs Expectancy, which is what this article will cover.

Calculated Runs Expectancy (CRE) is an analytical model that estimates runs produced by a player, team, or league for any number of games.  CRE operates under the assumption that every single play on the field is relevant to output and thus can be translated into a statistical measure.

In its general form, the Calculated Runs Expectancy formula looks like this:

  •  CRE = (√ {[(Bases Acquired) * [(Potential Runs) * (Quantified Advancement) / (Total Opportunities)]] / Outs Made2} * (Total Opportunities) + (Hit and Run Plays) + Home Runs) / Runs Divisor, relative to the league

 

This formula was reached by following a particular line of logical reasoning, which starts with the assumption that the singular objective of baseball is to win every game (well, duh!).  Winning every game mathematically requires one of two scenarios: either a team allows zero runs, or they score an infinite number of runs, both resulting in one team scoring 100% of the runs, assuring 100% of the wins.  Because the objective is to win the game, and the only way to assure victory is to score the most runs, then the only two ways players can contribute to winning are by scoring runs or by preventing the opponent from doing so.  This sounds painfully simple, but we have to establish that metrics are limited in usefulness if there is no clear link to runs, and therefore wins.  This assumption forces us to define what makes a run in terms of statistics.

With so many different statistics to represent the happenings on the field, it can be tough to form a clear definition.  Keep it simple.  Break down what a run is in the simplest way possible: a run scored is when a player safely touches all four bases, ending by touching home plate.  That’s it.  A team must acquire at least 4 bases in order to score 1 run, so the first formula we can use in our analysis is Bases Acquired:

  •  Bases Acquired = TB + BB + HBP + ROE + XI + SH + SF + SB + BT (bases taken)

 

This is a complete representation of the number of individual bases a hitter acquired, which is often overlooked as valuable information.

My second definition of a run comes directly from Bill James’ Runs Created statistic: to  score a run, a batter needs to first reach base, and then advance among the bases until they reach home plate.  This focus looks at offensive production through the completion of those two smaller goals.  These concepts have already been identified by James using three basic principles: On-Base Factor, Advancement Factor, and Opportunity Factor to calculate runs created.

But what composes these factors?  Well, this is where I venture slightly away from James, attempting to encompass a more complete representation of a hitter in my calculations.  I’ve altered them a bit and given them new names:

  • Potential Runs = TOB (times on base) – CS – GDP – BPO (basepath outs)
  • Quantified Advancement = TB + SB + SH + SF + BT
  • Total Opportunities = PA + SB + CS + BT + BPO

 

With these now defined, my modified Runs Created formula looks like this:

  •  Modified Runs Created = [(TOB – CS – GIDP – BPO) * (TB + SB + SH + SF + BT)] / (PA + SB + CS + BT + BPO)

 

Bases Acquired and Runs Created are counting statistics, but we want rate statistics.  I believe strongly in the principles of VORP, which asserts that production must always be measured relative to cost in terms of outs.  To amalgamate our measures of offensive production and outs made, we simply divide each by outs made to create two “per out” statistics.

So what we have now are two different measures of a batter’s efficiency; one that calculates bases acquired per out made and another that finds calculated runs scored per out made.  By multiplying the two, we can incorporate two different statistics of efficiency in our evaluation of hitters.  Conceptually, this represents a reconciliation of two different philosophies on how runs are produced.  We’ll call the resulting quantity Offensive Efficiency.

  •  Offensive Efficiency = (Bases Acquired * Runs Created) / Outs Made2

 

I particularly like this formula because the two key components that comprise it are largely considered obsolete by modern sabermetrics.  Both Total Average (bases/outs) and Runs Created are from the 1970s and are throwbacks to better uniforms and simpler ways of thinking.  If you were to approach a stathead today championing total average or runs created as “the answers,” they would first dismiss you, and then suggest more modern metrics.  Much like the struggle sabermetrics saw when first attempting to become a respected pursuit, modern sabermetrics seems to scoff at the idea that older, simpler calculations can be valuable.  But both Total Average and Runs Created per Out are logically sound in their function; they break down the aspects of hitting into real-life objectives that correspond to real-life results.  Offensive Efficiency will definitely tell you which batters performed most efficiently, but it is sensitive to outliers.  To counter this, recall the general CRE equation:

  •  CRE = (√ {[(Bases Acquired) * [(Potential Runs) * (Quantified Advancement) / (Total Opportunities)]] / Outs Made2} * (Total Opportunities) + (Hit and Run Plays) + Home Runs) / Runs Divisor, relative to the league

 

Multiplying Offensive Efficiency by Total Opportunities creates a balance between efficient and high-volume performers.  The next step, inspired by Base Runs, is to add “Hit and Run Plays” along with Home Runs to the equation because those are instances when a run is guaranteed to score.  Hit and Run Plays are my name for situational baserunning plays (found on Baseball-Reference) that result in a batter advancing more bases than the ball in play would suggest.  For example, when a batter hits a single with a runner on first, the runner would be definitely expected to reach second base.  Reaching third or scoring, however, would indicate a skillful play (or a hit and run) by an opportunistic baserunner.  Three stats make up Hit and Runs Plays: 1s3/4 (reaching third or home from first on a single), 2s4 (scoring from second on a single), and 1d4 (scoring from first on a double).

At this point, all that’s left is the Runs Divisor.  If you’re following along at home, an individual batter season without a Runs Divisor would be somewhere between 200-500, while a team single season would typically be between 2000-3000.  The Runs Divisor is specific to each season and league (so the 2014 AL and NL both have unique divisors), and is the average optimal divisor that would result in actual runs scored, relative to the specific league.  Let’s use a 2-team league as an example.  Team A scores a raw CRE of 2500 while scoring 700 actual runs, so their optimal divisor would be 3.57.  Team B, on the other hand, has a raw CRE of 2250 and scored 600, a divisor of 3.75.  The league’s Runs Divisor would be the average of the two: 3.66.  This divisor would be used for every individual player in that league, as well.  Divisors vary every year, but always remain very similar.

A full list of Runs Divisors from the seasons 1975-2014 can be seen here:

hmXdLww

The average divisor across that time span was 3.7631, with a standard deviation of just 0.0268.  This provides strong evidence of the relationship between CRE and runs; the two are related in the same way across generations of ballplayers.  When we graph the results of CRE against actual runs for all 1114 teams in that timespan, we can see some very convincing results:

GEHJeRp

The R2 value (0.9682) corresponds to an average difference between actual and calculated runs of 14.02.  When compared to other run estimators, the differences are significant:

Runs Estimator (Creator), Average, R2

  • Base Runs (David Smyth), 18.77, 0.9441             
  • Estimated Runs Produced (Paul Johnson), 18.15, 0.9480             
  • Extrapolated Runs (Jim Furtado), 18.33, 0.9515             
  • Runs Created (Bill James), 20.01, 0.9383                         
  • Weighted Runs Created (Tom Tango), 19.37, 0.9443  

 

The gap between CRE and the 5 other estimators is consistent across the entire span of 40 seasons.

There is a lot of new information to take in here, so feel free to comment below with any questions or feedback.  Part 2 will be uploaded in a few days.