Archive for Research

The Homer Numbers of a Hypothetically-Healthy Giancarlo Stanton

Giancarlo Stanton has missed significant playing time since his MLB debut in 2010 and has never played more than 150 games of a 162-game season (145 and 123 games being his next two highest totals). In spite of his injury-shortened seasons, Stanton has still been among the league home-run leaders in 2011, 2012, and 2014 (his 150, 123, and 145-game seasons, respectively).

Giancarlo Stanton Since Debut (June 2010)
Season Games PA HR HR MLB Rank Injury Report
2010 100 396 22 T-55 ——
2011 150 601 34 9 Hamstring issues limited time
2012 123 501 37 7 15-day DL: Arthroscopic knee surgery
2013 116 504 24 T-31 15-day DL: Strained right hamstring
  2014* 145 638 37 2 Season-ending facial fracture
2015 74 318 27 T-25 15-day DL: Season-ending hamate (hand) fracture
2016 119 470 27 48 15-day DL: Strained left groin
*=finished 2nd in NL MVP race (Clayton Kershaw)

Career-wise, Stanton has amassed a total of 208 home runs, good enough for 16th-most of any player through their age-26 season and among the likes of Miguel Cabrera and Jose Canseco.

HR-leaders through Age-26 season
Rank Player HR
1 Alex Rodriguez 298
2 Jimmie Foxx 266
3 Eddie Matthews 253
4 Albert Pujols 250
5 Mickey Mantle 249
6 Mel Ott 242
7 Frank Robinson 241
8 Ken Griffey, Jr. 238
9 Orlando Cepeda 222
10 Andruw Jones 221
11 Hank Aaron 219
12 Juan Gonzalez 214
13 Johnny Bench 212
14 Miguel Cabrera 209
14 Jose Canseco 209
16 Giancarlo Stanton 208

Given Stanton’s injury-plagued career, his career home-run numbers are a lower bound on what he may have accomplished had he played full, injury-free seasons following his debut. To quantify how Stanton’s injuries have suppressed Stanton’s career power numbers thus far, I extrapolated the home-run totals of Stanton’s injury-shortened seasons into full-season hypothetical home-run totals (hHR) using the formula below:

hHR = FLOOR(HR/G * 162)

The formula simply assumes that Stanton maintains his HR/G rate through a whole 162-game season and then conservatively rounds down. We can now compare home-run totals between the real Giancarlo Stanton and our hypothetical Giancarlo Stanton. I excluded his 2010 debut from the extrapolation.

Real Giancarlo Stanton vs. Hypothetical Giancarlo Stanton
Season Games HR HR MLB Rank hGames hHR hHR MLB Rank
2010 100 22 T-55 100 22 T-55
2011 150 34 9 162 36 8
2012 123 37 7 162 48 1
2013 116 24 T-31 162 33 T-9
2014 145 37 2 162 41 1
2015 74 27 T-25 162 59 1
2016 119 27 48 162 36 T-16

The real Stanton never led the MLB in home runs, but our hypothetical Stanton climbs into the MLB lead in three of his hypothetical seasons (2012, 2014, and 2015).

Career-wise, our hypothetical Stanton would have hit 275 total home runs. This hypothetical Stanton adds 67 home runs to his real total, jumping from 16th to second place on the Age-26 leaderboard, only 23 home runs behind the far-away leader, Alex Rodriguez.

HR-leaders through Age-26 season
Rank Player HR
1 Alex Rodriguez 298
2 Giancarlo Stanton (hypothetical) 275
3 Jimmie Foxx 266
4 Eddie Matthews 253
5 Albert Pujols 250
6 Mickey Mantle 249
7 Mel Ott 242
8 Frank Robinson 241
9 Ken Griffey, Jr. 238
10 Orlando Cepeda 222
11 Andruw Jones 221
12 Hank Aaron 219
13 Juan Gonzalez 214
14 Johnny Bench 212
15 Miguel Cabrera 209
16 Jose Canseco 209
17 Giancarlo Stanton (real) 208

Of note, using the same formula to calculate Stanton’s career strikeout totals predicts a whopping 1271 strikeouts for our hypothetical Stanton. His 977 strikeout “real” total through age 26 (second-highest) balloons and surpasses Justin Upton’s age-26-leading 1026 for a clear command of first place.

In reality, Stanton is a three-time All-Star, a Silver Slugger (2014), and a Home Run Derby champion (2016), and he historically ranks among the best in home-run totals for his age, all while facing injury issues in all of his first six full big-league seasons. Our hypothetically-healthy Giancarlo Stanton greatly improves his career numbers and garners himself a few MLB home-run crowns, giving a glimpse into how much larger his career numbers could be today had his first six full seasons been injury-free. As Stanton’s career progresses, it will be interesting to see where his home-run totals end up, and, unfortunately, how much greater they could have been.

Credit to Baseball-Reference for all publicly available data.


The Reds Have a Spin Rate Problem

With baseball’s annual winter meetings taking place this past week near Washington D.C, I want to take a look at the Cincinnati Reds and a potential way of looking to improve upon a historically bad pitching staff in 2016.  While they did just post the worst WAR by a pitching staff since 1900, they were completely average somewhere else, which likely aided them towards the path of history no team wants to make.  The Reds threw the highest amount of average four-seam spin-rate fastballs in 2016.

We are just scratching the surface on spin-rate research.  While we can’t say much for sure about ways to improve spin rate or why it differs from pitcher to pitcher, we do have a pretty good idea it’s good to be different.  The ultimate goal of pitching is to disrupt timing, create mis-hits and have swings and misses.  The more deception a pitcher can create by being further away from average spin on either the high end or low end of the spectrum, the better off they appear to be.  This was a major problem for the Reds last season as the they threw a whole bunch of average towards the plate.

Taking spin-rate data from baseballsavant.com, I looked at all 30 teams and their four-seam fastball data.  I set a minimum of 50 four-seams thrown by a pitcher to be included in the data set.  Team-by-team totals show that the Reds threw the fifth-most four-seam fastballs in 2016:

  1. Rays: 10823
  2. Diamondbacks: 10667
  3. Marlins: 10606
  4. Rockies: 10102
  5. Reds: 9991

The average spin rate for the four-seam fastball in 2016 was 2241 revolutions per minute.  This season, the Reds pitching staff was pretty close to the MLB mean at 2232 RPMs. Only the Astros, Athletics and Mets were closer to the mean (2240, 2245, 2248 respectively).  Now, let’s create a bucket we will call “four-seams around average” and see what we collect. This bucket will include pitches that were 50 RPMs higher than 2241 and 50 RPMs lower than 2241 for a 100-RPM range of 2191-2291. Next, I’ll use data from the 10 teams closest to the MLB mean, the most “average” spin teams, to determine who threw the most “average fastballs.”  Here are the top five totals:

  1. Reds: 3165
  2. Mets: 2674
  3. Athletics: 2072
  4. Angels: 2056
  5. Braves: 1973

As you can see, the Reds ran away with what we have designated as “average fastballs” with nearly 500 more than the Mets and over 1,000 more than the third-place A’s.  You could be saying to yourself that the Reds may have thrown so many average-spin fastballs because they threw the fifth-most four-seams in the majors this past season.  And you would be right since a larger sample size obviously affords the chance of more average pitches to be thrown (especially if the data follows a normal distribution like ours does). So I’ll bring in another measurement to further support that the Reds were very average in 2016: standard deviation

I’m sure most people are familiar with standard deviation (SD) so I won’t waste time going into formula, but an easy explanation is it’s one way of measuring dispersion in a given data set.  The lower the SD, the closer all the data points are to the mean.  Looking again at our 10 average spin-rate teams and the standard deviation for each team’s data set, here are the five lowest teams in terms of SD:

  1. Reds: 123.99
  2. Mets: 138.56
  3. Angels: 142.838
  4. Astros: 153.105
  5. Cardinals: 157.645

There are the Reds leading the way again!  Let’s attempt to put all 10 teams on an even playing field by taking a sample of 1,000 four-seam fastballs from each group.  The mean of this sample is our random variable.  In R, we will use the replicate function to generate 10,000 of these random variables to learn about its distribution.  After running the simulation, the random variables follow normal distribution which is something we already knew.  What I was interested in is if the team with the lowest standard deviation would have changed after each team had the same sample size. Here are the lowest five teams in SD after 10,000 simulations:

  1. Reds: 3.68
  2. Mets: 4.106
  3. Angels: 4.126
  4. Astros: 4.472
  5. Cardinals: 4.637

No change. By having the lowest SD in the group that was deemed to be the closest to the MLB mean in four-seam spin, and a test of a random sample of 1,000 pitches simulated 10,000 times, this further supports that the Reds pitching staff has a spin-rate problem, and is not just a product of a larger sample size.  In fact, the Reds had the lowest standard deviation of all 30 teams!

So where can the Reds look over the rest of the offseason to improve upon a pitching staff in need of upgrades in spin rate?  Well, a lot of the work in finding spin value from this year’s crop of free agents was done a few weeks ago on this site.  While Cincinnati won’t be in on the top-tier free agents available, there are more than a few options available that shouldn’t cost any more than $5-6 million in annual value that the Reds can afford to not only improve the bullpen, but move further away from the average spin that may have caused them problems all season.


Hardball Retrospective – What Might Have Been – The “Original” 2013 Marlins

In “Hardball Retrospective: Evaluating Scouting and Development Outcomes for the Modern-Era Franchises”, I placed every ballplayer in the modern era (from 1901-present) on their original team. I calculated revised standings for every season based entirely on the performance of each team’s “original” players. I discuss every team’s “original” players and seasons at length along with organizational performance with respect to the Amateur Draft (or First-Year Player Draft), amateur free agent signings and other methods of player acquisition.  Season standings, WAR and Win Shares totals for the “original” teams are compared against the “actual” team results to assess each franchise’s scouting, development and general management skills.

Expanding on my research for the book, the following series of articles will reveal the teams with the biggest single-season difference in the WAR and Win Shares for the “Original” vs. “Actual” rosters for every Major League organization. “Hardball Retrospective” is available in digital format on Amazon, Barnes and Noble, GooglePlay, iTunes and KoboBooks. The paperback edition is available on Amazon, Barnes and Noble and CreateSpace. Supplemental Statistics, Charts and Graphs along with a discussion forum are offered at TuataraSoftware.com.

Don Daglow (Intellivision World Series Major League Baseball, Earl Weaver Baseball, Tony LaRussa Baseball) contributed the foreword for Hardball Retrospective. The foreword and preview of my book are accessible here.

Terminology

OWAR – Wins Above Replacement for players on “original” teams

OWS – Win Shares for players on “original” teams

OPW% – Pythagorean Won-Loss record for the “original” teams

AWAR – Wins Above Replacement for players on “actual” teams

AWS – Win Shares for players on “actual” teams

APW% – Pythagorean Won-Loss record for the “actual” teams

Assessment

The 2013 Miami Marlins 

OWAR: 33.0     OWS: 255     OPW%: .468     (76-86)

AWAR: 18.5      AWS: 185     APW%: .383     (62-100)

WARdiff: 14.5                        WSdiff: 70  

 

The “Original” 2013 Marlins tied with the Phillies for last place, yet the ball club managed to school the “Actuals” by a 14-game margin. Miguel Cabrera seized MVP honors for the second consecutive season and notched his third straight batting title. “Miggy” produced a .348 BA, dialed long-distance 44 times and knocked in 137 baserunners. Adrian Gonzalez swatted 22 big-flies and reached the century mark in RBI for the sixth time in his career. Matt Dominguez drilled 25 two-base hits and blasted 21 round-trippers. Giancarlo Stanton supplied 26 doubles and 24 four-baggers as a member of the “Originals” and “Actuals”.

  Original 2013 Marlins                              Actual 2013 Marlins

STARTING LINEUP POS OWAR OWS STARTING LINEUP POS OWAR OWS
Josh Willingham LF 0.23 9 Christian Yelich LF 1.34 8.34
Marcell Ozuna CF/RF 0.16 6.68 Justin Ruggiano CF 1.11 9.23
Giancarlo Stanton RF 3.14 16.66 Giancarlo Stanton RF 3.14 16.66
Adrian Gonzalez 1B 4.12 21.17 Logan Morrison 1B 0.32 6.16
Josh Wilson 2B -0.11 0.54 Donovan Solano 2B 0.44 6.95
Robert Andino SS -0.26 0.82 Adeiny Hechavarria SS -2.33 4.28
Miguel Cabrera 3B 6.8 33.13 Ed Lucas 3B 0.42 7.2
Brett Hayes C 0.17 1.01 Jeff Mathis C -0.17 3.22
BENCH POS OWAR OWS BENCH POS OWAR OWS
Matt Dominguez 3B 0.84 11.34 Marcell Ozuna RF 0.16 6.68
Gaby Sanchez 1B 1.91 10.36 Placido Polanco 3B -0.35 5.41
Christian Yelich LF 1.34 8.34 Chris Coghlan LF 0.32 5.35
Logan Morrison 1B 0.32 6.16 Derek Dietrich 2B 0.63 5.29
Chris Coghlan LF 0.32 5.35 Juan Pierre LF -0.27 4.38
Jim Adduci LF 0.03 0.59 Rob Brantly C -0.98 2.61
Alex Gonzalez 1B -0.94 0.32 Greg Dobbs 1B -0.6 2.5
Mark Kotsay LF -1 0.17 Jake Marisnick CF 0.13 1.54
Kyle Skipworth C -0.05 0.01 Miguel Olivo C 0.17 1.17
Scott Cousins LF -0.06 0 Nick Green SS -0.01 1.05
Chris Valaika 2B -0.13 0.58
Joe Mahoney 1B -0.04 0.54
Koyie Hill C -0.55 0.54
Austin Kearns RF -0.13 0.25
Matt Diaz LF -0.14 0.15
Casey Kotchman 1B -0.25 0.06
Kyle Skipworth C -0.05 0.01
Jordan Brown DH -0.06 0
Gil Velazquez 3B -0.01 0

Jose D. Fernandez (12-6, 2.19) merited 2013 NL Rookie of the Year honors and an All-Star invitation while placing third in the NL Cy Young balloting. Portsider Jason Vargas contributed 9 victories with a 4.02 ERA to the “Originals” rotation and Henderson “The Entertainer” Alvarez fashioned a 3.59 ERA and 1.140 WHIP for the “Actuals” in 17 starts. The Marlins’ bullpen featured Steve Cishek (2.33, 34 SV). A.J. Ramos whiffed 86 batsmen in 68 relief appearances.

  Original 2013 Marlins                             Actual 2013 Marlins 

ROTATION POS OWAR OWS ROTATION POS OWAR OWS
Jose D. Fernandez SP 5.57 16.22 Jose D. Fernandez SP 5.57 16.22
Jason Vargas SP 2 7.04 Henderson Alvarez SP 1.89 6.19
Tom Koehler SP 0.46 3.96 Nathan Eovaldi SP 1.39 5.63
Brad Hand SP 0.4 1.43 Ricky Nolasco SP 1.13 4.92
Alex Sanabia SP -0.33 0.6 Jacob Turner SP 0.87 4.56
BULLPEN POS OWAR OWS BULLPEN POS OWAR OWS
Steve Cishek RP 1.62 12.99 Steve Cishek RP 1.62 12.99
A. J. Ramos RP 0.34 5.23 Mike Dunn RP 1.06 6.64
Ronald Belisario RP -0.9 2.61 Chad Qualls RP 1.22 6.22
Sandy Rosario RP 0.24 2.53 A. J. Ramos RP 0.34 5.23
Dan Jennings RP 0.08 1.95 Ryan Webb RP 0.6 5.02
Ross Wolf SW 0.14 1.92 Tom Koehler SP 0.46 3.96
Arquimedes Caminero RP 0.16 0.95 Kevin Slowey SP 0.46 3.15
Logan Kensing RP 0.02 0.1 Dan Jennings RP 0.08 1.95
Josh Johnson SP -1.25 0.04 Brad Hand SP 0.4 1.43
Josh Beckett SP -0.81 0 Arquimedes Caminero RP 0.16 0.95
Chris Hatcher RP -0.93 0 Alex Sanabia SP -0.33 0.6
Chris Leroux RP -0.17 0 Brian Flynn SP -0.59 0.14
Edgar Olmos RP -0.68 0 Steve Ames RP -0.02 0.02
Chris Resop RP -0.6 0 Duane Below RP -0.19 0
Chris Volstad RP -0.49 0 Sam Dyson SP -0.59 0
Chris Hatcher RP -0.93 0
Wade LeBlanc SP -0.41 0
John Maine RP -0.66 0
Edgar Olmos RP -0.68 0
Zach Phillips RP -0.03 0
Jon Rauch RP -0.71 0

 Notable Transactions

Miguel Cabrera 

December 4, 2007: Traded by the Florida Marlins with Dontrelle Willis to the Detroit Tigers for Dallas Trahern (minors), Burke Badenhop, Frankie De La Cruz, Cameron Maybin, Andrew Miller and Mike Rabelo. 

Adrian Gonzalez 

July 11, 2003: Traded by the Florida Marlins with Will Smith (minors) and Ryan Snare to the Texas Rangers for Ugueth Urbina.

January 6, 2006: Traded by the Texas Rangers with Terrmel Sledge and Chris Young to the San Diego Padres for Billy Killian (minors), Adam Eaton and Akinori Otsuka.

December 6, 2010: Traded by the San Diego Padres to the Boston Red Sox for a player to be named later, Reymond Fuentes, Casey Kelly and Anthony Rizzo. The Boston Red Sox sent Eric Patterson (December 16, 2010) to the San Diego Padres to complete the trade.

August 25, 2012: Traded by the Boston Red Sox with Josh Beckett, Carl Crawford, Nick Punto and cash to the Los Angeles Dodgers for players to be named later, Ivan De Jesus, James Loney and Allen Webster. The Los Angeles Dodgers sent Rubby De La Rosa (October 4, 2012) and Jerry Sands (October 4, 2012) to the Boston Red Sox to complete the trade. 

Matt Dominguez

July 4, 2012: Traded by the Miami Marlins with Rob Rasmussen to the Houston Astros for Carlos Lee.

Gaby Sanchez

July 31, 2012: Traded by the Miami Marlins with Kyle Kaminska (minors) to the Pittsburgh Pirates for Gorkys Hernandez.

On Deck

What Might Have Been – The “Original” 1985 Expos

References and Resources

Baseball America – Executive Database

Baseball-Reference

James, Bill. The New Bill James Historical Baseball Abstract. New York, NY.: The Free Press, 2001. Print.

James, Bill, with Jim Henzler. Win Shares. Morton Grove, Ill.: STATS, 2002. Print.

Retrosheet – Transactions Database

The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at “www.retrosheet.org”.

Seamheads – Baseball Gauge

Sean Lahman Baseball Archive


What Will Bryce Harper Really Be Worth in 2018?

It was recently reported that the Nationals would not meet the hefty demands of Bryce Harper. These reports come from Bob Nightingale of USA Today and consist of a demand of $400 million for 10 years or more. This is beside the point though. After the report, I was browsing around on Facebook when I saw someone point out that because of Harper’s defense, he isn’t even worth $300 million. This got me thinking, what is Bryce Harper really worth?

At first glance, I believe that Harper is worth at least $300 million. As a matter in fact, I won’t even make a final decision until the end of this article. I’m discovering his value with you. We’ll first look at his defense, since that is the claim against Harper. For continuity and consistency, I will use FanGraphs’ version of defensive, offensive, and base-running values.

When it comes to Harper’s defense, his values have been up and down for his career. Last year they were up. And down. But up, since I’m using FanGraphs stats, and thus UZR (Ultimate Zone Rating) will be used for my determination. The person from Facebook was likely using DRS (Defensive Runs Saved) because that was -3 while UZR was 8.7 for 2016.

Obviously defensive metrics like these are taken with a grain of salt because they have yet to be perfected. An 8.7 UZR is good. It isn’t top-tier, but it is definitely good. Plus the fact that right field isn’t the most inconsequential position. To make an impact in right field, a good arm is usually needed, and Harper had that in 2016. Yet 2015 was different, though, as both his arm rating and UZR were in the negative. Other than his 2014 UZR, everything else has been positive. His career totals are 17.4 for UZR and 16.3 for his arm. Like I said before, neither is is necessarily Gold Glove caliber, but he is definitely no scrub in the outfield. Even DRS, the metric I presume the Facebook man was using, has a total of 24 defensive runs saved for Harper on his career. So 2016 was his only year in the negative, and that was only -3.

Since I don’t want to only look at UZR and FanGraphs’ Arm ratings, I’ll also take a look at his Inside Edge fielding. All that does is show how often Harper executes on plays considered routine, likely, even, unlikely, remote, and impossible in descending order of probability. Except for routine plays, the rest have relatively small sample sizes on a season-wide basis for Harper. Each category has at least 30 samples for his career though, so the minimum number of samples to accurately represent the population is met. For routine plays, Harper performed as one would expect. He converted 99.6% of the plays in 2016 and 99.1% for his career, easily within the range of 90%-100% for the category. The next category, likely, has a range of 60%-90%. Harper was smack dab in the middle at 75% in 2016, but there were only 16 instances. Of the 70 in his career, he made 78.6% of the plays, well above the minimum expected of 60%. He performed even better in the even and unlikely categories. Remote plays were his only downside as he hasn’t made any of those plays in his career, but given the 39 instances it is hardly representative of his defensive play as a whole. He isn’t known as a burner and has been told by his coaches to tamp down the aggressiveness.

As a whole, his defense isn’t in question. Is it elite? No. He isn’t Jason Heyward or Mookie Betts in right field, but he was still fourth in the MLB in UZR for right fielders, so I don’t think his fielding is holding back his earning potential. If anything it may even be boosting it. Who wouldn’t want one of the premier hitting threats who can play a solid right field?

Because I want to save the more debatable part of Harper for last, we’re going to look at his base-running ability now. FanGraphs has the BsR (Base-Running Runs above average) stat, which sums up a player’s runs above average in terms of stolen bases, caught stealing, extra bases taken on hits, and double plays hit into. That gets boiled down to how many wins a player adds on the base paths. Harper’s BsR in 2016 stood at 2.4, or 2.4 runs added above the average player. He has 11.2 on his career.

To break it down, we will look at Harper’s wSB (weighted stolen bases), UBR (Ultimate Base Running), and wGDP (weighted ground into double plays). The wSB stat basically calculates how much a player helps by successfully advancing a base or hurts by being caught stealing. The Book gives success rates necessary for a base-stealer to add positive value in different situations. wSB simply adds together all the successes and failures and their weighted values (after all, a caught stealing is more costly than a stolen base is rewarding). In Harper’s case, he stole 21 bases in 2016, his highest total. He was also caught stealing 10 times. In all, he cost his team -0.3 runs trying to steal bases last year. It is an inconsequential amount, but for his career it is at -1.0. That is still too small to matter, but he is probably better off staying put unless he is sure he can make it to the next base. UBR and wGDP are higher on Harper. They are 5.5 and 6.7 for his career, respectively. Overall, Harper is a good base-runner. Still not elite, but he isn’t costing his team when running.

So far, Harper has graded well in both fielding and base running. In neither aspect of the game is Harper an elite player (though he’s arguably pretty close in the field). For Harper, and pretty much every player that makes big money in the MLB that isn’t a pitcher, the hitting is what will make and break him. The last two years have shown both sides of the spectrum of what Harper may turn out to be. In 2015, he was one of the two best players in baseball. Okay, he was the best. He flat-out outperformed Mike Trout (the true 2015 AL MVP, but that’s a debate for another time). Harper dominated in every form at the plate two years ago. If it weren’t for his negative defensive grade for the year, he would have broken the 10 fWAR barrier that only Trout has broken since 2004. He hit 42 home runs with 118 runs score and 99 RBI. If you don’t like those raw stats, he went and hit a batting line of .330/.460/.649. If you prefer metric stats, he went out and led in every iteration of runs created as well as wOBA. That stat line alone is worth $400 million.

But, we aren’t looking at one year of production. His 9.5 fWAR of 2015 is an anomaly so far. His second-highest is 4.6 in his rookie year. Last year it was 3.5. A 3.5-win player is not worth $400 million. A 4.6-win player is not worth $400 million. A 9.5-win player is. So, what is Harper really worth? Some (most) point to a reported injury that Harper had this past year that he played through anyway. This injury would have held him back. How much, though, we don’t know. We also don’t know if he will rebound to the 2015 version of him. Was that year a breakout year put on pause or was it in fact an anomaly?

To answer those questions, we need to dig a bit deeper than just his metric stats. In terms of exit velocity, Harper took a large step back from 91.4 mph to 89.5 mph in 2015 and 2016, respectively. In terms of home runs, Harper hit 19 in 2015 off of fastballs while regressing to eight last year. If it is a matter of catching up to fastballs, an injury definitely makes sense. 23-year-olds don’t suddenly lose their bat speed. That begins to happen at 33. When it comes to Harper’s batted balls, he increased the number of fly balls he hit and decreased in line drives. That usually translates to more home runs, but a drop in exit velocity answers that. Harper did hit more infield flies that in 2015. It was only a 3.1% change, but it does suggest he was just missing a bit more than the year prior.

Looking at the differences between the two years and what changed, I’m going to believe that he was injured. When reading online, most analysts believe that, and Harper even said he was injured. Only the Nationals said he wasn’t. With an injury, I have to believe that Harper was hampered by that rather than just a complete regression in skill. Harper has his hitting, and with the offseason to rest and heal he should come back and mash again.

One more tidbit about Harper’s hitting before we’re done here, though. His batting average of balls in play (BABIP) sat at a measly .264. That is well below the average of .300. One could look at Harper’s diminished exit velocity and how often he hit the ball soft, medium, and hard. Well, his average exit velocity is right around league average. He also was under league average for soft hits and above in hard hits. So that should translate to a bit above a .300 BABIP. Because of this, I’m going to factor in that Harper was pretty unlucky last year and his stats would look better if more balls fell into play like they should have.

Unfortunately, we aren’t quite done in determining Harper’s value. Since I’m going to believe that Harper was injured last year, that just adds to a pretty lengthy injury history. Lengthy injury histories aren’t something that teams like, but most of his have come from his aggressiveness on defense in his first few years. He took the pedal off the metal in 2015 and it translated to on-field success. If he continues to do that, I think he should be able to stay on the field.

Harper will also be 26 years old when he hits the open market in the 2018-2019 offseason. That is quite a bit younger than most free agents and it gives enough time for teams to lower their payrolls in time for a bidding war of great magnitude if they so choose (looking at you, Yankees). He will still have about six more years in his prime after he signs his potential mega-deal.

In prior years, teams have spent about $8 million per win above replacement. Obviously some players produce more than what they are being compensated for. No one is going to pay Mike Trout $80 million for one year. But, $40 million for a year isn’t out of the question, especially for someone of Trout’s caliber. This isn’t about Trout though, this is about Harper and what teams will pay him. He is said to be demanding $400 million for 10+ years. Is it conceivable that a team will pay him $40 million per year for 10 years if they expect similar success to 2015? Yes. He outperformed Trout and I think we can agree teams would hand him that amount of money in a pinch. It’s just a matter of whether or not it will happen.

Because I think Harper had an injury that didn’t allow him to play to his standard last year and he was unlucky with his hits, I do believe he can again reach his 2015 production. And because I believe he can get there again, I then have to believe a team will pay him at least $40 million for at least 10 years. I wouldn’t be surprised to see a contract similar to Giancarlo Stanton’s in terms of length — 13 years. For 13 years, Harper would only have to reach an annual average of $30 million, which is much, much easier to come by. So yes, when Bryce Harper reaches free agency where teams can bid as much as they can, some team will pay him that much. Of course, Harper can underperform again this coming season, and it would be hard for him to command that kind of contract. I don’t think that will happen. Based on what he showed in 2015 and why he didn’t do as well last year, he is more than likely to ramp up production in 2017.


Finding the Real Eric Thames

On Tuesday (11/29), the Brewers signed former failed prospect Eric Thames to a three-year, $16-million contract. In doing so, they also DFA’d the co-leader for home runs in the National League, Chris Carter. Now, there has been some speculation that the Brewers made this move to save money, but regardless of what you think the motives behind the move may be, it certainly is an interesting one that deserves a closer look.

Thames came up with the Blue Jays after being drafted in the 7th round of the 2008 draft. He showed good power in the minors, belting 27 homers at AA to the tune of a .238 ISO in 2010. He continued this surge into 2011 and did a decent job with the Jays at the major-league level, but struggled to hit lefties. Then, in 2012, it fell apart. His ISO dropped nearly 30 points from the year before, and his strikeout rate increased to an even 30% from 22%. After bouncing around in the minors in 2013, he then went overseas to the Korean Baseball Organization (KBO) and signed with the NC Dinos, where he almost immediately ascended to god status, hitting 124 home runs in 388 games with a .371 ISO in three years. Not only that, but he won a Gold Glove in Korea and stole 40 bases in 2015.

Now, of course, it’s never that easy. You don’t get a 40/40 guy with decent defense in the MLB for $5 million a year. The KBO is notorious for being a hitter’s paradise, as the skill level isn’t nearly that of the MLB. Think of the KBO as essentially being AA, where any major-league-caliber player will thrive, just like Thames did. But does that mean Thames has actually improved? If you look at some former KBO stars like Jung-Ho Kang and Hyun-Soo Kim, you can see that both have had success in the majors, even though they haven’t come close to matching their numbers in Korea. Thames’ Davenport translations (per Eno Sarris) suggest he’ll be a beast, slashing .333/.389/.628. Looking at those numbers, you could easily argue that Thames would be a bargain for the Brewers, essentially matching Carter’s output while even adding more value on the base paths and in the field.

That being said, Thames is a rare case. We have his stats from when he flopped in the big leagues, and we also have his stats from when he tore up the KBO. Barring some sort of complete technical and mental overhaul, one could also easily argue that Thames’ weaknesses the first time around will be his downfall the second time around. Let’s take a look at some stats from the KBO and compare them to his time in the MLB.

As stated before, one of the issues Thames had was that when he made contact, the balls didn’t go anywhere worthwhile (like the stands). He slugged .431 with a .182 ISO from 2011-2012, which does not look good if you’re a major-league first baseman. In the KBO, he put that issue to rest, where he slugged .718 with a .371 ISO, which is essentially unheard of in the MLB. Let’s check that problem with power off the list. However, there still stands the issue of his strikeouts and walks. He struck out 26% of the time during his time in the bigs while walking only 6% of the time, which is a recipe for disaster. In Korea, he struck out 18% of the time and walked a whopping 14% of the time. Other KBO imports have shown that both strikeout and walk rates regress when moving from Korea to the majors. So, Thames solved that second problem, although based on available data, we can assume he’ll regress in both categories. Thames improved in both areas that he needed to, but was this only because he was facing lesser pitching in a hitter’s paradise, or did he make technical changes to his swing in addition to improving his plate discipline?

Below are two screen shots: the top is Thames getting ready to take Ryan Dempster yard in 2013, the bottom is Thames hitting one of his 47 home runs in 2015.

mlbthames

 

kbothames

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Look at the hands. In the top picture, Thames keeps his hands roughly around his ears right before his swing, while in Korea, he appears to load his swing lower, near his shoulders. This allows Thames to stay in the zone with his bat longer and have a bit of an upswing, which leads to higher exit velocity and an improved launch angle. Both of these qualities translate into more power and more strikeouts. Ted Williams first pioneered this idea, saying that a slight upswing leads to extended contact on the ball, while a level swing leads to a smaller impact zone.

ted

 

 

 

 

 

 

 

 

This is a change many players have made, such as Josh Donaldson, Jake Lamb, and Ryon Healy. Eno Sarris wrote an excellent article on the changes Ryon Healy made to his swing. It looks like this is something Thames is trying to emulate and will hopefully carry over to the MLB.

It looks like Thames has made the adjustments that he has needed to become a successful player. Trying to project what player he’ll be is a bit difficult. Personally, I look at the Davenport projections and I’m a little hesitant to say Thames will hit .333 and slug .628, seeing as how his strikeout rate will almost certainly regress to levels close to his former major-league self. I don’t see his walk rate regressing down to that level, mainly because plate discipline is a skill that accrues over time, and pitchers will have to be more careful with Thames and his new approach at the plate.

Let’s look at his slash line from his time in the MLB — in 633 at-bats, Thames hit .250/.296/.431 with 21 homers and a walk rate of 6% and a strikeout rate of 26%. Assuming regression from Korea, let’s keep the strikeouts at 25%, up from 18% in Korea, and let’s up the walk rate to account for added patience and power to 10%. With the technical changes in his swing, we can also assume his batted balls will go further and get hit harder, so let’s bump the slugging up to .500, which translates into something like 30-35 HR. This puts his ISO right at .250, a step up from what we saw earlier in his career. We’re now looking at a slash line of roughly .250/.350/.500 with an above-average glove at first and 10 steals (the Brewers love to let their players run). That’s good. In fact, that’s better than Chris Carter, and the Brewers are getting this at half the price of what Chris Carter would cost. I think there are plenty of reasons to be excited about Eric Thames in 2017.


What Reducing the DL from 15 to 10 Days Could Mean

Wednesday night in the 11th hour, MLB owners and players agreed to a new collective bargaining agreement that will cover five seasons through 2021.  While many of the items eventually agreed upon were tweaks and not major overhauls, one of the items that was of interest to me was the reduction of the disabled list from 15 days to 10 days.  On the surface, this could look like a win-win for both the players and the owners.  After all, players get to come off the DL and back on to the playing field five days sooner than they would have in past seasons, and owners can save coaches and fans from having to watch replacement-level players play while a most likely better player is on the shelf.

Using DL data compiled by baseballheatmaps.com, I took a look at length of stay on the DL by all players who landed on the list from 2010-2016.  Since 2010, 319 players have spent exactly 15 days  on the DL.  In total, this is 4785 days spent on the DL in seven seasons.  Now, for fun, let’s assume those same 319 players were ready to go after the new minimum of 10 days on the DL.  Simple math here will tell you those players spent 3190 days on the DL.  So in theory, over the course of seven seasons, reducing the DL to 10 days could save players 1595 days on the DL and owners the same number of days using most likely replacement-level players.  On a per-team average basis, reducing the DL by five days could actually save a team 7.6 days of DL time.

Seems like a win-win, right?  Again, players come back sooner, GMs don’t have to call up as many players from the minors and burn options, and owners save money by not having as many extra players come up from the minors accumulating MLB service time.  Not so fast.  In the same seven-season stretch, 3324 players spent 15 days or more on the DL and only 319 came off after 15 days.  So only 9% of all players on the DL spent the minimum amount of time out of action.  Why would this be?  Well, the obvious answer is if a player is hurt, they are hurt.  No one knows a player’s body better than the players themselves and they will return to action when they feel they are ready.

But the other answer is it pays to be on the DL in the majors.  There is protection.  Players still earn their salary and collect service time, so why rush back from an injury?  In the minors it is a different story. If you get hurt it becomes the next man up for a promotion to the big leagues.  There’s a reason there is a saying in the minors: “you can’t make a club in the tub.”  Now, just because there is protection doesn’t mean players want to spend time on the DL.  If they could, they would spend no time on the DL, as time away from the playing field can hurt future earning potential. Injuries are an inevitable part of the game but most seem to prevent players from feeling they are healthy enough to come back sooner than 15 days to compete at their best.  By reducing the DL to 10 days, I can see increased pressure from fans and media to come back quicker.  What we have to remember is this is the new minimum.  Players will return when they and the medical staff feel they’re ready.  I wouldn’t give your hopes up to see players return from the DL sooner than they have in the past.


BatCast the Bat Flip Tracker: Oh, How the Wood Was Chucked

“Make baseball fun again” is Bryce Harpers outcry against baseball fundamentalists who continue to police emotions and enforce baseball’s expressionless professionalism.  “Shut up and play the game right” might be something you’d hear uttered from the fundamentalist’s side — ideally through tobacco-glazed teeth — and maybe by Brian McCannThe discourse is of course more involved than that, covering everything from retaliatory plunk balls to bat flips, and anytime something marginally inflammatory happens, it’s beaten so hard that we’re reminded how boring our lives are that we have to discuss the same things over and over and over.  I know you can picture the media package that accompanies the discourse: a young, brash, exquisitely coiffed, generational talent, who was hit in the ribs in his first ever plate appearance (then proceeded to steal home), is unabashedly passionate about a “fun” revolution in baseball.  His eye black is adorned like war paint; he has emojis on the bottom of his bats; his helmet never stays on his head when he runs the bases; and yes, he “pimps” his home runs.  Cut to Joey Bats‘ ALDS bat flip and the ensuing brawl and then connect it with Rougned Odor’s haymaker; cut to Brian McCann standing at home plate waiting for Jose Fernandez after his first career home run; then enter the commentator: “Is this wrong?”

While baseball’s moral code on gaining an edge is unpredictable, there’s always been the idea that individuals conform to the game, not the other way around.  Harper’s sermon won’t shatter the code of conduct, but it might move the needle, if it hasn’t already.  For example, I can’t think of a standout incident this season because of a bat flip.  That’s good! Because bat flips are really fun!  There’s really no need to overthink it.  There were plenty of memorable bat flips this year, and in an effort to make some fun out of baseball when there is no baseball being played, I’m breaking out my bat flip tracking equipment (a ruler, a stop watch, and a parabolic trajectory calculator) that I introduced last year, and booting up BatCast for a look back at the year’s most memorable wood-chucking moments.

A brief recap: arriving at these numbers is a sloppy and wildly imprecise affair.  I pull videos, gifs, and stills of a bat flip and start by measuring the height of the player as he appears on my screen.  I convert that measurement into the player’s real-life size and reference this ratio, as well as measurements on the baseball field, and rough estimates, to arrive at some of the data I present to you in meters and feet: initial height, apex, and distance.  Using a stopwatch or the time stamp on YouTube, I can declare a fairly accurate hang time of the bat.  Angles are roughly noted using the batter and the ground to form a 90-degree angle and are adjusted in the parabolic trajectory calculator.

Let’s kick this off:

Exhibit A – The one that’s probably at the forefront of your mind:

Asdrubal Cabrera

Date Inning Leverage Index ΔWE% Implication
09/22/16 11 4.42 82.5% 0.5 gm ld in WC

Statcast

Exit Velocity Launch Angle Distance
102 mph 28.50 393 ft

Le Flip

asdrubalbatflip092216

How about in slow motion?

092216_asdrubal_walkoff_slomo_med_m9up6w4p 

Ejaculatory!

How many of his teammates do you think saw that flip?  They may have seen the tail end of it, but I’m willing to bet zero saw the flip in its entirety because everyone in the dugout was gazing at the ball in flight.  But this was a no-doubter.  Edubray Ramos resigned to the outcome likely before the ball had reached its apex.  The Phillies weren’t playing for anything at this point, but the Mets?  Before this pitch, the Mets were tied with the Giants and Cardinals for the top wild-card spot.  Before this pitch, in the 9th inning, Jose Reyes erased a two-run deficit with a home run of his own, only to see that lead given up again when Jeurys Familia and Jim Henderson allowed two runs to score in the top of the 11th.  After this pitch, this game ended and they had a half-game lead on any team in the National League for the first wild-card spot.  That bat flip is a team effort.  There’s some “I did it” in there, but the way he looks towards the dugout and offers his bat up towards his teammates makes this feel like “We did it!”

The numbers:

Cabrera is listed as 6′ tall.  On the freeze frame I measured, he’s 1.9″ tall.  So our key tells us that 1″ on the screen is 37.9″ in real life.  When he releases the bat, he does so from about shoulder height and we’ll call 5′ (1.52 m) in real life.  The acme is, it appears, not a great deal lower than the top of Asdrubal’s head, so we’ll tally that down at 5′-7″ (1.71 m).  To me, the launch angle looks to be right around 30 degrees, and we’ll refine this number once we get them in the parabolic trajectory calculator.  The duration of flight I’m using is the average number I’ve come up with through timing the video 10 times — 0.79 seconds.

Parabolic Trajectory Calculator:

ptraj

BatCast

Exit Velocity Launch Angle Acme Distance
8.7 mph 30 Deg 5’-7” 8’-9”

Exhibit B – A Man Possessed:

Matt Adams

Date Inning Leverage Index ΔWE% Implication
07/22/16 16th 1.71 42.7% 2nd straight walk-off for Cardinals

Statcast

Exit Velocity Launch Angle Distance
105.8 mph 28.34 444 ft

636048353090779282-gty-579171664-83514488_1469294083291_4281277_ver1-0

If this picture was part of an emotional intelligence quiz, I’m sure the answers given as to what facial expression is being displayed would vary greatly.  To accurately assess the information in this picture it may behoove one to understand that, in baseball, home teams wear white and that the man in the background is most likely a fan of the home team and that his hands are held high in jubilation.  If you’re only looking at the horrifying ogre in the foreground who appears to be screaming at 67 Hz+, the pitch only a dog can hear, you’d be hard-pressed to say that is a happy man.  In fact, he may not be happy yet — he’s likely evoking a form of relief, having just exorcised the demons one faces when up to bat in the 16th inning of a tie baseball game; he looks like pure adrenaline.  Most of us don’t get to experience a moment like this in our lifetime so we don’t have a really strong reference point for what he’s feeling, but luckily you know what this article is about and there’s a gif:

giphy

PUMP! PUMP! PUMP IT UP!

That’s all lizard brain right there.  It’s a little undignified, but that’s the beauty of it.  Matt Adams is a dense, hulking man, and that makes it a little scarier and a little sillier.  Look:

matt-adams-b809f422f7cc9370

Sassy.

The numbers:

This one is especially hard to measure because of Adams’ primitive (yet graceful) movements.  I extracted these numbers using the still image and the video:

screen-shot-2016-11-28-at-10-04-38-pm

BatCast

Exit Velocity Launch Angle Acme Distance
20.6 mph 10 Deg 4’-11” 22’-1”

Exhibit C – Into the Batosphere

Yoenis Cespedes

Date Inning Leverage Index ΔWE% Implication
08/29/16 10th 1.23 47.0% The first baseball bat in outer space (for America – Korea has several).

Statcast

Exit Velocity Launch Angle Distance
101.9 mph 28.33 416 ft

Yoenis Cespedes made it into my BatCast segment last year with his nifty flip in the NLDS.  This flip follows a similar trajectory but he varies his look this time with a cross-bodied toss.  It’s rude:

082916_cespedes_bat_toss_med_k3thrcyn (1).gif

“Hold my drink, bitch.”

While the lesson here is obvious, the mistake is not as easily avoided: get the fastball ball UP and in on Cespedes.

plot_h_profile

Because of the evidence we have, the numbers for this bat flip will be even more rough than the others — by the way, I hope you’re not a mathematician, and I apologize if you are.  The data we can gather is the launch angle and at what time stamp the bat reaches it’s highest point.  Here’s a better view of the angle:

USP MLB: MIAMI MARLINS AT NEW YORK METS S BBN USA NY

Can we agree on shoulder height for the initial launch height to make things easier?   Let’s call it 5′ since Cespedes is 5′-10″.  We’ll say the bat was launched at a 70-degree angle and in the gif the bat appears to reach it’s apex at just before 0.4 seconds.

BatCast

Exit Velocity Launch Angle Acme Distance
9.2 mph 70 deg 12’-6” 4’-11”

Exhibit D – The “I probably didn’t even need this bat to hit this home run” flip

Bryce Harper

Date Inning Leverage Index ΔWE% Implication
09/10/16 8th 3.63 30.5% Bryce’s helmet probably won’t fall off when he’s running the bases.
Statcast
Exit Velocity Launch Angle Distance
99.7 mph 26.39 377 ft

After my long-winded intro it’s fitting to get to feature Bryce Harper in this piece.  He probably didn’t have as much fun this year as he did in 2015, but he appears to have gotten some enjoyment out of this shot.
wp-1480462655679.gif

Correct me if I’m wrong, but I believe that is what the kids call “Swagadoscious.”  I’ll just get right to the point this time.

bharpflipp

 

BatCast

Exit Velocity Launch Angle Acme Distance
6.3 mph 50 deg 6’-8” 5’-1”

Those are the ones that stuck out to me as the best flips of the year and I hope you were able to move past the rough estimates and get some enjoyment out of that as well.  I should note that Joc Pederson’s bat flip in the NLDS is omitted because I cannot find substantial evidence of an acme or distance.  And while a lefty going across his body like he did is pretty exotic, the uncertainty he exudes, combined with his panicked sashay, makes this effort pretty uncool.

pedersonbattoss_echl1ngh_il9khrdi

(Scherzer looks super imposed here)

So what can we pretend to glean from this?  Based on WPA, it’s probably not surprising that Harper had the most disproportionate bat flip.  Looking at the Statcast data, Harper’s home run was also the “weakest” out of the group.  So I guess even if Bryce Harper says what he says just so he can get away with being a little douchey, he’s holding up his part of the deal.  Of course, bat flips aren’t what make baseball fun.  Baseball is fun because we can see so much of our own lives in the game — it’s the humanity.  It provokes endless curiosity and it will reward you if you know where to look.  It’s the only game that can end, not because of time, but with one swing, and flip, of the bat.

Don’t be afraid to clue me in to bat flips in the future — my Twitter handle is in my bio (below).


Hitting Stat Correlation Remix

We all love baseball. And, since we’re on FanGraphs, odds are we also love baseball stats. A stat is always intended to measure one thing — how many home runs did Miguel Cabrera hit, how many bases did Ricky Henderson steal, how often does Joey Votto get on base, and so on.

The savvy fan knows that no one stat tells the whole story. Even WAR, which is our best estimator for how much value a player brings, requires us to dig further into how the player got there. Was it defense? Offense? If it was offense, how’d he get there — lots of walks, lots of home runs, or did he have a high BABIP? And which of these are most likely to repeat?

That’s where correlation comes into play. If there’s a high correlation season to season, then odds are what we’re measuring* is repeatable, and we can expect more of the same going forward. Otherwise, we should expect a regression (positive or negative) toward the league mean the next season.

Or maybe a player has a high line-drive rate. How’d they get there? Do they swing a lot, do they tend to be power hitters, etc. There are a lot of relationships within a season as well.

*Even at a seasonal level, we’re not necessarily measuring true talent so much as performance. Estimating true talent involves a lot of regression, and that’s a fun and important study, but not what I chose to focus on for this tool.

A few years back, Steve Staude published a hitting stat correlation tool that let anybody explore these correlations at their leisure. It was a fun way to explore the data, not to mention a neat piece of Excel engineering. I wanted to bring it up to date a bit, and in the process switch from Excel to Tableau. I can’t embed the view in this post, but you can view it by clicking through to the view on Tableau Public.

I decided to include every season in the FanGraphs database (I’m sure I owe a database admin somewhere an apology). By default, it filters on 300 PA for both metrics (either intra-season or next season), but you may drop the floor to 1 PA. It’s a terrible idea and you really shouldn’t do it, but I’m not here to tell you how to live your life.

I also added a yearly trend of the correlation. For most stats, this doesn’t add a lot, but there are some interesting stories. For instance, the yearly correlation between BABIP and AVG for players with 300 PA has been slowly dropping in the last 20 years. Reflective of more emphasis on walks, or perhaps defensive positioning?

The player with the highest swing rate and lowest strikeout rate? Randall Simon in 2002, with a 63.6% swing rate and a 5.9% K rate, which seems like some sort of joke. All that swinging meant a low 2.9% walk rate, so it’s not like he got away with anything.

The usual caveats about correlation not equaling causation apply. Just because you get a high r-squared doesn’t mean there’s a causal relationship; one always has to apply a common-sense analysis as well. That said, dive in and have some fun.


Looking into Differences in Exit Velocity

Statcast has revolutionized the way we look at batted-ball data. We have been spoiled with exit velocity, launch angle and so much more. After looking into this treasure trove of data, I began to wonder, how closely is a hitter’s overall production tied to their exit velocity? More specifically, I wanted to uncover whether production was tied to differences between Air EV and Ground EV. First, I calculated the difference between Air EV and Ground EV from Baseball Savant. Next, I filtered the list to only include those with at least 100 batted-ball events to not skew the sample. I also calculated AIR% by adding together LD% and FB% to see who is maximizing their contact and see who may need a change in approach.

This first chart illustrates which players have the largest difference between Air and Ground EV:

Player Difference, Air EV and Ground EV (MPH) AIR%
Byung-ho Park 17.3 58.7%
Nick Castellanos 13.6 68.6%
Brett Eibner 13.0 57.6%
Ryan Schimpf 12.9 80.4%
Mike Napoli 12.9 63.6%
Oswaldo Arcia 12.9 58.2%
Adam Duvall 12.5 66.1%
Brian Dozier 12.3 63.6%
Sean Rodriguez 12.3 60.2%
Brandon Belt 12.1 73.8%

Byung-ho Park leads the way by nearly 4 MPH, with a difference of 17.3 MPH. With the exception of Eibner, Park and Arcia, this is a list of hitters whose primary BIP type is FBs. Each of these hitters has an AIR% over 60%, with Ryan Schimpf pacing the group at an incredible 80%. With such a stark difference in EV, each these players should focus on hitting the ball in the air to maximize their overall production. For Park, Eibner and Arcia, putting the ball on the ground severely limits how often they can make harder contact. All things equal, hard contact is better than soft contact and these players should adjust their approach accordingly to maximize hard contact, which could help their overall production.

As we move on, the next chart displays players with the smallest differences in Air and Ground EV:

Player Difference, Air EV and Ground EV AIR%
Billy Burns -2.4 46.8%
Melky Cabrera -2.2 56.9%
Max Kepler -1.5 52.8%
Matt Szczur -1.4 57.4%
Martin Prado -1.2 52.6%
Jose Peraza -1 56.5%
Lorenzo Cain -1 52.7%
Ryan Rua -0.9 47.9%
Miguel Rojas -0.7 46.0%
Tyler Holt -0.5 48.0%

The speedy Billy Burns tops this list, complemented by a group of players no one will mistake for sluggers. This group comes with considerably less ceiling and overall production. Of this group, only three guys managed to post a league-average or better wRC+ (Prado, Cabrera and Peraza). Lorenzo Cain has been better in the past but was hampered by injuries this past year. Cain and the three previously mentioned provide the blueprint for how this profile can work. By spraying the ball and making enough contact, these guys maximize their limited power but have a razor-thin line between their bats being productive and unplayable.

As an aside, there was only one player who had zero difference in his EVs. The culprit? Nick Markakis, which for some reason makes perfect sense. Anyways!

So now that I have shown the extremes we can begin to answer the original question: does EV difference even matter for overall production? To find out, I ran a couple different tests. First, I took the data and divided them evenly into quarters. The results look like this:

Group Average EV Difference Average wRC+  

Best Hitter

Top 25% 9.4 102 Joey Votto
25-50% 6.3 100 Mike Trout
50-75% 4.5 95 Miguel Cabrera
Bottom 25% 1.8 94 Daniel Murphy

The top 50% of hitters with large differences in EVs hit average or slightly better. Meanwhile, hitters in the bottom 50% produced slightly below average. To give each group a face, I took the best hitter by wRC+ and here we have four elite hitters. So far we have a very minor indication that says players with larger EV differences hit better than those with smaller differences. What we do not have is a concrete reason to disqualify a hitter from being elite based on their EV differences.

Next, I took the data and plotted players’ EV Differences and wRC+ to see if there was any correlation.  The graph is about as random as it gets with an R squared value of .022. This shows that there is a relationship between EV differences and overall offensive production but nothing significant.

All things equal, you probably take the guy with the larger differences but that does not guarantee any kind of success. We now know that their differences of how hard they hit balls in the air or on the ground do not preclude them from being elite. Hitting is both art and science and what we have learned today only reinforces that hitters can have very different profiles and still have excellent results.


The Season’s Least Likely Home Run

Jeff recently ran two articles about the season’s worst and best home runs, as measured by exit velocity.  As a small addendum to that, I’d like to include both exit velocity and launch angle to try to determine the season’s least likely home run.  So how do we do such a thing?  Warning!  I’m going to spend a bunch of time talking about R code and machine learning.  If you want to skip all that, feel free to scroll down a bit.  If, on the other hand, you’d like a more in-depth look at running machine learning on Statcast data, hit me up in the comments and I’ll write some more fleshed-out pieces.

As usual, we’re going to rely heavily on Baseball Savant.  Thanks to their Statcast tool, we can download enough information to blindly feed into a machine-learning model to see how exit velocity and launch angle affect the probability of getting a home run.  For instance, if we wanted to make a simple decision tree, we could do something like this.

# Read the data
my_csv <- 'hr_data.csv'
data_raw <- read.csv(my_csv)
# Make training and test sets
library(caret)
inTrain <- createDataPartition(data_raw$HR,p=0.7,list=FALSE)
training <- data_raw[inTrain,]
testing <- data_raw[-inTrain,]
# rpart == decision tree
method <- 'rpart'
# train the model
modelFit <- train(HR ~ ., method=method, data=training)
# Show the decision tree
library(rattle)
fancyRpartPlot(modelFit$finalModel)

 

That looks like what we would expect.  To hit a home run, you want to hit the ball really hard (over 100 MPH) and at the right angle (between 20 and 40 degrees).  So far so good.

Now, decision trees are pretty and easy to interpret but they’re no good for what we want to do because (a) they’re not as accurate as other, more sophisticated methods and (b) they don’t give meaningful probability values.  Let’s instead use boosting and see how well we did on our test set.

method <- 'gbm' # boosting
modelFit <- train(HR ~ ., method=method, data=training)
# How did this work on the test set?
predicted <- predict(modelFit,newdata=testing)
# Accuracy, precision, recall, F1 score
accuracy <- sum(predicted == testing$HR)/length(predicted)
precision <- posPredValue(predicted,testing$HR)
recall <- sensitivity(predicted,testing$HR)
F1 <- (2 * precision * recall)/(precision + recall)

print(accuracy) # 0.973
print(precision) # 0.792
print(recall) # 0.657
print(F1) # 0.718

The accuracy number looks nice, but the precison and recall show that this is far from an amazingly predictive algorithm.  Still, it’s decent, and all we really want is a starting point for the conversation I started in the title, so let’s apply this prediction to all home runs hit in 2016.

Once you throw out some fairly clear blips in the Statcast data, the “winner”, with a 0.3% chance of turning into a home run, is this beauty from Darwin Barney.*  This baby had an exit velocity of 91 MPH and launch angle of 40.7 degrees.  For fun, let’s look at where similarly-struck balls in the Rogers Centre ended up this year.

* I’m no bat-flip expert, but I believe you can see more of a flip of “I’m disgusted” than “yay” in that clip.

Congrats Darwin Barney!  There are no-doubters, then there are maybes, and then there are wall-scrapers.  They all look the same in the box score, but you can’t fool Statcast.