Preller’s Impressive Rebuild

Back in 2014, the Padres had a really good farm system. It featured Austin Hedges, Matt Wisler, Trea Turner, and a few other good prospects, and Baseball America had them ranked sixth.

However, then came 2015, and A.J. Preller made an ill-advised attempt to go all-in. We don’t know whether the owners demanded him to do that, but we can for sure say it didn’t work. The Padres did improve to 74 wins, but came nowhere close to a wild-card slot, and they sent away Yasmani Grandal, Max Fried, Mallex Smith, Trea Turner, and others.

Suddenly, BA then had their farm system ranked only 24th, and the talent in the majors wasn’t great either.

Those actions really were bad for the organization.

But then came 2016, and Preller made a complete 180-degree turn. Most notably, he traded Craig Kimbrel for Javier Guerra, Manuel Margot and Logan Allen. He also selected Dan Straily and Brad Hand from waivers, and he made a lot of rule 5 picks.

Later, he traded James Shields for eventual top prospect Fernando Tatis, and swung the infamous Drew Pomeranz for Anderson Espinoza trade where Preller rightfully got criticized for not being honest about the health of his player. Preller got punished and was despised by the league and fans, but that didn’t stop him in his quest. He drafted Jacob Nix in 2015, Cal Quantrill in 2016, and MacKenzie Gore in 2017. He also signed a lot of guys on the international market, most notably Cuban Adrian Morejon.

The bottom line is that he built up a farm system in little more than two years that contains some risk but has some high upside and a lot of depth. MLB.com has seven of their guys in their top-100, and ranked their farm third midseason.

You can rightfully criticize Preller’s actions as a human being and professional, but there is little doubt about the results he got for his organization in the last two years.


In Remembrance of Roy “Doc” Halladay

Harry Leroy Halladay III was drafted by the Toronto Blue Jays with the 17th overall selection in the 1995 MLB First-Year Player Draft.  He was barely 18 years old at the time.  Throughout his time in the minor leagues, the pitcher, who now went simply by Roy Halladay, was a coveted prospect, reaching as high as #12 on the Baseball America Top 100 prior to the 1999 season.  Halladay surpassed rookie limits during the 1999 season, but the following year is generally more remembered as the anecdotal beginning of an eventual Hall-of-Fame-caliber career.  Among all pitching seasons with at least 50 IP, Halladay’s 10.64 ERA (48 ERA+) in 2000 was, and still is, the worst of all time.

The next season was much kinder to Halladay, as he posted a 145 ERA+; in 2002 he made his first AL All-Star team.  The first of two Cy Young awards “Doc” would receive came in 2003, when he pitched 266 innings and had a 3.23 FIP, along with an rWAR of 7.55.  The next two seasons were injury-plagued for Halladay, and he pitched a mere — by his standards — 274.2 innings in them combined, while running a 142 ERA+.  Fully healthy over the next four years, Halladay averaged 233 IP, never contributing fewer than 220 in a season.  In that stretch, only CC Sabathia produced a higher fWAR than Halladay, who was also first in IP, sixth in ERA, and eighth in FIP among all qualified pitchers in that span.  Halladay was performing at an elite level over a huge volume of work.  Doc Halladay was traded to the Philadelphia Phillies after the 2009 season.

This is where everything becomes more personal to me.  As a Phillies fan, I can clearly remember my middle-school self watching Halladay start many games for my favorite team.  The fondest of these memories is from the 2010 season, May 29 to be exact.  On that night Halladay took the mound opposing then Florida Marlins ace Josh Johnson.  Johnson was excellent that season, leading the NL in ERA and the MLB in FIP.  My 10-year-old self knew the game would be something spectacular.  Indeed, the game was spectacular.  The Phillies won 1-0 behind a complete game with 11 Ks from Halladay.  He had pitched a perfect game.  Later that season came an even more famous performance from Halladay.  He tossed a no-hitter against the Cincinnati Reds in Game 1 of the NLDS, only the second postseason no-hitter in history.  Sure, the Phillies would later fall in the NLCS, but the magic of Halladay’s season never was forgotten.  He won his second Cy Young that year.

However great 2010 was, my clearest memory of Roy Halladay pitching comes from 2011.  United with Cliff Lee, Cole Hamels, and Roy Oswalt, Halladay led the Phillies to 102 wins that season.  Unfortunately, what I remember best is Game 5 of the NLDS.  With the series tied at 2 games apiece, the Phillies handed the ball to Halladay for the deciding Game 5.  The Cardinals countered with one of Doc’s best friends, Chris Carpenter.  In total, just one run was scored in that game.  Rafael Furcal led off the game with a triple, and scored when the next batter, Skip Schumaker, doubled.  No more runners would cross the plate.  All told, both pitchers had incredible games.  Halladay had a game score of 72, 44% better than a league-average start.  The bitter portions of the memory are linked not to Halladay, but to the futility of the Phillies offense.  Roy Halladay could transcend even the bitterest of memories.

Time and age eventually caught up to Doc, and he did not pitch well in 2012 or 2013, seasons that were riddled with DL stints.  He retired following the 2013 season, and consensus in the industry was that he would be standing in Cooperstown giving a speech five years following this.  Additionally, some predicted that he would return to the game in some manner, as a pitching coach or something of the like.  First, however, he would take a few years to himself to pursue other interests.  Unfortunately, one of those interests was piloting, and, as fate would have it, he will never give a Hall of Fame speech.  Halladay loved flying planes, often tweeting about it.  Hauntingly following the advice of a quote attributed to several people, what he loved killed him.

In a 16-year MLB career, Roy Halladay compiled 2749.1 IP, 2117 K, a 65.4 rWAR, a 3.38 ERA, and a 3.39 FIP.  But does that really matter?  What matters is how Doc touched the lives of people around him.  It is cliché to say someone was a better person than they were a player, but he really was, and that’s saying something with his résumé.  Whether it was taking care of his family, being a good friend, providing a strong role model, or going to the Philadelphia Zoo with a persistent fan, Halladay improved the lives of those around him.

Goodbye, Roy “Doc” Halladay.  You truly did make the game better for all of us.  We are all so lucky to have been witnesses to your career and life.  You will be sorely missed.


If the Marlins Trade Stanton, They Need to Trade Everyone Else

The Jeter Group hasn’t been lazy and has made a lot of moves already. Now the rumor is that the payroll should be cut back to $90M and Stanton (and Gordon and Prado) should be traded, but the other stars like Yelich and Ozuna should be kept.

Now, I do think trading Stanton is a good idea. He has been great but also injury-prone, and he has a huge salary and opt-outs to make it worse. However, Stanton still is about a 4-5 win player and those wins have to be replaced. The Marlins are already a top-heavy team, with only six hitters and one pitcher with a WAR of 2 or better, and thus losing one star would hurt a lot. To make it worse, they only have three players between 1 and 2 WAR; the rest are below 1 or negative. Also, there is little help from the farm to be expected, which is ranked one of the worst in MLB by most sources.

So realistically, where does trading Stanton, Gordon, and Prado lead you? Gordon had a good season, but it was heavily fueled by BABIP; he isn’t really a good hitter and his trade value is limited. Stanton has trade value obviously, but the contract and opt-outs make him less appealing. Prado has zero value. So realistically trading the three gives you two top-100 prospects and maybe 2-3 more decent ones (40-45s). That is a good return, but we are talking about a terrible farm here, that according to Eric Longenhagen only had one top-100 guy pre-2017. So you lose about five wins from Stanton and maybe two from Gordon, and your team still is top-heavy and the farm is slightly better but still below average.

If the Marlins try to retool by trading Stanton and Gordon and keep everyone else, they are honestly in the same situation as the White Sox were before last year, a stars and scrubs team. Now, Yelich and Ozuna have long contracts, so they don’t necessarily need to go immediately, but without a farm system and trade chips it will be hard to build around them.

If the Marlins are serious about competing anytime soon, they need to either keep Stanton and spend big (which IMO is stupid because stars and scrubs teams hardly work anymore), or sell everyone and try to build a top-5 farm system as fast as possible. The Marlins aren’t in a bad spot to do that, although unfortunately their value is mostly hitters, and not ace pitchers, for whom the market currently is better.

But still, if you trade Stanton, Gordon, Yelich, Ozuna, Realmuto, Straily, and one or two of the relievers, you should easily be able to get back like seven top-100s, plus 6-7 more 40+ prospects, and that would immediately make them a top-5 farm system. Now, that would be a huge sell-off, but if you take Stanton away from such a top-heavy team, IMO that is the best thing that you can do.

I really hope Jeter is not just a popular head to sell an even more greedy owner. If the Marlins would trade the expensive guys and then try to retool around the cheap guys, that would be a very bad signal, because with that farm system, that likely would mean they keep being stuck in between. So unless the new group wants to spend $180M+, they better trade the high surplus value guys too.

Now, if there isn’t a good offer for Yelich and Ozuna, they can afford to wait a little like the White Sox did with Quintana, but ultimately the two need to be traded if the Marlins want to rebuild the team. Half-way rebuilds rarely work, at least if you don’t have a good farm system and good depth in the back end of the roster already.


A Few Candidates To Be The Nats’ Fifth Starter

The Nationals head into the offseason without a fifth starter. The plan was for Joe Ross to be a member of the rotation for 2018 and years to come, but he is likely to be out for the season after Tommy John surgery. The next-best in-house option is A.J. Cole. He is generally ineffective, however. In addition, both of these pitchers’ struggles against left-handed hitters may indicate that they are better suited to relief. Eric Fedde is the best pitching prospect on the farm. Although he may be ready to produce come June or July, it is probably better for his development for him to start the season on the farm, and then get called up when Stephen Strasburg inevitably has to sit out for a month.

The Nats have a good rotation headlined by Max Scherzer and Strasburg, so there is no need to go after the best free agents like Jake Arrieta or Yu Darvish, or even mediocre options such as Lance Lynn or Alex Cobb. The Nats don’t need high quality innings; they need a high quantity of innings.

Even though Jayson Werth’s hundred-million-dollar contract is coming off the books this offseason, much of that money will be devoted to arbitration raises to Bryce Harper, Anthony Rendon and Tanner Roark. The Nats will be searching for a cheap workhorse, and those don’t get exchanged in trades. Mike Rizzo will have to explore the free-agent market to find 160 innings, and I would like to highlight a few candidates.

THE RELIABLES

Jhoulys Chacin

Chacin is quietly coming off a great year for a back-of-the-rotation arm. He pitched 180 innings and managed to keep his ERA below 4. Behind that 3.89 ERA, though, was a 1.79 home ERA and a 6.53 away ERA. That may be off-putting to any non-Padre suitors, but there is no way that Chacin can be Clayton Kershaw at Petco Park and Anibal Sanchez everywhere else. Those two numbers are bound to converge somewhere around 4.0 in 2018. Plus, those splits may scare away a number of rivals for Chacin’s services, making his price palatable.

John Lackey

You may be initially repulsed at this name because his ERA increased by a whole run this season, but the reality is that he will probably be too expensive for the Nats. He pitched 170 innings this season, and the last time he pitched fewer than 160 innings was 15 years ago. (That is, aside from missing all of 2012.) The quality of those innings decreased drastically, but the baseline for that comparison was his career ERA+ of 110.

R.A. Dickey

The knuckleballing vet continues to produce. He made good on his one-year, $8-million deal with the Braves by pitching 190 innings of roughly league-average production. His FIP was right in line with that ERA. In addition, since velocity isn’t critical to his success, age shouldn’t and hasn’t rendered him ineffective. That being said, he could regress a bit and still produce 180 innings at around a 4.5 ERA next season. The caveat here is that the Braves have an option on him for this next season for the same price as 2017, so he might not reach free agency.

All of these players are probably in line for contracts like the one Bartolo Colon and Dickey received last offseason — one year, about $10 million. If Big Daddy Lerner wants to splash the cash, then the Nats can probably sign one of these players on a one-year deal. However, that is unlikely. If the Lerners don’t increase payroll, the Nats may be forced to truly scavenge the scrapheap.

 

WILD CARDS

Bartolo Colon

Speaking of veterans…Colon may have been DFA-worthy for the Braves last season, but I think he’s still got it. The manifestation of Jabba the Hutt on the pitcher’s mound rebounded with a 3.4 ERA in August with the Twins, and didn’t have the ominous drop in velocity that many veterans undergo. There is a worry that NL East hitters will have seen him enough to know how to destroy the 90mph fastballs he could throw in a game, but given the drastic roster turnover in Philadelphia and New York over the past year, and the pending teardown in Miami, the only worry is Freddie Freeman, and he seems to destroy any pitcher wearing a curly W on his chest. The 44-year-old may able to squeeze a little more magic out of his arm.

Ubaldo Jimenez

I know. I know. The Orioles’ rotation was trash, so why would the Nats want to sign someone from the rivals just up 95? Jimenez could easily rebound from his disastrous 2017. His ERA was 6.8, but his 4.5 xFIP paints a different picture. His ERA was bloated because of an almost impossibly high 2.08 HR/9 — well above the normal HR/9 of 1 that he sported for most of his time in Baltimore. That number is bound to fall much closer to his career average, and a move from Camden Yards to Nats Park would only help that. In addition, both his strikeout and walk rates were better last season than his career averages. Maybe a reunion with Matt Wieters would cause Ubaldo to return to what he was from 2014-2016 — an inconsistent, but capable, back-of-the-rotation arm. If his price tag is low enough, he could be a steal.

Chris Tillman

Let’s just pretend 2017 never happened. Tillman was the staff ace on a team that made the playoffs two out of the last three years. His 2017 season was ugly. Negative WAR ugly. However, that lackluster performance was likely due to the shoulder issues that forced him to miss half the season. Tillman proved from 2011 to 2016 that he is a more than capable pitcher. His FIP- was basically league average every year during that span, and he cracked 200 innings twice and 170 on two other occasions. The Nats should take a peek at Chris Tillman.

Clay Buchholz

Buchholz does not fit the mold I described in my introduction, but the Nats should be intrigued by his upside. He was a quality number two worth 3.2 WAR as recently as 2015, and provided 2.8 WAR to the Red Sox in 2013. Buchholz has been nothing but mediocre in his other seasons. He has the highest potential of any pitcher on this list, but he has proven NOT to be durable. He hit 189 innings in 2014, and 170 in the preceding and following even years; however, for the most part, Buchholz has thrown between 100 and 150 innings per season for his career. He basically missed this entire past season, but he is on track to be ready for spring training. If his right arm looks good, the Nats should give him a hard look.


The Deciding Play of the World Series That Nobody Is Talking About

Much like ESPN’s Tim Kurkjian, I am a baseball nerd.  I grew up clipping box scores out of The Sporting News and used them to compile season-long handwritten tables of statistical data (manually calculated) for my favorite team.  I collected baseball cards and put a few of them in the spokes of my bicycle.  I devoured the Bill James Baseball Abstracts.  I’ve had a lifelong love affair with the game of baseball and especially the statistics.  Whereas Mr. Kurkjian has a strange fascination with the sacrifice fly and even wrote a book about it, I am fascinated by baserunning and wrote a two-part blog series about it.

Part 1

Part 2

It is through this lens that I often view baseball games and especially baserunning decisions.  Our respective interests intersected in the incredible drama of World Series Game 5 between the Astros and Dodgers.  In the top of the eighth inning, LA trailed 11-9 with one out but with runners on second and third base.  According to FanGraphs’ play log, the win probability was 72.2% in favor of the Astros.  What happened next could very well have been the determining factor in the outcome of the entire World Series.  I couldn’t find a GIF of the play but it’s at the 3:17 mark of this re-broadcast if you want to view it online.

Justin Turner hit a line drive to right field, where Josh Reddick caught it cleanly for the second out of the inning.  With some forward momentum, he fired a throw to home plate in an attempt to gun down the speedy Chris Taylor tagging from third.  Taylor started sprinting down the line, then inexplicably stopped.  Reddick’s throw was well up the third-base line and revealed to the entire viewing world that Taylor probably would have been safe if he hadn’t stopped.  After a pitching change, the Fox broadcast showed a replay of third base coach Chris Woodward telling Taylor, “Gotta go!  Gotta go!  Gotta go!” followed by Taylor explaining to Woodward that he thought he was being given the stop sign.  The Astros’ win probability went up to 84.1% after that play, and up to 94.3% after Andre Ethier grounded out to end the inning.

Let’s examine that play a little closer.  The first question to ponder is whether or not it was the right decision to send the baserunner.  According to my prior analysis referenced above, the breakeven point for that situation is around 43%, meaning that if there’s a 43% chance or less of getting thrown out, the runner should attempt to score.  From the article:

“The break-even analysis indicates that coaches should send runners from 3rd almost every time on a fly ball with one out. Even if they’re thrown out a majority of the time, the net result will be positive.  Basically the risk of sending a dead duck to the plate is worth it compared to relying on the next batter to knock the run in.”

Chris Taylor is probably the fastest runner on the Dodgers.  But Josh Reddick is also known to have an exceptionally strong arm.  With Reddick coming forward and at medium depth, he probably wouldn’t need a perfect throw to gun down Taylor, but he would need a very good throw.  In real time, my thought was that Taylor should absolutely try to score based on my armchair opinion and knowledge of the odds of success.  If the play were repeated 100 times, would Reddick be able to throw out a running Taylor more than 43 times?  Given all the things that can go wrong, such as a throw off line (as this one was), the catcher not fielding it cleanly (which also happened in this case), or the catcher missing the tag, in my assessment Woodward made the right decision.  That opinion is certainly up for debate, but I think it was the appropriate choice given the circumstances.

Given that the decision was optimal, the second question is, what could Woodward have done differently to avoid miscommunication with the baserunner?  In a prior life, I used to coach intercollegiate volleyball.  Communication is a critical part of the game to both prevent collisions and to clearly identify who is responsible for playing the ball.  The natural tendency for a volleyball player is to say either “I got it” or “you got it” to call for the ball.  But I coached our players to call “mine” or “yours” instead.  The reason is because “I got it” and “you got it” are too similar and can become easily confused especially if someone only hears the “got it” part.  I often wonder if dropped pop-ups in baseball are the result of the “got it” phenomenon.  Regardless, the same concept applies to this baserunning situation.  “Go” and “no” are too similar, especially in the presence of 43,300 screaming fans during Game 5 of the World Series.  I would advise Woodward to restrict his lexicon to simple “stop” and “yes” commands or perhaps “run!” in the future to avoid any confusion.  It could make a world of difference.

By now, you know the rest of the story.  The Dodgers went on to lose that game 13-12 in 10 innings, but rebounded in Game 6 to tie the series, only to lose Game 7 and the World Series title.  But what if…?  What if Taylor didn’t abort his attempt and instead scored on a sacrifice fly?  And what if all the other events unfolded in an identical fashion?  The Dodgers would have only trailed 11-10 at that point and would have gone ahead 13-12 with their improbable three-run outburst in the top of the ninth inning.  They would have won Game 5 with Kenley Jansen closing it out in the bottom of the ninth, and they would have won the World Series in six games.  What if, indeed!  Certainly, nobody can say for sure how the subsequent events would have unfolded in this alternate reality, but the best guess we can make is to assume what happened after that play would have still happened, but with an extra run on the scoreboard for the Dodgers.  And if that were the case, the Dodgers would be World Series champions today instead of the Astros.  It’s incredible to imagine that the entire World Series may have been decided by a third-base coach who should have simply said “yes” instead of “go.”

 

Ross Roley is a baseball analysis hobbyist and former Professor of Mathematics at the U.S. Air Force Academy.  He’s also partially responsible for instant replay in MLB having raised awareness of the issue in 2006.  http://baseballanalysts.com/archives/2006/05/instant_replay_1.php


The Luckiest and Unluckiest Batters by xwOBA

Last week I posted an article about Chris Taylor and how I expected him to regress. I won’t get into that much detail here but I just want to look at the luckiest and unluckiest players of 2017.

The under-archiever leaderboard looks like this (copied from Baseball Savant):

1 Miguel Cabrera 0.382 – 0.322 0.060
Graphs
2 Mitch Moreland 0.371 – 0.335 0.036
Graphs
3 Victor Martinez 0.344 – 0.311 0.033
Graphs
4 Alex Avila 0.401 – 0.368 0.033
Graphs
5 Albert Pujols 0.326 – 0.294 0.032
Graphs
6 Kendrys Morales 0.358 – 0.326 0.032
Graphs
7 Brandon Moss 0.336 – 0.305 0.031
Graphs
8 Taylor Motter 0.288 – 0.259 0.029
Graphs
9 Alex Gordon 0.300 – 0.275 0.025
Graphs
10 Jose Martinez 0.411 – 0.386 0.025
Graphs

 

And here is the over-achiever leaderboard. Also in the top-30 are the mentioned Taylor, Jose Ramirez, Nolan Arenado and Javier Baez, among notable players:

1 Eduardo Nunez 0.275 – 0.348 -0.073
Graphs
2 Marwin Gonzalez 0.320 – 0.387 -0.067
Graphs
3 Zack Cozart 0.332 – 0.399 -0.067
Graphs
4 Mallex Smith 0.239 – 0.305 -0.066
Graphs
5 Jose Altuve 0.349 – 0.413 -0.064
Graphs
6 Dee Gordon 0.254 – 0.318 -0.064
Graphs
7 Scooter Gennett 0.312 – 0.374 -0.062
Graphs
8 Kevin Kiermaier 0.279 – 0.341 -0.062
Graphs
9 Charlie Blackmon 0.364 – 0.424 -0.060
Graphs
10 Ronald Torreyes 0.241 – 0.299 -0.058
Graphs

 

Now the question is whether it is really all luck. If you look at the unlucky leaderboard, it is pretty easy to see that many of them are slow as dirt. The over-achiever group has some average-speed players (for example Marwin Gonzalez), but also speedsters like Altuve, Smith, Gordon and Kiermaier.

Overall, the under-performers had a higher launch angle, higher exit velo, and a slightly but not significantly higher pull rate (thought that might be a factor due to the shift, but really wasn’t).

 

sprint speed exit velo launch angle pull%
under-performers 25.66 89.24 12.59 42.1
over-performers 27.98 84.04 8.49 40.89

 

Of course we don’t know whether those factors like low LA and low power, which are generally associated with worse hitting, are not correlated directly to the sprint speed. To test that, I looked at some sub-groups. When searching for harder hitters at lower LAs, I took EVs of over 89, paired with LAs under 9 (just eight players fulfilled that BTW). You get a slightly positive differential, which means slight under-performance, but only by about 18 wOBA points. Looking at soft hitters (below 85) with high LAs (<12 degrees), it does get more significant at a wOBA difference of 30 points.

Hard hitters with high LAs, however, only under-perform a tiny bit (about 8 points), so LA alone doesn’t really seem to make a difference. Hitting fly balls soft might be a factor that affects the under-performing, and very clearly speed does.

What we need to find out is how much of that is sustainable year to year. We do know that some pitchers have the skill to outperform their FIP, but for the most part pitchers who outperform their FIP will regress. Under or over-performing xwOBA might not be pure luck; there are factors which likely have an influence on that. Some of that might be holes in the xwOBA stat that can be fixed over time, and others might be caused by the player type. I think we need to do more analysis on the predictive value of xwOBA and the factors that influence it.

But of course one last thing needs to be said: Over-performing your wOBA is nice, but still, the overall production counts. Some of the over-performing hitters are still not good hitters (Gordon, Torreyes, Gennett), while some under-performers are good.


Thinking Like an MLB MVP Voter

Photo: Yi-Chin Lee/Houston Chronicle

Baseball season is coming to a close and the Baseball Writers’ Association of America (BBWAA) will soon unveil its votes for AL and NL MVP. The much-anticipated vote is consistently under the public microscope, and in recent years has drawn criticism for neglecting a clear winner *cough* Mike Trout *cough*. This being one of the closest all-around races in years, voters certainly have some tough decisions to make. This might be the first year since 2012 where it’s not wrong to pick someone other than Mike Trout for AL MVP.

Of course, wrong is subjective. The whole MVP vote is subjective. Voter guidelines are vague and leave much room for interpretation. The rules on the BBWAA website read:

There is no clear-cut definition of what Most Valuable means. It is up to the individual voter to decide who was the Most Valuable Player in each league to his team. The MVP need not come from a division winner or other playoff qualifier. The rules of the voting remain the same as they were written on the first ballot in 1931:

1.  Actual value of a player to his team, that is, strength of offense and defense.

2.  Number of games played.

3.  General character, disposition, loyalty and effort.

4.  Former winners are eligible.

5.  Members of the committee may vote for more than one member of a team.

It won’t do any good for me to saturate the web with another opinion piece on who deserves to win. It won’t change the vote, and I don’t think I could choose. My goal is rather to illustrate how BBWAA voters have interpreted these rules over time. Have modern sabermetrics driven any shifts in voter consideration? Do voters actually consider team success? Do voters unconsciously vote for players with a better second half?

I thought the best (and most entertaining) way to answer these questions would be to create a model that would act as an MVP voter bot. Lets call the voter bot Jarvis. Jarvis is a follower.

  1. Jarvis votes with all the other voters.
  2. It detects when the other voters start changing their voting behavior.
  3. It evaluates how fast the voters are changing behavior and at what speed it should start considering specific factors more heavily.
  4. It learns by predicting the vote in subsequent years.

I created two different sides to Jarvis. One that is skilled at predicting the winners, and one that is skilled at ordering the players in the top 3 and top 5 of total votes. The name Jarvis just gives some personality to the model in the background: a combination of the fused lasso and linear programming. And it also saves me some key strokes. If you are interested in the specifics, skip to the end, but for those of you who’ve already had enough math, I will spare you the lecture.

Jarvis needs historical data from which to learn. I concentrated on the past couple decades of MVP votes spanning 1974 to 2016 (1974 was the first year FanGraphs provided specific data splits I needed). I considered both performance stats and figures that served as a proxy for anecdotal reasons voters may value specific players (e.g., played on a playoff-bound team). For all performance-based stats, I adjusted each relative to league average — if it wasn’t already — to enable comparison across years (skip to adjustments here).  Below are some stats that appeared in the final model.

Position player specific stats: AVG, OBP, HR, R, RBI

Starting pitcher (SP) specific stats: ERA, K, WHIP, Wins (W)

Relief pitcher (RP) specific stats: ERA, K, WHIP, Saves (SV)

Other statistics for both position players and pitchers:

Wins Above Replacement (WAR) Average of FanGraphs and Baseball Reference WAR

Clutch – FanGraphs’ measure of how well a player performs in high-leverage situations

2nd Half Production – Percent of positive FanGraphs WAR in 2nd half of season

Team Win % – Player’s team winning percentage

Playoff Berth – Player’s team reaches the postseason

Visualizing the way Jarvis considers different factors (i.e. how the model’s weights change) over time for position players reveals trends in voter behavior.

Immediately obvious is the recent dominance of WAR. As WAR becomes socialized and accepted, it seems voters are increasingly factoring WAR into their voting decisions. What I’ll call the WAR era started in 2013 with Andrew McCutchen leading the Pirates to their first winning season since the early 90s. He dominated Paul Goldschmidt in the NL race despite having 15 fewer bombs, 41 fewer RBI, and a lower SLG and OPS. While Trout got snubbed once or twice since 2013, depending on how you see it, his monstrous WAR totals in ’14 and ’16 were not overlooked.

As voters have recognized the value of WAR, they have slowly discounted R and RBI, acknowledging the somewhat circumstantial nature of the two stats. The “No Context” era from ’74 to ’88 can be characterized perfectly by the 1985 AL MVP vote. George Brett (8.3 WAR), Rickey Henderson (9.8), and Wade Boggs (9.0) were all beaten out by Don Mattingly (6.3), likely because of his gaudy 145 RBI total.

Per the voting rules, winners don’t need to come from playoff-bound teams, yet this topic always surfaces during the MVP discussion. Postseason certainly factored in when Miggy beat out Mike Trout two years in a row, starting in 2012. See that playoff-berth bump in 2012 on the graph below? Yeah, that’s Mike Trout. What the model doesn’t consider, however, are the storylines, the character, pre-season expectations: all the details that are difficult for a bot to quantify. For example, I’ve seen a couple of arguments for Paul Goldschmidt as the front-runner to win NL MVP after leading a Diamondbacks team with low expectations to the playoffs. I’ll admit, sometimes the storylines matter, and in a year with such a close NL MVP race, it could push any one player to the top.

What can I say about AVG and HR? AVG is a useless stat by itself when it comes to assessing player value, but it’s ingrained in everyone’s mind. It’s the one stat everyone knows. Hasn’t everyone used the analogy about batting .300 at least once? Home runs…they are sexy. Let’s leave it at that.  Seems like these are always on the minds of MVP voters and that is not likely to change any time soon.

I’m sure some of you are already thinking, “What about pitchers!?” Don’t worry, I haven’t forgotten — although it seems MVP voters have. Only three SP and three RP have won the MVP award since 1974, and pitchers account for only about 7.5% of all top-5 finishers. As you can see in the factor-weight graph below, their sparsity in the historical data results in little influence on the model; voter opinions don’t change often, and their raw weights tend to be lower than position players. Overall, it seems as though wins continue to dominate the SP discussion, along with ERA and team success. While I would expect saves to have some influence, voters tend to be swayed by recency bias and clutch performance along with WHIP and WAR.

What would an MVP article be without a prediction? Using the model geared to predict the winners, here are your 2017 MLB MVPs:

AL MVP: Jose Altuve    Runner Up: Aaron Judge

NL MVP: Joey Votto   Runner Up: Charlie Blackmon

Here are the results from the model tuned to return the best top-3 and top-5 finisher order:

It’s apparent that I adjusted rate and counting stats for league and not park effects given both Rockies place in the top 2. Certainly, if voters are sensitive to park effects, Stanton and Turner get big bumps, and Rockies players likely don’t have a chance. Larry Walker was the only Colorado player to win the MVP since their inception in 1993, but in a close 2017 race it might make the difference.

Continue reading below for the complete methodology and checkout the code on github.

A previous version of this article was published at sharpestats.com.


Statistical Adjustments

Note: lgStat = league (AL/NL) average for that stat, qStat = league average for qualified players, none of the adjusted stats are park adjusted

There were two different adjustments needed for position player rate stats and count stats.

Rate stat adjustment:  AVG+ =  AVG/lgAVG  

Count stats: HR, R, RBI

Count stat adjustment:  HR Above Average =  PA*(HR/PA – lgHR/PA)

There were three different adjustments needed for starting pitcher (SP) and relief pitcher (RP) rate stats and count stats.

Rate stats: ERA, WHIP

Rate stat adjustment:  ERA+ =  ERA/lgERA  

Count stats I: K

Count stat I adjustment:  K Above Average =  IP*(K/IP – lgK/IP)

Count stats II: Wins (W), Saves (SV)

Count stat II adjustment:  Wins Above Average = GS*(W/GS – qW/GS)


Fused Lasso Linear Program

I combined two different approaches to create a model I thought would work best for the purpose of predicting winners and illustrating change in voter opinions over time. Stephen Ockerman and Matthew Nabity’s approach to predicting Cy Young winners was the inspiration for my framework for scoring and ordering players. A players score is the dot product of the weights (consideration by the voters) and the player’s stats.

The constraints in the optimization require the scores of the first place player to be higher than the second place, and so on and so on. This approach, however, doesn’t allow for violation of constraints. I add an error term for violation of these constraints, and minimize the amount by which they are violated.

Instead of constraining the weights to sum to 1, I applied concepts from Robert Tibshirani’s fused lasso which simultaneously apply shrinkage penalties to the absolute value of weights themselves as well as the difference between weights for the same stat in consecutive years. This accomplishes two things: 1) it helps perform variable selection on statistics within years helping combat collinearity between some performance statistics, and 2) it ensures that weights don’t change too quickly overreacting to a single vote in one year.

However, this approach and formulation cannot be solved by traditional linear optimization methods since absolute value functions are non-linear. The optimization can be reformulated as follows:

To select the lambda parameters, I trained the model using the first 10 seasons of scaled data increasing the training set by 1 season each time and tested with the subsequent year’s vote.After in season statistical adjustments, I scaled the stats by mean and standard deviation of training data to enable comparison across coefficients. All position player stats were replaced with 0 for pitchers and vice versa.

References:

1. Ockerman, Stephen and Nabity, Matthew (2014) “Predicting the Cy Young Award Winner,” PURE Insights: Vol. 3, Article 9.

2. R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society Series B, 67(1):91–108, 2005.

 


The Three True Outfits

There is a new trend in baseball over the past decades. One that reflects a big change in the game: The three true outfits.

Until the 1940’s, there was essentially one way that players wore their uniforms: Baggy pants, tucked into socks. Everyone on the team looked the same; there was no variation in how the players wore their uniforms.

Then things started to tighten up. By the 1970s, the whole team was wearing their pants tight, tucked into socks. The style was dramatically different from before, but there was little difference between players. All uniforms were tight.

In the past decades, something different has occurred. Players on the same team technically wear the same uniform, but how the uniform is worn varies dramatically from player to player.

Some players wear their pants big and baggy, covering not only their socks but their shoes as well. Other players wear their pants tight. Still other players are somewhere in between, wearing their pants in a more standard size, slacks-style.

Other pieces of the uniform add even more variation across players. Socks are worn in or out. Hats are flat or curved, sitting straight on the head or off-center. Jerseys can be buttoned up or unbuttoned. Shirts can be worn under the jersey or not, and vary in sleeve length.

What was once one outfit has become at least three true outfits, if not more. Players wear their uniforms in a variety of ways, reflecting the increasing diversity of the players and the game.

People have been critical of some of the styles, particularly the baggy pants. A writer for The New York Times, for example, discussed the fashion criticism of baseball style in a 2013 article, “Baseball Pants, a Sore Sight for Eyes.” “The World Series is a showcase for not only the finest teams in the game,” he wrote, “but also, for about the 15th year running, the regrettable fashion trend of the baggy, pajama-pant look.” One member of the fashion industry concluded, “What was once a stylish game has gotten depressingly schlubby.”

But the styles being criticized are actually far older than 15 years; they are the styles of the past, the pants of Babe Ruth. Baseball players today are making traditional styles modern and choosing how they want to look.

This trend reflects trends in fashion overall. Rather than walking around in top-down-imposed, creativity-crushing, cookie-cutter versions of clothing and ourselves, we want to be freer today to express our individuality in our lives and in our clothing. This diversity and individuality is reflected not only in the clothes worn on the field, but in the style of play. Baseball players are making the uniform, and the game, their own. This is good for baseball.


Just How Valuable Was Chad Green?

For all the surprises the 2017 season had to offer, one of the more pleasant ones had to be the rise of Chad Green.

Once a starting pitcher, Chad Green was converted to a relief pitcher this season. It was, for technical reasons, his first full season. His repertoire of the four-seam fastball, cutter, and slider combine for a nasty usage of all three. While he did throw his fastball 69.4% of the time this past season, the slider made way with a 22.1% usage rate — the cutter came in at just 7.8% of the time.

Some contributing factors to Green’s remarkable season could be that he was able to change his approach from a guy preparing to go five, maybe six innings, to one who could use his best stuff for two or so. That allows him to throw harder, and riskier, for shorter amounts of time.

Green’s average velocity on his FB:

  • 2016- 94.3 mph
  • 2017- 95.8 mph

The fact that Green was able to focus in on his pitches more led to his posting a fantastic season. Green improved in every category possible, and put himself in the elite group of relief pitchers that baseball has to offer.

2016: (MLB: 45.2 IP/ 8 GS)

  • 10.25 K/9 to a 2.96 BB/9, with a 2.36 HR/9
  • 41.3% GB rate / 25.0% HR-FB rate
  • .269 BAA & 1.40 WHIP

2017: (MLB: 69.0 IP/ 1 GS)

  • 13.43 K/9 to a 2.22 BB/9, with a 0.52 HR/9
  • 26.4% GB rate / 6.7% HR-FB rate
  • .145 BAA & .74 WHIP

What that indicates is that Green’s pitches were better utilized when he was able to throw them at their maximum ability. Of course, naturally, with a smaller usage, there is a smaller room for error.

The biggest improvement for Green was his HR allowance. In 2016, over the 45.2 IP, he gave up 12 HR. In 2017, over 69.0 IP, he gave up a mere four. Green was able to strike out dramatically more batters, while lowering his GB rate from 41.3% in 2016 to 26.4% in 2017. Essentially speaking, Green just didn’t allow people to get on base.

In 2016, he faced 198 batters. He gave up 49 hits, walked 15 batters, and hit another. 55 batters allowed on base, leading to 26 earned runs. That equates to approximately a 28% allowance of runners on base.

In 2017, however, he faced 253 batters. He surrendered 34 hits, walking 17, and hitting two. 53 batters allowed on base, leading to 14 earned runs. That equals out to roughly 21% of runners on base.

Being only 26 years old, and the Yankees having him for the next four full seasons under their terms, it looks like the future is bright for Chad Green.

Green’s value was astounding this season. The ability to use him for multiple innings allowed the Yankees to use him for extended appearances, and they gave him the same rest as the other arms. What separates him from his teammate, David Robertson, is that Green (thanks to his SP past) was able to go multiple innings on command.

GREEN 2017 value & breakdown:

  • Green appeared in 40 games, throwing 69.0 IP (1.7 IP per)
  • His RAR (Runs Above Replacement) was 23.5
  • 2.4 WAR
  • BABIP Wins- 0.6

ROBERTSON 2017 value & breakdown:

  • D-Rob appeared in 61 games (CWS/NYY) throwing 68.1 IP (1.1 IP per)
  • His RAR was 18.5
  • 1.9 WAR
  • BABIP Wins- 0.8

While David Robertson is nominated for AL Reliever of the Year (along with Craig Kimbrel and Ken Giles), Chad Green is seemingly receiving no love from the MLB.

Kimbrel has to be most deserving of the three, posting an amazing 32.1 RAR and a 3.3 WAR.

Giles, on the other hand, is the worst of the three, and he posted numbers worse than Green. (18.1 RAR with a 1.8 WAR)

Chad Green was the Yankees’ go-to, or so it seemed. Whenever they were in a jam, Green would be brought in. Initially used as a sixth starter, or the “fifth-inning guy,” Green established himself as a huge piece in their bullpen.

Back in December of 2015, when the Yankees sent Justin Wilson over to Detroit for a pair of prospects, the word around the MLB was that it was a rather lopsided trade, in the Tigers’ favor. Wilson came off of a great season for the Yankees, in which he posted a 3.10 ERA over 74 appearances. Despite this, Cashman stuck to his guns and reinforced the fact that the Yankees needed SP help, more than another elite closer. When the trade was completed, the Yankees received Detroit’s number 6 and 19 overall team prospects. (Green was 19.)

Chad Green may not be the most exciting player the Yankees have traded for, but he sure may be the best valued. With him being under team control for the next four seasons, and the fact that he is still young and working on his offspeed pitches, it opens the way for future improvements. Can Green be better next season? We all saw what happened with Luis Severino when he improved his secondary pitches.

In Severino’s second season (2016):

  • 55.9% Fastball
  • 9.9% Changeup

2017:

  • 51.4% Fastball
  • 13.5% Changeup

Although it’s not a drastic change, the fact that he was able to regain that command and control over his changeup made way for him to catch batters off guard more, and change the eye levels. Green’s spread is not nearly as spread out as Severino’s is, and being a relief pitcher, it doesn’t have to be. Green’s fastball is thrown, again, 69.4% of the time.

If he can work on his cut fastball a little bit more, the possibilities can expand for Green. The swings and misses out of the zone would be greater, and the contact percentages against lefties would go down, because he would jam them in on the hands. Having the luxury of playing alongside Robertson, who throws a mean cutter (inherited from Mariano Rivera), and being able to surround himself with the amazing ensemble of the bullpen crew the Yankees have put together bodes well for Chad Green.

During the months of September and October, Green posted a 0.74 ERA, allowing just one earned run (not counting the postseason). He faced 44 batters, and gave up just seven hits, against his 17 strikeouts. Green never gave up more than four earned runs in any month of the entire season, and surrendered seven runs in both the first- and second-half splits. It is clear that Green was focused from the moment he came out of the pen.

While he only pitched 2.1 IP in “high-leverage” situations (27.0 in “medium” and 39.2 in “low”), I wouldn’t let that sway his stats in a negative connotation. Look for that to change this upcoming season, especially with Betances’ mental struggles. If I were to speak blindly right now, Chad Green would be my seventh-inning guy, with Robo in the eighth, and Chapman in the ninth. There should be a dramatic change in terms of “high-leverage” innings pitched for Chad.

Needless to say, Chad Green is a rather remarkable story, being that he was considered an add-on in the trade that was headlined by Luis Cessa. When the trade initially happened, Green was said to be the guy that “bridged the gap” for Bryan Mitchell if he were to struggle.

Chad Green will look to build on his remarkable first full season in the “Pinstripe Pen of Doom,” and help guide New York to that AL pennant next year.


Do Switch-Hitters Always Need to Switch?

Switch-hitting is a rare yet valuable trait for hitters. It gives a player a certain versatility that eliminates the necessity for platooning. But it is not uncommon that a player is markedly more successful from one side of the plate. Take Lance Berkman, for example, one of the best switch-hitters of all time. Here are his career splits from the right side vs. the left side:

Handedness AVG ISO BB% K% GB% FB% wRC+
Left .301 .265 16.7% 16.8% 40.3% 39.9% 155
Right .259 .158 12.6% 15.0% 47.4% 33.2% 105

Berkman was better in every facet from the left side, hitting for better average, more power, and lifting the ball more while showing a better eye. How many of those lefty plate appearances came against lefties? 0. It makes sense that all his plate appearances would be L v R and R v L because pitchers are better facing hitters of the same handedness. But that is not always the case.

There are always reverse split guys, with both pitchers and hitters alike. We even had one in World Series Game 2. Rich Hill’s splits are not aggressively reverse, but for his career his wOBA and xFIP vs. lefties are .305 and 4.39, respectively. He’s posted .305 and 4.02 against righties in the same categories. The clear difference is his 16.0% K-BB% vs. righties and 11.5% vs lefties. The numbers were much more reverse in 2017, albeit in a small sample. This year, Hill’s wOBA allowed was .374 vs. .253, his xFIP was 6.08 vs. 3.36, and his K-BB% was 25.2% vs. 7.3%.

So, let’s take an example from that World Series game. Marwin Gonzalez, the Houston Astros’ switch-hitting utility man, had four plate appearances (not counting an intentional walk). In his first one, he faced Hill, striking out swinging. In his second one, he faced Hill, striking out looking. Both came as a righty. The next two came as a lefty. In his third, he drew a walk from Ross Stripling. And everyone knows what he did in his fourth appearance.

Gonzalez’s success and failures in those appearances did not stray from what he has done all year. Here are his left and right splits:

Handedness AVG ISO BB% K% LD% wRC+
Left .322 .230 10.0% 19.7% 22.0% 154
Right .250 .217 8.2% 17.9% 14.6% 115

He clearly displayed that he drove the ball better from the left side of the plate. So, Hill is worse against lefties and Gonzalez plays like an All-Star as a lefty. Wouldn’t it make sense to have him hit lefty?

Obviously, it’s not that simple, and you aren’t going to try an experiment in the World Series and have him hit lefty. Gonzalez’ eye isn’t trained to hit left-handed pitchers from the left side. And all his success from the left side may be because he sees right-handed pitching really well there. It also may disrupt a hitter if they mostly hit lefty versus righties, but then infrequently go left on left for the occasional reverse-split guy. It could make hitters completely uncomfortable, and a hitter is highly unlikely to perform if he is uncomfortable. In truth, most factors point to it being a bad idea, despite what numbers might say.

However, experimenting with the idea during inconsequential situations may be a good idea. I looked at some of the switch-hitters of the past decade with clearly more success from one side to see if any had toyed with hitting LvL or RvR. The group included Aaron Hicks, Mark Teixeira, Jose Reyes, Jed Lowrie, Chipper Jones, Pablo Sandoval, Justin Smoak, and Dexter Fowler.

One guy stood out — Sandoval. He’s accumulated 114 plate appearances as a LvL in his career. Still a tiny sample, but a clear demonstration that he has tried. Sandoval’s struggles as a righty are well-known, as his career wRC+ as a righty is a 80, vs. 124 as a lefty. In 2015, it appears he shelved the idea of hitting from the right side. 112 of those 114 LvL appearances came that season. In that one stretch, he was still poor, posting a 59 wRC+ against lefties. So he went back to hitting from the right side.

There are only two other guys, Teixeira and Reyes, who seemed to even have experimented with it. Teixeira hit 48 times as a RvR, and Reyes did the same 43 times. Teixeira seems to have messed around with it his entire career, having 4-5 such appearances nearly every year. While the sample is essentially nothing, his 138 RvR wRC+ is higher than both his LvR and RvL. On the other hand, Reyes’ appearances randomly popped up in 2010 and 2015, with 13 and 20 those years, respectively. He showed no sign of significant struggle as a lefty those years, so the randomness is strange. His -8 RvR wRC+ spoke for itself, anyway.

No one in the last decade, at least to my knowledge, has fully employed the tactic. The few that did fool around with it had mixed results. However, the success of Teixeira points to the fact that if the hitter feels comfortable, it may be a smart decision. Given the right matchup, of course. I attempted to find the pitching matchups for Teixeira as a RvR, but Statcast returned no results. It’s a strategy that probably many have thought about, but none have really used. Throwing in another situation to be accustomed to for a hitter may just be too difficult. But if a switch-hitter feels comfortable, it could be a helpful ability to have in their back pocket.