Starter or Reliever: The Josh Hader Story

I’ve always wondered if certain players are aware of the comparisons floated with their names.

For one, it could be valuable to observe and learn from a player with similar mechanics. Struggle can be an unexpected teacher, and if their look-alike possesses a career with peaks and valleys, those turning points make invaluable late-night research material for a baseball nut. On the other hand, comparing can create unrealistic expectations.

Because I have not had the pleasure of speaking to Brewers pitcher Josh Hader, knowing whether he sees value in comparisons eludes me. What I do know is the most frequent comparison attached to Hader immediately creates those lofty expectations: Chris Sale.

Not as lanky, or elite, Hader’s sidearm-lefty slot causes Sale-like deception.

David Laurila of FanGraphs spoke with Hader about mechanics, and a few points resonated with me.

Hader is cognizant of the value biomechanical analysis can have, disclosing his run-in with motion-capture cotton balls affixing themselves to his body as he pops a glove with 95-mph heat. His max-effort delivery may cause worry for some, but reading about Hader’s confidence in his concoction of a motion is settling, even if it’s coming from the horse’s mouth. If you subscribe to the theory that past injury predicts future injury, Hader eclipsing 100 innings every year since 2013 should ease your concerns. (Thanks to Laurila for getting Hader’s thoughts in the column linked above.)

Hader also confirmed his awareness of the deception he creates when talking with Laurila. The less time a hitter has to pick up the ball out of his hand, the better. Left-handed hitters, in particular, have been decimated by Hader’s fastball-slider combo.

Lefties combined for a .158 slugging percentage against Hader last season. That was second in baseball, behind Pittsburgh Pirates closer Felipe Rivero (minimum 70+ total batters faced). Firmly inside the 99th percentile; when you drill down to how effective Hader’s slider was, I fear for any lefty who had to deal with this release point and horizontal bite (see gif above). Hader threw his slider 77 times last year to left-handed hitters and the resulting slugging percentage was .071. When they swung at this slider, 44% of the time they missed. Both metrics sit comfortably above average in relation to average slugging percentages and whiff rates for hitters, adding statistical backing to Hader’s dominance.

Unique about Hader is not only this slider, his hair, and his effectiveness, but his role heading into the offseason.

Since his move to Milwaukee from the Houston Astros in 2015’s Carlos Gomez swap, Hader was a starting pitcher for every one of his minor-league appearances. Craig Counsell & Co. entertained the reliever role for Hader only upon his promotion to the major leagues on June 10. Culprits for the switch could be situational — the Brewers were contending, and needed bullpen arms — but you could also convince me they were performance-based. A 13.6% walk rate over 52 Triple-A innings doesn’t inspire confidence.

This isn’t breaking news to Brewers fans.

Control issues have always been a problem for Hader, but as a reliever, the Wayne’s World look-alike had a good enough fastball to utilize it 75 percent of the time to lefties, upwards of 85 percent to righties, and net himself a shiny 36 percent strikeout rate (47 2/3 innings). In the process, Hader cut his walk rate to 11.7 percent in the majors, from north of 13 percent at Triple-A.

Unfortunately for Hader, even that improvement shouldn’t inspire confidence. We haven’t had a qualified pitcher at the major-league level, with a walk rate greater than 11.6%, since Francisco Liriano in 2014. I wouldn’t fault Hader for making a deal with the devil and taking Liriano’s 1,500-inning career, but my intentions are to consider a pitch vital to determining Hader’s 2018 role.

***

Considering everything” headlines an MLB.com column from Brewers beat writer Adam McCalvy just over a week ago.

The vocalist of that quote was Craig Counsell, and the topic was our very own Josh Hader.

Indifference exists because Hader pitched so well in his 35 relief appearances and because of the smattering of question marks. The biggest of which is emerging ace Jimmy Nelson’s shoulder health. One depth chart has Hader as Corey Knebel’s set-up man. With an individual named “B. Suter” in the Brewers 2018 rotation. (Not “Bruce” Suter, just to confirm. Sorry, Brent.)

One question mark Hader can control is the development of his changeup. Stop me if you’ve heard this before, but a developed third pitch — so often the changeup — is how many minor-league arms get a chance to work for five-plus innings in the upper levels.

One of my favorite finds from 2017 has been the scout Chris Kusiolek (@CaliKusiolek on Twitter). In regards to changeups, Kusiolek mentioned on the Fantrax Baseball Show how much of a feel pitch it truly is. He detailed how he looks not at the present state of a pitcher’s changeup when determining the viability of the pitch’s future, but the athleticism of the pitcher, his arm action, fastball, and other aesthetics, to make that call. I’m nowhere near as seasoned of a scout as Kusiolek, but Hader hits a few of those points.

Even Hader will admit changeups are a feel pitch, and found in that same McCalvy column, the Brewers beat writer tweeted out the grip Hader was working on back in March of 2017.

“Messed up” can often prime one to think inconsistent, but that may apply to the resulting action Hader achieved on the pitch, rather than the results.

FanGraphs has Hader’s changeup just below 86 mph. This average velocity was the more common action on the pitch I observed watching tape of Hader. Other times, however, I’ve seen Hader’s change kick up to 88 mph. From my crude observation, the harder changeup only came spontaneously and later in counts. You’re about to see an 88-mph changeup on a two-strike pitch to Adam Duvall.

Harry Pavlidis has conducted extensive research on why some changeups are effective, noting those who generate elevated levels of ground balls and swinging strikes with the pitch are ideal (Stephen Strasburg is the poster-child).

Hader’s changeup hits one of those two criteria. Among starters and relievers with 50 or more changeups thrown, when Hader’s is put in play, it generates grounders at a 75-percent clip, sixth-highest in all of baseball (320 total starters and relievers). I understand it’s a pipe dream to ask Hader to replicate the arm action or grip that leads to the harder offering — if it is spontaneous — but if the structure of his general changeup leads to an elevated level of ground balls, this harder changeup might push him further into worm-killer territory.

Given Hader’s changeup has a sub-par whiff-per-swing rate in the bottom quarter of the league, playing to his strengths and embracing the harder version could make an interesting case for change.

You could argue Hader needs to continue mixing the two, but if the hittable, 86-mph changeup is thrown more as an early-count offering to righties, exploiting Hader’s attempt to pitch backwards could become an game plan. Or, in a perfect world, Hader can refine the swinging-strike rate on the slightly softer offering and turn into a two-changeup lefty. (A boy can dream, right?)

***

Considering Hader for a rotation spot is not a spontaneous decision, especially with Hader’s talent and polished, 23-year-old arm.

Both of his raw pitch count season-highs throwing his changeup came in consecutive appearances during late September. His usage with the pitch crept towards 19 percent, and both outings lasted north of two innings.

Hader can survive as a starting pitcher if his changeup becomes a legitimate weapon to right-handed hitters, especially if opposing managers understand Hader’s dominance against lefties and stack against his natural platoon split.

While Hader’s changeup is often knocked for being inconsistent, I counter that sentiment by saying he has a substantially better feel for the pitch than most, especially given the tendency of hitters to pound it into the ground, regardless of the velocity.

My gut tells me Hader will be utilized as a multi-inning reliever, and dominate both sides of the plate in 2018. My heart tells me to give Hader starts to further refine his feel for a pitch he’ll have to use effectively the second and third time through major-league lineups in order to survive.

In Craig Counsell and Derek Johnson I trust.

A version of this post can be found on my website, BigThreeSports.com

Statistics all from BrooksBaseball, BaseballSavant, Baseball Prospectus, and FanGraphs, unless otherwise noted.


Ichiro Shot the Moon

Ichiro is one of the most bizarre players of the past 20 seasons. While many hitters have come over from Japan to the MLB, Ichiro has stuck in North America like no one else. The NPB is famous for its ground-ball-heavy approach — per DeltaGraphs, the NPB ran a GB% of 48% compared to 44% for the MLB last season — but that approach usually doesn’t work that well across the pond. That wasn’t the case for Ichiro. He made it work, and he made it work all the way to capturing the single-season hit record. And he did it in a really, really weird way.

How to Hit In Japan

To explain why it was so weird that Ichiro did what he did, we have to go all the way back to the beginning, back to Ichiro’s home country of Japan. Nippon Pro Baseball is the highest level of professional competition in Japan, and it’s where MLB superstars (and future superstars) like Ichiro, Shohei Ohtani, and Hideki Matsui started their careers.

The NPB is traditionally referred to as a ‘AAAA league’ — its level of competition is below that of the MLB, but above that of typical AAA team, which is why players who could mash in AAA but couldn’t hang on in the majors usually end up in the land of the rising sun (guys like Álex Guerrero and Casey McGehee were among the best hitters in the NPB in 2017).

The NPB’s style of baseball, however, is unique. It exists as some strange mesh of dead-ball play and modern baseball, where ground ball machines can thrive.

Earlier this year, Ben Lindbergh took a look at the biggest ground-ball-machine in the world, Nippon-Ham Fighter Takuya Nakashima, who ran an astonishing 74.4% GB% in 2016. Nakashima’s batted-ball profile looks like something of a caricature of the rest of the league, a gross exaggeration of the way the rest of the league plays.

NPB vs. MLB GB%

League-wide, the NPB GB% year to year falls between 47% and 48%, which is quite a bit more than the 44%-45% that the MLB posts every season. Japanese players also traditionally reach base more frequently on grounders too, posting a BABIP of .245 on ground balls in 2017 compared to the MLB’s .241 figure.

NPB vs. MLB wRC+ on GB

But the biggest difference between MLB and NPB grounders? Ground balls are generally worth 30% more in Japan as they are in North America. MLB batters posted a 29 wRC+ on grounders, but NPB grounders were worth 42 wRC+. That’s a huge difference, especially for a league-wide figure. While it’s still not technically beneficial to hit ground balls, in Japan, hitters are rewarded for doing so more frequently than their North American counterparts.

How does such a huge difference exist between NPB and the MLB? Lindbergh, in the above article, suggests that the spongy Japanese turf is to blame, causing ground balls to have more life on them. In addition, Lindbergh suggests that the NPB, which has been slow to adopt many sabermetric and modern ideas, is shift averse, meaning many pull-happy hitters can run higher BABIPs. It’s also possible that since NPB has a lower skill level than the MLB, NPB infield defense could allow more hits than MLB infields.

Whatever the reason, hitters who came to the MLB from the NPB while relying on the ground ball as a means of production generally saw their production suffer. Tsuyoshi Nishioka, for example, hit .346/.423/.482 the season before coming to the MLB, but managed only a paltry .215/.267/.236 with the Twins in two seasons. Nishioka relied heavily upon the ground ball in both leagues but was punished more heavily for doing so in the MLB than in the NPB, and that, coupled with the difficulty of facing MLB pitchers, doomed him to mediocrity.

Ichiro was much the same — a ground-ball production machine. When he came over from Japan, perhaps in hindsight, he should have flopped for the same reasons that Nishioka, Kensuke Tanaka, Munenori Kawasaki, and Akinori Iwamura flopped. He fit the profile — speedy, high-contact ground-ball hitter coming over from Japan. Hell, Ichiro’s best-case scenario should have been what Nori Aoki turned out to be.

Instead, he thrived.

Ichiro Breaks the Mold

When Ichiro arrived in America, he was nothing short of a revelation, and a key factor in the Seattle Mariners posting the best record of the modern era in 2001 — and he was arguably the face of the franchise for close to a decade.

Ichiro’s high-contact, low walk/strikeout approach shouldn’t have worked. I ran Ichiro’s 2003 season through my similarity tool, and the best comps I generated were Jose Vizcaino’s 2004 season, Warren Morris’ 2003 season, and Brad Ausmus’ 2004 season (yes, that Brad Ausmus). None of these guys posted a wRC+ over 90 in those years, but Ichiro was at 112. How did Ichiro get by using a strategy that had failed so many hitters before him?

Career BABIP leaders by SLG

On paper, the answer is BABIP. For his first four seasons, Ichiro never posted a BABIP below .333. While the league average for BABIP is around .300, elite players generally have a BABIP skill above .300 as a result of making elite contact. If we make a rough and naive assumption that a high SLG means that a player made good contact, we see that the among the top 15 career BABIP leaders (with 10000 PA), most of them made good contact, except for Lou Brock … and Ichiro.

It gets weirder. Remember all that talk about ground balls? Ichiro hit a lot of them — since 2002, the earliest season for which we have batted-ball data, Ichiro has hit the most ground balls in the majors, almost 800 more than 2nd place (Derek Jeter). Here is a scatterplot of GB% versus BABIP for qualified single seasons since 2002.

GB% vs. BABIP, 2002-2017

There exists a weak, but roughly positive correlation between BABIP and GB%. Most everyone is hanging out somewhere around the 35%-50% GB% and .250-.350 range, but then there’s Ichiro, who consistently posts BABIPs well above what he should be getting. Ready? It gets even weirder.

GB% vs. BABIP vs. Age, 2002-2017

Here’s that same chart, but I’ve thrown in the ages of each hitter in a gradient color scale. There’s a good spread around here, but I’ve highlighted Ichiro’s 2004 season, and it should stand out in three big ways. First, he posted one of the highest GB% since 2002 (63.1%). Second, he posted the second highest single-season BABIP since 2002 (.399). And third, he was 30 when he did this! Many of the light blue values in the upper right of the column belong to Ichiro. Which is really unusual, since many of them are when he’s older than the median MLB player (29 years old).

GB vs. BABIP vs. Older or Younger than 29

In this chart, the red dots represent hitters 29 years old or younger, and the blue dots represent hitters 30 years old or older. Notice how there’s a roughly even mix in the middle, but older hitters tend towards the bottom left, and younger hitters tend towards the upper right (though there are exceptions to each).

GB vs. BABIP vs. Older or Younger than 29 without Ichiro

Here’s that same chart, but I’ve removed Ichiro’s seasons — look at the far upper right. See the difference?

Ichiro’s specialty is defying all aging curves and all logic by consistently posting these ridiculous BABIPs while acting like a ground-ball machine, and making contact that most hitters would be ashamed of.

Legs Don’t Fail Me Now

We’ve already identified that Ichiro makes sub-par contact, hits a lot of ground balls (not exactly a recipe for production), and doesn’t strike out or walk much. No, the biggest tool for Ichiro, as anyone who watched him play could tell you, was his speed.

August Fagerstrom previously found that Ichiro had elite speed in his younger days, estimating his time-to-first in his prime as just under 3.75 seconds, which would blow Billy Hamilton (3.95 seconds) out of the water. It’s no exaggeration to say that Ichiro could be one of the fastest men in MLB history.

So many hitters came over from Japan with profiles similar to Ichiro — speedy ground-ball hitters who make a lot of contact. But none of them had Ichiro’s generational speed, and so, none of them found the type of sustained success that he did.

One cannot help but feel a sense of wonder in looking at Ichiro’s career. Because his production relies almost solely on his ability to make contact and his speed, tools that decay slowly with age (I’m aware that speed tends to decrease with age, but exceptionally speedy runners such as Chase Utley and Rajai Davis can retain their prowess on the basepaths well into their late 30s), he was able to defy what we might expect from someone of his age and with his batted-ball profile.

Ichiro was shooting the moon with his approach the plate, in a way. Sabermetric wisdom tells hitters to elevate, draw walks, don’t be afraid to strike out, make solid contact, and don’t worry about speed. Ichiro did the exact opposite and was rewarded handsomely rewarded for it. I can think of no more unique player with such a storied career and legacy. Here’s hoping 2017 won’t be Ichiro’s last hurrah.


The Giants’ Not-So-Shiny New Toy

The Giants made a big splash by acquiring Evan Longoria, owner of three All-Star nominations, three Gold Gloves, a Silver Slugger, and the 2008 AL Rookie of Year. I will come right out and admit that I have hardly spent any time thinking about Longoria at all through his 10-year career. As a fan of an NL West team, the Rays are about as far away from my realm of focus as you can get. Throw in the fact that they are a small-market team dwarfed by the Yankees and Red Sox, and Longoria simply hasn’t made a huge impression on me.

After reacquainting myself with his player page, I realized how much I have been missing. Longoria has amassed almost 50 WAR in his career so far, placing him on the bubble of many Hall of Fame stats despite being only 32 years old. He has avoided any disastrous seasons, as his lowest WAR total was 2.2, and that came in a 2012 season when he only played in 74 games. Almost as impressive as his WAR totals – that 2012 season has been the only season in 10 years that he missed significant time due to injury. In the past five years, he has played in more games (798) than anyone in the MLB. He has been the epitome of health and consistency for a decade.

Longoria has earned his value by being very well-rounded. He provides significant value with his bat, as his career wRC+ mark of 123 matches up with the likes of Yoenis Cespedes, Jose Altuve, and Mookie Betts, all extremely accomplished hitters that have yet to enter their late-career decline phases. As the three Gold Gloves imply, Longoria is also an impressive fielder, with career marks of 75 DRS and 89.1 UZR. While not a massive base-stealing threat, he has shown enough speed and baserunning intelligence to provide slightly above-average baserunning value. Simply put: the dude is good at playing baseball, and he’s been proving it for an entire decade now.

As impressive as that resume is, the Giants don’t get to enjoy any of his past accomplishments. They didn’t trade for 2008-2017 Evan Longoria, they traded for 2018-2022 Evan Longoria. So now the question becomes: Is Evan Longoria still good? Jeff briefly touched on this immediately after the trade, but I wanted to take a deeper look.

At 32 years old, he is past the typical peak years for most baseball players, and in Longoria’s case, he already sustained a pretty clear peak over his first six seasons (ages 22-27). As Jeff noted, he put up a wRC+ of 135 during this time; compare that to his four seasons since then (ages 28-31), when his wRC+ has dropped to 108. Don’t get me wrong – 108 is still good! It’s just not the elite All-Star player we saw at the front of his career. His defense has followed the same trajectory, as he put up +79 DRS and +78.4 UZR over his first six seasons, then dropped to -4 DRS and +10.7 UZR over his last four seasons.

This is a familiar story: good baseball player gets older, becomes worse baseball player. But it’s so familiar that it can also be a trap – Longoria might end up following the Adrian Beltre career path, who posted a 6-WAR season at 37 years old. Looking at the numbers, though, I just can’t make myself believe that Longoria is anything more than a useful starter right now, and one that will shortly become a below-average player.

Longoria’s strikeout rate immediately jumped out to me, as he only struck out 16.1% of the time last year, setting a new career low, almost 4% below his career average. This is promising! In an era of increasing strikeouts, Longoria is figuring out how to put more balls in play, giving him more chances of getting on base. Of course, this line of thinking requires that he is trading strikeouts for quality batted balls, and considering his ISO last year sat 50 points below his career average, it didn’t look like this was the case. After digging deeper into some plate discipline numbers, it became very obvious to me what was happening.

2013 was Longoria’s last star-caliber season. The following year, his wRC+ dropped from 132 to 105, with a corresponding spike in Swing%. All of a sudden, Longoria was much more aggressive, swinging at more pitches both inside and outside of the zone. And especially in 2017, he seemed to be focusing intently on putting the ball in play, with a large spike in Contact% despite seeing the 2nd lowest Zone% in his career. Some people are able to cut strikeouts by controlling the strike zone better, but it looks like Longoria was cutting strikeouts by swinging more often and making poor contact on bad pitches. Consider his batted-ball distribution:

The first big red flag here is the red line along the bottom. Once again, starting in 2014, we start seeing a worrisome trend as he began hitting more and more infield flies. All his improvements in strikeout rate are erased here, as infield flies are essentially automatic outs and are just as bad. The other interesting tidbit in this graphic is the interplay between his GB% and FB% the past two years. Longoria had a mini-offensive resurgence in 2016, and it looks like that can be attributed to him lifting the ball more often. In 2017, he lost all of his FB% gains and then some, driving more balls into the ground than ever before.

Jeff also touched on the relevant Statcast data. Longoria’s exit velocity dropped significantly last year, as did his rate of barrels and xwOBA. There was nothing fluky going on for Longo in 2017 – he was swinging more often but making worse contact, and more of his batted balls were either going into the ground or popped up in the infield.

Is a turnaround completely out of the question? Of course not, nothing is out of the question. Perhaps a change of scenery will provide a spark for the 32-year-old. Perhaps he will be motivated to prove to the baseball world that the Giants made a good trade, and he will work harder than ever to make it back to All-Star levels. Even if he simply sustains his current production, he is still a 2-3 win player right now. But the Giants need more than that, and we’re already four years into a significant decline for Longoria. Both his bat and his glove are on the wrong side of the age curve, and it looks like the Giants just added another expensive, aging veteran to throw onto the pile.


Giants, Rays Make Strange Trade

On December 20th, the Rays shipped Evan “Career Ray” Longoria to the Giants for Christian Arroyo, Denard Span, Matt Krook and Stephen Woods. On the surface, this seems like a deal that fits the needs of both teams. The Rays have initiated yet another rebuild that Longoria didn’t want to be a part of, and got some young players in return. Arroyo is a former top-100 prospect who, despite destroying lower levels in 2017, struggled in his debut with the Giants. The two arms are the classic “pitching prospects,” and, well, Denard Span is Denard Span. On the other side, the Giants filled an absolutely gaping hole at third base. They no longer have to play Pablo Sandoval, and that should be a win for any team.  However, this trade has left me scratching my head, and there’re a few reasons why. Let’s look at some statlines.

Player A – 96 wRC+ / 11 DRS

Player B – 108 wRC+ / 10 DRS

Which one do you think the Giants just gave Christian Arroyo up for? The answer is A, Evan Longoria, a 32-year-old who is making 13.5 million dollars a year.  Player C is Todd Frazier, a 31-year-old (almost 32-year-old) free agent who will more than likely sign a contract in the 10-12 million dollar range. Now, Longoria did have a down year at the plate in 2017 and is the better defender of the two, but looking at these numbers raises some questions. Why did the Giants give up a talented young prospect for someone they could have just signed in the free-agent market? It’s understandable that you can look at Longoria’s track record and expect him to bounce back from a down year, but there are a few other things to consider before jumping to that conclusion. First of all, Longoria is moving from Tropicana Field to AT&T Park, one of the most pitcher-friendly parks in the majors. According to Baseball Reference, Tropicana was also stifling, but still, moving to AT&T is not a welcome change for any hitter. Secondly, Longoria is 32 years old, and we all know what side of 30 that’s on (it’s the bad side). Thinking Longoria can bounce back during his age-32 season is a tough sell for anyone who believes in the aging curve.

Let’s consider what the Giants could have done differently. If they would have signed Todd Frazier, they would have been getting a cheaper contract for a player with essentially the same skill-set as Longoria; a power right-handed bat with a plus glove at third base. They’re practically the same age, and now, the Giants can keep Christian Arroyo around and give him some more time to develop in the minors or give him exposure at the major-league level if Joe Panik continues to struggle. Yes, they did offload Denard Span’s contract, so technically, Longoria is cheaper than Frazier would be, but I’ll address that in just a bit. They also had the option to not sign anyone at all and hope Arroyo develops into some sort of Matt Duffy 2.0. To make it clear, I don’t think Arroyo will ever be as good as Longoria, but I have no problem believing he could be a 2-3 win player a few years down the road.

Now, in terms of the big picture, only one of these moves keeps the Giants’ hopes for the future alive. If you haven’t noticed, all the stars on the Giants are going to be on the wrong side of the aging curve soon. Signing Frazier only contributes to that problem. Arroyo could have been a piece the Giants could have built their team around in the future when all of their other superstars are decrepit skeletons. Remember what I said about Denard Span being Denard Span and his contract being offloaded? That’s another problem that the Giants have to fix now. Denard Span isn’t good, but he’s essentially league-average at playing center field. Who do the Giants stick out there now? Steven Duggar? Mac Williamson? Both of those options represent a downgrade to Span. Instead, we can expect the Giants to throw a bunch of money at a free-agent outfielder, perhaps someone like Lorenzo Cain. Cain would represent a huge upgrade over Span, but Cain is still another 31-year-old who is projected to decline in his production while making close to 20 million dollars a year, which cancels out the effect of getting rid of Span’s contract in the first place.

If they do sign Cain, the Giants will then be spending more money than they were before they traded for Longoria in the first place. If they don’t, then they’ll have to expect lackluster production out of center field, somehow even more lackluster than Span already was. Finally, you have to consider if this move actually makes the Giants better than the rest of the NL West. It doesn’t. The Dodgers are still a super team, the D-Backs are still very good, and the Rockies, despite having some question marks about their rotation, are a good team as well. Well, okay, the Giants won’t finish behind the Padres, but you still have to be better than the best team in baseball last year to win your division. This is a lot to ask for a Giants team that has only added something like 1.5 wins this offseason and was the worst team in the National League last year. Don’t get me wrong; getting Longoria, a good player who makes way, way less than his market value is a great move, but I don’t think it is in the context of where the Giants are as a team, what they gave up, and the holes they still have to fill.

As for the Rays, this is a move that was going to happen eventually. They see that the Yankees and Red Sox are going dominate the AL East for a while, so they decided that now is as good a time as any to tear it down and start again. The Rays will continue to the do the Rays thing we all know and love, stockpiling as many Matt Duffy-type players as they can while consistently pumping out awesome pitchers from their farm system, then trading those pitchers for more clones of Matt Duffy. Arroyo will more than likely take the second-base job in Tampa over at some point during 2018, and will be a fun player to watch in the Rays lineup. Span might end up taking some time from Mallex Smith in left field, but Smith is definitely the more exciting and interesting player of the two. The real Matt Duffy will end up playing third, and the Rays will finish 3rd or 4th in the AL East like they do almost every year. Again, this a trade that was going to happen, but it’s just surprising to see how it ended up going down.


Does Lifting the Ball Have a Ceiling?

Elevating is en vogue; everyone wants to do it and it seems like every hitter who does it can become a power hitter, especially with rumors about a new ball. There have been many examples of successful hitters of that mold: Daniel Murphy, Justin Turner and Jose Altuve, among others. Is there a limit to this? Could we see hitters with a 25% GB rate in the future? 20% 15%?

One thing that seems to cap this is BABIP. There is a pretty positive correlation of BABIP and GB rate, i.e. GB hitters tend to have a higher BABIP. That seems logical since FBs tend to have a lower average, and even if they are hits they often don’t count for BABIP as they are often home runs.

This table shows the relation of BABIP and GB rate between 2008 and 2017. You can see that BABIP does go down with lower GB rates, but wRC+ is actually better with lower GB rates. Still, you could see a point being reached where the lower BABIP eats up the advantages.

GB rate >0.35 0.35-0.4 0.4-0.45 0.45-0.5 0.5-0-55 >0.55
BABIP 0.287 0.290 0.299 0.304 0.314 0.320
wRC+ 106 102 101 95 90 93

Average launch angle shows a similar picture:

 

av. LA <8 8 to 10 10 to 12 12 to 14 14 to 16 16 to 18 >18
BABIP 0.318 0.314 0.305 0.298 0.300 0.289 0.274

It seems that once you get past a certain launch angle or GB rate, a drop in BABIP is inevitable. However, an exception might be possible. I looked up guys with a lower than 35% GB rate and a FB rate of lower than 45%, and their BABIP was 0.304. Those guys were pretty rare between 2008 and 2017, but it is possible. You just need to get the ball off the ground and avoid both pop-ups and high outfield fly balls above 25 degrees. Not an easy thing to do, though, as the bat is a round object, and batted balls will always be distributed rather normally around the average LA, meaning that a higher average LA usually will mean more high outfield fly balls.

However, it is possible to imagine a super-hitter who has such good bat control that his band is very narrow. The best example of this might be Daniel Murphy, who managed to have a 34% GB rate with just a 40% FB rate (meaning a very high LD rate), and subsequently a very high (.345) BABIP over the last three years.

So we could indeed imagine a kind of “super Murphy” who hits 25% grounders with lower than 45% FBs. However, to date, we have not seen a guy sustaining such high LD rates; that guy would probably have to have superhuman bat control (which probably eliminates almost all >25% K rate guys). But with modern training methods, who knows what might happen.


Should You Even Draft a Catcher in Fantasy Baseball?

If you play in a traditional 12-team 5×5 roto auction league with 25-man rosters and a $200 FA budget per season, you might constantly feel like there is solid waiver-wire talent out there, but your roster is too stacked to cut anyone. So, you offer your league-mates a trade of two or three mediocre players for one of their better players, but they are facing a similar roster crunch and immediately see right through your pernicious plan. It can be tempting to cut the lowest-production, lowest-upside player on your roster, which in many cases is the $1 catcher you drafted. But is that catcher really providing value to your roster? Let’s break it down.

Let’s say you draft Realmuto this year for $10 and expect a line of 13 HR, 53 R, 58 RBI, 7 SB, .275 AVG (Steamer projected line, ~500 PA).  The other cost of drafting Realmuto is the opportunity cost of his roster spot. In a typical fantasy week, there are three or four days where your typical starting lineup is not intact. Whether it’s because a team is having an off-day or one of your regular starters is DTD with a bruised toe, holes in your lineup are bound to happen. A smart streamer can look for good matchups and plug those holes. If you have unlimited pickups allowed in your league, then there is no cost to picking up a player if you have an open roster spot. In my league, I can pick up players for $1 on free-agent days (M/W/F).

This begs the question: if you are streaming to fill in holes four times per week over 26 weeks of the regular season, and each game you plug in a streaming player you get 4 PA, then that is going to equal just over 200 PA and cost you around $78 FAAB (assuming three pickups per week * 26 weeks, and one of your streamed pickups fills holes twice in one week for a total of four fill-ins). What does a slash line of 200 PA for a waiver-wire bat look like?

Kevin Pillar screams waiver-wire bat. His Steamer projection reduced to 200 PA looks like: 5 HR, 25 R, 20 RBI, 5 SB, .270 AVG. That’s quite worse than Realmuto’s line in every way excepting AVG. It amounts to a little less than 50% of Realmuto’s line at the cost of $78 FAAB. Now you could argue that maybe amidst all your streaming you end up picking up a Jonathan Villar 2016 breakout type of bat and end up sticking with him and getting immense value, but that’s easier said than done. Maybe you are also going to research pitcher vs. batter matchups on a daily basis and you get an edge there, but that is also easier said than done.

How does the 200 PA of Kevin Pillar compare to a $1 draft day, bottom of the barrel catcher’s line? Even poor Jonathan Lucroy is projected by Steamer to beat this line: 10 HR, 44 R, 46 RBI, 2 SB, .268 AVG. Other such luminaries projected to outshine it include Tucker Barnhart, Christian Vazquez, and Tyler Flowers. Pretty much any catcher who is a starter and can bat .250+ for a season will put up much better counting stats than the Pillar line.

Long story short — even though your catcher’s line may look meek, and they don’t play every day, making your roster look thin, it will still likely be better than waiver-wire lineup hole streaming. Better to save your FAAB cash for other needs. If you play in an unlimited transaction league, you would still need about 500 PAs of Pillar to exceed the Realmuto line. That’s a lot of transactions, and you might not have time to get all the necessary PAs in. Punting C is like heeding the siren calls — it can be very tempting, but also a dangerous and costly exercise. Staying the course with the catcher you drafted is usually the best call in terms of value per FAAB dollar spent.


Juan Nicasio Has a New Slider, and He Needs His Old One Back

The Mariners recently inked Juan Nicasio to a 2-year/$17-million deal in their first significant addition to their pitching staff this offseason. After years as a middling starter, Nicasio emerged as a rock-solid relief option with the Rockies in 2014 before the Dodgers fully bought into his potential as a reliever the following year. The Pirates then acquired him and shifted him into the rotation a bit in 2016; however, he had more success in their bullpen and moved there full-time in 2017. He was again on the move last year, though — this time playing for two new teams — but he never started a game, posting a cumulative 2.61 ERA over 72.1 IP in 76 appearances.

He’s on the wrong side of 30, and breakout relievers tend to pop up and decline quickly, but it can be argued that Nicasio has done nothing but improve since moving into the bullpen.

Juan Nicasio as RP IP ERA AVG OBP SLG wOBA
2014 20.2 3.48 .227 .275 .400 .300
2015 56.1 3.83 .257 .359 .381 .320
2016 55.2 3.88 .249 .328 .387 .308
2017 72.1 2.61 .216 .277 .333 .265

As a reliever, Nicasio is largely a two-pitch pitcher, primarily throwing a four-seam fastball and a slider. He had occasionally mixed in a sinker and changeup in previous years, but 2017 saw Nicasio throw a four-seam fastball or slider 98.31% of the time. This pitch mix in combination with his K/9 dipping from slightly over 10 to just under 9 may raise a couple eyebrows, but Nicasio also improved his command considerably.

His 6.9% BB% in 2017 was his lowest since his debut season and marked a second straight year of improvement, and his 24.7% K% compares well to previous years. This would suggest that Nicasio is only getting more efficient with his outs, not striking guys out at a lesser rate. And sure enough, his 1.08 WHIP last year was by far the lowest it’s ever been.

A quick look at his splits from 2017 showed a distinct improvement against left-handed batters compared to previous years.

Juan Nicasio vs. LHH IP AVG OBP SLG wOBA
2015 14.1 .359 .494 .516 .427
2016 21.0 .241 .351 .476 .350
2017 33.0 .205 .252 .292 .235

In his largest sample yet, Nicasio made huge strides.

Since improvements against opposite-handed batters tend to suggest an improvement in a pitcher’s changeup or breaking ball, and given that Nicasio essentially throws just two pitches, his slider seemed like a good starting point. I found that (per Brooks Baseball) it had an entirely different shape in 2017.

Juan Nicasio Sliders Velocity HMov VMov
2015 86.92 1.94 1.86
2016 87.11 1.49 2.80
2017 88.92 0.47 4.04

While Nicasio’s slider was laterally less impressive in 2017, it made up for that with reduced drop.

Here is his slider in 2016 with a little frisbee action.

Slider 2016.gif

And here it is in 2017 a bit more tightly wound.

Slider 2017.gif

Nicasio’s slider was devastating to right-handers in 2015 and 2016 (cumulative .218 wOBA/.221 xwOBA), but it seemingly fell into the swing path of lefties, as they smashed it for a .369 wOBA/.272 xwOBA in the same period. In 2017, lefties floundered against it for the first time, posting just a .194 wOBA/.175 xwOBA. But his other slider disappeared.

Using this somewhat cutter-like breaking ball against RHB in 2017 yielded a .302 wOBA and .320 xwOBA. Considering the fastball didn’t play up (.298 wOBA/.334 xwOBA), that kind of performance is a slight concern, but righties’ triple slash against him was still an encouraging .225/.296/.367 (.287 wOBA).

On the surface, the Mariners seem to have gotten a quality reliever at about market rate for his talent, but I think there is still some upside here. Certainly, in this new slider, Nicasio has found a legitimate weapon against LHB, but the Mariners must hope his natural slider is not lost. In order to remain a high-quality, high-leverage setup man — the kind that posts sub-3 ERAs — he’s going to have to bring out both.


The Modern Eras Committee Just Elected Bartolo Colon to the Hall of Fame

Jack Morris pitched 18 seasons while Bartolo Colon has now pitched 20. They both have a career winning percentage of .577. Morris has 2478 career strikeouts while Colon has 2454. Morris had 254 wins while Colon has 240 in an era where they are harder to obtain. Colon won a Cy Young while Morris’s highest finish was third. Morris has an ERA+ of 105, compared to Colon’s career ERA+ of 107. In fact, if you only looked at their first 15 years, Colon’s ERA+ of 114 outperforms Morris’s ERA+ of 109 even more!

Perhaps you strongly believe that, despite their statistical similarities, Jack Morris was significantly better than Bartolo Colon. Still, the fact that an argument could be made that Colon is as good a pitcher as Morris shows just how big a mistake the Modern Eras Committee made in electing Jack Morris to the Baseball Hall of Fame this weekend.

The list of players who were once viewed as “obviously not Hall of Famers” does not stop at Colon, either. In his article on ESPN.com, David Schoenfield said that it would be foolish to treat Morris as a benchmark for Hall of Fame induction. This argument is in defense of the Hall of Fame’s level of “rigor” — many think that without maintaining a certain level, the Hall of Fame may lose it’s significance. However, I believe there is another characteristic that the Hall of Fame must preserve even more so than rigor in order to maintain its credibility — and that is justice. If the Hall of Fame exposes itself as being discretionary in its election of members, it will quickly lose its relevance.

By electing Jack Morris to the Baseball Hall of Fame, voters both lowered the level of rigor previously required for election and have left the Hall of Fame in a current state of injustice until the following eligible players are also elected: Curt Schilling, Mike Mussina, Andy Pettitte, Dave Stieb, Rick Reuschel, Orel Hershiser, David Cone, Sam McDowell, Luis Tiant, Kevin Brown, Vida Blue, Bret Saberhagen, and Kevin Appier,

And the following cases are re-opened for election: Dwight Gooden, David Wells, Jim Kaat, Tommy John, Wilbur Wood, Ron Guidry, Jimmy Key, Frank Tanana, Dennis Martinez, Mark Langston, Chuck Finley, Mark Buehrle, Frank Viola and Jose Rijo.

Every single one of these 30 pitchers had a higher career ERA+ than Jack Morris and have either a higher career value, a higher peak value, or both.

Looks like Colon may be able to hang up his cleats a little more confidently this off-season now that Morris is in the Hall.


Another Weird Charlie Blackmon-ism

Charlie Blackmon is an atypical human being.

For one thing, he is a professional baseball player, meaning he is in the extreme upper echelon of athletic ability. But he is atypical even in his personal life, and his recent success has only highlighted his eccentric personality. He still drives a 2004 Jeep Grand Cherokee that he got in high school. He once had to be rescued on the side of the highway by DJ LeMahieu when he ran out of gas. He buys his clothes from Amazon. And of course, he is easily recognized by his impressive beard-and-mullet combo (the latter of which is pronounced “mu-lay” according to Blackmon).

Based on all his quirks, it should be no surprise just how unique his major-league career has been. He didn’t see regular playing time until his age-28 season, an age when some guys are already entering free agency. Despite this late start, he has steadily grown into an MVP candidate. In 2014, his first full season, Blackmon posted 2.0 fWAR, the exact threshold for a starting caliber player. In the three subsequent seasons, he posted an fWAR of 2.3, 4.1, and 6.5. I thought it seemed rather strange to have back-to-back seasons with ~2 WAR improvement, so I went to the leaderboards.

I searched for all batters with a minimum of 400 PAs in each of the past three seasons, producing a sample of 111 players. Then, I calculated the difference between each player’s 2015 WAR and 2016 WAR, and did the same for 2016 to 2017. This gave me two year-to-year improvements for each player, and I threw both values onto the scatter plot below, with Blackmon highlighted in purple.

2015 2017 WAR Improvements

Players generally don’t see improvements like this in back-to-back seasons; Blackmon is about as far to the top-right as you can get in this plot. Of course, value can come from many different places, and a player might make large defensive improvements one year and large offensive improvements the next. While Blackmon did see some improvement in his defensive metrics this season, the bulk of his improvements have come while batting. To get the following plot, I followed the same method as above, this time for wRC+.

2015 2017 w RC Improvements

Again, we see Blackmon floating towards the top right. Baseball is a game of adjustments, and if a batter enjoys a period of success, pitchers will generally approach him differently to gain an advantage. This is why players generally go through cycles, following the push and pull of the game. The past few years, Blackmon seems to be part of a small group of players who have been immune to this tug-of-war effect. He has stayed one step ahead of the pitchers, not only maintaining his gains but improving upon them as time goes on.

How has he found these improvements? Between 2015 and 2016, his walk rate and strikeout rate remained fairly constant, so he must have been getting much better results on balls in play. Sure enough, his batting average increased by 37 points and his ISO increased by 65 points, giving him 49 extra points of wOBA overall. At the time, Jeff Sullivan looked under the hood and found that Blackmon’s GB% was trending downward, and he had been attacking the low strike more so than ever before. Presumably, he realized that his swing path was conducive to driving low pitches into the air, and that balls in the air are more valuable, so he made the adjustment and enjoyed a power spike.

That all makes sense, but it begs the question: how did he improve even more in 2017? If he doubled down on the fly-ball revolution, he risked becoming Ryan Schimpf or Trevor Story.

Much to my surprise, the opposite happened – his GB% actually returned back to his career average. He increased his rate of ground balls, but he still managed raised his ISO by another 42 points. Before you cry BABIP or Coors Field, I’ll briefly note that in both years, his wOBA and xwOBA increased by approximately the same amount, so something real is going on here. In this case, I think he was finding more success on batted balls based on the pitches he didn’t put in play. Stay with me here.

In 2017, Blackmon’s strikeout rate rose by about 2.5%. This is what people in the industry call “not good,” but hold on, his walk rate also rose by…about 2.5%. This isn’t a player who suddenly developed a swing-and-miss problem to sell out for power, this is a player who is intentionally going deeper into counts. When a batter is more selective about the pitches he goes after, he is putting fewer balls in play in early counts, which leads to an increase in both walks and strikeouts simultaneously.

Let’s look at it a different way: Z-swing% measures the percentage of pitches inside the zone that a player swings at, and O-swing% measures the percentage of pitches outside the zone that a player swings at. Generally speaking, you want to swing at strikes and take balls, so you want your Z-swing% to be higher than your O-swing%; the larger the difference, the better your plate discipline.

In 2016, the difference between Blackmon’s Z-swing% and O-swing% ranked in the 9th percentile – he’s always been a bit of a free-swinging leadoff hitter. But in 2017, that difference increased by 4.7%, pushing him into the 26th percentile. While he’s still more aggressive than average, he has become decidedly less so, being more selective about the pitches he attacks and remaining comfortable in deep counts. By swinging at the right pitches, he’s able to avoid the easy outs that result from poor contact on pitches outside of the zone.

We have every reason to believe that Charlie Blackmon just had a career year, and he will never sniff an MVP race again in his career. But then again, we had every reason to believe the same thing last year. When it comes to Charlie, I have some advice: if you expect him to do something, he’s probably getting ready to do the exact opposite. It’s about time we stop trying to figure him out.


Improving WPS

“All happy families are alike; each unhappy family is unhappy in its own way.”  — L. Tolstoy

You can say something similar about baseball games. All boring games are alike; but exciting games are interesting in their own ways. Every boring game has one team building up a big early lead, which is never threatened. But there are many ways to have an exciting game: the pitcher’s duel, the slugfest, the late-inning comeback, extra innings, all in various combinations. And in between them are the bulk of games that are simply ordinary.

All of which makes ranking exciting games a tricky process, at least compared ranking to how boring they are. How does one compare Game 7 of the 1991 WS (1-0 in 10 innings) to Game 4 of the 1993 WS (15-14 in 9 innings) on the same scale? They’re great in different ways.

Back in 2005 I created a system to do just that, a rating system based simply on the runs scored in line score. I may have been the Christopher Columbus of that new world. And ranking the games allows you to rate post-season series-es.

The line-score system did work in the sense that it could tell the difference between a great game and a good one, and between a good one and an ordinary one. But while the line score gives you the basic outline of the game, it was blind to the details of what happens DURING each inning. Zero runs scored in the top of the 1st rates exactly the same; whether there were three pop-ups, or if three singles were followed by a triple play.

Eventually I realized that Baseball-Reference.com (ALL HAIL BBREF) has the play-by-play data for all playoff games, which includes a probability of victory after each play (anything that changes the outs, baserunners or score). Plotted, you can easily see if a game was good; It looks like and earthquake. If it was bad, it looks like the EKG of a corpse. Using those probabilities, we can create a much more accurate game rating. I fiddled with many rating schemes over the last 10 years before settling on one that seems both conceptually simple and that yields reasonable results.

Of course, by then I had been beaten to the basic concept by Dave Studeman (WPA) and Shane Tourtellotte (WPS). Twelve years is too long for laurels resting.

WPA = Sum(change in probability between plays)

Modified WPS = Sum(change in probability between plays) + top three plays + Final play

What I have developed is similar to their work, but I think it has some small advantages. Generally, my ratings will be quite close to Shane’s (R-squared > 99.5%). He correctly realized that simply summing the probabilities doesn’t quite work, which is why he modified it. An example…

There are seven post-season games with a WPA of exactly 4.52. Among them are:

1995 NLCS Game 2

Reds beat the Braves 6-2 in ten innings.

95 Plays, 13 plays changed the odds by at least 10%

top Play a Mark Portugal bases-loaded wild pitch +18%

70 plays with the odds in the 30% to 70% range

compared to

1960 WS Game 7

Pirates 10 Yankees 9 in nine innings

77 plays, 15 plays changed the odds by at least 10%

Of those 4 changed the odds by at least 20%

Of those 3 changed the odds by at least 30%

Of those 1 changed the odds by more than 50%

25 plays with the odds in the 30% to 70% range

 

There is simply no way those games are equal. The 1960 game has five different plays better than any play in the 1995 game. The 1995 game makes up the ground by (1) having 18 more plays (2) having fewer plays where nothing happened because the game was usually within one run.

1960 is still better because a +40% play isn’t twice as exciting as two +20% plays. Bill Mazeroski’s game-ending homer rates as +37%. Bobby Richardson’s game-starting line-out rates at +2%. Making a walk-off homer the equal of about 3 ½ innings with zero hits. NOPE. WRONG.

Shane accounted for this with his modified method. By counting the top three plays twice and Mazeroski’s walk-off homer three times, the ratings are now

1960: 6.49

1995: 5.19

And science prevails.

Of course, there is nothing magical about TOP THREE plays or LAST play. You could try using the top five plays and last five plays (believe me, I did).  But I do think that using Top-3 + Last can sometimes lead you astray. I will now present exhibits A and B to demonstrate where it can swing and miss.

Exhibit A: 1988 WS Game 1

Exhibit B: 1985 NLCS Game 6

I expect you to know them. The two biggest home runs in terms of changing the odds in post-season history courtesy of Mr. Clark and Mr. Gibson.

1985: WPA 4.48 in 83 plays and 9 innings

1988: WPA 3.94 in 82 plays and 9 innings

The 1985 game had more action with the same number of plays, which you can easily see in the line scores

StL          0              0              1              0              0              0              3              0              3              (7)

LA           1              1              0              0              2              0              0              1              0              (5)

 

Compared to

 

Oak        0              4              0              0              0              0              0              0              0              (4)

LA           2              0              0              0              0              1              0              0              2              (5)

 

The ‘85 game has a game tie in the 7th, broken tie in the 8th and lead change in the 9th

The ‘88 game has a lead change in the 2nd and a lead change in the 9th

Modified WPS says

1985: 4.48 + 1.34 + 0.01 = 5.83 (Tied for 94th best game)

1988: 3.94 + 1.43 + 0.87 = 6.28 (Tied for 58th best game)

I don’t think you can argue that the 1988 game is much better than the 1985 game; I don’t think it’s a better game at all. And it’s the last-play bonus that is to blame. Had the 1985 game been played in St. Louis then Clark’s homer would have been a walk-off and the game would have rated 6.56, well ahead of the 1988 game.

If you think about it, a last-play bonus is biased towards games won by the home team. If the home team loses, the last play will rarely amount to anything.

Only 23 times has it been at least 20%. When the home team wins, it is at least 20% 122 times.

Only 11 times has it been at least 30%. When the home team wins, it is at least 20% 96 times.

I also know this because I tried last play, last five plays, and last ten plays in trying to construct a rating system. I also tried top five plays, top ten plays, all plays over 10%, WPA – .03 per play (yielding the bizarre result of games with negative excitement).

Eventually I tried a simple power transformation on EVERY play. First, I tried summing the squares of the probabilities changes, like any good statistician would.

When I did that, the 1985 game Rated 10th and the 1988 game rated 5th. Which is the wrong order, and both games are just rated too high. Then I tried other powers…the Goldilocks approach, looking for the one that was just right.

 

Power             Rank               Rank

2.0          1985       10th         1988       5th best game

1.9          1985       12th         1988       8th Best game

1.8          1985       15th         1988       20th Best game

1.7          1985       23rd        1988       25th Best game

1.6          1985       32nd        1988       36th Best game

1.5          1985       38th        1988       51st Best game

1.4          1985       53rd        1988       76th Best game

1.3          1985       61st         1988       104th Best game

1.2          1985       79th        1988       133rd Best game

1.1          1985       100th      1988       158th Best game

1.0          1985       116th      1988      185th Best game

Everything above 1.7 was eliminated since it rated 1988 better than 1985

 

Here’s some shorthand I’m going to use:

Game 6 of the 1985 NLCS: STL 7, LA 5 in 9 innings — WPA 4.48 (9-4-2-1)

Game 1 of the 1988 WS: LA 5, SF 4 in 9 innings — WPA 3.98 (5-2-2-1)

The 1985 game had 9 plays rated>= 0.1, 4 plays rated>=0.2, 2 plays rated>=0.3 and 1 play rated >=0.5

The 1988 game had 5 plays rated>= 0.1, 2 plays rated>=0.2, 2 plays rated>=0.3 and 1 play rated >=0.5

For a sense of scale, the average game is WPA 2.67 (4.89-0.88-0.33-0.03)

(You can check the examples listed below on BBRef to get more detail on each game)

 

Checking 1.7, both exhibits rated higher than

Game 2 of the 2017 WS: HOU 7, LA 6 in 11 innings — WPA 5.30 (10-5-3-0)

Game 1 of the 2015 WS: KC 5, NYM 4 in 14 innings — WPA 6.36 (16-3-1-0)

1.7 weights the big plays too much

 

Checking 1.6, both test games rated higher than

Game 6 of the 1986 WS: NYM 6, BOS 5 in 10 innings — WPA 5.14 (16-3-3-0)

Game 6 of the 1986 NLCS: NYM 7, HOU 6 in 16 innings — WPA 5.80 (11-3-2-0)

1.6 weights the big plays too much

 

Checking 1.5,

the 1985 game rated higher than

Game 6 of the 1986 WS: NYM 6, BOS 5 in 10 innings — WPA 5.14 (16-3-3-0)

The 1988 game rated higher than

Game 4 of the 2001 WS: NYY 4, ARI 3 in 10 innings — WPA 4.58 (10-3-2-0)

1.5 weights the big plays too much, but it’s getting hard to find clear mistakes

 

Checking 1.4,

the 1985 game rated higher than

Game 3 of the 1976 NLCS: CIN 7, PHI 6 in 9 innings — WPA 4.72 (14-3-2-0)

Lead changes in the 7th, 8th and 9th innings.

The 1988 game rated higher than

Game 4 of the 1986 ALCS: CAL 4, BOS 3 in 11 innings — WPA 4.64 (7-4-2-0)

1.4 weights the big plays too much, but I’m now splitting hairs

 

Checking 1.3, I like this one. Let me check 1.2

 

Checking 1.2,

the 1985 game rated lower than

Game 2 of the 1996 ALDS: NYY 5, TEX 4 in 12 innings — WPA 5.02 (8-2-0-0)

Game 2 of the 1990 WS: CIN 5, OAK 4 in 10 innings — WPA 4.50 (10-2-0-0)

1.2 weights the big plays too little. Famous games are losing to games without any highlights.

 

So, I think 1.3 is the sweet spot.

My rating score is = Sum((change in probability between plays)^1.3) *2

The *2 at the end is purely cosmetic. It allows the very best game to score close to ten.

 

With base WPA, Gibson’s homer (.87) is worth about 25x a normal play (.035). With WPS it’s worth bout 75x a normal play. Raising all the plays to the 1.3 power means that Gibson’s homer is now worth about 65x a typical play.

With base WPA, Clark’s homer (.74) is worth about 21x a normal play (.035). With WPS it’s worth bout 42x a normal play. Raising all the plays to the 1.3 power means that Clark’s homer is now worth about 53x a typical play.

With a little algebra,

WPA:  Gibson = 1.18 * Clark

WPS: Gibson = 1.76 * Clark

Power 1.3: Gibson = 1.23 * Clark

A nice property of the transformation is that when the change in odds doubles, the play is worth ~ two and half times a much (2.46x)

 

EXCITEMENT IS NOT LINEAR

 

A 10% play is now worth 2.46 times as much as 5% play

A 20% play is now worth 2.46 times as much as 10% play

A 50% play is now worth 2.46 times as much as a 25% play

The system has a single parameter applied to ALL plays, so a game isn’t screwed if it has four great plays or the best play comes in the 8th inning. Ranking games this way, here are the five games better than, and worse than, my two test cases.

 

Series Road Team home team IP  (WPA^1.3)
*2
 WPA Top
Play
 # Plays  P>= .1  P>= .2  P>=.3  P>=.5
2014
ALCS G1
Royals 8 Orioles 6 10 5   5.14 35.0%         96        13         3         2        –
1935
WS G3
Tigers 6 Cubs 5 11 4.97   5.02 36.0%         96        15         5         1        –
1976
NLCS G3
Phillies 6 Reds 7 9 4.95   4.72 46.0%         82        14         3         2        –
2015
ALDS2 G2
Rangers 6 Blue Jays 4 14 4.93   5.46 37.0%       115         7         2         1        –
1997
ALCS G4
Orioles 7 Indians 8 9 4.92   4.92 38.0%         88        16         4         1        –
1985
NLCS G6
Cardinals 7 Dodgers 5 9 4.92   4.48 74.0%         83         9         4         2         1
1975
NLCS G3
Reds 5 Pirates 3 10 4.88   4.52 55.0%         81        14         3         3         1
1933
WS G4
Giants 2 Senators 1 11 4.87   4.94 55.0%         92         9         3         1         1
2011
ALCS G2
Tigers 3 Rangers 7 11 4.86   5.10 34.0%         92        13         3         1        –
2012
ALDS2 G2
Athletics 4 Tigers 5 9 4.86   4.86 41.0%         85        11         4         1        –
1999
NLCS G6
Mets 9 Braves 10 11 4.85   5.12 26.0%       108        14         3        –        –

 

 

Series Road home team IP  (WPA^1.3)
*2
 WPA Top
Play
 # Plays  P>= .1  P>= .2  P>=.3  P>=.5
1952
WS G5
Dodgers 6 Yankees 5 10 4.51   4.70 44.0%         92        10         4         1        –
1923
WS G1
Giants 5 Yankees 4 9 4.51   4.54 40.0%         78        12         2         2        –
1984
NLCS G4
Cubs 5 Padres 7 9 4.51   4.54 37.0%         83        10         4         2        –
1992
WS G2
Blue Jays 5 Braves 4 9 4.5   4.40 65.0%         85        11         1         1         1
1998
ALCS G2
Indians 4 Yankees 1 12 4.48   4.78 33.0%         96        11         3         1        –
1988
WS G1
Athletics 4 Dodgers 5 9 4.47   3.98 87.0%         82         5         2         2         1
2000
NLCS G2
Mets 6 Cardinals 5 9 4.46   4.66 32.0%         91        13         3         2        –
2016
NLDS2 G5
Dodgers 4 Nationals 3 9 4.46   4.66 21.0%         84        14         1        –        –
1977
WS G1
Dodgers 3 Yankees 4 12 4.45   4.80 30.0%         97        11         2         1        –
1954
WS G1
Indians 2 Giants 5 10 4.43   4.74 29.0%         89        11         1        –        –
1958
WS G1
Yankees 3 Braves 4 10 4.43   4.56 40.0%         88        10         3         2        –

 

 

I hope you’ll look at these and see that while they have different shapes, they all contain a similar ‘volume’ of excitement.

Another way to evaluate the method is to look at games with the same WPA. Going back to where I began in this article, here are the seven games with a base WPA of 4.52 (No promises that BBRef has not revised the scores since I captured the data…). They are each tied for the 108th highest WPA. But after using the 1.3 power factoring, you get this:

  Game Outcome RANK (WPA^1.3)*2  WPA   #
Plays
 Top 5
Plays
 # plays
30-70%
 P>=
.1
 P>=
.2
 P>=
.3
 P>=
.5
1960 WS G7 Pit 10 NYY 9 in 9 52              5.10   4.52     77   1.74         25    15      4      3      1
1975 NLCS G3 Cin 5 Pit 3 in 10 63              4.88   4.52     81   1.60         49    14      3      3      1
1911 WS G3 A’s 3 Giants 2 in 11 110              4.41   4.52     86   1.10         58    15      3      1     –
1998 NLCS G1 SD 3 Atl 2 in 10 117              4.36   4.52     84   1.10         59    11      2      1     –
2011 NLDS2 G5 Ari 3 Mil 2 in 10 119              4.35   4.52     85   1.05         71    13      2      1     –
1926 WS G5 NYY 3 StL 2 in 10 130              4.20   4.52     86   0.84         66    16      1     –     –
1995 NLCS G2 Atl 6 Cin 2 in 10 139              4.12   4.52     95   0.75         70    13     –     –     –

 

1960 gets the love it deserves, moving up 56 spots to the 52nd best game. That despite of having the fewest plays in the 30%-70% victory range. Games with more plays do worse since that means they have smaller impact plays on average. Think of the Top 5 plays as the highlight reel for the game. 1995 NLCS Game 2 has no play >0.2 and therefore drops 31 spots in the rankings.

Adjusted WPS? Weighted WPS? Power WPS? I really do need to give it a proper name.

 

A Final example, from among the greatest Playoff games ever.

2000 NLDS G3: Mets 3, Giants 2 in 13 innings — ModWPS Rank = 11, PowerWPS Rank = 22

1986 ALCS G5: Red Sox 7, Angels 6 in 11 innings — ModWPS Rank = 22, PowerWPS Rank = 12

1980 NLCS G5: Phillies 8, Astros 7 in 10 innings — ModWPS Rank = 25, PowerWPS Rank = 14

 

The 2000 game had the higher WPS, partly because it had more plays. ModWPS likes it more due to the additional action and walk-off homer, which the better top-three plays in 80/86 could not overcome.

 

year        WPS      Plays      Last Play    Top-3     ModWPS

2000       6.34        109         0.42          0.98                 7.74

1986       5.86       97           0.05             1.42                 7.33

1980       6.06        93           0.04            1.11                 7.21

 

So why do I think 1986/1980 are better?

Because, the deeper you go beyond the top three, the better the other two are revealed to be.

 

2000                                       1986                                       1980

1.28                                        1.94                                        1.61                        Sum of Top-5 Plays

42-31-25-16-14                  73-35-34-32-20                  40-38-35-26-24     Top-5 Plays

1.88                                        2.77                                        2.43                        Sum of Top-10 Plays

16-3-2-0                               14-5-4-1                               17-6-3-0               10%-20%-30%-50% plays

 

Or simply check the line scores.

2000

0 0 0 2 0 0 0 0 0 0 0 0 0 (2) Giants

0 0 0 0 0 1 0 1 0 0 0 0 1 (3) Mets

1986

0 2 0 0 0 0 0 0 4 0 1         (7) Red Sox

0 0 1 0 0 2 2 0 1 0 0         (6) Angels

1980

0 2 0 0 0 0 0 5 0 1             (8) Phillies

1 0 0 0 0 1 3 2 0 0             (7) Astros

 

The 2000 game IS a fabulous game. But the 1986 and 1980 games are more epic, with all the late-inning heroics. The 2000 game has exactly the required three big plays and the walk-off. It checks all the boxes.

I do kinda feel bad writing this. It sounds like I’m just picking on modified WPS here. LOOK AT WHAT ELSE IT GOT WRONG…

But as I said before, Power WPS is barely better. And to show that it’s better at all, I need to show those rare cases where it makes a better call. And it was an excellent benchmark, comparing differences between it and my sixty-eleven schemes helped me identify the flaws in sixty-ten of them.

Of course, even this is not the perfect system. Any play-by-play method will still fail to capture the in-play action. A bases-empty foul pop-out rates exactly the same as a bases-empty thrown-out-at-home-trying-to-stretch-a-triple. But it is the best we can do for now.

Whereas I used to guess my line score method captured maybe 70% of the excitement of a game, PBP ratings must be capturing upwards of 90%. Which means greater confidence in game rankings and playoff series ratings.

Anyway, if anyone has any thoughts, feedback, or questions I’d love to hear them. If no one can shoot the idea full of holes, or even one hole; then comes ranking and lists of games and series.