Archive for Uncategorized

Can Ohtani Optimize His Hitting Value in the AL?

When Shohei Ohtani, AKA the Japanese Babe Ruth, was deciding which MLB team he wanted to join, the bet was that it would be an AL team, because of the possibility of batting as a DH when he wasn’t pitching. While he would get some guaranteed PA as a pitcher in the NL, plus probably some more as a pinch hitter, the total would likely have been no more than about 200 in a season. Hitting as a DH, he could theoretically bat every game except the ones he was pitching (because if he were removed from the mound, his replacement would have to hit or be taken out for a pinch hitter). While in practice it’s expected he won’t play every day between pitching starts, he has the possibility of getting, say, 300-400 PA as a DH.

Sure enough, Ohtani chose the AL Angels. No one knows exactly how much he will bat this year, but Steamer projections give him 65 games and 259 PA. Since he’s projected to start 24 games as a pitcher, his batting projections represent about half of the games he’s expected to have available when he isn’t starting (162 – 24 = 138). The hope, obviously, is that Ohtani can provide value with his bat as well as his arm.

But based on Steamer, the bat will not be nearly as productive. While he’s projected to produce 3.1 WAR as a pitcher, as a hitter the expectation is only .5 WAR. After all the talk about how good a hitter Ohtani may potentially be, this seems disappointing. Of course, 259 PA is less than a full season’s worth of hitting, but even if he were able to hit for about a full season — say, 650 PA — and maintained the same rate stats, he would be worth only about 1.3 WAR. That would actually make him a below average player.

Even allowing for the fact that he is a pitcher, his projected WAR value still doesn’t seem that impressive. To put it in perspective, Madison Bumgarner, widely recognized as one of the best hitting pitchers currently playing, had 0.5 WAR last year, despite missing about half the season with an injury. In fact, he produced that 0.5 WAR with just 36 PA, less than 15% of Ohtani’s projected total. And last year was not even Bumgarner’s best as a hitter. His wRC+ was 86, excellent for a pitcher, but in 2014 his wRC+ was 114, and he produced 1.3 WAR in just 78 PA. As we’ve just seen, that’s as much WAR as Ohtani would be projected to achieve in a full season of 650 PA.

Pitcher Hitting is Far More Valuable than DH Hitting

Why is Ohtani’s projected WAR as a hitter so low? It’s not because he’s expected to perform poorly with the bat. His projected wRC+ is 113, just about the same as Bumgarner’s best, and historically good. Since WWII, only 32 pitchers have exceeded that value for a season (minimum 70 PA). And those were career years, whereas Ohtani if anything would be expected to improve his hitting as he matures. In fact, since the live ball era began, only one pitcher has reached a career wRC+ of 100 (minimum 1000 PA): Wes Ferrell, who hit that number exactly. Since WWII, the highest career wRC+ by a pitcher is 81 by Bob Lemon, or 87 by Don Newcombe, who barely misses the 1000 PA minimum; only one other pitcher has even reached 60. Among active pitchers (minimum PA: 300), only Zack Greinke (54) and Bumgarner (51) are > 50, though Bumgarner has been a little over 90 for the past four years.

So Ohtani is projected to be an exceptionally good-hitting pitcher. His WAR problem is the result of playing the DH position. WAR, of course, measures a player’s production relative to other players at that position. The DH generally is one of the best hitters on the team, since any player who is a good hitter can fill that role; it doesn’t matter if he’s a disaster at any defensive position. Pitchers, in contrast, are almost always by far the worst hitters on the team.

Calculating WAR involves summing four values: batting runs + positional runs + replacement runs + league runs. The total is then divided by runs/win, which is currently very close to 10.0. Different amounts of positional runs are assigned to different positions, with pitchers getting by far the largest benefit, and DHs the worst. As of 2017, the positional run value for pitchers was about .119 R/PA*. This is about the same as the league average R/PA, reflecting the view that a replacement level pitcher will produce essentially no runs at all.

Thanks to the positional runs, Bumgarner got a big boost this past season, despite being a below league average hitter with just 36 PA:

36 PA x -.017 = – 0.7 batting runs

36 PA x .119 = 4.3 positional runs

36 PA x .0305 = 1.1 replacement runs

36 PA x. 0015 = 0.1 league runs

Total = 4.8 runs (.5 WAR)

The value of about – .017 R/PA for batting runs is based on a league average R/PA value of .122, and Bumgarner’s wRC+ of 86, or 14% less than average: .122 x – .14 = – .017. The other values can be determined by dividing total PA by total replacement runs or league runs for any hitter with a large number of PA (the larger the PA, the more accurate the calculation). Though Bumgarner was a below-average hitter, his hitting was far above average for a pitcher, and that produces value that is recognized in the very large positional run adjustment.

In contrast, the DH has a very large negative positional run value; as of 2017, it was about – .029 R/PA. Ohtani’s projected WAR for 2018 can thus be calculated as follows:

259 PA x .016 = 4.1 batting runs

259 PA x -.029 = – 7.6 positional runs

259 PA x .0305 = 7.9 replacement runs

259 PA x. 0035 = 0.9 league runs

Total = 5.3 runs (.5 WAR)

The value of .016 RAA/PA for batting runs is based on a 113 wRC+ and a league R/PA value of .122: .122 x .13 = .016.

How Much WAR Would Ohtani’s Hitting be Worth as a National League Pitcher?

So from a WAR point of view, Ohtani is at a considerable disadvantage hitting as a DH, rather than as a pitcher. In fact, the positional disadvantage is so great that it considerably outweighs the fact that he will get many more PA as a DH in the AL than he would as a pitcher in the NL. Assuming his wRC+ is 113, how much value would he produce as a hitting pitcher in the NL? Steamer projects him to throw 148 innings. Assuming he pitched the same total in the NL, and that he batted fairly high in the order (at least, say, fifth or sixth; based on his AL hitting projections of 259 PA/65 games, this should indeed be the case), he might come to the plate as often as 65 times.

65 PA x .016 = 1.04 batting runs

65 PA x .119 = 7.74 positional runs

65 PA x .0305 = 1.98 replacement runs

65 PA x. 0015 = 0.1 league runs (note league runs/PA are less in the NL)

Total = 10.86 runs (1.1 WAR)

So Ohtani, assuming he was the same hitter, would be worth more than twice as much WAR as a hitting pitcher in the NL than as a DH in the AL, though he would come to the plate only about 25% as often (and we haven’t even considered the possibility that he could add further value in the NL as a pinch hitter). This begs the question, actually two closely related questions: 1) how many more PA would Ohtani have to have as a DH to produce the same 1.1 WAR he would produce as a NL pitcher? 2) how high a wRC+ would he have to have as a DH with 259 projected PA to match that 1.1 WAR?

In both cases, Ohtani would need to produce about 5.6 more runs above replacement. To do that while maintaining his projected 113 wRC+, he would need about 266 more PA, or a total of about 525. To do that while maintaining his projected 259 PA, he would have to elevate his wRC+ to 131.

Of course, if were able to produce a 131 wRC+ in the AL, he could presumably do it in the NL, too, which would increase his value there. It would not increase it as much, though, because of his much fewer PA. So a better question to ask would be: how high does his wRC+ have to be to match his NL WAR, given the projected PA of 259 as a DH, vs. 65 as a NL pitcher? It turns out his wRC+ would need to be about 137. Above this value, he would produce more WAR as a DH, while below it he would produce more WAR as a pitcher. This value is close to what is usually considered the mark of an elite hitter, 140.

However, the projected PA values that we’re working with may be low if we want to consider Ohtani’s potential in years beyond his rookie MLB season. On the one hand, if he proves to be a good hitter, he may get more PA. We might project a maximum of 400 PA. To get this many, he would have to play as a DH in about 100 games. Of the remaining 62 games, he would pitch in 24 and rest in 38. In order to rest both on the day before and the day after he pitches, he would need a total of 48 rest days, but the remaining ten might come on the team’s day offs.

With regard to pitching, if Ohtani were in the NL, and becomes an ace, let’s assume he would start a little more often, and log a total of 90 PA as a pitcher. This is pretty close to a maximum value in the current environment; in the past decade, only seven pitchers have had more PA in a season. In addition, let’s assume he appears as a pinch-hitter 110 times, giving him a total of 200 PA. In the 90 PA as a pitcher, his positional run value would be .119 R/PA, as explained before. In his 110 PA as a PH, we assume his positional run value is – .029 R/PA, the same as for a DH in the AL.

Using these values, we can estimate the number of runs above replacement Ohtani would be worth in the NL, compared to the AL, for various values of wRC+:

wRC+ NL Pitcher1 AL DH2
120 19.0 11.8
130 21.4 16.6
140 23.9 21.52
150 25.7 26.4

 

1 – Assumes 90 PA as a pitcher + 110 PA as a pinch-hitter

2 – Assumes 400 PA as a DH

Because of the larger number of PA as a pitcher, plus the additional PA as a PH, Ohtani now produces more run value in the NL up to wRC+ values > 140. He would have to have a wRC+ of nearly 150 before he would produce more value as a DH.

Has Ohtani’s Decision Eliminated Some of His Potential Value?

Will Ohtani be as valuable a hitter in the AL that he could he have been in the NL? Probably not. If we start with his projected stats for 2018, he will produce only about .5 WAR as a DH. Assuming the same wRC+ of 113, and the 65 PA likely to accompany his projected 148 IP, he would produce more than twice that total, about 1.1 WAR, as a pitcher. This is because the positional advantage for a pitcher is huge, while there is a large positional disadvantage for the DH.

Some of this value gap may be reduced if Ohtani becomes a much better hitter than is projected for 2018, because the much greater number of PA available as a DH allows him to take greater advantage of better hitting. But he would have to hit considerably better. Still assuming 259 PA as a DH vs. 65 PA as a pitcher, his wRC+ would have to nearly 140, an elite level, for his WAR as a DH to match that of a pitcher in the NL. That wRC+ value could be lowered to as much as 120 if Ohtani were to log as many as 400 PA, which seems close to the maximum compatible with his pitching program. But we might also argue that were he in the NL, he would perhaps pitch a little more often, and thus receive more PA as a pitcher, plus appear as a PH in most games in which he didn’t pitch. Making what I think are some reasonable assumptions about total PA under these conditions, Ohtani would have to produce at nearly a 150 wRC+ clip to produce as much value as a DH. Only seven qualified hitters managed that this past season.

Considering the reputation as a hitter that accompanies Ohtani as he comes to the U.S., this seems a little deflating. Even if he managed to produce a 150 wRC+, which seems quite unlikely, his total hitting WAR would be about 2.6. That would just about equal Wes Ferrell’s mark in 1935, the highest single season WAR for a pitcher in the live ball era, which certainly would be a major accomplishment. But it would not be that much more than a good hitting pitcher like Bumgarner manages even without pinch-hitting, nor would it add so much to Ohtani’s total WAR as a two-way player that his combined value would likely reach historic levels. If he is to finish anywhere near the top of the WAR leaderboard, it will have to be mostly through his arm, not his bat.

But even if Ohtani produces relatively little WAR as a hitter, this should serve as a reminder that there are different ways to understand value. The prospect of a pitcher who can hit well enough to DH even just part of the time has another kind of value to a team. Ohtani’s presence as an option at DH may open up a roster spot for another player, much as Ben Zobrist has had value beyond his WAR because of his ability to play multiple defensive positions. Surely the Angels are aware of this, and won’t be put off by his actual WAR totals.

*Though pitchers are not usually included in discussions of positional runs, this value can be calculated from the values table for the batting data of any pitcher. It corresponds roughly to 80 total runs for a whole season, batting every game, though of course pitchers never even approach this.


Summarizing My Findings on Launch Angle

Over the last year I made a series of studies on Statcast and I thought it would be interesting to write a little overview article to summarize my findings.

In June I looked at the launch angle profile of the league. The average went up of course, but it accelerated faster at the top than at the bottom, so we have not reached a stage of consolidation yet where the league is moving closer together in launch angle, which ultimately should be expected (the LA is increasing at the bottom but less than at the top.

That means there still is room for more growth in elevating but mostly in the bottom half of launch angle.

In the above I found that there are limits to elevating. I found the top guys usually average 11-16 degrees of launch angle. Below that players definitely can benefit from elevating more.

Then I was looking at the cost of too much elevation. A common theory is that swinging up more leads to more Ks because you are not really matching the plane of the pitch. I found a small effect there but nothing really big.

However I did find that there is a BABIP cost, especially if it comes with pulling the ball, and confirmed that with more research and found out that elevating more without a BABIP cost is possible if you get off the ground while limiting pop-ups and high outfield FBs above 30 degrees like Daniel Murphy does very well, while the 50+% FB guys with 20+ degrees of average LA tend to have low BABIPs, especially when coupled with pulling a lot to sell out for power.

I also looked at the relationship of EV and LA and unsurprisingly found out that between like 8 and 20 degrees, exit velo doesn’t matter much, while above 20 degrees almost all production comes from homers. Balls above 20 degrees and below 95 MPH are basically worthless so you need a certain minimum power to make elevation work. Off the ground is always good, but for some it might make sense to stay between 5 and 20 degrees.

Not quite related to that topic, I also created a formula for the relationship between power, patience, and K rate. An old argument between sabermetric and traditional writers was whether Ks matter. We know that Ks are not worse than other outs and high-K hitters do not perform worse, but that is also because there is a selection bias against high-K, low-power guys. Everything being equal, low Ks is better, and I found a pretty linear relationship between K, BB, and ISO.

If production is equal, Ks obviously don’t matter, of course.


The Home Run Explosion, Home Runs, and Winning

I wondered how the power revolution changes the impact of power on winning. Does the abundance of HR mean that HRs are less valuable? Or are they even more necessary?

For that I compared 2017 and 2008. 2008 is kind of an arbitrary cutoff; I used it because it was 10 seasons ago and not a completely different game.

In 2008 the top-10 HR-hitting teams averaged 86 wins, and in 2017 just 82 wins. Also in the top 10 in HRs in 2008, three teams had losing seasons, and in 2017 it was a whopping five teams. So it seems being a top-HR team helps less.

However, when looking at the bottom 10 HR-hitting teams, it is 74 wins for both years. Three teams of the bottom 10 in HRs had winning seasons in 2008 versus just two in 2017. So it didn’t become easier to succeed as a no-power team.

The league also got closer together in HRs. In 2008 the bottom-10 average was 127, and it as 1.6 times as much for the top 10 (197). In 2017 it was 172 for the bottom and just 1.3 times as much for the top (230).

Of course park factors and year-to-year variations play a role, but last season Colorado wasn’t even in the top 10 for example.

So it seems power is at least as much needed to win as it used to be, but it isn’t really much of a difference maker anymore, it is more a baseline needed to win. But teams like the Rays and A’s who hit tons of homers in a pitcher’s park show that you can’t really build around power as a main skill; you need to make sure you don’t suck at power, but since you can’t really separate anymore with power, you need other primary skills.

I would probably say make sure to be in the top third in power, but once you are there, don’t sacrifice other stuff to get even more power.

That is especially true for defense. The A’s led the league in average launch angle and were fourth in HRs. Since they were only seven HR behind the Yankees and four behind the Astros in a vastly less hitter-friendly park, we can probably say they were the top HR-hitting team.

They tried to sell out for power and it clearly wasn’t enough to make up for historically bad defense and other flaws.

So teams definitely shouldn’t sacrifice in other regards; there is enough power around to not put bad defenders or super low OBPs in the field to get more power.

Power is as important as it ever, was but it is not possible to dominate with it anymore like the 1927 Yankees did. Now it is now one necessary skill of many and well-roundedness is the name of the game in 2017. Same can be said for contact-hitting. People said after 2015 that contact was the future. However, low-power slap hitting didn’t prove to be successful, but with power now available so easy, teams now might be able to cut back on the Ks a little without sacrificing power like the Astros did, because super high Ks can suppress on-base percentage when it doesn’t come with Adam Dunn-like walks.


A Different Sort of Debate on WAR

Last month, the sabermetrics community descended into complete and utter anarchy over the latest and greatest debate on WAR. Industry heavyweights like Bill James, Tom Tango, and our own Dave Cameron all weighed in on the merits of baseball’s premier metric. After the dust settled, Sam Miller published an article on ESPN igniting a different sort of debate on WAR.

Miller’s piece noted that aside from the possible flaws behind WAR itself, each corner of the internet is calculating it a different way. For pitching specifically, FanGraphs (fWAR), Baseball Reference (rWAR), and Baseball Prospectus (WARP) all publish measures of WAR that oftentimes have significant disagreements. But that’s by design.

These three metrics were brilliantly characterized by Miller as so:

  • rWAR – “What Happened WAR”
  • fWAR – “What Should Have Happened WAR”
  • WARP – “What Should Have Should Have Happened WAR”

The rest of the piece is outstanding, and comes highly recommended by this author. In the aftermath, though, Tom Tango of MLB Advanced Media responded with the following challenge:

Given that I humbly consider myself to be an aspiring saberist, I took that challenge. Well, I first took the challenge of college final exams, but then the pitching WAR challenge!

The dataset from which I worked off included 1165 qualified individual pitching seasons spanning from 2000-2016. For each season, I collected the player’s fWAR, rWAR, WARP, RA9-WAR, and RA9-WAR in the subsequent year. As Tango suggested, using RA9-WAR to look retrospectively at our 3 competing pitching metrics will be the most effective way to measure the differences amongst the metrics themselves.

For those interested in the raw data, feel free to check it out here, and make a copy if you’d like to play around with it yourself.

Given the nature of the dataset, a logical first place to start was with a straightforward correlation table and go from there. That correlation table is displayed below.


As expected, small differences do exist between the various metrics in their abilities to predict future performance. In the sample, fWAR leads both WARP and rWAR by slight margins. For all you statheads out there, a linear regression on the data returns statistically significant p-values for fWAR and WARP, but not rWAR.

So that was fun, wasn’t it? With all of the nitty gritty math out of the way, let’s dive into a few examples. Miller already highlighted Teheran’s strange 2017 season, but as it turns out, there are far more extreme instances of metric disagreement.

Take Felix Hernandez’s 2006 season for example. His first full season in the bigs culminated in an underwhelming 4.52 ERA, but a 3.91 FIP and a 3.37 xFIP were promising signs of future success. Similarly, the WAR metrics were unable to come to any sort of consensus.


By WARP, the 20-year-old Hernandez was the 14th best pitcher in 2006. He was surrounded on the leaderboard by names like Roy Halladay, Randy Johnson, and Greg Maddux. By rWAR, his 2006 season ranked 135th alongside Jose Mesa, Cory Lidle, and interestingly enough, Greg Maddux.

fWAR, on the other hand, seems to have found a happy medium between the other two metrics. Sure enough, it was also the most accurate predictor of Hernandez’s RA9-WAR in 2007.

Taking a step back, I now wanted to determine which of the three metrics was the most accurate predictor of a pitcher’s future RA9-WAR. Just as Tango does, we’ll call the current season”Year T” and the next “Year T+1.” The results of this exercise are displayed below.
Yet again, we see a slight victory for the FanGraphs WAR metric. However, with over 1100 seasons in our sample, no single metric stands apart from the others. After all, they are designed with the same goal in mind: measure pitcher value. As you’ll see below, each metric usually ends up with a similar result to the others. (Click to view a larger version)


What happens, though, in instances like Teheran’s? When the metrics have stark disagreements with each other, which metric remains most reliable? To answer this question, I dug up the 10 most significant head-to-head disagreements among each of the metrics, and again looked at which version of WAR best predicted the RA9-WAR in Year T+1. Those results are listed below.

What stands out to me here is not only that fWAR still appears to be the best forward-looking metric, but also that in nine of its ten most significant disagreements with rWAR, the DIPS approach to WAR won out.

Just as in “The Great WAR Debate of 2017,” this discussion too is entirely dependent on what one intends to use WAR for. Here, we’ve established fWAR as an excellent forward-looking metric. Depending on who you ask, rWAR likely serves its best purpose illustrating, as Miller put it, what did happen. WARP may either be many years ahead of its time, or could still use a fair amount of tweaking. Or both. No matter, each version of pitching WAR comes with its own purpose, and each purpose has its own theoretical use.


Are We Overvaluing Power Hitters?

Aaron Judge and Jose Altuve were seemingly neck and neck in MVP voting this year (even if they are neck and belly button when standing next to each other). Judge had the edge in FanGraphs WAR, while Altuve held an edge according to Baseball Reference. Altuve had gaudy batting average and stolen-base totals, while Judge reached the coveted 50-home-run plateau to go along with his jaw-dropping Statcast numbers. Heading into awards season, the American League MVP was hyped as a two-man race that could go either way.

But, then the voting happened, and Jose Altuve got 27 first-place votes to Aaron Judge’s two. There were a lot of reasons for this from storyline, to traditional numbers, to team record. One of the most prominent among them for sabermetric voters was Aaron Judge’s clutch performance. According to the Clutch metric found on this site, he was the least clutch player in baseball this year. Actually, he was the least clutch player in baseball this entire millennium. Wait no, actually, he had the single least clutch season in the history of the metric (since 1972).

Up until now, Clutch has not been shown to have predictive value, even if it is important in deciding things like MVP races which are based on things that have already happened. But, as you may have guessed from the article title, I think there may be evidence to suggest otherwise. Here a list of the least clutch players in history for their entire career according to the clutch metric.

Rank Name Games HR BB% K% Clutch WAR
1 Sammy Sosa 2354 609 9.4 % 23.3 % -14.67 60.1
2 Mike Schmidt 2404 548 15.0 % 18.7 % -13.45 106.5
3 Lance Parrish 1988 324 7.8 % 19.6 % -12.90 43.4
4 Jim Thome 2543 612 16.9 % 24.7 % -11.66 69.0
5 Chet Lemon 1988 215 9.5 % 13.0 % -11.03 52.0
6 Jermaine Dye 1763 325 8.3 % 18.1 % -9.67 14.5
7 Alex Rodriguez 2784 696 11.0 % 18.7 % -9.53 112.9
8 Andre Dawson 2627 438 5.5 % 14.0 % -9.49 59.5
9 Gary Carter 2295 324 9.4 % 11.1 % -9.25 69.4
10 Barry Bonds 2986 762 20.3 % 12.2 % -9.13 164.4

This list is populated by a certain type of player: good ones. The difference in WAR between Jermaine Dye and Lance Parrish in 9th place would be a fantastic career for almost anyone. But, more importantly, it is populated by high-strikeout, power-hitting sluggers. Every single player on this list has a double-digit strikeout rate and everyone but Chet Lemon has at least 300 career home runs. The list of the most clutch players in history, on the other hand, is not made up of power hitters.

Rank Name Games HR BB% K% Clutch WAR
1 Tony Gwynn 2440 135 7.7 % 4.2 % 9.49 65.0
2 Pete Rose 2179 57 10.6 % 5.8 % 9.07 43.5
3 Scott Fletcher 1612 34 8.6 % 9.1 % 8.61 24.9
4 Mark McLemore 1832 53 12.1 % 13.6 % 8.51 17.4
5 Ichiro Suzuki 2636 117 6.0 % 10.0 % 8.25 58.2
6 Dave Parker 2466 339 6.7 % 15.1 % 7.64 41.1
7 Omar Vizquel 2968 80 8.6 % 9.0 % 7.54 42.6
8 Ozzie Guillen 1993 28 3.4 % 7.2 % 7.48 13.1
9 Lance Johnson 1447 34 6.1 % 6.6 % 6.89 26.4
10 Jose Lind 1044 9 5.4 % 9.2 % 6.71 3.3
11 Mark Grace 2245 173 11.6 % 6.9 % 6,.58 45.5

 

This list is made up of a very different kind of hitter. Tony Gwynn, Pete Rose and Ichiro are perhaps the three most well-known contact hitters of all time. Only three players on this list have double-digit strikeout rates, and only one has 300 career home runs. Chet Lemon, dead last in home runs on the other list, would rank second on this one.

Aaron Judge fits right into the pattern of these lists, as one of four qualified players with a 30% strikeout rate or higher. If you sum the clutch score of the top 10 players in strikeout rate this year, you get -9.05, or nearly one win per player lost due to clutch performance. If you remove Aaron Judge, the sum is a still gaudy total of -5.41.

I charted Strikeout rate against clutch score for all players qualified in 2017, and there is a small but definite trend. Below the chart, you can see the regression equation along with the P value for the coefficient and the R^2.

Ultimately I don’t have the tools or the time to fully explore this idea, but it would appear that there is an actual relationship here. The effect may be minuscule as the R^2 indicates, but the general trend seems to indicate that clutch players are more contact-oriented. This makes sense, because the most clutch situations in a game happen with men in scoring position, where the difference between a strikeout and a fly out or ground out can be an entire run. Further work needs to be done, but I would not be surprised to find that batted-ball type or walk rate also has an impact. For example, hitters with higher fly-ball rates may be more clutch because, with runners on base, a fly ball avoids a double play with a man on first, and may drive in a run with a man on third. With nobody on base and nobody out, the way a batter gets out does not make a difference. But in clutch situations, all outs are not created equal.


The Giants’ Not-So-Shiny New Toy

The Giants made a big splash by acquiring Evan Longoria, owner of three All-Star nominations, three Gold Gloves, a Silver Slugger, and the 2008 AL Rookie of Year. I will come right out and admit that I have hardly spent any time thinking about Longoria at all through his 10-year career. As a fan of an NL West team, the Rays are about as far away from my realm of focus as you can get. Throw in the fact that they are a small-market team dwarfed by the Yankees and Red Sox, and Longoria simply hasn’t made a huge impression on me.

After reacquainting myself with his player page, I realized how much I have been missing. Longoria has amassed almost 50 WAR in his career so far, placing him on the bubble of many Hall of Fame stats despite being only 32 years old. He has avoided any disastrous seasons, as his lowest WAR total was 2.2, and that came in a 2012 season when he only played in 74 games. Almost as impressive as his WAR totals – that 2012 season has been the only season in 10 years that he missed significant time due to injury. In the past five years, he has played in more games (798) than anyone in the MLB. He has been the epitome of health and consistency for a decade.

Longoria has earned his value by being very well-rounded. He provides significant value with his bat, as his career wRC+ mark of 123 matches up with the likes of Yoenis Cespedes, Jose Altuve, and Mookie Betts, all extremely accomplished hitters that have yet to enter their late-career decline phases. As the three Gold Gloves imply, Longoria is also an impressive fielder, with career marks of 75 DRS and 89.1 UZR. While not a massive base-stealing threat, he has shown enough speed and baserunning intelligence to provide slightly above-average baserunning value. Simply put: the dude is good at playing baseball, and he’s been proving it for an entire decade now.

As impressive as that resume is, the Giants don’t get to enjoy any of his past accomplishments. They didn’t trade for 2008-2017 Evan Longoria, they traded for 2018-2022 Evan Longoria. So now the question becomes: Is Evan Longoria still good? Jeff briefly touched on this immediately after the trade, but I wanted to take a deeper look.

At 32 years old, he is past the typical peak years for most baseball players, and in Longoria’s case, he already sustained a pretty clear peak over his first six seasons (ages 22-27). As Jeff noted, he put up a wRC+ of 135 during this time; compare that to his four seasons since then (ages 28-31), when his wRC+ has dropped to 108. Don’t get me wrong – 108 is still good! It’s just not the elite All-Star player we saw at the front of his career. His defense has followed the same trajectory, as he put up +79 DRS and +78.4 UZR over his first six seasons, then dropped to -4 DRS and +10.7 UZR over his last four seasons.

This is a familiar story: good baseball player gets older, becomes worse baseball player. But it’s so familiar that it can also be a trap – Longoria might end up following the Adrian Beltre career path, who posted a 6-WAR season at 37 years old. Looking at the numbers, though, I just can’t make myself believe that Longoria is anything more than a useful starter right now, and one that will shortly become a below-average player.

Longoria’s strikeout rate immediately jumped out to me, as he only struck out 16.1% of the time last year, setting a new career low, almost 4% below his career average. This is promising! In an era of increasing strikeouts, Longoria is figuring out how to put more balls in play, giving him more chances of getting on base. Of course, this line of thinking requires that he is trading strikeouts for quality batted balls, and considering his ISO last year sat 50 points below his career average, it didn’t look like this was the case. After digging deeper into some plate discipline numbers, it became very obvious to me what was happening.

2013 was Longoria’s last star-caliber season. The following year, his wRC+ dropped from 132 to 105, with a corresponding spike in Swing%. All of a sudden, Longoria was much more aggressive, swinging at more pitches both inside and outside of the zone. And especially in 2017, he seemed to be focusing intently on putting the ball in play, with a large spike in Contact% despite seeing the 2nd lowest Zone% in his career. Some people are able to cut strikeouts by controlling the strike zone better, but it looks like Longoria was cutting strikeouts by swinging more often and making poor contact on bad pitches. Consider his batted-ball distribution:

The first big red flag here is the red line along the bottom. Once again, starting in 2014, we start seeing a worrisome trend as he began hitting more and more infield flies. All his improvements in strikeout rate are erased here, as infield flies are essentially automatic outs and are just as bad. The other interesting tidbit in this graphic is the interplay between his GB% and FB% the past two years. Longoria had a mini-offensive resurgence in 2016, and it looks like that can be attributed to him lifting the ball more often. In 2017, he lost all of his FB% gains and then some, driving more balls into the ground than ever before.

Jeff also touched on the relevant Statcast data. Longoria’s exit velocity dropped significantly last year, as did his rate of barrels and xwOBA. There was nothing fluky going on for Longo in 2017 – he was swinging more often but making worse contact, and more of his batted balls were either going into the ground or popped up in the infield.

Is a turnaround completely out of the question? Of course not, nothing is out of the question. Perhaps a change of scenery will provide a spark for the 32-year-old. Perhaps he will be motivated to prove to the baseball world that the Giants made a good trade, and he will work harder than ever to make it back to All-Star levels. Even if he simply sustains his current production, he is still a 2-3 win player right now. But the Giants need more than that, and we’re already four years into a significant decline for Longoria. Both his bat and his glove are on the wrong side of the age curve, and it looks like the Giants just added another expensive, aging veteran to throw onto the pile.


Giants, Rays Make Strange Trade

On December 20th, the Rays shipped Evan “Career Ray” Longoria to the Giants for Christian Arroyo, Denard Span, Matt Krook and Stephen Woods. On the surface, this seems like a deal that fits the needs of both teams. The Rays have initiated yet another rebuild that Longoria didn’t want to be a part of, and got some young players in return. Arroyo is a former top-100 prospect who, despite destroying lower levels in 2017, struggled in his debut with the Giants. The two arms are the classic “pitching prospects,” and, well, Denard Span is Denard Span. On the other side, the Giants filled an absolutely gaping hole at third base. They no longer have to play Pablo Sandoval, and that should be a win for any team.  However, this trade has left me scratching my head, and there’re a few reasons why. Let’s look at some statlines.

Player A – 96 wRC+ / 11 DRS

Player B – 108 wRC+ / 10 DRS

Which one do you think the Giants just gave Christian Arroyo up for? The answer is A, Evan Longoria, a 32-year-old who is making 13.5 million dollars a year.  Player C is Todd Frazier, a 31-year-old (almost 32-year-old) free agent who will more than likely sign a contract in the 10-12 million dollar range. Now, Longoria did have a down year at the plate in 2017 and is the better defender of the two, but looking at these numbers raises some questions. Why did the Giants give up a talented young prospect for someone they could have just signed in the free-agent market? It’s understandable that you can look at Longoria’s track record and expect him to bounce back from a down year, but there are a few other things to consider before jumping to that conclusion. First of all, Longoria is moving from Tropicana Field to AT&T Park, one of the most pitcher-friendly parks in the majors. According to Baseball Reference, Tropicana was also stifling, but still, moving to AT&T is not a welcome change for any hitter. Secondly, Longoria is 32 years old, and we all know what side of 30 that’s on (it’s the bad side). Thinking Longoria can bounce back during his age-32 season is a tough sell for anyone who believes in the aging curve.

Let’s consider what the Giants could have done differently. If they would have signed Todd Frazier, they would have been getting a cheaper contract for a player with essentially the same skill-set as Longoria; a power right-handed bat with a plus glove at third base. They’re practically the same age, and now, the Giants can keep Christian Arroyo around and give him some more time to develop in the minors or give him exposure at the major-league level if Joe Panik continues to struggle. Yes, they did offload Denard Span’s contract, so technically, Longoria is cheaper than Frazier would be, but I’ll address that in just a bit. They also had the option to not sign anyone at all and hope Arroyo develops into some sort of Matt Duffy 2.0. To make it clear, I don’t think Arroyo will ever be as good as Longoria, but I have no problem believing he could be a 2-3 win player a few years down the road.

Now, in terms of the big picture, only one of these moves keeps the Giants’ hopes for the future alive. If you haven’t noticed, all the stars on the Giants are going to be on the wrong side of the aging curve soon. Signing Frazier only contributes to that problem. Arroyo could have been a piece the Giants could have built their team around in the future when all of their other superstars are decrepit skeletons. Remember what I said about Denard Span being Denard Span and his contract being offloaded? That’s another problem that the Giants have to fix now. Denard Span isn’t good, but he’s essentially league-average at playing center field. Who do the Giants stick out there now? Steven Duggar? Mac Williamson? Both of those options represent a downgrade to Span. Instead, we can expect the Giants to throw a bunch of money at a free-agent outfielder, perhaps someone like Lorenzo Cain. Cain would represent a huge upgrade over Span, but Cain is still another 31-year-old who is projected to decline in his production while making close to 20 million dollars a year, which cancels out the effect of getting rid of Span’s contract in the first place.

If they do sign Cain, the Giants will then be spending more money than they were before they traded for Longoria in the first place. If they don’t, then they’ll have to expect lackluster production out of center field, somehow even more lackluster than Span already was. Finally, you have to consider if this move actually makes the Giants better than the rest of the NL West. It doesn’t. The Dodgers are still a super team, the D-Backs are still very good, and the Rockies, despite having some question marks about their rotation, are a good team as well. Well, okay, the Giants won’t finish behind the Padres, but you still have to be better than the best team in baseball last year to win your division. This is a lot to ask for a Giants team that has only added something like 1.5 wins this offseason and was the worst team in the National League last year. Don’t get me wrong; getting Longoria, a good player who makes way, way less than his market value is a great move, but I don’t think it is in the context of where the Giants are as a team, what they gave up, and the holes they still have to fill.

As for the Rays, this is a move that was going to happen eventually. They see that the Yankees and Red Sox are going dominate the AL East for a while, so they decided that now is as good a time as any to tear it down and start again. The Rays will continue to the do the Rays thing we all know and love, stockpiling as many Matt Duffy-type players as they can while consistently pumping out awesome pitchers from their farm system, then trading those pitchers for more clones of Matt Duffy. Arroyo will more than likely take the second-base job in Tampa over at some point during 2018, and will be a fun player to watch in the Rays lineup. Span might end up taking some time from Mallex Smith in left field, but Smith is definitely the more exciting and interesting player of the two. The real Matt Duffy will end up playing third, and the Rays will finish 3rd or 4th in the AL East like they do almost every year. Again, this a trade that was going to happen, but it’s just surprising to see how it ended up going down.


Does Lifting the Ball Have a Ceiling?

Elevating is en vogue; everyone wants to do it and it seems like every hitter who does it can become a power hitter, especially with rumors about a new ball. There have been many examples of successful hitters of that mold: Daniel Murphy, Justin Turner and Jose Altuve, among others. Is there a limit to this? Could we see hitters with a 25% GB rate in the future? 20% 15%?

One thing that seems to cap this is BABIP. There is a pretty positive correlation of BABIP and GB rate, i.e. GB hitters tend to have a higher BABIP. That seems logical since FBs tend to have a lower average, and even if they are hits they often don’t count for BABIP as they are often home runs.

This table shows the relation of BABIP and GB rate between 2008 and 2017. You can see that BABIP does go down with lower GB rates, but wRC+ is actually better with lower GB rates. Still, you could see a point being reached where the lower BABIP eats up the advantages.

GB rate >0.35 0.35-0.4 0.4-0.45 0.45-0.5 0.5-0-55 >0.55
BABIP 0.287 0.290 0.299 0.304 0.314 0.320
wRC+ 106 102 101 95 90 93

Average launch angle shows a similar picture:

 

av. LA <8 8 to 10 10 to 12 12 to 14 14 to 16 16 to 18 >18
BABIP 0.318 0.314 0.305 0.298 0.300 0.289 0.274

It seems that once you get past a certain launch angle or GB rate, a drop in BABIP is inevitable. However, an exception might be possible. I looked up guys with a lower than 35% GB rate and a FB rate of lower than 45%, and their BABIP was 0.304. Those guys were pretty rare between 2008 and 2017, but it is possible. You just need to get the ball off the ground and avoid both pop-ups and high outfield fly balls above 25 degrees. Not an easy thing to do, though, as the bat is a round object, and batted balls will always be distributed rather normally around the average LA, meaning that a higher average LA usually will mean more high outfield fly balls.

However, it is possible to imagine a super-hitter who has such good bat control that his band is very narrow. The best example of this might be Daniel Murphy, who managed to have a 34% GB rate with just a 40% FB rate (meaning a very high LD rate), and subsequently a very high (.345) BABIP over the last three years.

So we could indeed imagine a kind of “super Murphy” who hits 25% grounders with lower than 45% FBs. However, to date, we have not seen a guy sustaining such high LD rates; that guy would probably have to have superhuman bat control (which probably eliminates almost all >25% K rate guys). But with modern training methods, who knows what might happen.


Improving WPS

“All happy families are alike; each unhappy family is unhappy in its own way.”  — L. Tolstoy

You can say something similar about baseball games. All boring games are alike; but exciting games are interesting in their own ways. Every boring game has one team building up a big early lead, which is never threatened. But there are many ways to have an exciting game: the pitcher’s duel, the slugfest, the late-inning comeback, extra innings, all in various combinations. And in between them are the bulk of games that are simply ordinary.

All of which makes ranking exciting games a tricky process, at least compared ranking to how boring they are. How does one compare Game 7 of the 1991 WS (1-0 in 10 innings) to Game 4 of the 1993 WS (15-14 in 9 innings) on the same scale? They’re great in different ways.

Back in 2005 I created a system to do just that, a rating system based simply on the runs scored in line score. I may have been the Christopher Columbus of that new world. And ranking the games allows you to rate post-season series-es.

The line-score system did work in the sense that it could tell the difference between a great game and a good one, and between a good one and an ordinary one. But while the line score gives you the basic outline of the game, it was blind to the details of what happens DURING each inning. Zero runs scored in the top of the 1st rates exactly the same; whether there were three pop-ups, or if three singles were followed by a triple play.

Eventually I realized that Baseball-Reference.com (ALL HAIL BBREF) has the play-by-play data for all playoff games, which includes a probability of victory after each play (anything that changes the outs, baserunners or score). Plotted, you can easily see if a game was good; It looks like and earthquake. If it was bad, it looks like the EKG of a corpse. Using those probabilities, we can create a much more accurate game rating. I fiddled with many rating schemes over the last 10 years before settling on one that seems both conceptually simple and that yields reasonable results.

Of course, by then I had been beaten to the basic concept by Dave Studeman (WPA) and Shane Tourtellotte (WPS). Twelve years is too long for laurels resting.

WPA = Sum(change in probability between plays)

Modified WPS = Sum(change in probability between plays) + top three plays + Final play

What I have developed is similar to their work, but I think it has some small advantages. Generally, my ratings will be quite close to Shane’s (R-squared > 99.5%). He correctly realized that simply summing the probabilities doesn’t quite work, which is why he modified it. An example…

There are seven post-season games with a WPA of exactly 4.52. Among them are:

1995 NLCS Game 2

Reds beat the Braves 6-2 in ten innings.

95 Plays, 13 plays changed the odds by at least 10%

top Play a Mark Portugal bases-loaded wild pitch +18%

70 plays with the odds in the 30% to 70% range

compared to

1960 WS Game 7

Pirates 10 Yankees 9 in nine innings

77 plays, 15 plays changed the odds by at least 10%

Of those 4 changed the odds by at least 20%

Of those 3 changed the odds by at least 30%

Of those 1 changed the odds by more than 50%

25 plays with the odds in the 30% to 70% range

 

There is simply no way those games are equal. The 1960 game has five different plays better than any play in the 1995 game. The 1995 game makes up the ground by (1) having 18 more plays (2) having fewer plays where nothing happened because the game was usually within one run.

1960 is still better because a +40% play isn’t twice as exciting as two +20% plays. Bill Mazeroski’s game-ending homer rates as +37%. Bobby Richardson’s game-starting line-out rates at +2%. Making a walk-off homer the equal of about 3 ½ innings with zero hits. NOPE. WRONG.

Shane accounted for this with his modified method. By counting the top three plays twice and Mazeroski’s walk-off homer three times, the ratings are now

1960: 6.49

1995: 5.19

And science prevails.

Of course, there is nothing magical about TOP THREE plays or LAST play. You could try using the top five plays and last five plays (believe me, I did).  But I do think that using Top-3 + Last can sometimes lead you astray. I will now present exhibits A and B to demonstrate where it can swing and miss.

Exhibit A: 1988 WS Game 1

Exhibit B: 1985 NLCS Game 6

I expect you to know them. The two biggest home runs in terms of changing the odds in post-season history courtesy of Mr. Clark and Mr. Gibson.

1985: WPA 4.48 in 83 plays and 9 innings

1988: WPA 3.94 in 82 plays and 9 innings

The 1985 game had more action with the same number of plays, which you can easily see in the line scores

StL          0              0              1              0              0              0              3              0              3              (7)

LA           1              1              0              0              2              0              0              1              0              (5)

 

Compared to

 

Oak        0              4              0              0              0              0              0              0              0              (4)

LA           2              0              0              0              0              1              0              0              2              (5)

 

The ‘85 game has a game tie in the 7th, broken tie in the 8th and lead change in the 9th

The ‘88 game has a lead change in the 2nd and a lead change in the 9th

Modified WPS says

1985: 4.48 + 1.34 + 0.01 = 5.83 (Tied for 94th best game)

1988: 3.94 + 1.43 + 0.87 = 6.28 (Tied for 58th best game)

I don’t think you can argue that the 1988 game is much better than the 1985 game; I don’t think it’s a better game at all. And it’s the last-play bonus that is to blame. Had the 1985 game been played in St. Louis then Clark’s homer would have been a walk-off and the game would have rated 6.56, well ahead of the 1988 game.

If you think about it, a last-play bonus is biased towards games won by the home team. If the home team loses, the last play will rarely amount to anything.

Only 23 times has it been at least 20%. When the home team wins, it is at least 20% 122 times.

Only 11 times has it been at least 30%. When the home team wins, it is at least 20% 96 times.

I also know this because I tried last play, last five plays, and last ten plays in trying to construct a rating system. I also tried top five plays, top ten plays, all plays over 10%, WPA – .03 per play (yielding the bizarre result of games with negative excitement).

Eventually I tried a simple power transformation on EVERY play. First, I tried summing the squares of the probabilities changes, like any good statistician would.

When I did that, the 1985 game Rated 10th and the 1988 game rated 5th. Which is the wrong order, and both games are just rated too high. Then I tried other powers…the Goldilocks approach, looking for the one that was just right.

 

Power             Rank               Rank

2.0          1985       10th         1988       5th best game

1.9          1985       12th         1988       8th Best game

1.8          1985       15th         1988       20th Best game

1.7          1985       23rd        1988       25th Best game

1.6          1985       32nd        1988       36th Best game

1.5          1985       38th        1988       51st Best game

1.4          1985       53rd        1988       76th Best game

1.3          1985       61st         1988       104th Best game

1.2          1985       79th        1988       133rd Best game

1.1          1985       100th      1988       158th Best game

1.0          1985       116th      1988      185th Best game

Everything above 1.7 was eliminated since it rated 1988 better than 1985

 

Here’s some shorthand I’m going to use:

Game 6 of the 1985 NLCS: STL 7, LA 5 in 9 innings — WPA 4.48 (9-4-2-1)

Game 1 of the 1988 WS: LA 5, SF 4 in 9 innings — WPA 3.98 (5-2-2-1)

The 1985 game had 9 plays rated>= 0.1, 4 plays rated>=0.2, 2 plays rated>=0.3 and 1 play rated >=0.5

The 1988 game had 5 plays rated>= 0.1, 2 plays rated>=0.2, 2 plays rated>=0.3 and 1 play rated >=0.5

For a sense of scale, the average game is WPA 2.67 (4.89-0.88-0.33-0.03)

(You can check the examples listed below on BBRef to get more detail on each game)

 

Checking 1.7, both exhibits rated higher than

Game 2 of the 2017 WS: HOU 7, LA 6 in 11 innings — WPA 5.30 (10-5-3-0)

Game 1 of the 2015 WS: KC 5, NYM 4 in 14 innings — WPA 6.36 (16-3-1-0)

1.7 weights the big plays too much

 

Checking 1.6, both test games rated higher than

Game 6 of the 1986 WS: NYM 6, BOS 5 in 10 innings — WPA 5.14 (16-3-3-0)

Game 6 of the 1986 NLCS: NYM 7, HOU 6 in 16 innings — WPA 5.80 (11-3-2-0)

1.6 weights the big plays too much

 

Checking 1.5,

the 1985 game rated higher than

Game 6 of the 1986 WS: NYM 6, BOS 5 in 10 innings — WPA 5.14 (16-3-3-0)

The 1988 game rated higher than

Game 4 of the 2001 WS: NYY 4, ARI 3 in 10 innings — WPA 4.58 (10-3-2-0)

1.5 weights the big plays too much, but it’s getting hard to find clear mistakes

 

Checking 1.4,

the 1985 game rated higher than

Game 3 of the 1976 NLCS: CIN 7, PHI 6 in 9 innings — WPA 4.72 (14-3-2-0)

Lead changes in the 7th, 8th and 9th innings.

The 1988 game rated higher than

Game 4 of the 1986 ALCS: CAL 4, BOS 3 in 11 innings — WPA 4.64 (7-4-2-0)

1.4 weights the big plays too much, but I’m now splitting hairs

 

Checking 1.3, I like this one. Let me check 1.2

 

Checking 1.2,

the 1985 game rated lower than

Game 2 of the 1996 ALDS: NYY 5, TEX 4 in 12 innings — WPA 5.02 (8-2-0-0)

Game 2 of the 1990 WS: CIN 5, OAK 4 in 10 innings — WPA 4.50 (10-2-0-0)

1.2 weights the big plays too little. Famous games are losing to games without any highlights.

 

So, I think 1.3 is the sweet spot.

My rating score is = Sum((change in probability between plays)^1.3) *2

The *2 at the end is purely cosmetic. It allows the very best game to score close to ten.

 

With base WPA, Gibson’s homer (.87) is worth about 25x a normal play (.035). With WPS it’s worth bout 75x a normal play. Raising all the plays to the 1.3 power means that Gibson’s homer is now worth about 65x a typical play.

With base WPA, Clark’s homer (.74) is worth about 21x a normal play (.035). With WPS it’s worth bout 42x a normal play. Raising all the plays to the 1.3 power means that Clark’s homer is now worth about 53x a typical play.

With a little algebra,

WPA:  Gibson = 1.18 * Clark

WPS: Gibson = 1.76 * Clark

Power 1.3: Gibson = 1.23 * Clark

A nice property of the transformation is that when the change in odds doubles, the play is worth ~ two and half times a much (2.46x)

 

EXCITEMENT IS NOT LINEAR

 

A 10% play is now worth 2.46 times as much as 5% play

A 20% play is now worth 2.46 times as much as 10% play

A 50% play is now worth 2.46 times as much as a 25% play

The system has a single parameter applied to ALL plays, so a game isn’t screwed if it has four great plays or the best play comes in the 8th inning. Ranking games this way, here are the five games better than, and worse than, my two test cases.

 

Series Road Team home team IP  (WPA^1.3)
*2
 WPA Top
Play
 # Plays  P>= .1  P>= .2  P>=.3  P>=.5
2014
ALCS G1
Royals 8 Orioles 6 10 5   5.14 35.0%         96        13         3         2        –
1935
WS G3
Tigers 6 Cubs 5 11 4.97   5.02 36.0%         96        15         5         1        –
1976
NLCS G3
Phillies 6 Reds 7 9 4.95   4.72 46.0%         82        14         3         2        –
2015
ALDS2 G2
Rangers 6 Blue Jays 4 14 4.93   5.46 37.0%       115         7         2         1        –
1997
ALCS G4
Orioles 7 Indians 8 9 4.92   4.92 38.0%         88        16         4         1        –
1985
NLCS G6
Cardinals 7 Dodgers 5 9 4.92   4.48 74.0%         83         9         4         2         1
1975
NLCS G3
Reds 5 Pirates 3 10 4.88   4.52 55.0%         81        14         3         3         1
1933
WS G4
Giants 2 Senators 1 11 4.87   4.94 55.0%         92         9         3         1         1
2011
ALCS G2
Tigers 3 Rangers 7 11 4.86   5.10 34.0%         92        13         3         1        –
2012
ALDS2 G2
Athletics 4 Tigers 5 9 4.86   4.86 41.0%         85        11         4         1        –
1999
NLCS G6
Mets 9 Braves 10 11 4.85   5.12 26.0%       108        14         3        –        –

 

 

Series Road home team IP  (WPA^1.3)
*2
 WPA Top
Play
 # Plays  P>= .1  P>= .2  P>=.3  P>=.5
1952
WS G5
Dodgers 6 Yankees 5 10 4.51   4.70 44.0%         92        10         4         1        –
1923
WS G1
Giants 5 Yankees 4 9 4.51   4.54 40.0%         78        12         2         2        –
1984
NLCS G4
Cubs 5 Padres 7 9 4.51   4.54 37.0%         83        10         4         2        –
1992
WS G2
Blue Jays 5 Braves 4 9 4.5   4.40 65.0%         85        11         1         1         1
1998
ALCS G2
Indians 4 Yankees 1 12 4.48   4.78 33.0%         96        11         3         1        –
1988
WS G1
Athletics 4 Dodgers 5 9 4.47   3.98 87.0%         82         5         2         2         1
2000
NLCS G2
Mets 6 Cardinals 5 9 4.46   4.66 32.0%         91        13         3         2        –
2016
NLDS2 G5
Dodgers 4 Nationals 3 9 4.46   4.66 21.0%         84        14         1        –        –
1977
WS G1
Dodgers 3 Yankees 4 12 4.45   4.80 30.0%         97        11         2         1        –
1954
WS G1
Indians 2 Giants 5 10 4.43   4.74 29.0%         89        11         1        –        –
1958
WS G1
Yankees 3 Braves 4 10 4.43   4.56 40.0%         88        10         3         2        –

 

 

I hope you’ll look at these and see that while they have different shapes, they all contain a similar ‘volume’ of excitement.

Another way to evaluate the method is to look at games with the same WPA. Going back to where I began in this article, here are the seven games with a base WPA of 4.52 (No promises that BBRef has not revised the scores since I captured the data…). They are each tied for the 108th highest WPA. But after using the 1.3 power factoring, you get this:

  Game Outcome RANK (WPA^1.3)*2  WPA   #
Plays
 Top 5
Plays
 # plays
30-70%
 P>=
.1
 P>=
.2
 P>=
.3
 P>=
.5
1960 WS G7 Pit 10 NYY 9 in 9 52              5.10   4.52     77   1.74         25    15      4      3      1
1975 NLCS G3 Cin 5 Pit 3 in 10 63              4.88   4.52     81   1.60         49    14      3      3      1
1911 WS G3 A’s 3 Giants 2 in 11 110              4.41   4.52     86   1.10         58    15      3      1     –
1998 NLCS G1 SD 3 Atl 2 in 10 117              4.36   4.52     84   1.10         59    11      2      1     –
2011 NLDS2 G5 Ari 3 Mil 2 in 10 119              4.35   4.52     85   1.05         71    13      2      1     –
1926 WS G5 NYY 3 StL 2 in 10 130              4.20   4.52     86   0.84         66    16      1     –     –
1995 NLCS G2 Atl 6 Cin 2 in 10 139              4.12   4.52     95   0.75         70    13     –     –     –

 

1960 gets the love it deserves, moving up 56 spots to the 52nd best game. That despite of having the fewest plays in the 30%-70% victory range. Games with more plays do worse since that means they have smaller impact plays on average. Think of the Top 5 plays as the highlight reel for the game. 1995 NLCS Game 2 has no play >0.2 and therefore drops 31 spots in the rankings.

Adjusted WPS? Weighted WPS? Power WPS? I really do need to give it a proper name.

 

A Final example, from among the greatest Playoff games ever.

2000 NLDS G3: Mets 3, Giants 2 in 13 innings — ModWPS Rank = 11, PowerWPS Rank = 22

1986 ALCS G5: Red Sox 7, Angels 6 in 11 innings — ModWPS Rank = 22, PowerWPS Rank = 12

1980 NLCS G5: Phillies 8, Astros 7 in 10 innings — ModWPS Rank = 25, PowerWPS Rank = 14

 

The 2000 game had the higher WPS, partly because it had more plays. ModWPS likes it more due to the additional action and walk-off homer, which the better top-three plays in 80/86 could not overcome.

 

year        WPS      Plays      Last Play    Top-3     ModWPS

2000       6.34        109         0.42          0.98                 7.74

1986       5.86       97           0.05             1.42                 7.33

1980       6.06        93           0.04            1.11                 7.21

 

So why do I think 1986/1980 are better?

Because, the deeper you go beyond the top three, the better the other two are revealed to be.

 

2000                                       1986                                       1980

1.28                                        1.94                                        1.61                        Sum of Top-5 Plays

42-31-25-16-14                  73-35-34-32-20                  40-38-35-26-24     Top-5 Plays

1.88                                        2.77                                        2.43                        Sum of Top-10 Plays

16-3-2-0                               14-5-4-1                               17-6-3-0               10%-20%-30%-50% plays

 

Or simply check the line scores.

2000

0 0 0 2 0 0 0 0 0 0 0 0 0 (2) Giants

0 0 0 0 0 1 0 1 0 0 0 0 1 (3) Mets

1986

0 2 0 0 0 0 0 0 4 0 1         (7) Red Sox

0 0 1 0 0 2 2 0 1 0 0         (6) Angels

1980

0 2 0 0 0 0 0 5 0 1             (8) Phillies

1 0 0 0 0 1 3 2 0 0             (7) Astros

 

The 2000 game IS a fabulous game. But the 1986 and 1980 games are more epic, with all the late-inning heroics. The 2000 game has exactly the required three big plays and the walk-off. It checks all the boxes.

I do kinda feel bad writing this. It sounds like I’m just picking on modified WPS here. LOOK AT WHAT ELSE IT GOT WRONG…

But as I said before, Power WPS is barely better. And to show that it’s better at all, I need to show those rare cases where it makes a better call. And it was an excellent benchmark, comparing differences between it and my sixty-eleven schemes helped me identify the flaws in sixty-ten of them.

Of course, even this is not the perfect system. Any play-by-play method will still fail to capture the in-play action. A bases-empty foul pop-out rates exactly the same as a bases-empty thrown-out-at-home-trying-to-stretch-a-triple. But it is the best we can do for now.

Whereas I used to guess my line score method captured maybe 70% of the excitement of a game, PBP ratings must be capturing upwards of 90%. Which means greater confidence in game rankings and playoff series ratings.

Anyway, if anyone has any thoughts, feedback, or questions I’d love to hear them. If no one can shoot the idea full of holes, or even one hole; then comes ranking and lists of games and series.


Let’s Find the Giants 88 Wins

We find ourselves in the midst of an exceptionally intriguing offseason. Rarely is there an opportunity to acquire a prior year’s MVP and remain in position to nab the number-two asset on the market: Shohei Ohtani. Given Ohtani’s decision to forego a contract that syncs up with his open-market value when he turns 25, he’ll hold a Black Friday-esque price-tag when posted. Virtually any team in baseball can make a play to acquire the former star from the Hokkaido Nippon-Ham Fighters, regardless of wallet size. That makes this particular campaign for a generational talent so intriguing.

Whether your team meets Ohtani’s duo of wants — independent of a passing grade on his questionnaire — is another story.

The San Francisco Giants are in a precarious position heading into 2018. Coming off a 64-win season, the lowest win total for their franchise since 1994, and the lowest of Bruce Bochy’s tenure by seven games, a rebound seems imminent. The current state of their roster, however, casts doubt on how relevant a rebound can make their team.

So, I sent out a tweet entertaining the possibility that one team lands the two biggest names of the offseason.

A little bit of mental math brought my over/under to 87.5 wins. Imprecise? Sure, but only three times since 2014 has one team improved on their prior year win total by more than 24 games: the Minnesota Twins (2016 to 2017, +26 wins), Arizona Diamondbacks (2016 to 2017, +24 wins), and Chicago Cubs (2014 to 2015, +25 wins). Whether a signal or mere noise, each of those improvements came without lavish acquisitions during winter (I used my subjective definition of “lavish”). Each was propelled to relevance by internal talent (Buxton/Sano, Ray/Godley, Arrieta/Bryant, etc.), superb management, and other favorable nods from the Baseball Gods. Each of the 29 responses to my poll came with three elements of consideration: Ohtani, Stanton, and everything else.

Ohtani

The pitching side of Ohtani’s value is interesting. ZiPS and Dan Szymborski were the first to throw their hat in the ring, giving Ohtani a 3.55 ERA over 139 innings of work, with 161 strikeouts, and a walk rate of 3.9 BB/9. It’s lukewarm, considering the hype around Ohtani and knowledge of his sub-1.1 WHIP over in the NPB. Do I agree with it? Not from a control standpoint, but we can work with it and my disagreement isn’t dismissal of a labor-intensive statistical model’s projection.

Taking the three essential components of FIP (walks, strikeouts, and homers), and our knowledge that pitcher fWAR is derived from FIP, we can backtrack from Ohtani’s ZiPS projection and in an anti-statistician kind of way. By comparing Ohtani’s per-nine peripherals to last year’s performers, we can infer his fWAR might be around 3.0 as a pitcher in 2018 (139 IP, 10.4 K/9, 3.9 BB/9, 1.0 HR/9). This ZiPS and fWAR magic says he’ll be slightly worse than 2017 Brad Peacock (that was a weird sentence to write).

Ohtani’s potential 3.0 fWAR is backed up when you look at his 2016 in the NPB. The righty posted 137 1/3 innings of work, with a 9.2 K/9, 2.9 BB/9, and a HR/9 just north of 1.0. This gives Ohtani something slightly better than Jose Berrios’ 2.8 fWAR 2017 campaign (an equally weird sentence to write).

Value for Ohtani with his bat on the Giants, a team obviously absent of a DH, is where confusion starts.

I want to keep this as simple as possible. It’s unlikely that he goes to the NL if contributing significantly on the mound and in the box are his main goals. The inherent risk for the lottery-winning club would be too high and uncertainty around whether Ohtani would prefer such a role plays an equally large factor. Travis Sawchik breaks Ohtani’s NL hitting value down better than I ever could, so I’ll only give you the product of his analysis.

Ohtani could have about 1.6 fWAR as a hitter. This is composed of 1.1 fWAR in his standard pitcher plate appearances, plus another .5 fWAR from regular pinch-hitting chances (emphasis on the word “regular”).

In total, we have a 4.6 fWAR player in Shohei Ohtani in the National League. Our 3.0 fWAR on the mound and an aggressive — but feasible — 1.6 fWAR in the box.

To find 88 wins for the Giants that my poll responders believe in, we need to start somewhere. It’s too easy to begin at a projection already circulating for the Giants’ 2018 win total, so I’ll make this hard for myself to execute, and likely, for you to rationalize. Let’s start with those 64 hard-fought wins Bochy’s squad scratched and clawed their way to. We’ll work backwards from there.

64 wins, plus roughly five we’re going attribute to Ohtani brings us to 69.

Stanton

Now onto Stanton.

Eno Sarris, a familiar name to many, looked through the surplus value on a trade that would send Stanton to the Bay Area. The names included in that analysis revolve around the following:

To SF: Stanton, Dee Gordon

To MIA: Joe Panik, Tyler Beede, Chris Shaw

We don’t have confirmation this would be the package, but I remain adamant Miami wants contract relief more than anything. Centering an offer around the eight FanGraphs wins above replacement (fWAR) Panik has accumulated in his career feels like a proper balancing of sides, given how much money the Giants would take on in a scenario like this. Whether Stanton opts out or stays through the length of his contract muddies just how much money the Giants, or any team, will tie up through 2027. Although it seems like a risk teams are willing to take, how that opt-out risk factors into offerings is another confounding input.

However, Stanton’s value to teams from a performance standpoint is less cloudy than his monetary value. He’s good. Very good. Completing two 6-fWAR seasons before turning 28 is desirable trait for any player. One of the first projections kicking around — FanGraphs’ Steamer — holds Stanton somewhat steady with his torrid 2017.  5.3 fWAR, buoyed by another 45+ homer season, and a wRC+ that holds up to his career standard. I have little objection to this, even if worry consumes you that a healthy season for Stanton was an anomaly.

Ohtani brought us to 69 wins and now Stanton will take us north of the only number above 15 anybody is ever excited to see. We’re at 74 for the Giants by taking WAR and interpreting them as literal wins, something I probably shouldn’t do given the debate the industry just had, but I’ll test my luck.

Everything else

This subheading encompasses a lot of assumptions. In my tweet asking my loyal followers to quickly gauge whether the Giants could get above the 87.5 wins, this considered everything from a (hopefully) full season of good Madison Bumgarner and paying a priest to rid the bad juju from the Giants’ clubhouse, to a minor investment in separate baseballs juiced specifically for AT&T Park.

We could venture another 1,000 words on the improvements of San Fran, but there are far more qualified Giants fans on this website and others (shoutout to Grant Brisbee at McCovey Chronicles) that have surely detailed this difference with more care and a deeper knowledge of the Giants’ issues and internal fixes.

Cutting to the chase, let’s make a simple push to the 88-win mark. FanGraphs’ depth-chart projections currently has the Giants as a 78-win team. That’s 14 wins better than 2017. It is also exactly what we need to go from 74 wins to 88.

Sometimes, things work out better than anybody could have ever planned.

We found our 88 wins.

The only thing I’m left wondering is whether my tweet and over/under projection at 87.5 inspired hopes of 90-plus-win seasons in voters’ minds. If Bochy & Co. can accomplish that feat without even one of Ohtani or Stanton, I commit to paying the shipping fee for Bochy’s Manager of the Year Award.

A version of this post can be found on my site, BigThreeSports.com, by following this link