Archive for August, 2017

The Kia Tigers Are Doing Everything Right — Except on the Weekends

The Kia Tigers are doing a lot of things right. At 64-34-1, they are in first place in the Korean Baseball Organization, with a comfortable five-game lead over the second-place NC Dinos. As a team, they are slashing a cumulative .306 / .375 / .479, and are first or second among teams in the KBO in virtually every offensive category.

category

H 2B 3B HR R K BB AVG OBP SLG wRC+
2017 Kia 1092 213 24 120 658 620 356 0.306 0.375 0.479 116.9
league rank 1 1 2 3 1 1 2 1 1 1 2

But the emphasis on offensive firepower has not come at the expense of pitching; while Kia’s hurlers are not dominating the league the way their hitters are, their pitching staff ranks first in the KBO in WAR (15.8), and has above-average marks in ERA+ (105) and FIP+ (105.6). This is a solid, well-rounded team.

However, Kia has had one major flaw throughout the season: They play significantly worse on the weekends.

The KBO schedule is set up such that each team plays two three-game series per week, one from Tuesday to Thursday, and one from Friday to Sunday. Throughout the 2017 season, Kia players, both pitchers and batters, have performed significantly worse on the weekends. The effect is most noticeable on the hitting side, with a precipitous drop in performance in games that happen in the second, Friday to Sunday, series of the week.

The table below shows the batting splits for the top-10 Kia hitters (by plate appearances), as well as the team as a whole, and clearly shows the distinction between the mid-week and weekend series. From Tuesday to Thursday, Kia hits like, well, Kia. But from Friday to Sunday, Kia’s cumulative batting line is comparable to that of the Lotte Giants and Samsung Lions, who are in seventh and eighth place, respectively.

Kia Tigers 2017 time of week batting splits, descending by △OPS
pos hitter weekday
weekend difference
AVG OPS AVG OPS △AVG △OPS
LF Choi Hyoung-woo 0.440 1.373 0.290 0.883 -0.150 -0.490
SS Kim Seon-bin 0.475 1.135 0.284 0.701 -0.191 -0.434
1B Kim Ju-chan 0.361 0.986 0.192 0.555 -0.169 -0.431
3B Lee Beom-ho 0.308 0.979 0.250 0.781 -0.058 -0.198
CF Roger Bernadina 0.341 1.004 0.301 0.865 -0.040 -0.139
2B An Chi-hong 0.333 0.953 0.317 0.822 -0.016 -0.131
DH Na Ji-hwan 0.327 0.925 0.284 0.954 -0.043 0.029
1B Seo Dong-wook 0.286 0.778 0.311 0.863 0.025 0.085
RF Lee Myeong-gi 0.303 0.797 0.370 0.884 0.067 0.087
team Kia Tigers 0.335 0.935 0.273 0.768 -0.062 -0.167

This stark difference in team performance has borne out in the team’s record. On Tuesday to Thursday games, Kia is 41-9, an .820 winning percentage, or an 118-game-winning pace over a full 144-game season. For comparison, the KBO single-season wins record is 93, set by the 2016 Doosan Bears, and the 90-win mark has only been eclipsed one other time, when the now-defunct Hyundai Unicorns won 91 in 2000.

However, on Friday to Sunday games, Kia is 23-25-1, a .469 winning percentage, or a 68-game-winning pace. If Kia had a .469 winning percentage this season, they would slot in at eighth in the standings between, guess who, Lotte and Samsung.

There are no clear reasons for this drop-off. Kia’s schedule has been fairly balanced between the weekday and weekend series, and they have faced good and bad teams alike. Other teams have some variation between weekday and weekend, but there is no league-wide trend toward weaker weekends, and especially no performance gaps as severe as Kia’s.

However, as Kia is still well in control of the 2017 KBO standings, and performing well overall, this weekend drop-off stands as more of a curiosity than an actual problem. Perhaps it actually makes the team even scarier; despite running roughshod over the rest of the league, the Kia Tigers still have room to improve.


Understanding Roger Bernadina’s KBO Rebirth

A lot of things have clicked for the Kia Tigers this season, chief among them being their offense’s record production. Kia’s fearsome lineup features three of the Korean Baseball Organization’s top-10 hitters by batting average, and five of the top-20 hitters by wRC+, and is a driving force behind the team’s domination of the standings, currently sitting in a comfortable 1st place at 64-34-1, five games up on the second-place NC Dinos.

A major force behind the dominance of the Kia offense has been the unexpected emergence of their new center fielder Roger Bernadina, in his first season in the KBO. Just a season ago, Bernadina was toiling in the minor leagues, playing with the Las Vegas 51s, the New York Mets’ Triple-A affiliate.

The difference between the old Bernadina, a failed prospect who played seven partial seasons in Major League Baseball, mostly with the Washington Nationals, and the current Bernadina, who hits leadoff for the Kia Tigers’ offensive juggernaut, is stark.

Roger Bernadina career stats, 2008-2017
league years G AVG OBP SLG wRC+ WAR
MLB 2008-14 548 0.236 0.307 0.354 81 1.2
KBO 2017 95 0.320 0.383 0.551 135 3.9

In less than a fifth of the games played, Bernadina has already accumulated over three times his MLB WAR and hit over half as many home runs (19 to 28). By wRC+ he has been the 16th most productive player in the KBO this season, and by WAR, he has been the 6th best position player in the league. On Thursday night he hit for the cycle, becoming only the third foreign player to do so in the KBO. Quite a jump for someone who was a career 81 wRC+ hitter in the MLB.

Which of course begs the question: What’s changed? In less than a season, how has Roger Bernadina improved this much?

It isn’t plate discipline; Bernadina is actually walking slightly less (7.7 percent in the KBO versus 8.2 percent in the MLB) and swinging more (50.3 percent vs 42.1 percent). His strikeouts are down from 21.3 percent in the MLB to 17.4 percent in the KBO, but that change may be more a function of the leagues themselves (the MLB’s higher overall K% means Bernadina’s mark is about league average in both leagues) than any adjustment Bernadina himself has made.

Bernadina also still profiles as the same type of hitter, hitting a majority of his batted balls on the ground, with a moderate preference to pull. He never displayed particularly drastic platoon splits, hitting roughly the same against lefties and righties, and this tendency is also unchanged. Though his batted-ball characteristics would have made him a reasonable shift candidate, shifts were almost never employed against him in the MLB, so his increased numbers in the KBO are also not the result of the KBO’s relative lack of defensive shifts.

The biggest difference is the change in Bernadina’s batting average on balls in play. His current KBO BABIP is .353, a drastic increase from his career MLB BABIP of .288.

On one hand, Bernadina profiles as the type of hitter than might naturally run a higher BABIP. He runs well, having rated as a positive baserunner and base-stealer in both his time in the MLB (59 steals, 83% success rate, 8.9 BsR) and the KBO (21 steals, 81% success), and the fact that he is primarily a ground-ball hitter should give him ample opportunity to take infield hits and run a higher BABIP.

However, his track record shows this to not be the case. BABIP is a statistic that takes a long time to stabilize, and as such his career average is more indicative of him as a player than his current 2017 outlier mark. With no other changes in batted-ball profile or batting approach, Bernadina’s increased BABIP, and by extension increased offensive production, is more likely the result of fortunate circumstances and luck than any real change in skill.

That being said, simply acknowledging that Bernadina has been lucky this season does not diminish his performance. Regardless of whether he is performing to his expected outcomes or not, he has been a productive member at the top of the Kia Tigers’ lineup and, perhaps even more interestingly, has hit better as the season has progressed.


Analyzing the Big Boys’ Team Peripherals

There are a few really good teams this year. The Dodgers and Astros are really destroying their divisions, although the Astros have been slowed down by injuries a little. The Nats are also really good despite their bullpen struggles, that they tried to fix with a few trades.  There are also the Red Sox and the Yankees, who started really well. I will also include last year’s World Series finalists, who had a mediocre first half but really turned it on in July and on paper have very strong teams that should compete with the other top teams.

Let’s start with hitting. I used wRC+, K, BB, ISO, my own K-BB-ISO, xWOBA and BABIP.

wRC+ K BB ISO K%-BB%-ISO BABIP xWOBA
Astros 129 17.3 8.2 0.211 -0.120 0.317 0.336
Dodgers 111 22.5 10.6 0.194 -0.075 0.307 0.332
Nats 108 20.4 8.8 0.199 -0.083 0.315 0.329
Yankees 108 22.6 9.8 0.183 -0.055 0.308 0.328
Indians 107 18.3 9.8 0.178 -0.093 0.300 0.330
Cubs 98 21.8 9.8 0.188 -0.068 0.286 0.318
Boston 92 18.6 9.1 0.144 -0.049 0.304 0.314

The Astros have clearly been the best hitting team. Their BABIP might regress some, but they also lead in xXOBA, K-BB-ISO, ISO and contact. Behind them, the Dodgers, Yankees, Nats and Indians form a group that is pretty close together by all the stats. The Cubs are clearly behind in wRC+ but they also have the lowest BABIP at .286 that might be due to some regression. The Statcast-based xWOBA suggests that it was not all bad luck but in ISO, contact and my combined stat, only the Astros clearly are superior to them. The Cubs also have a July WRC+ of 113, which is fourth of the group behind the Astros (152), Dodgers (120) and Indians (115) with a not outrageous .307 BABIP.

Boston, however, clearly is the last team out of the bunch. They are last by xwOBA, wRC+, K-BB-ISO and ISO. Contact, walks and BABIP are OK, but Boston this year simply doesn’t have power; their ISO is actually second-last in MLB.

If we want to create tiers of hitting, the Astros are alone in their own tier. Of course, currently half of their lineup is on the DL, but hopefully that changes and then they should be number one again. After them, there is a tier out of the Nats, Dodgers, Indians and also the Cubs. The Cubs have been clearly worse in results and are closer to Boston in that regard, but their lowish BABIP, their preseason projections and their last very good month, as well as their peripherals, make me grade them on par with the other non-Astros teams. All those teams are pretty close in talent and should be projected for about a 105-110 wRC+ the rest of the way.

And then there is Boston, clearly the worst out of the pack no matter how you slice it.

Then there is pitching:

xWOBA FIP K-BB
Dodgers 0.273 3.42 18.7
Houston 0.295 3.88 18.3
Yankees 0.297 3.86 17.0
Washington 0.301 4.10 15.8
Boston 0.306 3.80 17.5
Cubs 0.309 4.21 14.1
Indians 0.304 3.61 19.2

Here the Dodgers clearly lead the field, being first in xwOBA against and FIP and second in K-BB. After them, we have the Indians, who are a weaker team in xwOBA but second in FIP and first in K-BB, and Houston, who is second in xwOBA, third in K-BB and fourth in FIP. Boston is second in FIP and fourth in K-BB, and the Yankees are in the same tier. Washington and the Cubs are a bit weaker in that regard, but overall there are not huge differences, plus the Yankees and Cubs both had really big upgrades.

Tiers here would be the Dodgers and Indians first, then Red Sox and Yankees and third the Nats and Cubs, but the recent trades might have moved the Cubs into the second tier and the Yankees into the first tier.

By xWOBA differential you have:

Dodgers 0.059
Astros 0.041
Yankees 0.031
Nats 0.028
Red Sox 0.008
Indians 0.026
Cubs 0.009

the Dodgers ahead of the Astros, Yankees, Indians and Nats and the Red Sox and Cubs last. I think, however, that with the recent trades and their hitters getting hot, the Cubs are pretty close to at least the Yankees, Indians and Nats and not that far off the big two from LA and Houston, especially because they excel in defense, which was not included in my analysis. Boston unfortunately I don’t see quite in that tier, especially if Price doesn’t come back strong.


Stealing Bases and Splitting the Rewards

The contextual revolution (don’t really know if that’s a thing, but it sounds official) emerged in the MLB the past few years, attempting to control for more situational effects than current sabermetric-driven baseball stats. These models build upon Bill James’s work, Tom Tango’s all-important linear weights, and similar metrics that account for league, park, and positional production.

Baseball Prospectus (BP) writers developed baseball statistics that further quantify performance using mixed models . You can find a good introduction to mixed models in this article written by Jonathan Judge, Harry Pavlidis and Dan Brooks of BP, but if you are familiar with linear or logistic regression, a mixed model attempts to estimate the average performance over the course of the season (fixed linear model) and use the residuals (or error) to simultaneously quantify the contributions of “random” participants in any given play. Now, why do I say random? It isn’t so much that these participants are random, but that the baseball players are always changing and the number of “random” interactions they have throughout a season is endless, while the effect of an 0-2 count on run production stays relatively consistent or fixed throughout a whole season.

Some existing baseball stats based on mixed models include:

  1. Called Strikes Above Average (CSAA) — defensive statistic that measures catcher framing skills controlling for the batter, pitcher, catcher, and umpire
  2. Swipe Rate Above Average (SRAA) — base running metric that attempts to quantify base stealing ability for batters, and stolen base prevention for pitchers and catchers
  3. Take Off Rate Above Average (TRAA) — player specific effects on base stealing attempts
  4. cFIP — a new version of Fielding Independent Pitching (FIP) taking into account many aspects of a plate appearance. Read more about it here.

By the title, you can probably guess this article is about stolen bases, and you are correct. Specifically, I will be discussing Swipe Rate Above Average, or SRAA for short. SRAA is derived from a mixed model that attempts to account for the inning, the stadium, the quality of the pitcher, and the pitcher, catcher, and lead runner involved. SRAA is directly derived from a player’s random effect and is a single number, generally ranging from -10% to 10%, describing the additional probability a player contributes to a successful steal. For example,  Mike Trout had a 4% SRAA in 2016. Given the average stolen-base situation, Trout is 4% more likely to successfully steal than the average baserunner in 2016.

While SRAA accounts for pitcher skill using cFIP (See above link for more information), the quality of a pitcher can’t necessarily control for all variation in a pitcher’s pitch sequence or the occasional mistake in the dirt. Pitches in the dirt, pitch-outs, off-speed, and fastballs are treated equally in SRAA. Consequently, SRAA values may be lacking for runners that disproportionately get thrown out on pitch-outs or for catchers that consistently block balls in the dirt while still throwing out the runner.

Let’s explore some evidence of these effects before we include them in the pitch adjusted (pSRAA) model. I started by subsetting Retrosheet play-by-play data from the 2016 season to only stolen-base attempts by lead runners. For example, events with a steal of second base with a man on third were not included. I only included situations where a pitch preceded a stolen-base attempt. I supplemented the play-by-play data with PITCHf/x data which tracks trajectories of every pitch in the MLB. I aligned the pitch data with each stolen base with minimal missing connections between the two data sets. Only three stolen bases did not have PITCHf/x data since there technically wasn’t a pitch that occurred (e.g., steal of third, then steal home on a passed ball). An additional eight did not have valid trajectory readings in PITCHf/x.  I ended up with 2,809 total attempts. Excluding some of these stolen bases means, for those who are familiar with SRAA, my SRAA numbers will not match up directly with BP’s numbers.

I first examined pitch speed and its effects on stolen-base percentage. It’s no surprise that, in 2016, runners succeeded more often on slower pitches.

Notice a slightly higher success rate for pitch speeds that fall above 95 mph. This phenomenon is not unique to 2016, and Jeff Sullivan hypothesized that good base-stealers are the ones stealing against fireballers. Indeed, while only 8% of stolen bases occur during a pitch that is 95 mph or higher, speedsters Billy Hamilton and Starling Marte attempted over 12% of their stolen bases in these situations. These situations tend to arise later (about one inning later on average) in closer games (stealing team is only .39 runs ahead rather than .46 runs ahead on average), meaning base-stealers ought to be more certain of success before attempting to steal.

In addition to pitch speed, we also have access to pitch location data through PITCHf/x. As you can see in the figure below, the SB probability varies more drastically by location, and therefore, is the most meaningful of the two pitch metrics. The results below mirror the results I would expect. High SB probability along the right side of the plate for left-handed hitters confirms that most catchers (if not all) are right-handed, which makes it hard to throw over left-handed hitters. Similarly, catchers have more success with right-handed hitters and pitches closer to their throwing shoulder. And finally, the most obvious of all: It’s hard to throw a runner out when the ball hits the ground.

I also included the PITCHf/x pitch descriptions since they help improve the model slightly. Some descriptions occurred only a few times, so I combined them into larger categories:

  • Dirt: Ball in Dirt, Swinging Strike (Blocked)
  • Pitch-out: Pitch-out, Swinging Pitch-out
  • Strike/Ball: Ball, Called Strike,
  • Swinging Strike: Foul Tip, Missed Bunt, Swinging Strike

Below is a table detailing the SB success rates in each of the four groups. Dirt and Pitch-out are the most extreme categories, with “normal” pitches falling in-between. Something that jumped out at me was the lower success rate on swinging strikes, as I would expect this to distract the catcher. Two explanations I can come up with are: 1) catchers tend to hold the no-swing pitches a split second longer to get the call from the ump, or 2) swinging pitches occur during a hit and run play where runners tend to be less skilled at stealing bases.

Controlling for the lead runner’s base is the last addition I made to the original SRAA model. Adding this effect improved the model (AIC to be specific), indicating runners stealing third were more likely on average to be successful than runners attempting to steal second and especially home. A likely explanation is that runners stealing third need to be more confident in their ability to steal in the current situation and have a right-handed hitter obstructing the catchers throw about 65% of the time.

So now that we have this new metric pSRAA, lets take a look at how it deviates from SRAA. As you can see in the figure below, the distribution of both metrics are fairly similar.

pSRAA has a slightly tighter distribution for pitchers and runners, meaning pSRAA has absorbed some of the expected SB probability in these new variables and pushed pitcher and runner SB skills closer to the mean. This phenomenon occurs most likely because the variables we are trying to control for are largely out of control for these players and are not rectifiable or exploitable. By that, I mean pitchers can’t control whether the one pitch they throw in the dirt happens to coincide with a runner taking off, but catchers can use this event to prove their skill. While a pitcher “loses control” of the SB situation when the ball is released, a catcher can make a brilliant play, saving a potential wild pitch and converting it into an out. Thus, we see a wider variation in pSRAA for catchers, as pSRAA identifies the increasingly elite talent and the replacement players that struggle to nab runners on pitch-outs.

Examining how players’ metrics improved or worsened after controlling for these additional effects reveals some drastic changes, but mostly small adjustments. The figure below illustrates the change from the old metric to the new metric. The closer a player is to the dotted line (pSRAA = SRAA), the less that player deviated from the original SRAA measure. If a player ends up above this line, it means that pSRAA is higher than SRAA, so when controlling for pitches, pSRAA attributes more success (for runners — less success for pitchers and catchers) to their ability rather than luck.

How does this new pSRAA model help us as baseball fans or analysts? pSRAA can identify where SRAA was under or overvaluing players’ skills. For example, SRAA undervalues catcher Chris Iannetta at a 0.86% SRAA when pSRAA pegs him at whopping -4.19% (negative is good for catchers)!  In other words, Iannetta jumps from the 43rd percentile of catchers to the 70th percentile!

To give you an idea of the kind of adjustments pSRAA makes, here is a sample stolen-base attempt against Iannetta (video has no sound for those of you who are watching at work; for sound go to 1:51:40 here), specifically a SB attempt that the model predicts will happen 85.5% of the time. Actually, it is more like 88.4% if you account for the runner, Lorenzo Cain, the 15th-fastest baseball player according to Statcast’s speed measure.

Now let’s just freeze that frame. The ball is almost on the ground, and not to mention, only thrown at 80 mph, giving Cain almost an extra tenth of a second to get to second base. Regardless, Iannetta guns him out with an impeccable throw.

Not only can we use pSRAA to uncover insights such as above, but we can also abuse pSRAA to easily find awesome plays like this top 5 play. J.T. Realmuto, known for his unbelievable pop time, throws out Ben Revere on this gem of a play. The pSRAA model gives Realmuto a 10% chance of throwing out Ben Revere, but Realmuto pops up in a staggering 1.78 seconds (via Statcast) and throws a perfect 85mph toss to second.

Or this scenario, which had a 92% stolen-base probability. A.J. Pierzynski picks a throw off the ground, then navigates around Brandon Phillips to beat Suarez by a mile.

And finally, here is an example of a successful stolen base the model predicts will happen 15% of the time — not a surprise when you see where the pitch is thrown (actually 43% when you account for the speedy Rajai Davis and the way below average Kurt Suzuki).

pSRAA does well for these purposes, but may not illustrate the total value a player adds to his team’s success. A runner with a high pSRAA value with only a couple stolen-base attempts hasn’t added much value to his team since he didn’t utilize his skill often enough. We can leverage pSRAA and stolen base/caught stealing (CS) run values to come up with a more useful metric, which I have aptly named Pitch Adjusted Swipe Rate Runs Above Average (pSRrAA) —a mouthful, I know. I based pSRrAA upon linear-weights metrics like FanGraphs’ Weighted Stolen Base Runs (wSB). The term linear weights, often used in the world of baseball statistics, translates to the average run value of a certain action and its effect on run scoring over the course of an inning. For example, let’s say there is a man on first base with no outs. The average number of runs scored in an inning in 2016 starting with this exact situation is 0.8744 runs. He gets caught stealing, and now the situation is nobody on and 1 out. Starting in this situation, the run expectancy drops to 0.2737. Thus, the value of this specific play was about -0.6 runs. Examining these situations over the course of the whole season leaves us with average run values that we can assign to SB and CS. Combining the run values for SB (runSB = .2 runs) and CS (runCS = -.41 runs) produced by FanGraphs for the 2016 season, we can use pSRAA to attribute the run values more accurately:

pSRrAA = pSRRA x (runSB-runCS) x Attempts

This method for calculating pSRrAA works because of the following:
  1. pSRRA already determines the probability a certain player adds to a SB above average.
  2. If a player adds 10% probability to a SB, they are contributing runSB 10% more than the average player and runCS 10% less.
  3. pSRRA x (runSB-runCS) quantifies the average attempt value, so then we just multiply by attempts to get a full run value over the course of the season.

Of course, as I alluded to in the beginning, pSRAA doesn’t account for all types of stolen bases, only ones with pitches involved. Consequently, pSRrAA doesn’t account for the total value runners and pitchers contribute to their teams because attempts are excluded in which catcher isn’t involved. Finally, to take a look at the top 10 and bottom 10 performers for each position according to pSRrAA, see my original article here. And as always, you can find the code associated with pSRAA/pSRrAA and the analysis on my GitHub page here. Checkout my new Facebook page to stay up to date on new articles.

A previous version of this article was published at sharpestats.com.


Newcomers Find Their Way at Home

The Boston Red Sox have been tightly related with highly-touted prospects during the past months and even years. Taking a quick look at MLB.com’s Top 100 Prospects rankings from 2015 to 2017, we find two names come up fairly consistently. Those belong to infielders Yoan Moncada and Rafael Devers. The former entered the 2015 ranks as “the best teenage prospect to come out of Cuba since Jorge Soler in 2011” and signed with Boston for $31.5 million, which smashed the biggest amount to date registered by the Reds’ signing of Aroldis Chapman for $16.25 million. While Devers’ price ($1.5 million) was nothing close to Moncada’s, he was also praised as “the best left-handed bat on the 2013 international market.”

Multiple names from the 2015 class of prospects have already seen large major-league play time (Byron Buxton, Corey Seager, Joey Gallo and Aaron Judge), and the time has come for Moncada and Devers to start writing their full-time MLB stories. In the case of Moncada, Boston opted to trade him to the White Sox for Chris Sale during the past off-season while keeping Devers in town. Anyway, and as things have turned out, both have practically debuted in parallel during this season for their franchises, being called up for quite different reasons. In the midst of a complete rebuild, Chicago will count on Moncada to take on the third-base position from now on. Boston, on the other hand, wanted to improve their infield a hair and seem to have opted for Devers as an in-house solution to their woes.

As the date of the writing of this article, this is, Tuesday, July 25 (better known as National Rafael Devers’ Day given his major-league debut with the Red Sox), Moncada will have the chance to play as much as 65 games and Devers 60. They will probably not reach those numbers — at least not Devers, knowing Boston’s contender status and probable use of platoon hitters during the rest of the season. Another fact of interest is that Yoan Moncada is 22 years old and Rafael Devers is just 20. So, those numbers will make for a baseline on what to look for during the rest of this article, which will focus on how call-ups perform in their debut seasons, both home and away.

Prospects made huge jumps just going from the minors to the majors, change cities and clubhouses, meet new teammates, and much more, but you would guess that after settling in they’d produce more at home than far away from it. In order to actually know if this holds true, I ran a set of queries on Baseball-Reference.com to find out. I’ll be looking at rookie-season splits from 2000 to 2017 in which the players debuting were between 20 and 22 years of age (such as those of Moncada and Devers). A total of 87 players within those parameters have seen major-league action during the selected time span. So we’ll be working with 174 home/away splits in order to know if rookies of ages 20-22 have historically played better at home or away from it as we may expect.

First of all, I’ve looked at “playing time” stats, this is: games, games as a starter and plate appearances. As much as we could expect players to perform better at home than away over their first few games, we could expect teams to “protect” their rookies and deploy them more frequently at home than on the road. As it turns out, though, the statistics for the home and away splits are virtually the same for the three mentioned categories. First myth debunked.

Moving on to what really matters, production, we can try and see how well players have hit in their ballparks compared to other venues, and whether there are or not big differences in this aspect.

Subtle differences start to appear between the games played at home and those played away in terms of runs scored and hitting. There are no big differences between the splits, surely, but it seems that home performances have edged away ones by a hair during the past 17 years on average. The biggest different in any of the studied statistics comes in both the doubles and home-run categories at 0.3 points each in favour of the home split.

Another interesting set of statistics to look at are those related with base-stealing. By logic, players would be expected to feel more comfortable, confident and willing to steal bases at home rather than in other parks. Again, that preconception seems to be wrong. Between the 87 players studied, the average of steal attempts was higher away than at home, and even the success was five points higher when stealing in other ballparks rather than in their own one.

Finally, we must turn our attention to the game of percentages and look at the slash line of the analyzed players in terms of BA, OBP and SLG. On top of that, I included the average tOPS+ and sOPS+ values. The former of those last two is meant to represent the player’s OPS in the split relative to that player’s total OPS during the full season (not accounting for the home/away split), with a value greater than 100 indicating that he did better than usual in the split. The second one is the OPS in the split relative to the league’s split OPS (again, a value greater than 100 indicates the player did better than the league in this split).

And here is where our home/away splits, once for all, truly separate themselves. Not one, not two, not three, but every percentage value posted at home by the average 20-to-22 year-old rookie from 2000 to 2017 has been better than the number registered far from it, and not by little. The difference in BA is of 15 points, in OBP of 23, in SLG of 23, in tOPS+ of 13 and in sOPS+ of 4. That yields an average difference of 20.3 points in the slash line and of 8.5 in the OPS+ metrics, which is huge. It is interesting to see how the average rookie performance is under the league-average level (under 100 sOPS+) both at home and away, but how said average was able to put up much better numbers at home (106 tOPS+) than away (93 tOPS+).

Just in case the rest of the data didn’t make it clear, which it actually didn’t, this leaves no doubt or case for equity open. After all, rookies probably prefer to play at home, sweet home.

But now that we know that newcomers not older than 22 years when they play their first major-league games tend to perform better at home, it is just a thing of curiosity to explore some of the unique cases that have occurred during the past 17 seasons to the 88 players of our study. We have been looking at the average rookie during the past few paragraphs, but as expected, each case is unique in itself and would make for a complete study on its own. Next is a table containing the rookies with a 45+ point differential in tOPS+ (with at least 60 games played), so we can measure how different their production was at home and on the road. Players are ordered by the absolute difference, with negative values meaning their production away was better than that at their home ballpark.

As it turns out, only 16 of 72 players had differences of 45+ points in tOPS+ between their games at home and those played away. Of those 16, though, seven were better far from their team’s stadium, something not really expected, much less in the case of Stanton and his minus-94 differential.

Just for fun, let’s look at Giancarlo’s case, whose split numbers are radically different while having played almost the same amount of games home and away during his rookie season. In 180 PA at home he hit 29 balls, including 7 home runs, for a BA/OBP/SLG line of .182/.272/.599 and 52 total bases. In 216 PA away he hit 64 balls with 15 home runs, posting a .320/.370/1.020 slash line and getting 130 total bases. What could be seen as a terrible entry year by looking at just the production at home (league-relative sOPS+ of 60) turns into a monster season while considering what Stanton was able to do outside of Miami (183 sOPS+). Something similar happened to Jay Bruce, Logan Morrison or more recently Miguel Sano, only in opposite venues.

As a final note, it can also be seen how only six of the 16 players in the table above had a big differential while debuting prior to 2010. The other 10 players made their debuts from 2010 on, which could mean that the trend is for rookies to have much more variable productions in different venues that the average historical newcomer.

We still don’t know how Moncada and Devers will perform during the rest of the season, but if that last supposition holds true, then White Sox and Red Sox fans just can hope for their players to at least do more damage at home than away, so they get to watch their jewels explode in front of their own eyes instead of between different ballparks around the nation.


What Is a Pitcher? What Is a Batter?

When we consider individual baseball players, we think that we understand how to divide them into pitchers and hitters. Clayton Kershaw is a pitcher, we say confidently, and the recently-traded Nori Aoki is a hitter. But from the perspective of the statistical record, the question can be a little harder to answer. Kershaw, after all, had appeared in six more games as a pinch-hitter or pinch-runner than he has as a pitcher through 2016, and Mr. Aoki stood on the mound and induced a fly out from Aaron Judge earlier this season (among other less satisfactory results). Is there a programmatic way to divide baseball players into hitters and pitchers from the perspective of the statistical record?

For me, the question isn’t merely academic: I am building a baseball trivia game, and it is very important for the rules of the game that I be able to programmatically divide baseball players into hitters and pitchers. In particular, I need to divide players into pitchers and hitters over the course of their careers, not merely from the standpoint of a particular season or game. And I need to do so definitively: a player can’t be both a pitcher and a hitter. The data that I am working with for my baseball trivia game comes from Sean Lahman’s database, and includes batter seasons and pitcher seasons back to the 1870s.

The Lahman database does not attempt to disambiguate between hitters and pitchers, merely including hitting seasons and pitching seasons. If a player only hit, that’s a hitter; if he only pitched, that’s a pitcher. Easy enough, but there are of course complications. Pitchers bat in real baseball, so there are lots of hitting seasons by pitchers in the data. And sometimes, as noted above, hitters pitch in blowouts, so there are pitching seasons by batters included as well.

Then there’s Babe Ruth, who really was both a pitcher and a hitter, you might say, throwing lots of innings in the 1910s before becoming a full-time hitter in the ‘20s. What does it mean to pitch and hit “a lot”? Carlos Zambrano was a pitcher, informed baseball fans presumably agree. He was also a decent hitter and was used as a pinch-hitter fairly often. He’s not a batter, though. Right?

Here’s the programmatic metric that I’ve decided on and used to divide players in my game:

According to the Lahman database, there have been 5,277,522 batter games and 1,064,580 pitcher games in baseball history through 2016. That’s a ratio of about 4.95 batter games to 1 pitcher game. Any player with a higher ratio should be classified as a hitter, any lower as a pitcher. That is my claim: any player with a higher ratio of “Games appeared in as a batter” to “Games appeared in as pitcher” is a batter, and the player is otherwise a pitcher. Some data points that fall out of this classification:

Ruth: 2503 hitter games, 193 pitcher games: 12.9 ratio: Hitter

Kershaw (through 2016): 288 hitter, 282 pitcher: 1.02 ratio: Pitcher (Kershaw has been used as a pinch-hitter and pinch-runner, stupidly, from time to time)

Zambrano: 384 hitter, 354 pitcher, 1.08 ratio: Pitcher

Rick Ankiel: 653 hitter, 51 pitcher: 12.8 ratio: Hitter

We might be interested in “hittery” pitchers or “pitchery” hitters: players whose ratio of batter games to pitcher games approach the dividing ratio of 4.95 to 1. By this metric, the “hittery-est pitcher” with a career of any length is Jimmy “Nixey” Callahan, who pitched and played left field for various Chicago teams and the Phillies in the late 1890s and early 1900s.

The “Pitchery-est hitter” is John Ward, who was mostly a pitcher for the Providence Grays for seven years and then a middle infielder for various New York teams for a decade. He’s about twice as “pitchery” of a hitter as Ruth.

Most of the real double-duty guys played in the dead-ball era. A man named Hal Jeffcoat played CF and often provided relief innings for some lousy Cubs and Reds teams in the 1950s. Eno Sarris mentioned him in the context of an article on two-way players earlier this year. In our modern era of extreme specialization, not-too-good OF turned not-too-good pitcher Brooks Kieschnick is about as close as it gets.

It might be slightly more precise to use Innings Pitched and Innings As A Batter Or Fielder, but that would introduce some problems (that I am eliding here) and probably wouldn’t move the ratio very much. What do you think? How would you programmatically and consistently divide players into batters and pitchers?

If you’d like to be a beta tester for the trivia game, or be kept in the loop for when the game is released, sign up here.


Challenging Conventional Wisdom About the Trade Deadline

The MLB trade deadline has passed, and you may be happy or disappointed that your favorite team is going to be stuck with the players they now have until the end of season. Actually, that’s not true. Trades can be made until August 31, but any player swap after the deadline invokes the waiver-wire process, which allows any other team to block a trade or claim a waiver player for themselves. So, deals that will have any sort of impact will usually happen just before the deadline.

This year’s trade deadline involved the names of mostly pitchers — Sonny Gray, Yu Darvish, Jaime Garcia, David Robertson, Sean Doolittle, Addison Reed, Francisco Liriano, and others were all traded near or almost at the deadline.

The Dodgers, whose pitching staff so far has led the league in FIP, ERA, WHIP, and rank third in K/BB ratio, added Yu Darvish, a pitcher who hasn’t been his best this season, but who can certainly turn a great rotation into a nearly unbeatable one in a five-game or seven-game series. The Cubs, whose bullpen ranks 10th in fWAR, brought in left-handed reliever Justin Wilson from the Tigers, who presumably will fill the role as the set-up man for Wade Davis. The Yankees supplemented a bullpen that ranks fourth in ERA and WHIP, and second in K/9, with Sonny Gray and David Robertson. Sean Doolittle and Brandon Kintzler were sent to the Nationals to help solve their bullpen issues which have resulted in the second-worst ERA in the league. On the same day that Lance McCullers was placed on the 10-day DL, the Astros traded for Francisco Liriano to add some stability to their rotation/bullpen as they are all but guaranteed a postseason spot.

But every year, we hear talk about which teams will buy or sell. The teams who have little to no shot of making the postseason, are obviously more likely to sell. The decision-making gets interesting when looking at teams that are “on the bubble.” Front offices must decide whether to go all-in for the current year, possibly giving up young prospects for proven stars to fill needs they see in their team, or to take the seemingly less-risky route of keeping your prospects and attempting to fill your needs with lesser players on the trade market and hope that it’s enough to make a run in the postseason. And if it doesn’t work out, at least you didn’t give up your future stars.

This is the conventional wisdom that’s being challenged by some teams, and needs to be examined more. The truth about the postseason in professional baseball is that you don’t know when you’ll have that chance again, no matter how many top-100 prospects you have. The Washington Nationals infamously shut down Stephen Strasburg in 2012 following the logic that it would be better to save their starter for future postseasons rather than “risk it” that year. And of course, the Nationals have not won a postseason series since. Had they managed Strasburg so that he could have pitched into October, who knows what would have happened. Win probabilities show that is far easier to predict who will make the playoffs then what will happen once those teams get there. So if you have a chance to make the playoffs, you should go all-in for it.

This is exemplified by the win probabilities calculated at FiveThirtyEight.com. As of this article’s writing, the Dodgers have a greater-than 99% chance of winning their division, and a 23% chance of winning the World Series, and the same can be said about the Astros. The Nationals, Indians, and Cubs all have a ten, nine and eight percent chance of winning the Fall Classic, respectively. But all three of those teams have an 84% chance or better of making the playoffs. The point is, you could be the Dodgers or the Astros and be having a historic season, and still “only” have a 23% of winning the World Series. Now, this year is unusual. Typically, even when baseball teams are really good, their World Series chances are less than 20%. Comparing this to basketball, the Warriors, dominating the NBA in a similar fashion that the Dodgers and Astros are in the MLB, had a 48% chance of winning the title at a similar point in their season. So even when there seems to be a lack of parity in the game, baseball’s postseason still has a relatively higher level of unpredictability. These win probabilities are the data that should be driving the decisions of teams as they near the deadline, particularly if they have even a small chance of getting to the playoffs. Because you can never have enough talent to guarantee a chance to win the pennant or the World Series.

Obviously, these decisions are limited by payroll and the contracts of the players you have at the time. But the overall idea that a team who has a small chance should wait and build even more so that they have an even better chance of making the playoffs the next year or some other year down the road — it needs to go. The Dodgers were smart to add a great starting pitcher in Yu Darvish despite already having arguably the best staff in baseball. And the Yankees and Cubs were smart to bolster their previously strong bullpens. What is interesting is that, once again, the Nationals, who have one of the worst bullpens in the league, did not push harder for Sonny Gray or Justin Wilson. They got Sean Doolittle, who is good (4.10 ERA and 0.3 WAR in 2017 according to bbref.com) and Brandon Kintzler, who has been slightly better (2.78 ERA and 1.2 WAR in 2017). It’s also interesting to see that the Red Sox, whose offense ranks 23rd in wRC+, did not go after more hitters close to the deadline, and settled for reliever Addison Reed from the Mets. The Red Sox currently have a 6% chance of winning the World Series according to FiveThirtyEight. If their offense doesn’t pick up, their reluctance to find that power bat could be the difference.

But the Rockies, who currently have a 2% chance of winning it all and whose relievers’ ERA ranks 23rd at 4.52, acquired Pat Neshek and his 2.1 WAR from the Phillies. The Diamondbacks added J.D. Martinez to a powerful lineup that likely has more in them than they’ve showed recently, seeing that they are fifth in hard-contact percentage, but 16th in wRC+. Both of these are smart moves by the front office; on the other hand, Mike Rizzo of the Nationals and Dave Dombrowski of the Red Sox will have some questions to answer if their teams don’t make decent runs into the postseason.

Hopefully, we continue to see more teams who have at least a 2 or 3% chance of winning the World Series go all-in at the trade deadline. I’m not claiming that the reasons other teams weren’t more aggressive at the trade deadline are because they’re concerned about losing prospects, but it is worth noting that teams often make the mistake of not going all-in because they don’t believe they have a high enough chance of winning it all, when the reality is that you don’t. You just need a somewhat reasonable path to the playoffs, where the x-factor of unpredictability comes into play and anything can happen.


Even Without Brad, the Padres’ Pen Will Be in Good Hands

As with most rebuilding teams, the San Diego Padres aren’t in any particular need of a strong bullpen, and they’ve handled this season’s trade deadline accordingly. As of July 30, they’ve already traded away Ryan Buchter and first-half closer Brandon Maurer, and relief ace Brad Hand is expected to follow this offseason. The rest of San Diego’s bullpen is, for the most part, unexceptional; not including Hand, the most-used relievers still on the team are Craig Stammen and Jose Torres, neither of whom have a positive WAR or a FIP under 4.50.

It’s fortunate for San Diego, then, that Kirby Yates has quickly become their most reliable non-Hand option in relief. The team plucked Yates, a relatively unknown 30-year-old Hawaiian right-hander, from the waiver wire in late April, prior to which he’d spent time as a Ray, a Yankee, and, for one inning in 2017, an Angel. Minus a disastrous 2015 season, due in part to a HR/FB ratio of over 30%, both Yates’s FIP and xFIP have consistently been below 4.00. He’s also demonstrated an impressive strikeout ability over the past few years; his K rates in ’14 and ’16 were both approximately 27%, and in 2015, his worst season, he still managed to strike out nearly 23% of batters faced.

Since his move down the California coast in April, though, Yates has emerged into the Padres reliever perhaps most likely to take over the closer role — assuming Hand is dealt as expected (ed. note: oh well) — and has been one of the more unexpectedly impressive relievers of 2017. In prior years, Yates’s terrific strikeout rate was often coupled with a walk rate that was passable at best (7.6% in 2015) and dreadful at worst (10.3% last season). This season has seen progress in both areas — his BB% is down to 6.3%, and he’s struck out over 38% of the batters he’s faced. Yates’s improvements in strikeout and walk percentage have been sufficient to land him among the league leaders in both K%, where he ranks seventh among qualified relievers, and K-BB%, where he ranks fifth, at 31.9%. For reference, Andrew Miller ranks sixth at 31.0%, and other members of the top five are comprised of arguably the best relievers in the game, including Craig Kimbrel and Kenley Jansen.

Of course, it’s a bit premature to tout Yates as a Kimbrel-quality option out of the Padres’ bullpen. He doesn’t have the same electric stuff, or anything near the track record, of his peers on the league leaderboards, and he’s been the beneficiary of a strand rate of almost 91%. At 3.09 and 3.01, his FIP and xFIP, respectively, are also significantly higher than his 2.23 ERA, so there’s a fair bit of evidence to suggest that Yates isn’t as good as his basic stats indicate. With that being said, though, there’s a lot to like about Yates’s performance this year. There’s nothing fluky about a 38% strikeout rate, and his SIERA score, at 2.24, has been far more bullish on Yates than have his FIP and xFIP. So while Yates isn’t necessarily becoming the next great San Diego closer, his improvements this year are far too drastic to be chalked up entirely to luck.

Instead, I believe there are a couple interrelated reasons for Yates’s recent success. In June, Jeff Sullivan wrote about Brewers starter Chase Anderson’s 2017 breakout, noting that Anderson had started shifting his location on the rubber. Against right-handed hitters, Anderson began his wind-up from the far right side of the rubber; this was, as Sullivan explained, about “playing the angles,” adding that Anderson could get his pitches “sweeping away” from these batters.

Yates, it appears, has followed the same line of thinking. Compare the starting point of Yates’s delivery between the past two seasons:

rubber

We can also see how much his pitches’ respective routes to home, as illustrated by PITCHf/x, have changed since last season:

pitchpaths

Compared with a .283/.372/.457 slash line in 2016,  righties are hitting just .171/.227/.305 against him this year, with a .227 wOBA and .224 xwOBA. With Yates’s new starting point on the rubber, his pitches have been able to more effectively “sweep away” from right-handed batters, since they start significantly farther to the right, and he’s seen excellent results against righties in particular. This effect, I believe, has been a significant contributor to Yates’s success. As the above graph indicates, his fastball and slider travel most toward the outer section of the plate, which may be giving right-handed hitters more difficulty in the batter’s box.

However, that’s not the only interesting development regarding Yates’s slider. According to PITCHf/x, he’s throwing roughly four percent more sliders against right-handers, and his fastball usage has declined by roughly the same amount. His slider hasn’t spun the same this year as it has in the past, either: according to PITCHf/x, the pitch’s spin rate has risen from 594 to 1,962 RPM this season. (I should note that Baseball Savant sees a negligible difference in the average spin rate of Yates’s slider, so there may be an error in the data.) Regardless, it’s hard to deny that the pitch’s movement has changed:

sliders14-17

As evidenced by the wide spread in 2017, Yates’s slider still seems like a work in progress, but it’s clear that the pitch has taken on some new movement. FanGraphs, through PITCHf/x, scores his slider’s xMov as having shifted from 1.4 to -2.2, indicating that the pitch has actually begun moving toward right-handed batters. This doesn’t invalidate the merits of Yates’s shift on the mound, though — the new angle might still be affecting how righties pick up his pitches, and the majority of his sliders do tend toward the outer half of the plate, thus still “sweeping away” from the batters.

Yates briefly spoke on his slider in a May interview with Jeff Sanders of the San Diego Union Tribune, saying the pitch was “getting back to where it used to be.” I found this a curious phrase for Yates to use, seeing as how the pitch has done anything but revert back to its old movement. His next sentence, however, may answer this question. Yates says he’s “incorporated a splitter that [he] feels pretty confident in,” and later mentions that over the offseason, he developed the pitch as a sort of contingency plan against an occasionally less-than-trustworthy slider.

I’m not very familiar with the inner workings of PITCHf/x, but it seems possible that the system could be classifying some of Yates’s new splitters as sliders. Not only would this account for the change in his slider’s horizontal movement, but it’d also explain Yates’s description of the pitch. Overall, though, I believe Yates’s newfound success can largely be attributed to the above adjustments he made over the offseason. He may not become the next Trevor Hoffman, but Yates has shown the Padres more than enough to feel a bit more comfortable with their bullpen, even after Brad Hand is dealt this winter.


Where Are Anthony Rizzo’s Missing Hits?

Anthony Rizzo is hitting just .257 this year with a .242 BABIP. A fantasy-league mate of mine proclaimed “Rizzo sucks this year” after a recent trade. However, the only thing I can see that’s changed is his BABIP. He’s on pace for 106 RBI and 95 runs after totaling 109 RBI and 94 runs last season. His ISO is an identical .252. For all intents and purposes, he’s the same hitter, except he’s missing some hits. My league-mate chalked this up to “he’s getting shifted more” or “he’s worse hitting against the shift.”

One of those two things is correct. Rizzo has faced a shift in 85.7% of his plate appearances this year, which isn’t different from the 85.5% he faced last season. Rizzo is, however, hitting only .247 on balls in play when facing the shift (.214 when not shifted) this year. This is a 54-point swing in BABIP from a year ago (.301 while shifted; .359 when not shifted). This amounts to 13 missing hits thus far this year against the shift (and six more when not shifted). For the purposes of this article I want to focus on the missing hits against the shift.

What we have here is the symptom of something that’s going on when Rizzo is hitting this year that wasn’t happening as much last year, so I started sniffing around for other major changes in the Rizzo data. One thing that popped into my head was that the Cubs offense, especically the top of the order, has been getting on base much less this year than last year. That led me to thinking about what the defense looks like when there are runners on base versus when the bases are empty.

For a reference point, this is a typical shift against Rizzo with no one on base:

Rizzo_Shift_No_One_On_Base

With only a runner on first base, the shift is the same, but with the obvious addition of the 1B holding the runner on.

In 2016 Rizzo batted with runners on base in ~55% of his plate appearances and ~32% of the time with runners in scoring position. In 2017 those numbers have dropped to ~45% and ~24%.

While I don’t have Rizzo-specific defensive placements for all his batted balls in play, I did compare his spray charts from last year and this year and noticed two very empty spots.

The first spot is just behind the second-base bag, where the SS typically lines up in the over-shift against Rizzo. In 102 games this year, Rizzo has yet to collect a hit to this part of the field, while he had six hits within this area of the field last year and a few more just behind it to the opposite-field side. Using the FG splits tool we can see Rizzo has an .054 AVG in this area of the field this year vs. .333 from a year ago.

Rizzo

The second empty spot is where you’d find line drives to the opposite field falling in before the left fielder. This led me to look into Rizzo’s batted-ball distribution to the pull side and opposite-field side for both ground balls and line drives. As you’ll see, Rizzo is going to the opposite field ~5% less on his ground balls, and non-oppo GBs turn into outs more frequently for Rizzo due to the shift.

Rizzo Batted Ball Distribution 2016 & 2017
PULL OPPO
2016 2017 2016 2017
GB 63.9% 60.8% 10.4% 4.8%
LD 39.2% 50.9% 25.8% 22.8%

Rizzo is hitting .140 this year on ground balls to the left or up the middle, against a .345 mark from a year ago. This accounts for eight of his 13 missing hits. Another three hits are accounted for from luck against the shift on the pull side. The remaining two missing hits are from a slight change in batted-ball distribution on line drives to the opposite field. At the end of the day, I don’t think anything has changed with Rizzo outside normal variance in various batted-ball outcomes.


A Surprisingly Close 18-4 Game

On July 19, 2017, the Colorado Rockies beat the San Diego Padres by a score of 18-4. Padres starter Clayton Richard left the game after 3 2/3 innings, having given up 14 hits and with his team down 11-0. After the game, Richard took responsibility for his rough outing, but also pointed out that the Rockies may have benefited from some luck. “It just seemed like mis-hit balls found the right spots,” said Richard. Let’s see if Richard is right; let’s try to eliminate the effects of luck and see how this game should have turned out.

Because the score of the game affects how teams play, I am only going to predict what the score should have been after four innings, at which point the Rockies had a 12-0 lead. In lopsided games, teams often rest their everyday players (as the Padres did with Wil Myers) and don’t bring in their top relievers (Kevin Quackenbush, who gave up six runs, relieved Richard with two outs in the 4th), so it would be unfair to use what happened after the 4th inning to estimate what the score of the game should have been.

I looked at Baseball Savant’s hit probability and expected wOBA (xwOBA) of every plate appearance in the first four innings of the game. These stats only consider a batted ball’s exit velocity and launch angle. Although I will generally refer to the difference between xwOBA and wOBA as luck, keep in mind that defensive positioning and defensive ability are also factors that can affect this difference (the Rockies are, in fact, an above-average defensive team, while the Padres are one of the worst in the National League). In the first four innings, the Padres had 16 hitters come up to the plate, and they averaged a .254 xwOBA, compared to an actual wOBA of .281, for a difference of .027 per hitter. I gave Manuel Margot’s first-inning plate appearance, in which he walked but was later picked off, an xwOBA and wOBA of 0. Meanwhile, the Rockies’ 29 hitters averaged an xwOBA of .420 and a wOBA of .664, for a difference of .244 per hitter. Two things are immediately clear. First, the Rockies certainly out-hit the Padres in the first four innings of the game. Second, as Richard noted, the Rockies’ hitters benefited from a lot of luck.

First, I will calculate the number of runs each team would have had through four innings if their wOBA was exactly their xwOBA (this estimate will be a little low for both teams, as xwOBA does not take into account that the game was played at Coors Field). To do this, I will find their weighted runs above average (wRAA), and then add that to four times the average number of runs per inning in the National League.

 

wRAA = ((wOBA – league wOBA) / wOBA scale) x PA

league wOBA = .320

wOBA scale = 1.25

 

When calculating wRAA, we run into a problem: we can’t use the actual number of PAs each team had because this number depends on the number of baserunners they had, which should change when we convert wOBA to xwOBA.  To come up with an expected number of baserunners, I added the hit probability of all balls put in play and added 1.000 for each walk and hit-by-pitch (with the exception of Margot’s 1st-inning walk). Strikeouts, as you might expect, were worth 0 points. The Padres had 3.24 expected baserunners (.203 xOBP) while the Rockies had 11.70 (.404 xOBP). With a .203 OBP, it would take roughly 15 hitters to get through four innings (15 x .203 = 3.045 baserunners; 15 hitters – 3 baserunners = 12 outs). With a .404 OBP, it would take roughly 20 hitters to get through four innings (20 x .404 = 8.08 baserunners, 20 hitters – 8 baserunners = 12 outs). Therefore, we use 15 PAs for the Padres and 20 PAs for the Rockies (notice that reducing the number of hitters doesn’t ignore what happened to the Padres’ last hitter or the Rockies’ last nine, as I use the average xwOBA of all the hitters that came up and simply apply that to a smaller sample).

The Padres’ expected wRAA through four innings is then -.79 while that of the Rockies is 1.60. The National League averages .5533 runs per inning, which comes out to 2.21 runs per four innings. Add each team’s wRAA to this number and a reasonable score of this game through four innings would be 1.42 to 3.81 in favor of the Rockies. It is still the Rockies’ lead, but nowhere near the 12-run difference that actually took place.

Of course, we know that luck and defense do exist. Let’s say that in one of the oddest trades in MLB history, the Padres and the Rockies decided to swap their luck and their defenses before the game. I will add to the Padres’ xwOBA the difference between the Rockies’ xwOBA and wOBA and vice versa (I will call this new number “swapped wOBA”). I will do the same with the teams’ xOBP and OBP to determine the number of hitters that would have come up through four innings in this scenario.  Here’s a chart summarizing all the numbers:

 

Padres Rockies
xwOBA 0.254 0.420
wOBA 0.281 0.664
wOBA – xwOBA 0.027 0.244
swapped wOBA 0.498 0.447
xOBP 0.203 0.404
OBP 0.250 0.586
OBP – xOBP 0.047 0.182
swapped OBP 0.385 0.451
PA 19 22

 

Using the same process as before, we use the teams’ swapped wOBA to calculate their wRAA through four innings and add 2.21 to each. With the Rockies’ luck, the Padres would have been expected to score 4.92 runs (2.71 wRAA + 2.21) through four innings. Meanwhile, with the Padres’ luck, the Rockies would have been expected to score 4.45 runs (2.24 wRAA + 2.21) through four innings. Not only was the game not as lopsided as it appeared, but with the teams’ luck and defense swapped, the Padres would have held the lead (if you round to the nearest whole number) through four innings. That is a 13-run difference solely due to luck and defense!

Now, there is a slight issue with the calculation I performed above. I took data from only 16 Padres hitters and then applied it to 19, assuming the extra three performed at the same level as the first 16. To fix this, we can look instead at the Padres’ expected run value for only the first 16 hitters. We end up with a wRAA of 2.28. Using their swapped OBP of .385, roughly six hitters would have reached base, meaning that these 16 hitters would have come up in 3 1/3 innings. So through only 3 1/3 innings, the Padres would have had basically the same wRAA as the Rockies would have had through four. This is amazing. If only the Padres were given the luck that the Rockies received on this day, they would have at least been tied through four innings, a far cry from the 12-run deficit they unfortunately had to face.