A player’s take on xwOBA

by Nate Freiman

May 24, 2018

When I was playing in the Arizona Fall League in 2012, I led the league in line-outs. At least it seemed like it. It was the fall before I was Rule-5 eligible and I was hoping to show the Padres I could hit high level pitching. Unfortunately, a .726 OPS in the desert wasn’t going to have them breaking down my door with a team-friendly extension in hand.

If only there were x-stats! XwOBA is the shiny new eight-figure toy that we hitters can play with after an 0-15 slump. “But I was hitting the ball hard. See, look!” Back in the pre-Statcast dark-ages, a lineout might have had some anecdotal benefit buried in the bottom of a report. Now we have the data.

There’s been a lot written about xwOBA this week. Craig Edwards, Tom Tango and Jonathan Judge have all weighed in. I was especially interested in the ways they addressed it’s predictive capabilities.

Judge’s study compared season xwOBA for pitchers with the following season. Tango explored the correlations of small sample sizes of xwOBA to a larger sample.

I looked at this through the lens of a player. When a guy is getting lots of hits but they are bloopers and seeing-eye grounders (remember when ground balls went through the infield?) it’s soft hot streak. Likewise, a guy might be hitting the loudest .220 in the history of the PCL.

If you’re hitting the ball hard, they’ll start falling. Right? I wanted to test this theory by measuring xwOBA’s predictive capability month-to-month.

Methodology

(All data from BaseballSavant)

I started by getting data for each month of the regular season in the Statcast Era (2015-) for players with 50 PA in that month. I then did a series of inner joins in R to get what I’ll call “double-months.” A double month is when a player has 50 PA in two consecutive months. So Aaron Judge in April-May 2017 is one player-double-month.

The column labels in the Double Month data frame were: “wOBA,” “xwOBA,” and “Next month wOBA.” I ended up with 3,173 data points. Running these correlations gives us an idea of how your month might predict your next month.

I also wanted to see whether you’d be better off using your entire previous season to predict the next month. For this I got full-season data (min 200 PA) for 2015 and 2016 and did another series of inner joins to get a data frame representing the previous full-season metrics and the current month metric. These columns would look like this:

“Previous season wOBA,” “Previous Season xwOBA,” “Current season month wOBA.”

I got 2311 of these data points.

For good measure, I also created a data frame for double-seasons. If you had 200 PA in two consecutive seasons, congratulations: you just got a double-season. There ended up being 532 of them.

Finally, I ran all the correlations.

Results

Double-Months

wOBA to Next Month wOBA: r=0.203

xwOBA to Next Month wOBA: r=0.274

Previous season to current month:

wOBA to wOBA: r=0.238

xwOBA to wOBA: r=0.25

Double-Seasons

wOBA to wOBA: 0.403

xwOBA to wOBA: 0.451

The differences are small, but they are consistent. xwOBA appears to be a better short term predictor than wOBA. What interested me the most was that while wOBA predicts your next month better if used in large sample size, the opposite is true for xwOBA. If you want to use xwOBA, you’re (slightly) better off using the most recent data.

Let’s talk about this in baseball terms. Baseball is so complex that a couple broken bat bloopers here and there can give you a really good month. Maybe you’re getting shifted but the pitcher doesn’t execute his spot and misses away and you shoot the wide open side of the infield a couple times. Maybe you made the mistake of hitting the ball hard in the middle of the field against the Cubs. Stats like wOBA practically scream regression to the mean.

But there’s no hiding from Statcast. If you’re hitting the ball hard it probably means you’re seeing the ball well and are consistently on time. Plate appearances aren’t independent events; we feel things in the cage one day that might get us locked in for a week. Or the other way around.

5 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

bensnider94

7 years ago

This is neat. I’d love to see more things like this from the direct perspective of a player. Great insight.

Dominikk85Member since 2020

Definitely the most talented player who ever wrote for fangraphs:). Thanks for stopping by nate.

Btw does xwoba consider batted ball direction or just ev and la?

Regarding woba I think you definitely also have to factor in speed. It is not just the infield hits but also infield depth. Against a hard hitting big guy like nate the defense will play back more meaning more range on grounders and even some extra liners will be snatched while against a speedster they will play in more because of the speed to take time away which means a few more grounders go through and some more bloopers fall in.

Liam StevensonMember since 2016

This is cool Nate! I’m currently working on related analysis that I’m planning on submitting to the Community page in the coming couple of weeks. Thanks for your take and insight.

MichaelMember since 2017

Good stuff. Perhaps FG should promote to the top of the front page?

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG