Was wOBA Actually Invented Nearly 100 Years Ago?

With apologies to Michael Lewis, what if everything you thought you knew about baseball was wrong? As our collective understanding of advanced statistical analysis in baseball grows exponentially with each passing day, we are now among a generation of baseball fans that has done more critical thinking about and retained more esoteric knowledge of the game than our parents could ever have dreamed of. Anyone who has seen MLB Network’s show on the evolution of statistics would think that between Henry Chadwick’s invention of the box score and Branch Rickey’s hiring of Alan Roth as a statistician, baseball fans in the 20th century consumed baseball metrics in only the most rudimentary of ways — via the dreaded batting average, home runs and RBI triumvirate.

However, what if I told you that one of the most advanced analytical discoveries — one that sabermetricians hold near and dear to their hearts — was actually discovered before Babe Ruth ever played a game?

My dad gave me Michael Lewis’ “Moneyball” when I was 13 years old. Up until that point, I had every statistic on the back of every baseball card memorized. I would spend hours organizing and reorganizing my seemingly infinite collection of cards — always checking the numbers before placing a card into a new position. Michael Lewis ruined this childhood passion. Knowing that Bernie Williams hit .338 in 1998, narrowly beating out Mo Vaughan (.337) for the batting title just didn’t seem that important anymore. Long story short, I had to know everything about baseball I didn’t already know. My progression began with on-base percentage, grew to OPS, and I eventually stumbled upon the likes of FIP and wOBA. As a teenager, I simply believed these metrics were all invented by Bill James, who I imagined being like Vito Corleone with Tom Tango as Tom Hagen and Voros McCraken as Michael Corleone.

Returning to the present day, I‘m currently in the midst of writing my college thesis (yes, it’s on baseball), and I recently came across a piece of research that sounded my “wow, everything I thought I knew about baseball is wrong” alarm to claxon-like levels. In the 1915 edition of Baseball Magazine (distributed from 1908-1957), there’s an article written by F.C. Lane that would make even Tom Tango take notice (assuming he isn’t already aware of its existence): “Why the System of Batting Averages Should Be Changed: Statistics Lie at the Foundation of Baseball Popularity — Batting Records Are the Favorite — And Yet Batting Records Are Unnecessarily Inaccurate.”

Lane opens his discussion with a question: “Suppose you asked a close personal friend how much change he had in his pocket and he replied, ‘Twelve coins,’ would you think you had learned much about the precise state of his exchequer?” He goes on to compare two mens’ respective financial situations: Man A, with “twelve coins” consisting of a combination of quarters, nickels, and dimes; and Man B, with twelve silver dollars. Saying both men have equal financial means is equivalent to the system of tracking batting averages, he explains. “One batter, we may say, made twelve singles, three or four of them of the scratchiest possible variety. The other also made twelve hits, but all of them were good ringing drives, clean cut and decisive, three of them were doubles, one a triple, and one a home run…Is there no way to separate the dimes from the nickels and give each its proper value?” Sound familiar?

“If these averages mislead or give mistaken ideas of batting ability they forfeit their only excuse in being?”

This issue was not solely unique to Lane’s inquisitiveness. John Heydler, secretary and future president of the National League, added, “that the system of giving as much credit to singles as to home runs is inaccurate to that extent. But it has never seemed practicable to use any other system. How, for instance, are you going to give the comparative values of home runs and singles?”

Lane wasn’t satisfied with Heydler’s admission that even though the system was broken, it couldn’t be fixed. To prove Heydler wrong, the question Lane would attempt to answer was simple: “What constitutes the value of a hit?” “A hit,” Lane says, “is valuable in so far as it results in a score. The entire aim of a baseball team at bat is to score runs. Hits, stolen bases, taking advantage of errors — in short, all the departments of play — are but details in the process of scoring runs.”

Lane continues to outline what appears to be a very early version of weighted on-base average (wOBA). Before he concludes his argument, he makes another discovery that took the rest of us about 80 years to figure out. Lane compares Jake Daubert — who hit for a high batting average — and Gavvy Cravath, who Lane claims is a much better player, even with his sub-.300 batting average.

To make this comparison, Lane looks at the league average figures for singles, doubles, triples and home runs (77.44%, 14.80%, 5.51%, and 2.24%, respectively) and compares those numbers to each player’s numbers. Daubert’s hit breakdown was as follows: 79.47% singles, 13.90% doubles, 5.29% triples and 1.33% home runs. “In other words,” explains Lane, “Jake made more singles and fewer extra base hits than the general average right down the line. Jake had a lot of coins in his pockets, but many of them were nickels and dimes.” Cravath, on the other hand, had the following breakdown: 59.38% singles, 20.80% doubles, 4.69% triples and 16.12% home runs. Lane breaks down the numbers further, assigning the proper (his idea of the correct) values to each hit, thus creating a weighted batting average. Comparing a player’s weighted figures to the league averages seems quite a bit similar to what we know as wRC+ today, wouldn’t you say?

Clearly the baseball universe did not end up adopting these types of analyses back then. Even today, most fans are just beginning to realize just how one-dimensional batting average is. MLB Network’s aforementioned special on statistics called OPS the gateway drug, and noted that fans are beginning to realize that OBP and SLG are better metrics than AVG. While the more advanced figures like wOBA and wRC+ are still relatively unknown to the baseball masses, even they seem to be slowly seeping into the wider baseball zeitgeist.

“Let it be hoped that 1916, the dawn of a new day in baseball affairs, will witness as well the dawn of a new day in the outworn method of keeping batting averages. The time has passed when the public will any longer swallow the palpable falsehood that a home run is no better than a scratch single. It knows better, instinctively feels better, and should be told the truth by a presentation of the season’s statistics founded upon a sane, workmanlike basis.”

If only Lane could’ve seen just how far his theories have come.





Sam graduated from Swarthmore College in 2012.

52 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Dan
12 years ago

This is fascinating.

Adam K.
12 years ago

This is really amazing. Too bad nobody paid any attention to him when he was writing…

Alex
12 years ago

very interesting article

baycommuter
12 years ago

Very interesting, I hope some ballclub gave F.P. Lane a job in the front office. It’s surprising to me that anyone in the dead ball era could have had 16.12% of his hits be home runs. Cravath must have been the Rob Deer of his time.

Bobby Ayala
12 years ago

Interesting article, but isn’t what he’s describing just SLG% ?

Ron
12 years ago

It’s time baseball cards and more importantly, baseball game broadcast analyts, opened up the public’s knowledge of info like this.

Kyle
12 years ago

Awesome article! Thanks for sharing your extremely interesting finding.

Paul Thomas
12 years ago

Lane went quite a bit further than this. He essentially went on to figure out the linear-weights run values of particular batting events, and was only percentage-points off from the values people eventually calculated sixty years later.

If that’s not impressive enough yet, remember that he did this without a calculator, or computers to store all the data.

Lane’s work is, IMO, one of the most amazing accomplishments of sports fanaticism in history.

Oliver
12 years ago

Wow. It had always bugged me to think that no one bothered to ponder these sorts of questions until recently…

Leslie cavrell
12 years ago

Sm – You wrote an unbelievable article. great job. Very Very interesting

Husker
12 years ago

Lane is the Archimedes of baseball. Archimedes invented the rudiments of calculus and proved physical laws with it, but all of this was forgotten after his death and rediscovered independently by Newton and Leibniz nearly 2000 years later.
A similar thing happened with Lane’s work with a lesser time lapse.
Amazing!

MNzach
12 years ago

Terrific article, very fascinating.

Scott Lindholm
12 years ago

Excellent article, and thanks for the link to the original article–I haven’t had the chance to read that yet, but I look forward to seeing his analysis in a pre-computer statistical world.

filihok
12 years ago

This is absolutely superb.

filihok
12 years ago

“It is grotesqueries such as this that bring
the whole foundation of baseball statistics into disrepute”

This quote from page 47 where he is comparing Daubert and Cravath is pretty good.

Nathan
12 years ago

Didn’t Babe Ruth play his first MLB game in 1914?

PeteH
12 years ago

You young whippersnappers! Snicker. That was pretty terrific and it’s a real pity Lane wasn’t recognized in his time. Back then, sports writers were basically the hackiest of the hacks, paid basically to hang around saloons with ballplayers to get scuttlebutt or phone in a play-by-play. This guy was obviously really thinking about the sport and you did a great job of explaining what he was up to. I’ve been a fan for 50 years and had no idea someone was thinking that way until Rickey came along.

BTW, my daughter is in the midst of writing her thesis this semester. Good luck to both of you and congrats on a really interesting article.

buddaley
12 years ago

I don’t recall it dealing with this specific example-although it is a fabulous instance of early analytical thinking-but Alan Schwarz’s book, “The Numbers Game” does a wonderful job of tracing the development of statistical analysis from the 19th century origins of baseball. And one of the salient points of the book is that efforts to deepen our understanding of the game via stats has always been around, and there have always been debates about which numbers are most meaningful.

Schwarz notes, for example, that for a time in the 19th century walks were counted as hits, essentially making BA really OBP. He also has a lengthy discussion of Lane’s efforts, noting that Lane also remarked on park effects as important in evaluating stats. Other early 20th century researchers kept records that anticipate sabermetric ideas.

buddaley
12 years ago

Oops, I wrote the first paragraph without checking the Schwarz book and the second after looking at it again. Obviously he did deal, and at length, with the specific example of Lane’s evaluation of BA.

Jameson
12 years ago

Somebody at Fangraphs hire this guy

Kenos
12 years ago

Great stuff. Thanks!!

Scott Lindholm
12 years ago

I forgot to add this in my earlier comment–the fact that Lane was able to compile those stats, in that day when I’m not sure there were annual baseball statistical compilation. I doubt he went to the trouble to tabulate a year’s worth of box scores, but he also didn’t have FanGraphs and baseball-reference.com at his disposal. Using B-R, I was able to replicate his work in five minutes, and found minor variations due to updates in the records that have occurred over the years. These detract from neither Lane’s original work or the author’s outstanding article–it’s a rare feat to write something that generates nothing but positive comments.
1915 NL Lane B-R
Singles 7786 (77.44%) 7788 (76.80%
Doubles 1488 (14.80%) 1555 (15.34%)
Triples 554 (5.51%) 572 (5.64%)
Homers 226 (2.24%) 225 (2.22%)

Babe Ruth
12 years ago

Well, Cravath was bit better than Daubert during his prime of 1912-1919 (38 WAR to 29 WAR), but Daubert had the better career (49-40 WAR), reaffirming the sabermatrics-era maxim that big, slow guys don’t age well.

Nate Ader
12 years ago

This is so cool! Thanks for sharing, Sam.

Best of luck on your thesis.

bill
12 years ago

anyone else middle-click “Why the System of Batting Averages Should Be Changed:?”

Jonathan C. Mitchell
12 years ago

Great article!

Bradley Woodrummember
12 years ago

Too cool.

Jacob
12 years ago

Here’s the SABR bio on F.C. Lane, if you want to learn more about him:

http://bioproj.sabr.org/bioproj.cfm?a=v&v=l&bid=781&pid=16911

You can read all 175 Lane-written articles for Baseball Magazine (1909-1918) by searching here:

http://www.la84foundation.org/5va/baseballmagazine_frmst.htm

Some of these are quite fascinating, as Sam discovered.

Joe
12 years ago

Just a note: You spelled Vaughan wraunghan – it’s Vaughn.

dencimm
12 years ago

“the outworn method of keeping batting averages” he calls our record keeping today. IMO, it’s his method is flawed too because he doesn’t take into account situations.

williams .482member
12 years ago

That was amazing. Un-believable.

rwinter58member
12 years ago

Very cool, very glad this was posted. loved it

reillocity
12 years ago

Here’s one where Lane questions whether the statistical batting records of the day (1918) are giving a crack Red Sox twirler who recently took up outfielding his proper due …

http://www.la84foundation.org/SportsLibrary/BBM/1918/bbm216m.pdf

seth coren
12 years ago

interesting analysis. puts a different slant on how we value players.may change arbitration approach.

Marissa C
12 years ago

Very insightful

Steve Carey
12 years ago

You hit a home run on this artlice! Truly enjoyed it. I always knew M.M. was better than .298 lifetime!!!

Nathan
12 years ago

Wow, nice find! Very interesting stuff. I second the idea that FG hires you. 🙂

TimmyT
12 years ago

This is nuts!
Proud to be a fan of the sport with the most analysis, statistics, and truth. Baseball exposes shams and uncovers gems like no other sport.
I bet that hockey and other sports could benefit from similar statistical revolutions

saberbythebay
12 years ago

Incredibly interesting stuff. Well researched and beautifully written. Bravo.

Also love the Clubhouse Confidential shout out. A great foil to Intentional Talk’s mindless blather.

Matt B
12 years ago

Great Job Samuel

sheath1976
12 years ago

@Babe Ruth

If you look more closely you will see that Cravath actually peaked between 31-38! He posted 3.2 WAR in only 255 plate apps at age 38. It seems like he aged quite well for a big lumbering slugger. He is a remarkable man in baseball history. Daubert had 4.4 WAR at age 38 in 700 plate appearances. Neither did much in the bigs after 38 so I would call it a wash. Cravath aged just as well. The dicrepancy in career WAR is due to Cravath wasting away in the PCL and Minors prior to his age 31 season.

Jon L.member
12 years ago

This is a terrific article! Of course, some of the credit for articles like this go to baseball history for having amazing things happen in it.

juan pierres mustache
12 years ago

In his weekly column the following Thursday, Murray Chass responded that while he might be a bit old to be keeping up with newfangled statistics, he could see that the analysis was flawed because Lane failed to assign a coin to the sacrifice bunt in his change metaphor.

steve
12 years ago

Great job, and thanks for linking to the original, super-fascinating article. I certainly agree with Lane’s opinion that “statistics are the most important part of baseball.”

Garrett Hawk
12 years ago

Menzin has written an outstanding article; cogent, articulate, and fascinating.

(minor quibble: by the time Lane’s 1915-1916 article was published, the Sultan had already acquired one of his 7 rings.)

mikeNicoletti
12 years ago

Just want to give high praise to this article, what a find! Of course, I’m at work today working on implementing an algorithm that was invented in 1975 on a processor that was invented in 2011, so i’m sure this would be a far less isolated event if we had all of these articles properly cataloged!

AA
12 years ago

Would be interesting to see Lane’s view on OBP. Rickey himself valued it highly well before it became a standard.

Kreg
12 years ago

it’s cool, what make me surprise is that why nobody paid attention to it for such a long time?