Archive for Outside the Box

The Baseball Fan’s Guide to Baby Naming

I’ve often wondered if some sort of bizarre connection exists between names and athletic ability, specifically when it comes to the sport of baseball. Considering I grew up in the 90’s, I will always associate certain names with possessing a supreme baseball talent. Names like Ken (Griffey Jr.), Mike (Piazza), Randy (Johnson), Greg (Maddux) and Frank (Thomas) are just a few examples. With a wealth of statistical information available, I thought I’d investigate into the possibility of an abnormal association between names and baseball skill.

I began digging up the most popular given names, by decade, using the 1970’s, 80’s & 90’s as focal points. This information was easily accessible on the official website of the U.S. Social Security Administration, as they provide the 200 most popular given names for male and female babies born during each decade. After scouring through all of the names listed, the records revealed there were 278 unique names appearing during that timespan.

Having narrowed down the most popular names for the timeframe, I wandered over to FanGraphs.com, to begin compiling the “skill” data. I will be using the statistic known as WAR (Wins Above Replacement) as my objective guide for evaluating talent. Sorting through all qualified players from 1970-1999, the data revealed 2,554 players eligible for inclusion. After combining all full names with their corresponding nicknames (i.e.: Michael & Mike), the list was condensed down to 507 unique names.

By comparing the 278 unique names identified via the Social Security Administration’s most popular names data, with the 507 qualified ballplayer names collected through FanGraphs, it was discovered that 193 of the names were present on both lists. The following tables point out some of the more intriguing findings the research was able to provide.

The first table[Table 1], below, is comprised of the 25 most frequent birth names from 1970-1999. The second table[Table 2] consists of the 25 WAR leaders by name, meaning the highest aggregate WAR totals collected by all players with that name. Naturally, many of the names that appear in the 25 most common names list, reappear here as well. Ken, Gary, Ron, Greg, Frank, Don, Chuck, George and Pete are the exceptions. It’s interesting to see that these names seem to have a higher AVG WAR per 1,000 births(as seen on the final table), perhaps indicative of those names’ supremacy as better baseball names? The last table[Table 3] contains the top 25 names by AVG WAR per 1,000 births; here we see some less common names finally begin to appear. These names provide the most proverbial bang (WAR) for your buck (name). Yes, some names, like Barry and Reggie, are inflated in the rankings — probably due to the dominant play of Barry Bonds and Reggie Jackson, but could it not also mean these players were just byproducts of their birth names?!? Probably not, but it’s interesting, nonetheless.

So if you’re looking to increase the chances your child will make it professionally as a baseball player, then you might want to take a look at the names toward the top of the AVG WAR per 1,000 births table, choose your favorite, and hope for the best…OR, you could always just have a daughter.

Please post comments with your thoughts or questions. Charts can be found below.

25 Most Common Birth Names 1970-1999

Rank

Name

Total Births

Total WAR

WAR per 1,000 Births

1

Michael/Mike

2,203,167

1,138

0.516529

2

Christopher/Chris

1,555,705

184

0.11821

3

John

1,374,102

799

0.581252

4

James/Jim

1,319,849

678

0.513316

5

David/Dave

1,275,295

859

0.673491

6

Robert/Rob/Bob

1,244,602

873

0.70175

7

Jason

1,217,737

77

0.062904

8

Joseph/Joe

1,074,683

616

0.573006

9

Matthew/Matt

1,033,326

95

0.091646

10

William/Will/Bill

967,204

838

0.866415

11

Steve(Steven/Stephen)

916,304

535

0.583649

12

Daniel/Dane

912,098

233

0.255674

13

Brian

879,592

154

0.174967

14

Anthony/Tony

765,460

314

0.409819

15

Jeffrey/Jeff

693,934

298

0.430012

16

Richard/Rich/Rick/Dick

683,124

888

1.29991

17

Joshua

677,224

0

0

18

Eric

627,323

122

0.194637

19

Kevin

613,357

305

0.497426

20

Thomas/Tom

583,811

505

0.86552

21

Andrew/Andy

566,653

184

0.325243

22

Ryan

558,252

17

0.030094

23

Jon/Jonathan

540,500

61

0.112118

24

Timothy/Tim

535,434

253

0.473074

25

Mark

518,108

397

0.765477

 

25 Highest Cumulative WAR, by Name, 1970-1999

Rank

Name

Total Births

Total WAR

WAR per 1,000 Births

1

Michael/Mike

2,203,167

1,138

0.516529

2

Richard/Rich/Rick/Dick

683,124

888

1.29991

3

Robert/Rob/Bob

1,244,602

873

0.70175

4

David/Dave

1,275,295

859

0.673491

5

William/Will/Bill

967,204

838

0.866415

6

John

1,374,102

799

0.581252

7

James/Jim

1,319,849

678

0.513316

8

Joseph/Joe

1,074,683

616

0.573006

9

Steve(Steven/Stephen)

916,304

535

0.583649

10

Thomas/Tom

583,811

505

0.86552

11

Kenneth/Ken

312,170

439

1.405644

12

Mark

518,108

397

0.765477

13

Gary

176,811

353

1.998179

14

Ronald/Ron

246,721

342

1.38456

15

Anthony/Tony

765,460

314

0.409819

16

Kevin

613,357

305

0.497426

17

Gregory/Greg

324,880

303

0.931729

18

Jeffrey/Jeff

693,934

298

0.430012

19

Donald

215,772

298

1.380161

20

Frank

176,720

298

1.687415

21

Charles/Chuck

458,032

262

0.571357

22

Timothy/Tim

535,434

253

0.473074

23

Lawrence

220,557

248

1.126239

24

George

226,108

246

1.090187

25

Peter

181,358

246

1.357536

 

25 Highest WAR per 1,000 Births, by Name, 1970-1999

Rank

Name

Total Births

Total WAR

WAR per 1,000 Births

1

Barry

34,534

175

5.079053

2

Leonard

31,626

123

3.895529

3

Omar

13,656

53

3.873755

4

Fernando

13,180

47

3.543247

5

Theodore/Ted

27,144

93

3.444592

6

Jack

53,079

176

3.323348

7

Reginald/Reggie

47,883

157

3.283002

8

Frederick/Fred

54,529

146

2.681142

9

Bruce

56,609

141

2.487237

10

Calvin

43,412

107

2.453239

11

Gary

176,811

353

1.998179

12

Roger

77,458

151

1.948153

13

Glenn

33,794

65

1.929337

14

Darrell

53,317

102

1.920588

15

Frank

176,720

298

1.687415

16

Dennis

131,577

218

1.653024

17

Jerry

122,465

201

1.638019

18

Dale

36,162

54

1.48775

19

Lee

62,922

89

1.406503

20

Kenneth/Ken

312,170

439

1.405644

21

Louis/Lou

142,969

200

1.400304

22

Ronald/Ron

246,721

342

1.38456

23

Roy

59,004

82

1.382957

24

Donald

215,772

298

1.380161

25

Jay

63,795

87

1.368446

 


MLB 2014 All-Loser Team

I’m mostly an NFL writer. For years, I’ve been naming an NFL All-Loser Team at the end of each regular season. It’s an all-star team comprised exclusively of players whose teams missed the postseason. You can view it as a celebration of players who may be underrated or underappreciated because their teams aren’t very good, or you can view it as a shot at people who insist you can’t be that great if your team didn’t make the playoffs. Up to you. It’s a fun project, and it’s easy to apply to MLB as well football.

Here’s what you’re getting after the jump:

* Four teams. We’ll do an American League All-Loser Team, National League All-Loser Team, MLB All-Loser Team, and an all-star team taken exclusively from the six clubs that finished last in their divisions.

* For each list, we’ll do nine position players (the NL gets a pinch-hitter instead of a DH), and I’ll show my imaginary batting order. Each team will also feature a five-man rotation, a right-handed reliever, and a left-handed reliever. So, 16 players per team.

* I’ll offer some minimal commentary on the teams, with a paragraph or two for each team to discuss surprising selections and close calls. For the MLB team, I’ll list the top three in fWAR at each position and explain my selections. There’s nothing earth-shattering here, unless you think we can’t make a wicked lineup out of players from losing teams.
Read the rest of this entry »


Could It Be Time to Update WAR’s Positional Adjustments?

It’s been quite a week for the WAR stat. Since Jeff Passan dropped his highly controversial piece on the metric on Sunday night, the interwebs have been abuzz with arguments both for and against the all-encompassing value stat. One criticism in particular that caught my eye came from Mike Newman, who writes for ROTOscouting. Newman’s qualm had to do with a piece of WAR that’s often taken for granted: the positional adjustment. He made the argument that current WAR models underrate players who play premium defensive positions, pointing out that it would “laughable” for Jason Heyward to replace Andrelton Simmons at shortstop, but not at all hard to envision Simmons being an excellent right fielder.

This got me thinking about positional adjustments. Newman’s certainly right to question them, as they’re a pretty big piece of the WAR stat, and one most of us seem to take for granted. Plus, as far as I’m aware, none of the major baseball websites regularly update the amount they credit (or debit) a player for playing a certain position. They just keep the values constant over time. I’m sure that whoever created these adjustments took steps to ensure they accurately represented the value of a player’s position, but maybe they’ve since gone stale. It’s certainly not hard to imagine that the landscape of talent distribution by position may have changed over time. For example, perhaps the “true” replacement level for shortstops is much different than it was a decade or so ago when Alex Rodriguez Derek Jeter, Nomar Garciaparra, and Miguel Tejada were all in their primes.

I decided to try and figure out if something like this might be happening. If the current positional adjustments were in fact inaccurately misrepresenting replacement level at certain positions, we’d expect the number of players above replacement level to vary by position. For example, there might be something like 50 above-replacement third basemen, but only 35 shortstops. Luckily, the FanGraphs leaderboard gives you the ability to query player stats by position played, which proved especially useful for what I was trying to do. For each position, I counted the number of plate appearances accumulated by players with a positive WAR and then divided that number by the total plate appearances logged at that position. Here are the results broken out by position for all games since 2002.

Ch1

Based on this data, it seems like the opposite of Newman’s hypothesis may be true. A significantly higher portion positive WAR plate appearances have come from players at the tougher end of the defensive spectrum, which implies that teams don’t have too difficult of a time finding shortstops and center fielders who are capable of logging WARs above zero. Less than 13% of all SS and CF plate appearances have gone to sub-replacement players. But finding a replacement-level designated hitter seems to be slightly more difficult, as teams have filled their DH with sub-replacement-level players nearly 30% of the time. Either teams are really bad at finding DH types (or at putting them in the lineup), or the positional adjustments aren’t quite right. The disparities are even more pronounced when you look at what’s taken place from 2002 to 2014.

Ch2

The share of PAs logged by shortstops and center fielders hasn’t changed much over the years, but the numbers have plummeted for first basemen, corner outfielders, and DH’s. From Billy Butler and Eric Hosmer, to Jay Bruce and Domonic Brown, this year’s lineups have been riddled with sub-replacement hitters manning positions at the lower end of the defensive spectrum. Meanwhile, even low-end shortstops and center fielders, like Derek Jeter and Austin Jackson, have managed to clear the replacement level hurdle this season if we only count games at their primary positions.

The waning share of above-replacement PA’s coming from 1B, LF, RF, and DH has caused the overall share to drop as well, with a particularly big drop coming this year. Here’s a look at the overall trend.

 

Ch3

And here it is broken down by position…

 

Ch4

And just between this year and last…

 

ch5

 

Frankly I’m not sure what to make of all of this. I’m hesitant to call it evidence that the positional adjustments are broken. There could be some obvious flaw to my methodology that I’m not considering, but I find it extremely interesting that there’s been such a shift between this year and last. We’re talking an 8 percentage point jump in the number of PAs that have gone to sub-replacement-level players. Maybe its been spurred the rise of the shift or maybe year-round interleague play has something to do with it, but it seems to me that something’s going on here. And I’m interested to hear other people’s thoughts on these trends.


Cat Days of Summer: The Tigers and Schedule Effects

If you’ve been on the internet in the last few weeks (or within earshot of a Michigander) you may have heard about the Tigers. Specifically, you may have heard about how the odds in favor of a Detroit appearance in the 2014 ALDS dropped from 21-to-1 on July 25 to under break-even by August 23 before a slight rebound to finish out the month. Even more specifically, you may have read Mike Petriello’s article about that on this very website. Or at the very least, you may have heard their struggles described in a less quantitative fashion. Regardless, the month of August was not kind to the Bengals.

As Petriello pointed out, this has been less of a Tigers collapse than a Royals surge. But there’s still something to the idea that the Tigers were playing worse in August than they had been previously. Let’s start with the basics:

2014 First Half August
R/G 4.80 4.58
RA/G 4.25 4.74
W% .582 .516
Pythagenpat .557 .484

In August, the Tigers scored fewer runs, allowed more runs, and won fewer games than in the first half. On some level, that’s all that really matters. On another level, something else is different about August for these Tigers.

Back on July 14, Buster Olney and Jeff Sullivan both wrote articles about schedule strength. Olney called the Tigers’ schedule the second-most difficult of 17 “contending” teams (paywall), while Sullivan said it was the easiest in all of MLB. One of the key reasons for the discrepancy was that Sullivan was using projections to determine the difficulty of a particular opponent, while Olney was using actual results. Score one for Sullivan. Another key difference was that as of July 14, the Tigers were about to play 55 games in 56 days, which did not factor into Sullivan’s analysis.

A point for Olney? Perhaps. But first, what would we expect to see if this was a result of schedule fatigue? Or put another way, which groups of players might be hurt most or least by not having a day off? Based on conventional wisdom, the bullpen would probably be the most affected, and the starters the least. So how does this match up to the Tigers? Read the rest of this entry »


The Search for a Good Approach

Last week I explored the strategic effect of seeing more pitchers per plate appearance. I love the ten-pitch walk as much as the next guy, but what I love even more is seeing a guy be able to change that approach to beat a scouting report. Let’s take a look at June 5, 2014, when the A’s went to see Masahiro Tanaka for the first time. The first batter is Coco Crisp:

Pitcher
M. Tanaka
Batter
C. Crisp
Speed Pitch Result
1 91 Sinker Ball
2 90 Sinker Ball
3 91 Fastball (Four-seam) Ball
4 90 Fastball (Four-seam) Called Strike
5 91 Fastball (Four-seam) Foul
6 92 Fastball (Four-seam) In play, out(s)

So Crisp doesn’t get the best of Tanaka, but he makes Tanaka labor a bit through six pitches. If you’re going to make an out to start the game, it might as well be a long one. For the next batter, John Jaso, Tanaka decides to go right after him:

Pitcher
M. Tanaka
Batter
J. Jaso
Speed Pitch Result
1 90 Sinker In play, run(s)

I may be looking too deeply into the narrative here, but I love to imagine Tanaka getting a bit frustrated here. Perhaps the scouting report said that both Coco is aggressive early, while Jaso’s running 15% walk rates in 2012 and 2013 suggest that he’s more patient.  Tanaka has to throw six pitches in order to get Crisp out, but after deciding to go right after Jaso, he gets taken deep.

So I wondered if there are players who are able to fulfill both ends of this spectrum. Are there any players that are capable of prolonging their time at the plate until they see the pitch they want, but are also aggressive and willing enough to hit the gas on the first pitch? I used FanGraphs for the pitches/plate appearance data, but used baseball-reference’s play index to look up all instances of first-pitch hits this season. Originally I was going to use first-pitch swings, but I decided to just stick to times when the pitcher gets punished for trying to get ahead early. After all, if your decision is to get ahead early in the count, and the guy swings but all he does is foul it off or hit into an out, then that doesn’t change your approach as a pitcher. I wanted to see guys whom the book isn’t written on yet.  Advance Warning: These stats will be about a week old by the time you see them, as I am a slow, slow man.

Best P/PA Rank + FPH Rank (I have no idea how to pitch to them) FPH% P/PA FPHR PPAR FPHR + PPAR wOBA
Scott Van Slyke 5.940594059 4.143564356 26 45 71 0.385
Eric Campbell 4.2424242424 4.248520710 117 18 99 0.326
Jesus Guzman 4.294478528 4.17791411 111 33 144 0.247
Daniel Murphy 4.577464789 4.111842105 87 58 145 0.305
Joey Votto 4.044117647 4.334558824 135 12 147 0.359
Mark Reynolds 5.037783375 4.0375 59 91 150 0.307

(For Reference: FPH% = First Pitch Hit Percentage, or how often a batter gets a hit on the first pitch they see.  P/PA = Pitches per Plate Appearance. FPHR = First Pitch Hit Ranking, or how they rank in this category compared to the rest of the league.  PPAR = Pitches per Plate Appearance Ranking.  FPHR + PPAR = The addition of these two numbers.)

I like this table!  I have wondered at times what has caused Scott Van Slyke‘s resurgence this year. Perhaps this table gives us a bit of a clue.  Van Slyke is the only person in the MLB to rank in the top 50 in both FPHR and PPAR.  That’s pretty neat.  Daniel Murphy is also quite balanced, but he’s been much more consistent over the last few years.  He’s particularly interesting in that he doesn’t have a particularly high walk rate or strikeout rate.  I guess he’s just selective at times.  Jesus Guzman’s presence on this list goes to show that a good approach doesn’t necessarily mean success; it just means that he may not head back to the bench in any predictable fashion.  I stretched out the table one spot to include Mark Reynolds, because his name on this table makes me feel better about drafting him in Fantasy Baseball for past five years.

I also wanted to look at the flip-side.  Who are the guys who don’t tend to take a lot of pitches, but also don’t tend to make any decent contact on first pitches?

Highest P/PA Rank + FPH Rank (Pick your poison) FPH% P/PA FPHR PPAR FPHR+PPAR wOBA
Joaquin Arias 0.6451612903 3.55483871 370 400 770 0.221
Ben Revere 1.629327902 3.563636364 365 368 733 0.307
Endy Chavez 0.9345794393 3.674311927 321 393 714 0.301
Conor Gillaspie 2.168674699 3.587112172 359 329 688 0.353
Jean Segura 2.564102564 3.42462845 396 289 685 0.262

Here we have a much less impressive list.  Joaquin Arias has been one of the worst hitter in the majors this year, and his dominance atop this leaderboard makes a bit of sense.  However, Conor Gillaspie is having an excellent season for the Pale Hose, despite the fact that he doesn’t seem to excel in either of the areas this article is interested in.  One pecuilar note is that this group is pretty poor at hitting for power in general; these 5 guys have 13 home runs between them on the year, and six of those are Gillaspie’s.

So now let’s look at the weird ones.  I would think that it stands that if there are certain players who tend to take a lot of pitches and who also never seem to square up the first pitch, then we know our game plan.  Get ahead early on these batters.  We can try to view that by simply looking at each players FPH Ranking minus their PPA ranking.  This is the same at looking at the absolute value of their PPAR minus their FPAR.  Here are the top five in that respect:

Worst in FPHR, Best in PPAR (Groove it Early) FPH% P/PA FPHR PPAR FPHR-PPAR wOBA
Jason Kubel 1.136363636 4.471590909 387 4 383 0.278
Aaron Hicks 0.641025641 4.224358974 401 21 380 0.286
Mike Trout 1.217391304 4.418965517 385 6 379 0.401
Matt Carpenter 1.376936317 4.357264957 380 8 372 0.343
A.J. Ellis 1.181102362 4.255813953 386 17 369 0.264

Golly; I’ve figured out Mike Trout!  Mike Trout ranks very highly on our list of PPAR but is unfortunately relatively average when it comes to the first-pitch punish.  All of these guys actually fit this mold.  We have three relatively poor hitters accompanied by the best player in baseball and an above average infielder on a winning team.  So we can tell that being patient isn’t necessarily a good or bad thing; it’s just that hitter’s style.  Now let’s take a look at the reverse:

Best in FPHR, Worst in PPAR (Don’t throw it in the zone early) FPH% P/PA FPHR  PPAR PPAR-FPHR wOBA
Jose Altuve 8.159722222 3.175862069 5 407 402 0.355
Wilson Ramos 7.169811321 3.293680297 6 405 399 0.327
Erick Aybar 6.628787879 3.347091932 12 401 389 0.312
Ender Inciarte 8.360128617 3.471518987 3 391 388 0.284
A.J. Pierzynski 6.413994169 3.391930836 16 399 383 0.283

It’s always satisfying when the data shows what you expect it to.  I imagined Jose Altuve as being among the more aggressive hitters, and this shows that at least.  Altuve ranks 5th in the league in FPH% and is rather mediocre in the PPA category.  Interesting to see that this top five is also sorted by wOBA; Altuve is the best hitter on the list, and Pierzynski is the worst.  So there’s nothing necessarily wrong with an aggressive approach, but it does give us a clue as to a possible plan of attack.

So all this is to say, like my last article, that no particular approach is best.  One can look to swing at the first pitch, or one can be patient and wait for their pitch to come.  That said, everybody does have an approach, and that means they’ve got something they’re not looking for.  Stats like FPH and PPAR may just give us more clues as fans as to what teams put together with scouting reports.

So to conclude by going back to our first example, perhaps Tanaka should have read this data before his start against the A’s.  Coco ranks 266th in the league in FPHR, but a respectable 76th in PPAR.  Conversely, Jaso ranks 80th in the league in FPHR, but just 225th in PPAR.  Tanaka might have been better served by going after the aging Crisp and saving his energy for the somewhat aggressive Jaso.


Baseball’s 10 Most Unusual Hitters

Baseball, more than any other major team sport, has the reputation for having the least athletic athletes. Jose Molina is obligated to, at times, sprint. Jorge de la Rosa must swing a baseball bat. David Ortiz sometimes has to play in the field. Having skills like catcher defense, pitching, and hitting with power will earn you playing time, and many players have such elite strengths that it’s worth it just to deal with those weaknesses. So many of baseball’s skills are unrelated that players have to spend a lot of time doing things they aren’t good at, at least relative to other MLB talent. A good way to make anyone look unathletic is to make them perform a long list of skills that have little to do with one another and compare them to the best in the world at those tasks.

I wanted to assemble a list of players who experienced something like this phenomenon the most frequently. Essentially, I wanted to see what players’ strengths and weaknesses were the farthest apart. To determine those players whose skills varied the most between themselves, I gathered what I consider to be the six stats that best describe what a player’s strengths and weaknesses are. BABIP and K% for contact, BB% for discipline, ISO for power, and Fielding and Baserunning values. I then gathered stats from 2011-2014 to better control for less reliable fielding metrics, assigned each player’s stats a percentile rank, and calculated the standard deviation of those six stats for each player.

For instance, Mike Trout’s attributes look like this:

Mike Trout

His strikeout rate has been higher than MLB average, but he is otherwise an exceptionally well rounded player, as we know.

The most evenly talented player in baseball has been Kyle Seager, who is almost in the middle third at every stat.

Kyle Seager

Many players have much more severe strengths and weaknesses. Here are the 10 players whose stats show the greatest variation from one another.

10. Dexter Fowler

Dexter Fowler

9. Ichiro Suzuki

Ichiro Suzuki

8. Jose Altuve

Jose Altuve

7. Curtis Granderson

Curtis Granderson

6. Mark Reynolds

Mark Reynolds

5. Giancarlo Stanton

Giancarlo Stanton

4. Miguel Cabrera

Miguel Cabrera

3. Darwin Barney

Darwin Barney

2. Adam Dunn

Adam Dunn

1. Ben Revere

Ben Revere

The whole list is fun to look through and play around with, so feel free to click here and look through all the qualifying players.


Not All One-Run Games are Created Equal

It’s the bottom of the fourth. No outs. Your beloved Milwaukee Brewers are up to bat trailing the Dodgers 1-0, with Clayton Kershaw on the mound. They’ve picked up two scattered hits and drawn a walk over four innings, but the sentiment in the dugout and the stands seems to read if they haven’t scored yet, chances don’t look so good.

Consider the same situation, now, with one small change. Your Brewers are still down by a run. It’s still the bottom of the fourth. Kershaw is still dealing. But it’s 2-1 Los Angeles this time. Milwaukee has still only gotten two hits and drawn a single walk, but the timing has worked out such that a run scored. By the numbers, things are almost exactly the same. No question about it. The sentiment, though, is certainly different. We’ve broken through once already, think the players, manager, and fans. We can do it again. Well, of course the Brewers can do it again. But, statistically speaking, will they? That is: when trailing by one run as they enter a half-inning, is a team more likely to come back in a non-shutout than in a game in which they haven’t yet scored?

The answer is “yes,” although only by what initially appears to be a small margin. In 2013, 5705 half-innings began with the batting team trailing by a run. 11.4% (651) of those half-innings ended with the batting team tied or in the lead. The same year, 2915 half-innings began with the batting team trailing specifically by the score of 1 to 0. 11.1% (324) of those ended in a lead change or tie.

At first glance, a 0.3% difference between odds of scoring when down by a run versus the specific case of being down 1-0 seems minor. And it is, really. For years with complete-season data available since 1871, the percent of half-innings started where it’s a one-run game and the losing team up to bat which resulted in a lead change or tie (let’s call this %ORLC) averages out to 11.5% ± 1.3% (1 σ). The subset of these in which the batting team was being shutout (let’s call this %ORSLC) has an average of 10.6% ± 1.1% (1 σ). Middle-school statistics will tell you that while, yes, %ORSLC is on average nearly a percent lower than %ORLC, they fall within a standard deviation of each other and, thus, their difference is not statistically significant.

That’s true. But baseball isn’t middle-school statistics and two subsets whose error ranges overlap are not for all practical purposes equal. Quite remarkably, %ORLC has exceeded %ORSLC for each consecutive season of Major League Baseball since 1977 (when %ORSLC was 0.2% higher) and every year since 1871 except for five seasons (out of the 111 years of complete-season data that were available).

That is: in 106 out of the last 111 seasons for which box scores have been logged every game, a batting team behind in a one-run ballgame has successfully erased the deficit more often when not trailing 1-0. The margin isn’t huge, of course, but the trend is meaningful.

View post on imgur.com

Above: Percentage of one-run game situations and specific 1-0 game situations (%ORLC and %ORSLC, respectively) in which the team losing scores to tie or take the lead

After all, baseball is a game of small but meaningful margins. The 111-year average relative difference between these two metrics (10.6% vs 11.5%) is proportional to a .277 batting average versus .300, or 89 wins in a 162-game season instead of 97. The latter is perhaps a more relevant comparison, since it is gaining (and maintaining) a lead that is crucial to winning games.

Among teams in 2013, however, these differences aren’t so marginal. In %ORLC (percentage of half-innings in which a team trailing by a run ties it up or takes the lead) the Royals finished first at 16.7% and the Cubs finished last at 6.5%. In %ORSLC (same stat but for the score 1-0), the Rays finished first at 16.7% (same number, coincidentally) and the Red Sox finished last at 4.9%. Considering the Royals didn’t make the playoffs in 2013 and the Red Sox won the World Series, I wouldn’t use %ORLC and %ORSLC as indicators of a team’s ultimate success unless you’re looking to lose a lot of money in Vegas.

While one could theorize for hours on the meaning and utility of each made-up statistic, it sure doesn’t seem like %ORLC and %ORSLC are indicative of much on a team-by-team basis. But that doesn’t mean they’re useless. Let’s go back to the long-term trend of %ORLC and %ORSLC, where the former was higher than the latter 106 out of 111 times.

Some underlying process, it would seem, must be responsible for this impressive stat. If we are to believe that teams truly underperform, ever so slightly, when they’re losing 1-0 due only to the fact that they’re being shut out, shouldn’t we able to see the effect of psychology on performance somewhere else?

As it turns out, you don’t have to look far. Let’s consider the general situation of a team coming up to bat down by a run (not only the specifically 1-0 case), which is colloquially termed a “one-run game.” We’ll abbreviate any instance of this (a trailing team coming to bat in any half-inning) as OR. Now this situation could happen at any point in a game. A visiting team leads off with a run in the top of the 1st, the home team comes up to bat – that’s an OR. It’s all tied-up in the top of the 13th, the third baseman slugs a solo shot to left, three outs are recorded, the home team steps up the plate with one chance to stay alive – that’s an OR. So, in what inning on average does an OR occur?

In 2013, the answer was the 4.95th inning. In 2012 and also for the last 111 years of available records, the 4.91st inning. Baseball amazes us once again with its year-to-year consistency in obscure statistics. But this obscure stat isn’t all that meaningful on its own. Okay, so most one-run situations occur near the 5th inning – so what?

Well, let’s take a look now at the average inning in which a team scored in an OR to tie or take the lead. We’ll call this a one-run game situation where the lead changes, or ORLC. In 2013, of all the instances of ORLCs, the average time they occurred was the 5.18th inning. In 2012, the 5.10th inning. And for the same 111 seasons of recorded game data, the 5.20th inning. Once again, we see a marginal but nonetheless compelling deviation from the average, just as we saw with %ORSLC. Teams score in one-run situations about a third of an inning later than the one-run situations tend to occur themselves. That may not seem like a whole lot, but consider that in our 111-season dataset only two years – 1902 and 1912 – saw earlier ORLCs than ORs on average. Just two years in one-hundred eleven.

View post on imgur.com

Above: Average innings of occurrence for one-run game situations (OR) and one-run game situations in which the trailing team scores to tie or take the lead (ORLC)

So what’s going on? I like to think of average ORLC minus average OR as a league-wide statistic for urgency. Consider the following: if the inning number had no effect on the performance of a trailing team in a one-run situation, then we would see roughly the same average inning of occurrence for both OR and ORLC. Out of 111 years, we’d expect to see about 55 years in which OR occurred earlier on average than ORLC and around 55 in which it didn’t. But we don’t see this at all, which strongly suggests that inning number has an effect on how a team does at the plate when down by a run. This is the urgency statistic. It describes a trend that has rung true for the past 101 consecutive seasons of Major League Baseball – when time is running out and the 9th inning is rapidly approaching, teams in close games get their acts together and produce runs. Not every time, of course, but we’re speaking in averages of massive sample sizes here.

So, while your Brewers are likely to fare worse trailing Kershaw and the Dodgers 1-0 than 2-1, take solace in the fact that it’s the fourth inning. Statistically speaking, they’ll have a better chance breaking through as the game goes on and their need for a run becomes more urgent. The effect of team psychology has left its imprint on the records of baseball games since the sport’s earliest days.


Pitches Seen: Baseball’s Boring Inefficiency

I think I might be the biggest fan of the world of the Ten-Pitch Walk.  I don’t know why, but I get overly excited when I see a player really battle for a long time, against everything the pitcher has, only to win the battle through patience.  Perhaps it’s because it’s so contrary to the spirit of what’s actually exciting about baseball; seeing players run around and field a batted ball.  It’s wholly a battle of attrition.  It’s the baseball equivalent of watching somebody run a marathon; you may not think the act itself is exciting, but it’s certainly an impressive feat in a vacuum.

So this has also lead to a fascination with pitches seen per plate appearance.  I’ve long wondered if certain teams place an emphasis on teaching their players to see more pitches per plate appearance.  It seems fairly self-evident that seeing more pitches is, in a microcosm, better than seeing fewer pitches.  You tire the pitcher out quicker, you see more data for your next at-bat to work with, and you give your team a chance to see what the pitcher has, and how he’ll react in different situations.  I hypothesized, purely based on colloquial wisdom, that the A’s would be good at this and the Blue Jays would be bad at this.  That’s not to say that one approach is better than the other, but just that some teams seem more patient than others.

Fortunately, FanGraphs has data available per hitter as to how many pitches they see.  I pulled that data out and found out each player’s average pitch per at bat since the year 2003 (the earliest we have this data, from what I can tell) and restricted the findings to active players only.  Then I ran some regressions to see if there was any correlation between pitches per at bat and useful batting stats.  Here’s what I found:

We see a slightly positive correlation between P/PA and wOBA.  It’s not really anything to write home about, but it’s more than negative.  It doesn’t seem immediately that seeing more pitches relates heavily to overall performance at the plate.  What about on base percentage?

Slightly better here, but still not great.  Seeing more pitches does have a little more correlation to getting on base, but there are plenty of aggressive swingers that don’t follow that model, so it means the correlation is loose at best.  What if we talk just about taking walks?

Here we have a real correlation.  .59 is a fairly strong correlation, and that makes sense.  The more pitches you see, the more likely you are to take a walk.  If you can successfully foul off anything in the strike zone, you will eventually walk (or the pitcher will die of exhaustion, either way, you win).  This is reasonably useful.  If you’re trying to find a way to make your team walk more, maybe you can invest in some players that see more pitches per plate appearance than normal.  This strong of a correlation makes me think about strikeout percentage too, though, because every pitch you foul off makes you closer (or just one whiff away) from striking out.

There is a positive correlation here, but not nearly as strong as between BB% and P/PA.  It’s stronger than the other useful stats like wOBA, but it’s interesting to know that seeing more pitches relates much more strongly to taking a walk than it is to striking out, at least on a grand scale.  There is some research to be done here to see what the odds are of a plate appearance as the pitch count increases, but I’ll leave that for another day.  My next thought was to see if there are, in fact, any teams that are better at this than other teams.  Here’s what we’ve got on a team level:

1 Red Sox 4.0506764011
2 Twins 4.0396551724
3 Cubs 3.9222196952
4 Yankees 3.9142662735
5 Pirates 3.9037861915
6 Astros 3.9028792437
7 Padres 3.9021177686
8 Mets 3.9009743938
9 Marlins 3.8916836619
10 Indians 3.8914762742
11 Athletics 3.8899398108
12 Phillies 3.8839715662
13 Blue Jays 3.8685393258
14 Cardinals 3.8634547591
15 Rays 3.8511224058
16 Rangers 3.8489497286
17 Dodgers 3.8480325645
18 Tigers 3.8314217702
19 Angels 3.8280856423
20 Diamondbacks 3.8161904762
21 Nationals 3.8146927243
22 White Sox 3.811023622
23 Giants 3.8038379531
24 Reds 3.8015854512
25 Orioles 3.8014611087
26 Braves 3.7944609751
27 Mariners 3.7358235824
28 Royals 3.7310519063
29 Rockies 3.7244254169
30 Brewers 3.6745739291

Well, my original hypotheses were not great ones.  The A’s and the Blue Jays, at 11 and 13, are both decidedly middle of the road teams.  I find it most fun in times like this to look at the extremes; in this case, the Red Sox and the Brewers.  The difference in pitches seen per plate appearance between these two teams is 0.38.  That may seem small, but it adds up.  If we assume the average pitcher faces 4 batters per inning, that’s an additional 1.5 pitches per inning, and 9 pitches by the end of the sixth, just purely by the nature of the hitters.  In a tightly contested contest, that may mean the difference between getting to the bullpen in the 7th rather than the 8th, or even the 7th rather than the 6th.

It should be noted that I limited this data set to 2014 (in contrast to the earlier data which was 2003 onwards) just so we could get a realistic look at roster construction, and to see if any teams are, right now, putting any particular emphasis in this area. The BoSox are carried by the very patient eye of Mike Napoli (4.51 P/PA), but hurt by the rather hacky eye of AJ Pierzynski (3.42 P/PA). Even on one team, that’s more than a pitch per plate appearance, which is pretty startling. The Brewers don’t have nearly the same difference; their best is Mark Reynolds with 4.04 P/PA and their worst is Jean Segura with 3.42 P/PA. As an aside, Chone Figgins is by far the best in this with a whopping 4.99 P/PA, though it was in just 76 PA. Kevin Frandsen brings up the rear with 3.16 P/PA in 189 PA. A lineup of all Mike Napoli’s would see 24.3 more pitches than a lineup of Kevin Frandsens before the leadoff Napoli even comes up a third time. I would feel bad for that pitcher.

The talk about teams possibly emphasizing this data made me wonder if I could make a huge difference if I compiled a team solely to do this; just make sure the pitchers throw a ton of pitches.  With that, I present to you the 2014 All-Stars and Not-So-All-Stars in this area, with a PA minimum thrown in to eliminate Figgins-like outliers:

All-Stars P/PA wOBA
C A.J. Ellis 4.344444444 0.311
1B Mike Napoli 4.353585112 0.371
2B Matt Carpenter 4.20647526 0.362
3B Mark Reynolds 4.179741578 0.341
SS Nick Punto 4.033495408 0.293
LF Brett Gardner 4.305959302 0.332
CF Mike Trout 4.219285365 0.404
RF Jayson Werth 4.399714635 0.364
DH Carlos Santana 4.297962322 0.356

 

Not-So-All-Stars P/PA wOBA
C A.J. Pierzynski 3.33404535 0.32
1B Yonder Alonso 3.603264727 0.318
2B Jose Altuve 3.266379723 0.321
3B Kevin Frandsen 3.41781874 0.296
SS Erick Aybar 3.415445741 0.308
LF Delmon Young 3.450895017 0.321
CF Carlos Gomez 3.517879162 0.321
RF Ben Revere 3.544046983 0.296
DH Salvador Perez 3.366071429 0.331

Despite the fact that there isn’t a strong correlation between wOBA and P/PA directly, it’s worth noting that the P/PA All-Stars are significantly better than the Not-So-All-Stars. Their difference in wOBA is .328 as compared to .314. The Not-So-All-Stars certainly present a fine lineup though; the All-Stars just have the benefit of having Mike Trout in their lineup. It’s nice to know that this is one other area that Mike Trout simply is amazing at, confirming the obvious. The All-Stars have a collective P/PA of 4.26, while their counterparts sit down at 3.43. That’s .83 pitches per plate appearance, which over the course of two turns through the lineup is 14.94 pitches; that’s definitely something notable.

So, it appears this is a demonstrable skill with some value, though not a ton. We can see that some teams are better at this than others, and we see some positive benefit from this, most notably in walk rate. While we see plenty of players on both sides of the scale who are excellent ballplayers, the data does seem to suggest that seeing more pitches is better than not doing so, though only marginally on a league wide scale. When we isolate leaders in this area vs. those more aggressive, we can see some startling differences though, suggesting that perhaps there is an advantage to be gained here.


Ben Revere and the Emptiest Batting Average Ever

I was listening to the Jonah Keri Podcast on Grantland recently, and he had Phillies beat writer Matt Gelb on the show. Gelb talked about all the sad things that Phillies fans are already tired of discussing, but he did make a statement that I found particularly poignant. He described Ben Revere’s season as something to the effect of “the emptiest batting average ever.” By empty, he means that while Revere is hitting above .300, an impressive feat in this offense-starved MLB landscape, he does so with almost no walks or extra-base hits. His value at the plate is almost entirely in the form of singles. This comment got me thinking: just how empty is his batting average?

As of this writing, Revere is hitting .314 with a .331 on-base percentage and a .371 slugging percentage. For comparison, the average player has a substantially worse batting average (think .240) but with a similar OBP and a substantially better SLG. To illustrate with normal stats, Revere has 27 total doubles, triples, homers, and walks this year. So far in 2014, there are 42 players with at least 27 doubles, 8 players with at least 27 homers, and 144 players with at least 27 walks.

But how rare is it to have this single-happy nature with such a high average? To look for players to compare to Revere historically, I looked for other player seasons since 2000 which had enough plate appearances to qualify for the batting title with a batting average at least as high as Revere’s but a walk rate and isolated slugging (slugging minus batting average) below his.

But there weren’t any, so I extended the search back to 1980.

Still nobody. 1960?

Nothing.

1900?

Zilch.

Now, to be fair, Ben Revere himself hasn’t completed a full season, so let’s use a more relaxed criterion of 400 plate appearances (Revere has 459).

OK, you get it.

In fact, since 1900 (it’s not worth going earlier because seasons were much shorter then), the only player with at least 400 plate appearances that had as high of a batting average with as little other hitting value is … Ben Revere. That’s it.

I’m not really sure that there’s much to be done with this information, but it’s a pretty shocking finding. As a member of a roster that’s overpaid and underperforming across the board, Revere’s limited skillset has been overshadowed by his more compensated counterparts. However, I was fascinated to discover that on a team that has had plenty of notable failings, Revere has had perhaps the most “unique” and “special” stats of any of them, as long as you’re not taking annual salary into account.

If you disregard his sub-par defense (especially compared to what you would expect from a guy with his speed), Revere really isn’t a terrible offensive player. If you took away all of his steals and instead turned that many singles into doubles, he’d have a slugging percentage around the league average. The problem is, a single followed by a steal isn’t as valuable as a double because it doesn’t advance runners on base, so his value would really be something less than that of a player with league-average slugging. Even if he posts a batting average way above the mean in any given season, he never walks or gets extra-base hits, so he has to sustain that mark against all kinds of luck and defensive factors in order to give the Phillies even passable offensive value. It’s a game that the Phillies seem interested in playing, and it’s defensible because of his obviously high average and stolen base totals, but I’m just not sure if they’re going to win that way.


2014’s Most Average Hitter

The premise of this article is a very simple one: which hitter has been the most average in 2014? Considering this question led me through a very simple process, and to a very sad answer (I urge you not to look at the links until the end because suspense). To the leaderboards we go!

Seeing as we’re looking for the most average hitter (not considering defense), and wRC+ is a hitting statistic designed to compare hitters against the average, it seems like a natural starting place. Considering only players with wRC+ between 95 and 105 gives us a list of 24 players.

Next, let’s look at wRC+’s partner in crime: wOBA. League average for wOBA is .316, so this round we’ll be restricting our list of 24 even further, only looking at hitters with wOBA between .310 and .320. Doing so cut our list (almost) in half! We are now left with only 13 players, progress!

Now that we’ve condensed the list based on production, it’s now time to look at the composition of said production. Our average player should have a BB% of about 7.9, and a K% of 19.8. Adjusting our leaderboard leaves us with the three most average hitters in the league. One of these three is not a surprise. The other two are very sad surprises.

But we want 1 average player, not 3, so to narrow it down to the end, I have included another filter for ISO, because our most average hitter should hit for an average amount of power. This final filter leaves us with the single most average player in the major leagues, and fair warning, it will sadden you:

Evan Longoria: BB%: 8.8 / K%: 18.8 / ISO: .139 / BABIP: .287 / OBP: .324 / SLG: .390 / wOBA: .312 / wRC+: 102

League Average: BB%: 7.9 / K% 19.8 /ISO .140 / BABIP: .301 / OBP: .319 / SLG: .396 / wOBA: .316 / wRC+ 100

There was a time when Longoria was to baseball what Mike Trout is today (well maybe not quite on the same level). He came up in 2008 and was the the star of the Rays in their surprising march to the World Series. He showed off 100% not-fake, seemingly-superhuman powers. From 2008 to 2013, Longoria’s wRC+ was 15th in baseball, in a virtual tie with David Wright (who happens to be one of the other most average players). He was also the single most valuable position player by WAR (36.1) in that time. For the first 6 years of his career, Longoria was a model of offensive consistency.

2014 has been a different story though. I’m not the first to write about Longoria’s down year, so I’ll refer you to the works of Jeff Sullivan and James Krueger. The bottom line: Longoria’s bat speed is down, which is killing his power and his ability to hit inside fastballs. This can be seen in his power numbers: a .139 ISO is a far cry from his career ISO of .225 (for reference, David DeJesus has a career ISO of .140). Longoria’s only hitting 9.7% of his fly balls for home runs, compared to 15.5% for his career.

His power hasn’t fully disappeared, but it’s nowhere close to what it was. It’s this sort of sharp power decline that usually suggests some sort of injury à la Matt Kemp (.236 ISO in 2012, .125 in 2013 following a shoulder surgery). Longoria is not expected to miss much time with his latest foot injury, and as Krueger points out, Longoria himself has attributed these struggles with mechanical issues. However, if I were a betting man (or at least old enough to legally gamble in casinos), I would put money on the Rays’ third baseman undergoing some sort of procedure over the offseason.

Now the good news for the Rays is this: even as the league-average hitter, Longoria is still very valuable. Dave Cameron ranked him 9th in his trade value series, no doubt in large part due to his superb defense and very team-friendly contract. Projections have Longoria finishing 2014 with 4.0 WAR. If the cost of a win is approximately $6 million, then he’s worth about $24 million in 2014, but only being paid $7.5 million. Even if Longoria continues to be a league-average bat with excellent defense, he will be very underpaid and very valuable. Really goes to show how great that contract was, huh?

Even more fortunate for Rays’ fans is that given Longoria’s career history, this sort of drop off in offensive production likely is not representative of his true-talent level. While his days as a ~135 wRC+ hitter may be behind him, 119 games is not a huge sample size and Longoria is still just 28. It’s likely that Longoria’s production increases closer to his career averages (Oliver has him 126 wRC+ for next year, which definitely passes the sniff test). The fact remains: Evan Longoria, despite being the most average hitter in baseball, is still one of the most valuable. Now we’ll just have to see what happens to that other average-hitting third baseman.