# The Leadoff Walk

We’ve all heard a broadcaster comment on the impending doom of a leadoff walk and yet they fail to seem to apply the same sort of fateful outcome for a single. I thought it would be interesting to find the outcomes of each of the ways a player can leadoff an inning by getting on first base and see if it affects whether or not the runner goes on to score.Â I took the retrosheet data sine 1952 (but not including this year) that I have as a MySQL database and created a quick python script to determine these results. I took it further and examined if the breakdown were any different in late game situations, as I’m always hearing “You never want to walk the leadoff batter but especially late in close ball games”. I was also curious if even in general more solitary runs get manufactured once a leadoff runner gets on base in late game situations.

Total times batter lead off an inning by getting to first: 508312
Total times runner scored: 192150

So a leadoff batter who starts on first base scores 37.80% percent of the time, here is the breakdown via the means they get aboard

#### Any inning

```SingleÂ Â Â Â Â  325455 Scored 122662Â Â  37.69% WalkÂ Â Â Â Â Â Â  150570 ScoredÂ  57189Â Â  37.98% HBPÂ Â Â Â Â Â Â Â Â  11865 ScoredÂ Â  4600Â Â  38.77% ErrorÂ Â Â Â Â Â Â  19260 ScoredÂ Â  7270Â Â  37.74% StrikeoutÂ Â Â Â  1007 ScoredÂ Â Â  375Â Â  37.24% Catcher's Int. 155 ScoredÂ Â Â Â  54Â Â  34.84% TotalsÂ Â Â Â Â  508312 Scored 192150Â Â  37.80%```

So it appears as though it’s not much of a statistically significant difference between the walk and the single. The HBP numbers seems to be a bit of an outlier, I’m wondering if that is just sample size or if such an outcome rattles the pitcher to the point of that much more runs being produced.

Lets now examine the breakdown based upon the stage of the game.

#### 6th inning or earlier

```SingleÂ Â Â Â Â  217421 ScoredÂ  83243 38.29% WalkÂ Â Â Â Â Â Â  100587 ScoredÂ  38798 38.57% HBPÂ Â Â Â Â Â Â Â Â Â  7879 ScoredÂ Â  3070 38.96% ErrorÂ Â Â Â Â Â Â  12778 ScoredÂ Â  4880 38.19% StrikeoutÂ Â Â Â Â  645 ScoredÂ Â Â  244 37.83% Catcher's Int. 107 ScoredÂ Â Â Â  36 33.64% TotalsÂ Â Â Â Â  339417 Scored 130271 38.38%```

#### 7th inning or later

```SingleÂ Â Â Â  108034 Scored 39419 36.49% Walk Â  Â  Â Â  49983 Scored 18391 36.79% HBPÂ Â Â Â Â Â Â Â Â  3986 ScoredÂ  1530 38.38% Error Â  Â  Â Â  6482 ScoredÂ  2390 36.97% StrikeoutÂ Â Â Â  362 ScoredÂ Â  131 36.19% Catcher's Int. 48 ScoredÂ Â Â  18 37.50% TotalsÂ Â Â Â  168895 Scored 61879 36.64%```

Interesting how 1.74% more leadoff runners reaching first score in the earlier innings.Â  Is this a comment on the failure of manufacturing runs or pitching being different in the later stages of the game?Â  Perhaps a deeper look based upon “close game situations” is in order for that.

We hoped you liked reading The Leadoff Walk by plen!

Please support FanGraphs by becoming a member. We publish thousands of articles a year, host multiple podcasts, and have an ever growing database of baseball stats.

FanGraphs does not have a paywall. With your membership, we can continue to offer the content you've come to rely on and add to our unique baseball coverage.

Guest
Evan

Couple of things: The only reasonable explanation I can think of for the slight elevation in runs scored off of a lead-off walk versus a lead-off single is that when a lead-off walk is issued, it’s more likely that you’re dealing with a pitcher who has control problems. It’s even more likely when the leadoff runner reaches via HBP. Based on this explanation you might expect to see more lead-off HBP’s score than lead-off walks than lead-off hits, which you apparently do. Regarding the lower likelihood of lead-off runners scoring in later innings I think that definitely has to do… Read more »

Guest
72'Yankees

Nice! I was always curious about that stat. But I also always tought that a leadoff walk (or, in this case) a leadoff HBP would usually score, because it’s an early evidence that the pitcher is loosing his control. In other words: Obviously, no pitcher wants to walk the leadoff batter, nor hit him with a pitch, but when that does happen, that guy usually scores, because the pitcher is loosing his control over the outcome of pitches. He starts to miss his targets, and when that happens, he usually gets hit. I don’t know if there is a way… Read more »

Guest
LarryInLA

All of those numbers are within the sampling error, which should be listed explicitly with data like this.

1 SD for N trials with an expected success rate p = sqrt(p*(1-p)/N)

So for all innings:

Singles = 38.29 +/- 0.09%
Walks = 37.98 +/- 0.13%
HBP = 38.77 +/- 0.44%
Error = 37.74 +/- 0.35%

That’s all within the margin of error, and there’s no reason to rule out the null hypothesis that they are all have the same true scoring percentage.

The differences between the early and late innings are statistically significant thought, it would seem.

Guest
joeIQ`

a good collection of info, but the leadoff walk is so dreaded because it’s your own fault. If they get a single they’ve earned it. It’s not that the walk is so bad, it’s just that the walk is a strategic error. Also, look at it this way, how often does a leadoff walk score (37%) vs how often a ball put in play scores? If you make them put it in play it’s more like a 10% chance to score. (the odds of getting on x the odds of scoring after you get on) So indeed, a lead off… Read more »

Guest
Rick

I would think that leadoff hitters who reach in the 7th inning and later are less likely to score because the team is more likely to pull the ineffective pitcher and put in a more effective one.

Guest

What Joe IQ said.

Guest
Socrates

First of all. This Community Blog thing is great. I have love the two that I have read so far. Great work Plen. Second, I have to agree with Evan that the difference between the late inning and early inning numbers is likely pitching changes. For instance in a close game (7 inning or later), you are MUCH more likely to see a platoon matchup for a replacement pitcher. Lastly, as LarryinLA points out there are definately some statistically significant differences in these numbers (I think the sample size being half a million is cool). It would seem to me… Read more »

Guest
Mike

I think the likelyhood that the runner gets bunted to 2nd is higher in the later innings. Does a run score more often or less often when a runner is bunted over. I know that the expected runs after a bunt is lower, but is the chance that the one specified runner score higher?

Guest
Josh

Ditto the above posters. wOBA is about .015 points lower in innings 7-9 than it is in innings 1-6.

Guest

Peak scoring happens between the 5th and 7th innings. I have no evidence to prove why, but I suspect it has to do with the fact that your starting pitchers are tired by the 5th/6th/7th inning, and hitters have gotten a better look at them. If your starters are yanked, then your bottom teir bullpen arms are brought in. In the 8th, and 9th, you’ve typically got your better bullpen arms, who are fresh, and only facing batters once, so those 2 innings tend to be lower scoring then the rest (particularly the 9th, where your closer (best relief pitcher)… Read more »

Guest
The Duder

Selection bias.

Guest
Ted Hoppe

The recent trend is for pitchers to throw only 6 or 7 innings and then have a middle reliever take over. This wasn’t always the case. So measuring innings where the 4th and 5th innings, where starters have been through the order once perhaps might be a better indicator, and good and mediocre pitchers will average out The ninth inning, where a closer is called in, but not always, may also lower the 37% rate. Blown saves may have higher raters than the 37% rate I would be interested in knowing if there is any significant differences by decade, but… Read more »

Guest
badenjr

It would be interesting to see if the likelihood of scoring has changed over the years. In particular, it would be interesting if you see that only in the bullpen era does the difference between early game and late game scoring exist. While I’d assume that the relief pitching suggestion is the cause of the difference between early and late game scoring, I’ll offer another possible explanation as well. The early part of the game includes the first inning. Generally, over the period covered by the data, the leadoff hitter in the first inning is usually (1) fast and therefore… Read more »

Member
Nathaniel Dawson

An interesting follow-up on that would be checking the difference in late game scoring to early game scoring throughout the years. Relief pitchers are used much more these days, possibly widening that variance over time.

Guest
kbertling353

Great stuff, Plen. I don’t really have much to add, but I wanted to thank you for putting the work in to research this.

Guest
Chris

Not sure if this was already asked, but what about IBB? Were those looked at? Because the next batter will most likely be getting challenged more and have a greater opportunity to drive a pitch.

Guest
badenjr

plen, you’ve shown that the percentage is significantly higher in the first inning. Does that explain the difference between early and late in games? If you removed the first inning, does the percentage look about the same for innings 2-6 as for innings 7+? The common explanation was that a fresh reliever better prevented the run from scoring in the late innings. If that was true, you’d expect to see the percentages decline as the years progressed and relievers became more prevalent. Your data suggests the opposite. From 1952 through 1976, the percentage never climbed as high as 38%. It… Read more »

Guest

I’ve got a couple of thoughts (caveat’ing that, as Larry pointed out (and mentioning that Larry’s singles number was the one for 1-6th inning), your differences are still within the margin for error). 1) If you can, and continuing on with badenjr’s idea, break down the difference by lineup position. Of course, maybe this will result in too few samples, especially for the eighth and ninth hitters. 2) I’d also break it down by n’th time seeing that pitcher/batter matchup in that game. I think this will equalize for bullpen usage over the years, as there would be more 3rd… Read more »

Guest
Bill

The leadoff hitter tends to be a faster running. They ALWAYS lead off the first inning, hence the increase.
The second inning, the 4-5-6 (slower) hitters tend to start the inning, hence it’s harder to score. (Especially since they have to rely on the 6-7-8-9 hitters to drive them in).
That seems to clearly explain the increase % in the first inning and the decrease in the 2nd inning.
9th inning, your closer is in the game, hence the lower %.

I’d like to think innings 3-6 vs 7-8 would be fairly equal.

my \$0.02

Guest

Just read this post, it’s awesome. I’m curious though, do you know where I can find data on the events that transpire on the next at bat (after a lead-off walk). I have a few theories, like increased errors (possibly catching his defense a little bit more off-guard from the lack of engagement on the last at bat) and increased hits (since the pitcher doesn’t want to fall behind, giving the batter a chance to t-off on a first pitch over the plate).

Member
charles

Same as kbertling353 says:
thanks

Guest
Charles

really nice article and good posts…ONE thing I’d like to see is pct of times the leadoff baserunner scores if he is on 2nd base vs 1st base. And the difference between whether they got to first on a double or stole while the 2nd batter was still batting. Just curious since this is considered scoring position where they can score on a single. Lastly…the other time a leadoff runner might get on first is if they get on via an error. Can’t imagine the pct would be much different other than due to small sample size similar to IBB

Guest
pete

Am I correct in that the data shows the percent of time that the actual player who led off with walk, single, etc. scores? Does anyone know the percent of time a team scores in an inning when leadoff man reaches first. For example, leadoff walk followed by fielders choice force out at second; now man on 1st with one out and he eventually scores.

Guest
Jianadaren

The differences between the types of ways a player can reach is likely noise. They might be showing some sort of sort of proxy for command or the hit types might somehow select for different parts of the batting order, but I highly doubt that can be teased out from here.

With respect to innings, it’s already well-known that the later innings have a lower run environment so that’s what you’re seeing. This is caused by cooler weather and bullpens.

Guest
Matt Murray

Is their data to compare four pitch lead off walks vs non four pitch walks?