Examining Latino Hitters’ Plate Discipline
Abstract
It has been suggested that Latino players participating in Major League Baseball swing wildly at pitches outside the defined strike zone, that is, they are considered undisciplined batters, especially when compared to their trained American counterparts in the MLB. This paper examines that assumption and also the effectiveness of this possible lack of discipline, that is, if the Latino players hit more pitches outside the strike zone, hit for a higher batting average than American players, and walk or strike out at different rates. Based upon common beliefs in the baseball community that Latino players are less disciplined, and that they are generally superior players to Americans, the initial hypothesis of the observational study is that Latinos will swing at more pitches outside the zone, hit more of those pitches, walk less, strike out more, and hit for a higher batting average than their American counterparts.
The observational study focused on two populations: American Major League players and Major League players from the Dominican Republic, Cuba, Puerto Rico, Panama, and Venezuela. Further, the population consisted only of players who are currently on a Major League roster, whether active or on the Disabled List, and who have had at least 500 career at-bats, the equivalent of approximately one full MLB season of at-bats.
The initial hypothesis was nearly completely correct: Latino players did swing at a significantly higher percentage of pitches outside the strike zone than did American players, hit more of those pitches, and hit for a higher batting average. However, there was no evidence as to whether Latinos or Americans walked more, but there was evidence that American players strike out at a higher rate than their Latino counterparts.
The Study
Research Question
There is a Major League Baseball adage which references non-Mexican Latino baseball players who aspire to leave their homelands to become part of American baseball. The vernacular maxim is “You never walk off the island,” that is, batters are encouraged to swing at pitches outside the defined baseball strike zone in the hope that they can become more successful hitters. The central question which will be answered in this analysis is, “Are Major League Baseball players from the Dominican Republic, Cuba, Puerto Rico, Panama, and Venezuela less disciplined at the plate than their American counterparts?” An undisciplined batter is defined as one who swings at pitches outside the strike zone. A secondary significant corollary question to be investigated is, “Is this undisciplined approach actually successful or unsuccessful, that is, does an undisciplined approach work successfully for those Latino players?”
Data Collection
This experimenter utilized all 30 MLB team rosters courtesy of the 30 team pages on http://espn.go.com/mlb/players to find that there are 217 Major League players from the United States who are currently on a roster, whether active or on the Disabled List, who have had at least 500 career at-bats. There are additionally 82 players from the Dominican Republic, Cuba, Panama, Puerto Rico, and Venezuela who also are on an MLB roster and have had at least 500 career at-bats. The 217 American players were assigned a number between 1 and 217, and a random number generator was utilized to produce 21 unique numbers which correspond to the 21 American players used in the sample. The 82 Latino players were assigned a number between 1 and 82, and a random number generator was utilized to determine eight unique numbers which correspond to the eight players used in the sample. This random sample was taken from the entire population of Latino and American players with at least 500 career at-bats, and each sample, following research guidelines, is necessarily less than 10% of the population. Once the two samples had been assembled, the search feature on fangraphs.com provided all five statistics in its Standard and Plate Discipline charts for each player. The statistics will be referred to as OS for O-Swing%, that is, percent of swings at pitches outside the strike zone, OC for O-Contact%, that is, contact made on those swings at pitches outside the strike zone, K for strikeout percentage, BB for walk percentage, and BA for batting average.
Data Charts
Latino Player | Outside Swing, OS % | Outside Contact, OC % | Strikeout, K % | Walk, BB % | Batting Average, BA |
Pablo Sandoval | 45.6 | 77.9 | 13.2 | 7.3 | 0.293 |
Alexei Ramirez | 37.7 | 72.6 | 11.9 | 4.8 | 0.276 |
Alcides Escobar | 34.7 | 74.6 | 13.2 | 4.2 | 0.265 |
Jose Altuve | 36.1 | 81.1 | 10.6 | 5.3 | 0.304 |
A. Hechavarria | 37.5 | 76.2 | 16.8 | 4.5 | 0.258 |
Miguel Montero | 30.8 | 67.7 | 19.5 | 9.9 | 0.265 |
Yadier Molina | 30.0 | 77.1 | 9.3 | 7.0 | 0.284 |
Carlos Gonzalez | 35.2 | 56.6 | 22.4 | 7.9 | 0.290 |
American Player | Outside Swing, OS % | Outside Contact, OC % | Strikeout, K % | Walk, BB % | Batting Average, BA |
Kole Calhoun | 31.0 | 63.7 | 19.4 | 7.8 | 0.276 |
Jarrod Dyson | 24.5 | 74.3 | 18.7 | 8.4 | 0.252 |
Brian Dozier | 25.7 | 70.7 | 18.3 | 9.4 | 0.243 |
Jonny Gomes | 27.4 | 51.7 | 26.9 | 10.3 | 0.243 |
Evan Gattis | 43.5 | 66.8 | 23.3 | 5.1 | 0.243 |
Dee Gordon | 34.9 | 80.0 | 15.8 | 5.1 | 0.288 |
Reed Johnson | 32.2 | 63.4 | 18.2 | 4.5 | 0.279 |
JJ Hardy | 26.2 | 71.0 | 14.6 | 6.8 | 0.260 |
Collin Cowgill | 29.4 | 62.3 | 25.2 | 7.7 | 0.238 |
Jonathan Lucroy | 31.1 | 76.3 | 14.0 | 7.8 | 0.282 |
AJ Pierzynski | 39.2 | 72.0 | 11.6 | 4.8 | 0.282 |
Travis d’Arnaud | 27.8 | 73.0 | 15.4 | 7.8 | 0.240 |
Chris Johnson | 40.7 | 55.5 | 24.2 | 4.8 | 0.283 |
Torii Hunter | 30.1 | 55.0 | 17.9 | 6.9 | 0.279 |
Delmon Young | 41.2 | 61.1 | 18.0 | 4.1 | 0.284 |
Devin Mesoraco | 33.6 | 63.0 | 20.1 | 8.3 | 0.241 |
Anthony Rizzo | 23.7 | 77.4 | 18.5 | 11.0 | 0.261 |
Nolan Arenado | 40.5 | 74.5 | 13.0 | 4.8 | 0.277 |
David Wright | 23.2 | 65.3 | 18.4 | 10.9 | 0.298 |
Mike Zunino | 36.4 | 51.0 | 32.0 | 5.0 | 0.199 |
Josh Thole | 27.5 | 76.5 | 13.5 | 9.1 | 0.249 |
Brief Discussion of Summary Statistics
There is much less variability in the sample statistics for the Latino players than for the American players. The interquartile ranges for four of the five statistics are smaller for Latino players, and the ranges of the sample data are smaller for all five statistics among the Latino players. These two range differences suggest higher homogeneity among the Latino players participating in American baseball and higher heterogeneity among their American counterparts.
Inference Procedures
µ1 = Mean O-Swing%, O-Contact%, K%, BB%, or BA for Latino players
µ2 = Mean O-Swing%, O-Contact%, K%, BB%, or BA for American players
Conditions: 1) Random samples (stated)
2) The samples are at most 10% of the population (stated)
3) n < 30, but the normal probability plots for all five statistics for both Latinos and
Americans appear linear, so normality of the statistics is
assumed
HO: µ1 – µ2 = 0
HA: µ1 – µ2 > 0 (In walks and strikeouts, two t-tests were used; HA: µ1 – µ2 < 0 in the second tests)
α = .05
Discussion and Conclusions for Each Statistic
1) O-Swing, OS: There is not sufficient evidence at α = .01 that Latino players are less disciplined than US players, that is, that they swing at more pitches thrown outside of the strike zone per 100 pitches than do American players. However, at the proposed α = .05, there would be evidence that Latinos are less disciplined hitters, that is, they swing at significantly more outside pitches than do their American counterparts.
2) O-Contact, OC: There is not sufficient evidence at α = .01 that Latino players are more adept at making contact with pitches outside of the strike zone, that is, that the mean O-Contact for Latino players is higher than American players. However, at the proposed α = .05, there would be evidence that Latinos make contact with a higher percentage of pitches outside of the strike zone.
3) Batting Average, BA: There is not sufficient evidence at α = .01 that Latino players hit for a higher average than Americans. However, at the proposed α = .05 level of significance, there would be evidence that Latino players do hit for a higher batting average than American players.
4) Base on Balls, BB: There is not sufficient evidence that American players have a higher walk rate than Latino players. This indicates that, despite being more disciplined hitters, Americans still do not walk more often than Latinos.
5) Strikeout, K: There is not sufficient evidence that Latino players strike out more than American players. In fact, there is evidence that Americans strike out more than Latinos.
Final Conclusion
Utilizing the five two-sample t-tests, there is evidence at the proposed α = .05 significance level, but not at α = .01, that Latino players are less disciplined hitters than are American players and swing at a higher percentage of pitches outside the strike zone. However, this undisciplined approach works for players from the Dominican Republic, Cuba, Panama, Puerto Rico, and Venezuela, as they more often successfully make contact with those outside pitches. The term ‘less disciplined’ denotes that a hitter would be less effective and not make contact, but this is not true based upon the sample. The analysis indicates Latino players actually hit a higher percentage of the pitches they swing at outside of the strike zone than American players. Further, they statistically hit for a higher batting average than their American counterparts, which is not unexpected since they hit more pitches outside the zone than do trained American hitters. Similarly, they statistically strike out less frequently than American players since they are better ‘outside the zone’ hitters. Lastly, despite American players’ more disciplined approach at the plate, that is, they swing at fewer outside pitches, there is statistically no evidence that Americans have a higher walk rate than Latinos.
As noted above, all conditions for inference were met. The selection of players was completely randomized, but an interesting trend occurred within the selected players: more ‘star players’ were randomly selected in the Latino group than in the American group, again suggesting a higher level of homogeneity in the Latino players. The eight randomly chosen Latino players have appeared in 15 All-Star games, while the 21 American players account for only 19 All-Star appearances. Based upon the analyses, it appears that Non-Mexican Latino baseball players do indeed “swing their way off the island.”
Too complicated for me to understand.
Could there be selection bias to this? Meaning is it easier for a marginal American player to make the big leagues than a marginal Latino player. This might at least explain why the American players perform worse on swings outside the zone.
Great comment! A league selection bias is an interesting idea, and I believe it could come from scouting differences. That is, American players may still be scouted using older “good body, good swing” ideas, whereas Latinos may be judged more on production.
Yeah but…
At what ages were these players signed? A lot of international players sign in theit teens; American players sign only late in high school or early in college.
If the Latin players are taken into professional systems at early ages, then the teams are more responsible for their development than anything else.
Interesting study. I like when people challenge sweeping assumptions with data.
I think part of the problem you are trying to solve is putting a definition to, “less desciplined.” You seem to have some biases associated with that term that may not actually be a part of the definition. I believe that results from batted balls can be mutually exclusive from how disciplined a batter is. So the question that you set out to answer about the maxim, “You can’t walk off the island,” is really about O-swing as you identified in your statistical analysis. Resluts of batted balls are separate and aside from how many swings outside the zone a player takes. I would suggest keeping that out of your conclusions.
Also, why take a representative sample when the total pools of players are small to begin with? If the software you are using could handle 217 players you would have more data. Conditions for inferece may have been met, but your starting population was not that large.