Dominican Major Leaguers and the Provinces They Hail From

It shouldn’t come as any great surprise to a typical baseball fan that Dominican players play an outsized role in Major League Baseball today. In fact, the Dominican Republic, which has a population roughly just 3.3% that of the United States, supplies MLB with upwards of 10% of its players. Major League Baseball and baseball fans are better off because of this. After all, who wants to live in a baseball world without Nelson Cruz or Fernando Tatis Jr., for instance?

With this point in mind, the following takes a look at players from the Dominican Republic. More specifically, where in the D.R. players were born and when they made their way to MLB. What follows will be split into three brief sections: a description of the data utilized, some insights into the growth of the D.R.’s influence in MLB, and finally some map-based depictions of the players’ provinces of birth within the Dominican Republic. Read the rest of this entry »


Are Third Base Coaches Too Hesitant in Sacrifice Fly Situations?

Imagine you are coaching third base. Your team is at bat with a runner on third and one out. There is a flyball caught in marginally shallow left field. You think your runner has about a 50/50 chance of scoring if you send him. Do you send him?

Many of you would probably say no. This is a risky call. There is a 50% chance the runner would be out, which would be a huge momentum killer. Furthermore, if he gets caught and your team loses by a run, you are going to be the person blamed by the media.

My hypothesis is that third base coaches are leaving runs on the table. Over the past four seasons, third base runners scored 98% of the time when sent in sac fly situations, suggesting that coaches are sending them only when they have a very high degree of confidence of success. I hypothesize they won’t send runners unless they feel they have at least an 80% chance of scoring, but my analysis says they should be sent even with much lower chances. Read the rest of this entry »


Computer Vision and Pitch Framing

Quantifying catcher framing was a huge step for the analytical community in trying to understand the position more fully. It has allowed evaluators to have more accurate numbers on what a catcher is adding to the team. It has seemingly also brought more organizational focus to framing at the expense of blocking across the league, as can be seen in the increased prevalence of catching from a knee.

Perhaps all this work will be moot if robo-umpires are ever implemented, but teams clearly see marginal advantages to be gained by research and development on this topic for now. With this in mind, the quantification of a catcher’s ability to frame is only the first step in the journey. Next we should be looking to find what makes a catcher good or bad at framing in order to improve player development practices. Finding this from a statistical perspective is tricky, as we don’t really have easily accessible data on what the catcher is doing behind the plate other than the video of it happening. This may not be the case on the team side as markerless motion capture is a developing technology in this space which can record more data, but publicly, we just have video. Instead of sitting down and trying to watch thousands of pitches like surely many coaches have done, I’ll try my hand with OpenCV and Tensorflow. Read the rest of this entry »


Solving Adam Ottavino

There are many contributing factors to the lackluster performance of Boston Red Sox reliever Adam Ottavino, who made headlines in 2018 after saying he “would strike Babe Ruth out every time.” Currently, Ottavino has a 3.68 ERA, a FIP of 3.27, and 58.2 innings pitched at time of writing, placing him 119th out of 283 qualifying pitchers. Some big reasons for his mediocre stat line includes his inability to get left-handed batters out, command struggle, and pitch selection.

Red Sox Manager Alex Cora has done well to put Ottavino in situations to succeed, and without Cora at the helm, Ottavino’s stat line would look much worse. The bottom line is that the right-hander has been an abomination against lefties in 2021. In 19.2 innings pitched, he has allowed 10 earned runs, 22 hits, and 10 walks. Ottavino’s comments to the Boston Herald earlier in the season did not age well:

“I have no idea what they’re looking for these days in terms of roles and stuff like that, but I do think it would benefit me to get a full season in facing as many lefties as possible so I can put that kind of narrative to bed.”

Read the rest of this entry »


Pitch Mix Effectiveness

In a previous project, I attempted to determine what types of pitches are most effective in 1-2 and 0-2 counts based on suspicions that wasting pitches was not inherently strategic. I did this by analyzing league average wOBA values of different types of pitches in and out of the strike zone. The findings showed that on average, breaking and off-speed pitches outside of the zone were the most effective pitch to throw in order to minimize wOBA in both 0-2 and 1-2 counts.

While using league-average data produced some interesting results, I was still unsatisfied, since trying to project pitching strategy to a single pitcher doesn’t work when the data is league-wide. My goal was then to write an algorithm that could use a specific pitcher’s career pitching history to analyze the results of each of their pitches and determine every pitcher’s most effective pitch mix.

After a long time writing and editing code, I believe I have written a script that can do just that: evaluate each pitcher who has thrown more than 1,250 pitches since the start of 2019 and determine the wOBA value of each of their pitches at every count. Read the rest of this entry »


A Regional View of the MiLB Housing Crisis

Like millions across the country, minor league players are facing a housing crisis. The practice of using host families to house prospects was put on hold due to the pandemic, leaving players responsible for obtaining their own housing. Things have not gone well. While stories have come to light bit-by-bit, team-by-team, a piece last month by Brittany Ghiroli of The Athletic is one of the more comprehensive looks at the minor league housing crisis to date.

Ghiroli’s story details a number of ways in which minor league players get squeezed by housing, all of which is best summed up by this quote from catcher Caleb Joseph: “Finding a place to put your head at night is the hardest, most stressful thing to do as a minor leaguer.” Joseph would know, as he slept in his team’s clubhouse one year to save on housing.

The comments by Joseph, who spent 2014-2020 in the majors, also underscore that while the situation with host families is specific to this season, housing has long been an issue for minor leaguers. But in light of Ghiroli’s piece and the amount of reporting on this issue recently, I was interested in putting some numbers to the stories players have shared, particularly since housing costs can vary greatly from market to market and minor league teams are scattered across the country. Read the rest of this entry »


James McCann Has Lost His Progress

For Mets catcher James McCann, 2019 represented a career-altering triumph over the struggles that had plagued him through his first five big-league seasons in Detroit. With the Tigers, McCann’s abject lack of success at the plate led him to yo-yo between batting stances and approaches. In June 2016, he replaced his leg kick with a quieter front-foot step, struck out in a career high 29.2% of his at-bats, and tweaked his stance again in the offseason. McCann closed the book on his rookie contract with a 2018 season from hell — an abysmal triple-slash of .220/.267/.314, and a wRC+ of 56, good for second-worst among all hitters with at least 450 plate appearances.

2018 Worst wRC+ (450+ PA)
Player PA wRC+
Chris Davis 522 46
James McCann 457 56
Alcides Escobar 531 59
Scott Kingery 484 61
Billy Hamilton 556 68
JaCoby Jones 467 68
Adam Engel 463 68
Wilmer Difo 456 71
Jonathan Lucroy 454 72
Victor Martinez 508 73

Things opened up (literally) for McCann in Chicago. After signing a one-year, $2.5 million “prove it” deal with the White Sox, McCann’s most radical tweak struck gold. Opening his stance and bringing his hands closer to load unlocked an entirely different hitter in the once-struggling backstop. McCann became a legit power threat, popping 18 homers in just 118 games, and his wRC+ jumped to 108, placing him eighth among all catchers with at least 300 PAs. He continued this trend in 2020’s short-season madness: a slash of .289/.360/.460, a wRC+ within the top-40 of all hitters with as many at-bats, and even a positive grade as a framer. Read the rest of this entry »


Using Clustering To Generate Bullpen Matchups

In today’s game, reliever usage may be more important than ever. As starters go less deep into games, more emphasis is placed on bullpen strategy to survive the mid-to-late innings. Teams can use data to streamline this process, strategizing relief pitcher usage based on their pitch repertoires and batter ability. My goal is to produce a matchup tool that can potentially give us some insight as to how the big league teams “play the matchups.”

The basis of a bullpen matchup recommender will be at the pitch level: what types of pitches does a particular hitter struggle against, and how do they align with what a particular pitcher throws? To do this, I will first use clustering methods in order to redefine pitcher arsenals based on pitch flight characteristics. Matchups will then be selected according to which pitcher is expected to perform the best against a given batter, optimizing pitcher strengths against batter weaknesses.

Data

To conduct this research I used available Statcast data from 2016-2021 (through this year’s trade deadline). My variables of interest are as follows: pitch location (plate_x & plate_z), perceived pitch speed derived from release extension (effective_speed), pitch movement (pfx_x & pfx_z), spin rate (release_spin_rate), and the newly introduced spin axis (spin_axis). I elected to include spin axis in order to account for how the batter may see the pitch as it’s released. All in all, the variables selected measure the stuff and location of each pitch so that we may classify them more accurately beyond the basic pitch type labels. After cleaning this dataset and removing outliers, I was ready to move on to the modeling process. Read the rest of this entry »


When 1 + 1 Doesn’t Equal 2

By Bryan Woolley, JP Wong, and Nick Skiera.

Baseball, like all sports, is exciting because of the concept of variance. No team scores the exact same number of runs every game. That is why the Dodgers (5.82 runs/game) were not 60-0 in 2020. Runs per game strongly correlates with winning percentage for obvious reasons, but a team’s variance (essentially their consistency) plays a crucial role in their ability to win baseball games

Relating to this, we came across an interesting game theory concept. Given certain properties of the run-scoring distributions, the competitor with the lower output can increase their win probability by increasing the variance in their output. Conversely, the competitor with the higher output can increase their win probability by decreasing the variance in their output. Were this to apply to baseball, lower-scoring teams could win more games by becoming more inconsistent. Of course this is all just in theory, so the requirements for it to be relevant in reality to baseball might not be met.

We will examine the importance of variance in baseball both to test the theory and to attempt to uncover interesting trends in the sport. In our analysis we find that variance plays a significant role in a team’s success, suggesting that roster and lineup construction can be optimized by going beyond mean production. So as our title proposes, 1 WAR + 1 WAR and 2 WAR might not always be worth the same amount to a team if they are produced with different consistencies. Read the rest of this entry »


Which Pitch Should Be Thrown Next?

There are few things I enjoy in baseball more than the pitcher vs. hitter dynamic. Everyone likes to see highlight plays like a great catch or a mammoth home run, but those plays are few and far between. I believe that the tension created in a drawn-out plate appearance is where baseball is most enjoyable. Every pitch is meaningful, and the strategy of the game is on full display. The pitcher is trying to decide the best way to get the hitter to produce an out and the hitter is doing everything he can to thwart the pitcher.

This dynamic of baseball has always fascinated me. I was curious how pitchers and catchers decided which pitch was correct to throw in a situation. There are plenty of tools available to them that were not readily available when I was a child, like heat maps made from pitch-tracking data, but they show results without the context of what previous pitches were thrown in the plate appearance. Heat maps provide useful data, but the real art of pitching is being able to set up a hitter to take advantage of their weaknesses. If a pitcher throws the same pitch in the same location every time, eventually the hitter is going to catch on and change his strategy accordingly. So which sequence of pitches is the most effective at retiring hitters? This is the question I attempted to answer with this article. Read the rest of this entry »