Imagine you are coaching third base. Your team is at bat with a runner on third and one out. There is a flyball caught in marginally shallow left field. You think your runner has about a 50/50 chance of scoring if you send him. Do you send him?
Many of you would probably say no. This is a risky call. There is a 50% chance the runner would be out, which would be a huge momentum killer. Furthermore, if he gets caught and your team loses by a run, you are going to be the person blamed by the media.
My hypothesis is that third base coaches are leaving runs on the table. Over the past four seasons, third base runners scored 98% of the time when sent in sac fly situations, suggesting that coaches are sending them only when they have a very high degree of confidence of success. I hypothesize they won’t send runners unless they feel they have at least an 80% chance of scoring, but my analysis says they should be sent even with much lower chances. Read the rest of this entry »
Balls in play are subject to lucky bounces, bloops, and exquisite defensive plays. Are some great hitting seasons and breakout performances just a player getting lucky on more than their fair share of balls? Is there any way to tell if a player is truly lucky or good, or if his batting average on balls in play is higher than we would expect? Could building a better expected BABIP help us find over- or undervalued players?
In the hopes of better understanding players’ true abilities, I looked specifically at the correlation between BABIP and launch characteristics. A player’s BABIP viewed across a short timeframe, such as a single season, can be highly influenced by luck. BABIP doesn’t converge well over a small sample. Using the law of large numbers, we know that given enough balls in play, a player’s BABIP should converge to their “true” BABIP. Fortunately, other launch characteristics like exit velocity and launch angle (both vertical and horizontal) converge more quickly. My goal was to build a model for expected BABIP based on those launch characteristics that removes as much luck as possible and more closely reflects a player’s true skill.
This project started as work I did along with Eric Langdon, Kwasi Efah, and Jordan Genovese for Safwan Wshah’s machine learning class at the University of Vermont. We were using launch characteristics (exit velocity, vertical launch angle, and derived horizontal launch angle) to predict if balls would land for hits or not. We initially tried using a support vector machine classification but found that a random forest model delivered more accurate predictions. Read the rest of this entry »