Author Archive

Using Decision Trees To Classify Yu Darvish Pitch Types

Last year, I wrote a post which outlined the application of a K Nearest Neighbors algorithm to make pitch classifications. This post will be, in some ways, an extension of that as pitches will yet again be classified using a machine learning model. However, as one might have presumed given this post’s title, the learner of choice here will be a decision tree. Additionally, this time around, instead of classifying pitches thrown over the course of a single game I will aim to classify pitches thrown by a single pitcher over the course of an entire season.

What follows will be divided into three sections: a brief conceptual explanation of decision tree learners, a description of the data and steps taken to train the decision tree model of choice here, and finally a run-through of the model’s results. I am not an expert on machine learning, but I believe that this is an interesting exercise that (very, very basically) highlights a powerful model using interesting baseball data. The work to support this post was conducted in scripting language R and with the direction of the book Machine Learning with R by Brett Lantz. Read the rest of this entry »