Archive for Outside the Box

Hierarchical Clustering For Fun and Profit

Player comps! We all love them, and why not. It’s fun to hear how Kevin Maitan swings like a young Miguel Cabrera or how Hunter Pence runs like a rotary telephone thrown into a running clothes dryer. They’re fun and helpful, because if there’s a player we’ve never seen before, it gives us some idea of what they’re like.

When it comes to creating comps, there’s more than just the eye test. Chris Mitchell provides Mahalanobis comps for prospects, and Dave recently did something interesting to make a hydra-comp for Tim Raines. We’re going to proceed with my favorite method of unsupervised learning: hierarchical clustering.

Why hierarchical clustering? Well, for one thing, it just looks really cool:

That right there is a dendrogram showing a clustering of all player-seasons since the year 2000. “Leaf” nodes on the left side of the diagram represent the seasons, and the closer together, the more similar they are. To create such a thing you first need to define “features” — essentially the points of comparison we use when comparing players. For this, I’ve just used basic statistics any casual baseball fan knows: AVG, HR, K, BB, and SB. We could use something more advanced, but I don’t see the point — at least this way the results will be somewhat interpretable to anyone. Plus, these stats — while imperfect — give us the gist of a player’s game: how well they get on base, how well they hit for power, how well they control the strike zone, etc.

Now hierarchical clustering sounds complicated — and it is — but once we’ve made a custom leaderboard here at FanGraphs, we can cluster the data and display it in about 10 lines of Python code.

import pandas as pd
from scipy.cluster.hierarchy import linkage, dendrogram
# Read csv
df = pd.read_csv(r'leaders.csv')
# Keep only relevant columns
data_numeric = df[['AVG','HR','SO','BB','SB']]
# Create the linkage array and dendrogram
w2 = linkage(data_numeric,method='ward')
labels = tuple(df.apply(lambda x: '{0} {1}'.format(x[0], x[1]),axis=1))
d = dendrogram(w2,orientation='right',color_threshold = 300)

Let’s use this to create some player comps, shall we? First let’s dive in and see which player-seasons are most similar to Mike Trout’s 2016:

2016 Mike Trout Comps
Season Name AVG HR SO BB SB
2001 Bobby Abreu .289 31 137 106 36
2003 Bobby Abreu .300 20 126 109 22
2004 Bobby Abreu .301 30 116 127 40
2005 Bobby Abreu .286 24 134 117 31
2006 Bobby Abreu .297 15 138 124 30
2013 Shin-Soo Choo .285 21 133 112 20
2013 Mike Trout .323 27 136 110 33
2016 Mike Trout .315 29 137 116 30

Remember Bobby Abreu? He’s on the Hall of Fame ballot next year, and I’m not even sure he’ll get 5% of the vote. But man, take defense out of the equation, and he was Mike Trout before Mike Trout. The numbers are stunningly similar and a sharp reminder of just how unappreciated a career he had. Also Shin-Soo Choo is here.

So Abreu is on the short list of most underrated players this century, but for my money there is someone even more underrated, and it certainly pops out from this clustering. Take a look at the dendrogram above — do you see that thin gold-colored cluster? In there are some of the greatest offensive performances of the past 20 years. Barry Bonds’s peak is in there, along with Albert Pujols’s best seasons, and some Todd Helton seasons. But let’s see if any of these names jump out at you:

First of all, holy hell, Barry Bonds. Look at how far separated his 2001, 2002 and 2004 seasons are from anyone else’s, including these other great performances. But I digress — if you’re like me, this is the name that caught your eye:

Brian Giles’s Gold Seasons
Season Name AVG HR SO BB SB
2000 Brian Giles .315 35 69 114 6
2001 Brian Giles .309 37 67 90 13
2002 Brian Giles .298 38 74 135 15
2003 Brian Giles .299 20 58 105 4
2005 Brian Giles .301 15 64 119 13
2006 Brian Giles .263 14 60 104 9
2008 Brian Giles .306 12 52 87 2

Brian Giles had seven seasons that, according to this method at least, are among the very best this century. He had an elite combination of power, batting eye, and a little bit of speed that is very rarely seen. Yet he didn’t receive a single Hall of Fame vote, for various reasons (short career, small markets, crowded ballot, PED whispers, etc.) He’s my vote for most underrated player of the 2000s.

This is just one application of hierarchical clustering. I’m sure you can think of many more, and you can easily do it with the code above. Give it a shot if you’re bored one offseason day and looking for something to write about.


Gary Sanchez Should Bat Second

What do Mike Trout, Josh Donaldson, Dustin Pedroia, Corey Seager and Manny Machado all have in common? Besides the numerous accolades that they share between the Rookies of the Year, the Silver Sluggers, the MVP awards and the combined 16 All-Star appearances, they all share one less obvious trait: they have more career plate appearances batting second in the lineup than anywhere else. Gone are the days of your team’s best player batting third or fourth. The new normal is now MVP-caliber players batting second. It has worked for Pedroia and the Boston Red Sox, Machado and the Baltimore Orioles, Donaldson and the Toronto Blue Jays and Seager and the Los Angeles Dodgers. Not for nothing, but those teams all made the postseason last year with large contributions from their second-hole hitters AND Trout was the AL MVP for the second time in his career on a last-place Los Angeles Angels team. And as more teams continue to adopt this trend, the New York Yankees should also look to bump up their best hitter.

In an appearance the other week on a YES Network interview, GM Brian Cashman has stated that the Yankees have kicked the tires on splitting Brett Gardner and Jacoby Ellsbury in the lineup. This makes a lot of sense when looking at their game; they both rely on their ability to get on base and set the table more so than their ability to drive in runs. Additionally, both players have slowly, but noticeably, been in decline in recent seasons, primarily due to age and injury. Gardner has been the subject of trade rumors over the past few seasons and Ellsbury has been the ire of the New York media for largely failing to live up to the seven-year, $153-million deal he signed before the 2014 season. River Ave Blues has already had a look at how the Yankees would approach this situation and they have provided a solid solution, but they almost immediately toss out the idea of Gary Sanchez batting there for one reason or another, while Sanchez is most deserving of the promotion.

Sanchez has established himself as the Yankees’ most dominant hitter after bursting on the scene last year. The Yankees, their fans, and the nation all expect Sanchez to hit in the third spot in the lineup, a prestigious position considering the history of the franchise, but moving the young slugger to second would not only better suit the team, but would also play to his strengths. Sanchez, despite the short sample size of 231 plate appearances, has proved to be a pretty good fastball hitter. Of the 294 fastballs he has seen, he has connected for a .328 AVG and .781 SLG, and nine of his 20 home runs. Why does this matter? Traditionally, number-two hitters have seen more fastballs than elsewhere in the lineup, and to further cement his commitment to the fastball, per Brooks Baseball, Sanchez had an exit velocity of 94.3 MPH against the heater (Sanchez ranked in the top 10 in overall exit velocity last year). Young players are also traditionally late to adapt to major-league breaking pitches. Can you blame them when they’re up against this or this?

Secondly, it has been proven that two-hole hitters collect more plate appearances per season than the three through nine spots. This is not new information, but the exact number of plate appearances has been up for debate for years. Beyond the Box Score might’ve ended the debate while also examining how the two hole has changed, stating that “[e]ach drop in the batting order position decreases plate appearances by around 15-20 a year,” which might explain why MVPs Trout and Donaldson have made a living there over the past few seasons. An extra 10-20 plate appearances could mean an extra home run or two over the course of the season. Baseball is a game of inches, but it’s also a game of runs.

With a lineup bereft of veteran power and more intent on utilizing the “Baby Bombers,” as they’ve been so aptly named, moving Sanchez up to second could and should give the lineup a much-needed boost if the reliance on Greg Bird and Aaron Judge should go somehow awry. Veterans Matt Holliday, Chase Headley and Starlin Castro have had good seasons and impressive resumes, but they need to return to All-Star form to carry a team of youngsters and a questionable starting rotation. No one really expects Sanchez to produce at the same rate that he did last year, but perhaps a bump up would allow him to produce at an above-average level again.


wRC+ by Leverage: the Good, the Bad, and the Funky

So I got a little carried away with the new splits leaderboard when I was looking up some wRC+ data. I was curious about which players performed the best/worst in high-leverage situations and one thing led to another and it led me to looking at top performers across the three leverage situations (low, medium, and high). If you want to know more about how leverage is calculated there is an old article in The Hardball Times here.

I used the splits leaderboards to gather 2016 hitter data by leverage situation and I only included players who had a minimum of 20 PA per split. Once I gathered all the data I converted each player’s wRC+ by leverage situation to a percentile and calculated each player’s mean percentile rank along with the variation around the mean using standard deviation to produce the following plot.

The blue line is just a LOESS line showing the general trend of the data. What the line is telling us is that players on the extreme end of the percentile ranks also seem to have the lowest variation or, more simply put, good players seem to be consistently good and bad players seem to perform poorly across all leverage situations. Using that plot as my baseline, I started exploring the data to answer some question about player performances in 2016. I included the top 10 players in ordered tables going from from least interesting to most interesting, at least in my opinion. First, let’s look at the top performers from this year.

Players who ranked highest in wRC+ across all leverage situations
Leverage Rank
Name Low Medium High Mean Rank SD
Mike Trout 98 97 99 98 1
Freddie Freeman 97 88 94 93 4.6
Josh Donaldson 97 94 88 93 4.6
Anthony Rizzo 93 88 97 92.7 4.5
Joey Votto 96 98 84 92.7 7.6
David Ortiz 96 99 77 90.7 11.9
Matt Carpenter 91 82 94 89 6.2
Paul Goldschmidt 88 86 88 87.3 1.2
Tyler Naquin 93 81 87 87 6
Ryan Schimpf 80 86 93 86.3 6.5

Boring, Mike Trout leads the way as the top performer. Apparently it doesn’t matter when he comes up to the plate; he is going to smash the ball. But I’m not going to focus on Trout, as I’m not qualified to write about him and he’s above my pay grade, so let’s leave him to the professionals. Like I said before, least interesting first and hopefully it’ll get more exciting as we go. Here’s a fun fact to keep you going: In high-leverage situations among players with a minimum of 20 PA, Ryan Howard led the league in ISO with a 0.640 mark. Ryan Schimpf was second with an ISO of 0.542. And Howard did that with a 0.118 BABIP, too.

Second, let’s take a look at the worst performers of the season.

Players who rated as the worst performers across all leverage situations
Leverage Rank
Name Low Medium High Mean Rank SD
Yan Gomes 17 23 0 13.3 11.9
A.J. Pierzynski 17 21 9 15.7 6.1
J.B. Shuck 25 16 11 17.3 7.1
Nick Ahmed 15 35 3 17.7 16.2
Jake Marisnick 37 19 6 20.7 15.6
Ramon Flores 21 20 21 20.7 0.6
Gerardo Parra 33 29 1 21 17.4
Juan Uribe 19 38 11 22.7 13.9
Adeiny Hechavarria 20 34 15 23 9.8
Alex Rodriguez 19 30 22 23.7 5.7

After a pretty impressive career, although it also came with its fair share controversy, we see A-Rod make this list. And it doesn’t look like he is going to be playing again this year, which casts some doubt on whether he is going to make it to 700 career home runs (he’s currently at 696).  But more importantly, our poorest performer of 2016 looks to be Yan Gomes. I was inclined to say A.J. Pierzynski should actually be considered the poorest performer of the year since his standard deviation was about half of Gomes’, but then I noticed that Yan Gomes was in the 0th percentile in high-leverage situations — literally the worst. Not all-time worst, but still pretty bad! And I guess if you want to argue that the worst percentile should actually be 1, as in the 1st percentile, then you could make that argument, but the value was rounded to 0 when Yan Gomes registered a whopping -72 wRC+ in high-leverage situations. The second-worst was Gerardo Parra at a -59 wRC+; that’s a pretty significant gap between first and second. Fun-fact time: In high-leverage situations, Mike Zunino ran a 30.8% walk rate, although he also struck out 30.8% of the time too. Yasmani Grandal had a 30.4% walk rate to go with a much smaller 13% K%.

Everyone always seems to be looking for players who are on the extreme ends of the leaderboards, but let’s give some love to the unsung heroes of the world, the completely average performers! I wasn’t sure if I simply wanted to use mean percentile rank as a measure for averageness, so I decided to go with what I called Deviation in the table. Deviation is calculated by adding the standard deviations (SD) of a players percentile ranks to the Δ50 column. The Δ50 column is calculated as the absolute value of a players mean rank minus 50.

The most average performers of 2016 in wRC+
Leverage Rank
Name Low Medium High Rank SD Δ50 Deviation
Scooter Gennett 55 46 49 50 4.6 0 4.6
Ezequiel Carrera 46 44 51 47 3.6 3 6.6
Leonys Martin 44 54 47 48.3 5.1 1.7 6.8
Matt Duffy 41 49 49 46.3 4.6 3.7 8.3
Avisail Garcia 45 51 42 46 4.6 4 8.6
Howie Kendrick 46 59 52 52.3 6.5 2.3 8.8
Johnny Giavotella 40 44 42 42 2 8 10
Jason Castro 47 49 62 52.7 8.1 2.7 10.8
Jonathan Schoop 62 53 54 56.3 4.9 6.3 11.2
Brandon Phillips 55 48 63 55.3 7.5 5.3 12.8

And Scooter Gennett comes away as the most average performer of the season! He also ran a 0.149 ISO on the season and I think 0.150 is usually considered average. Look how wonderfully average these guys were; we should all take a minute to enjoy the little things in life. I realize this may not be the sexiest table, but it’s still interesting. You might not be getting a whole lot out of these guys over an entire season, but they are going to go up there and do average things whether you like it or not.

Two tables left — hopefully you’re still with me here. Let’s look at consistency. People always say consistency is key. I guess that’s good advice except when you’re on the terrible end on the spectrum.

Table looking at the most consistent performers based on percentile rank
across the 3 leverage situation (low, medium and high)
Leverage Rank
Name Low Medium High Mean Rank SD
Ramon Flores 21 20 21 20.7 0.6
Ivan De Jesus 32 32 33 32.3 0.6
Mike Trout 98 97 99 98 1
Paul Goldschmidt 88 86 88 87.3 1.2
Johnny Giavotella 40 44 42 42 2
Yunel Escobar 66 69 65 66.7 2.1
Hunter Pence 79 76 80 78.3 2.1
Wilson Ramos 81 80 85 82 2.6
Alexei Ramirez 26 32 28 28.7 3.1
Austin Jackson 43 38 37 39.3 3.2

Ramon Flores and Ivan De Jesus both had extremely consistent seasons; it’s just too bad they are on the wrong end of the spectrum. But I have to say Ramon Flores beats out Ivan De Jesus as he registered on average 12 percentile ranks poorer. In third we see Mike Trout showing incredible consistency while being the top performer in the league, followed closely by Paul Goldschmidt. It’s interesting see the top four players on this list from opposite ends of the spectrum, but the rest of this list bounces back and forth as well.

And here we are, the last one or as the title says “the Funky”. I found that volatility was the most interesting question, or which players showed the most boom or bust in 2016. Most of the players in this list performed best in low- and medium-leverage situations, often above the 90th percentiles.

Looking at players who showed the highest volatility based on percentile rank
across the 3 leverage situation (low, medium and high)
Leverage Rank
Name Low Medium High Mean Rank SD
Sandy Leon 96 84 2 60.7 51.2
David Peralta 95 29 1 41.7 48.3
Dansby Swanson 23 99 15 45.7 46.4
Yangervis Solarte 93 73 5 57 46.1
Mac Williamson 59 95 4 52.7 45.8
Alex Avila 36 99 12 49 44.9
Jarrod Saltalamacchia 41 9 97 49 44.5
Pedro Alvarez 86 85 9 60 44.2
Ryan Zimmerman 21 85 1 35.7 43.9
Kris Bryant 98 91 19 69.3 43.7

After perusing though the list, one of the most interesting names that jumps out should be Jarrod Saltalamacchia and his 97th percentile rank in high-leverage situations last year. And here’s another twist, would it surprise you to hear that in 2016 Miguel Cabrera was the least-clutch hitter among all Tigers qualified hitters? Check out the Tigers leaderboard here. But the 2016 volatility award goes to Sandy Leon, who absolutely mashed balls in low-leverage situations, was no slouch in medium-leverage spots, but dropped off the map in high-leverage situations. I have no idea how BABIP relates to wRC+, but with Sandy Leon it looks like his BABIP reflects what was happening in the different situations (0.434, 0.393 and 0.190). There is probably some combinations of descriptive stats that would explain some of the variance, and BABIP may very well be included, but I’m not going to go into that here.

Hope you enjoyed this. If anyone wants a copy of the R code I used to make the graph and tables, leave a comment below and I’ll pass it along. I ended up finding a pretty cool library to create html tables in R so you don’t have to mess around with formatting and manual inputs. As long as you’re willing to put a little work into understanding css you can basically customize the look of your tables.


The 2017 Phillies Can Change Baseball Forever

The GM of the Philadelphia Phillies has been accumulating the players to potentially pull off the greatest singleseason heist in the history of baseball.

How will they do this, you might ask?

By utilizing the 3-3-3 rotation.

I will explain why recent rotation alterations by the 1993 Athletics and 2012 Colorado Rockies were not successful. Then I will show how the Phillies version of the 3-3-3 will change the baseball world. But first, let me explain the 3-3-3 rotation and its benefits.

The classic 3-3-3 rotation uses three groups of three pitchers each, pitching once every three games.

Game 1 – Innings 1-3 (Pitcher#1) Innings 4-6 (Pitcher #2)  Innings 7-9 (Pitcher #3)

Game 2 -Innings 1-3 (Pitcher #4)Innings 4-6 (Pitcher #5)Innings 7-9 (Pitcher #6)

Game 3  – Innings 1-3 (Pitcher #7) Innings 4-6 (Pitcher #8) Innings 7-9 (Pitcher #9)

Ideally, each pitcher will throw three innings or 30-50 pitches per appearance. By the end of the season each pitcher will pitch about 162 innings over 54 appearances.

This rotation will help pitchers succeed by:

1) Allowing hitters only one plate appearance against each pitcher

2) Eliminating fatigue by keeping pitch counts down

The more opportunities a hitter has against a pitcher, the better success he has. Dave Fleming of Bill James Online provided statistical evidence from 2008 supporting this fact:

 PA  BA OBP SLG OPS

1st PA in G 108606 .255 .328 .398 .727

2nd PA in G 44505 .270 .334 .431 .765

3rd PA + in G 34520 .282 .346 .453 .800

Notice how every hitting statistic increases with each at-bat. To make a few comparisons, Eduardo Nunez was an All-Star last year, and his OPS was .758. All-Star Xander Bogaerts had an OPS of .802. So if you leave a pitcher in past the third AB (generally 7th or 8th inning) you’re facing a lineup full of 2016 Xander Bogaertses. Not exactly a winning formula.

A similar pattern was echoed in pitch counts:

PA BA OBP SLG OPS

Pitch 1-25 87685 .261 .333 .410 .743

Pitch 25-50 39383 .257 .326 .400 .726

Pitch 51-75 31791 .270 .333 .429 .763

Pitch 76-100 24261 .277 .344 .450 .795

The fact that pitches 1-25 were less effective than 25-50 is due to lineup construction. The rest of the numbers clearly show that pitchers are exponentially worse after the 50th pitch.

In this post, I will explain:

1) Why the 3-3-3 rotation did not work for La Russa in 1993

2) Why the Rockies’ alternative rotation wasn’t accepted in 2012

3) The benefits the 3-3-3 rotation will provide the Phillies in 2017 and beyond

Before we begin, there a few concepts we must accept:

1) Baseball is not archaic; it is ever-changing

2) Categorizing pitchers as only “starters”, “relievers” or “closers” is limiting to the pitchers’ value and abilities. We have to look beyond these inadequate labels. I will use these terms in this article, but attempt to focus on these underlying meanings:

a) Starter – Pitcher trained to throw 5+ innings

b) Reliever – Pitcher trained to throw 1-2 innings

c) Closer – Pitcher with experience throwing the last inning

3) There is no one system that produces winners or losers. You must utilize your personnel to the best of their abilities and limitations

Why the 3-3-3 rotation did not work in 1993

1) The Athletics did not have the personnel to execute the strategy

2) The experiment lasted one week

First, the Athletics had one of the worst pitching staffs in the league in 1993. They were in last place when they implemented the 3-3-3 rotation and had lost nine of their last 12 games. Here is a list of their ERAs in ascending order:

Name                      Training      ERA    Synopsis

Bobby Witt                 SP           4.21     97 ERA +

Goose Gossage          RP           4.53    Age-41 season

Todd Van Poppel      SP           5.04     21-year-old rookie

Ron Darling               SP           5.16       79 ERA+

Bob Welch                  SP           5.29     Age-36 season

Mike Mohler          RP / SP     5.60     Started 9 of 42 appearances

Kelly Downs           RP / SP     5.64     Started 12 of his 42 appearances

Shawn Hillegas      RP / SP      6.97    Started 11 of 18 appearances

John Briscoe             RP            8.03    Started 2 games in 139 IP in career

Only Bobby Witt and Goose Gossage had an ERA under 5.04. Witt was by far their best pitcher and his 97 ERA+ shows he was below average.

The second reason it did not work is the experiment only lasted one week. The public and media backlash from the switch to this rotation was so great that La Russa was forced to abandon the experiment after one week. One week! I don’t care what you do in baseball, if it only lasts one week, then you didn’t give it a real chance. Buster Posey hit .118 in his first week in the MLB in 2009, but the Giants wisely kept him around for 2010.

Why the Rockies’ alternative rotation did not work in 2012

1) They did not have the right personnel

First, let’s describe the specifics of the Rockies’ new rotation. It was a four-man rotation of Jeff Francis, Jeremy Guthrie and rookies Drew Pomeranz and Christian Friedrich. In each start, these four pitchers were given a strict 75-pitch limit. Three rotating pitchers called “piggybacks” would then relieve them.

Game 1 – Francis (75 pitches) Piggyback #1 Reliever #1 Closer #1

Game 2 – Guthrie (75) Piggyback #2 Reliever #2 Closer #1

Game 3 – Pomeranz (75) Piggyback #3 Reliever #3 Closer #1

Game 4 – Friedrich (75) Piggyback #1 Reliever #1/2 Closer #1

Similar to the 1993 A’s, the Rockies made their switch out of desperation. When implemented on June 20th, the Rockies were 18 games below .500 and in a 6-15 slump, on pace to lose over 100 games. Here is a look at the top six Rockies pitcher stats by the end of the year, with ERAs in ascending order:

Name                       Training         ERA       ERA+     IP

Jhoulys Chacin           SP               4.43        105         69

Drew Pomeranz         SP               4.93         94         96.2

Alex White               SP/RP           5.51          84          98

Jeff Francis                 SP               5.58          83          113

Christian Freidrich    SP               6.17          75           84.2

Jeremy Guthrie          SP               6.35          73          90.2

Only one of these starters was even an average pitcher. Three of the four rotation mates were at least 27% worse than the average pitcher in 2012. The issue with the 1993 A’s and the 2012 Rockies are they made these moves in the middle of last-place seasons. They were desperate to change what were the worst pitching staffs in the league. No team heading for a last-place finish is going to respond well to a complete overhaul of the staff in the middle of the summer.

The good news for this particular experiment, however, is that the Rockies pitching staff performed much better after the change was made. In the first 21 games that it was implemented, the starting pitchers improved from a league-worst 6.28 ERA to a league-worst 5.22 ERA. That’s more than an entire one-run improvement! Still the league worst (control your laughter), but that’s a major improvement.

I believe that gives us hope that an alternative and better rotation can be found in the correct circumstances. With the right rotation mates and the correct distribution of pitch counts, I believe there is room for improvement. The key is to train and implement the rotation before the season begins. No pitcher is going to be motivated to try a new system if it is implemented in the middle of a terrible season. It has to be the game plan to begin with, and everyone must be on board. Below you will see why the Phillies have the perfect staff for a 3-3-3 rotation. I have used the 3-3-3 rotation as my basis, but implemented some changes inspired by the 2012 Rockies to ensure success.

How the 3-3-3 Rotation will benefit the Phillies

1) Utilizing the perfect personnel

2) Peak value from assets

3) Health (Physical and Mental)

Personnel

The Phillies have eight middle-of-the-rotation MLB-ready starters who have demonstrated the ability to get MLB hitters out for multiple innings per appearance. The Phillies have five quality relievers who have demonstrated the ability to get MLB hitters out for one inning+ per appearance. Let’s take a look at the 2016 Phillies stats in order of ascending ERAs:

Name             Training    MLB IP 2016    ERA 2016      MLB service

Asher                 SP                27.2                    2.28              0.061 years

Neris                 RP                 80.1                   2.58               1.104 years

Benoit            RP / CP           48                      2.81                Final Year

Neshek          RP / CP            47                     3.06                Final Year

Eickhoff             SP                 197.1                  3.65                1.045 years

Hellickson       SP                 189                     3.71                Final Year

Ramos             RP                 40                       3.83               0.101 years

Buchholz         SP               139.1             Career 3.96          Final Year

Velasquez        SP                131                       4.12                1.086 years

Nola                  SP                 111                      4.78                 1.076 years

Gomez          RP/ CP           68.2             4.85 w/ 37 SV       Final Year

Eflin                   SP               63.1                     5.54                  0.111 years

Thompson        SP               53.2                     5.70                 0.058 years

Asher, Eickhoff and Hellickson were MLB starters with ERAs under 3.71 last year. Buchholz has the ability to be a front-line starter coupled with a career 3.96 ERA. Velasquez and Nola showed great promise despite rather average ERAs in the 4s. Velasquez sported a 10.6 K/9 ratio while Nola’s curveball has the best horizontal movement in the Majors (9.3 inches, beating out Gerrit Cole). The only two pitchers who disappointed were Eflin and Thompson, two young starters getting their first crack at the majors. Let’s count on them performing better next year.

The best reason why this personnel is perfect is because all of the trained starters have generally similar projections. From a projection and performance standpoint, all of these pitchers are middle- to back-of-the-rotation guys with upside. Nola and Velasquez are projected #2/#3 guys while Eflin, Thompson, Asher and Eickhoff are #3 to back-of-the-rotation guys (Though Eickhoff did have an impressive year in 2016). There is no Kershaw or Verlander or Bumgarner or Cueto who are expected to dominate and throw eight innings every start.

By only allowing them up to 50 pitches and one time through the lineup, the numbers listed in the introduction illustrate that the 3-3-3 rotation puts these players in the best possible position to succeed. Since the numbers are now in their favor, pitchers will have a refined focus and confidence. They can make a structured game plan on how they’re going to attack each hitter. This will limit extended innings under duress and ultimately build confidence in the minds of these young pitchers.

You may ask, Kevin, the Phillies aren’t going to contend in 2017. Why go through such a drastic change to get marginally better?

The answer is using the 2017 season as a stage for their assets to increase in value.

Asset Valuation

The Phillies are not in line for a winning season in 2017. They most likely won’t win 80 games in 2018. But 2019 is their year. That amazing 2018-2019 class of Kershaw, Donaldson, Machado, Harper, Pollock, LeMahieu, Keuchel, Harvey, Wainwright, Corbin, Smyly and Shelby Miller will be theirs for the taking, as the only money they have tied up is to Odubel Herrera. Even the 2017-2018 class of Arrieta, Cobb, Darvish, Duffy, Pineda, Tanaka (option), and Cueto (option) could insert an ace or #2 into their staff.

That is why they need to act now. They must increase their pitchers’ values now and acquire better assets with 2019 in mind. The free-agent market will be booming from 2017-2019, thus lowering trade-market value of any player after this year’s deadline. Instead of trading away prospects to get the guys they need, teams will simply open their pocketbooks. Now is the time to trade these middle-of-the-rotation guys away. Especially because they are not all in the 2019 plans.

“Utility Pitchers”

What is the most overpriced asset on the market right now? Relief pitching. More specifically, pitchers who can pitch multiple innings in relief in tough situations. See: Andrew Miller, Kenley Jansen, and Aroldis Chapman. By utilizing the 3-3-3 method, you are training your starters to pitch multiple innings in different scenarios and relieve in later innings. The 3-3-3 method trains your pitchers to achieve the greatest possible value by becoming what I like to call “utility pitchers.”

What makes players like Ben Zobrist, a .266 career hitter, and Ian Desmond, a .267 hitter, worth $60-70 million? They are utility players. Teams these days love utility players and are willing to pay big money for them. They are more valuable now than they have been in all of history. The same can be said for utility pitchers.

If you have ever been to the Arizona Fall League, it is used as a stage for the game’s top prospects. Starting pitchers generally pitch three innings, and relief pitchers will pitch 1-2 innings each for the remainder of the game. They do this to give teams’ top minor-league players exposure to higher competition with an added benefit of raising prospect value in the eyes of other teams. By sending their players to compete with top minor-league competition for all scouts to see, a good showing will raise potential trade interest. For example, this year the Giants sent a young catcher named Aramis Garcia, a former second-round pick. Garcia doesn’t fit into the Giants MLB plans with a player like Buster Posey entrenched at catcher until 2022, but they used him as one of their eight player selections anyway. I can surmise they did this to boost his stock for potential trade scenarios. The Phillies do not have all their current pitchers in their 2018-2019 MLB plans, so why not show them off to other teams?

By using the 3-3-3 method in the MLB as a stage for their abundance of young pitching talent, their pitchers will:

1) Get experience against the top talent in the world

2) Potentially increase their trade value

3) Limit innings to 130 – 160 IP

4) Give young pitching the best chance to succeed at the MLB level

5) Keep their innings down and arms fresh

The Phillies 2017 3-3-3 rotation, which you will notice is a quasi version of the 3-3-3 that I referenced above, would look like this:

1st Group – Hellickson (3) Asher (3) Eflin (2) Neris (1)

2nd group –  Nola (3) Eickhoff (3) Thompson (2) Gomez (1)

3rd Group –  Velasquez (3) Buchholz (3) Benoit (1) Ramos (1) Neshek (1)

Why this particular grouping?

1. Ability to sell three of what we call “closers” at the deadline. They can also switch Benoit and Ramos to the closer role on any particular day, giving Klentak five pitchers with closing experience to sell.

2. Give Eflin and Thompson only 2 IP per appearance because of their struggles last year. This should increase their confidence by decreasing their perceived pressure.

3. Since the Phillies signed two relievers to one-year deals in the offseason, it is apparent that Klentak wants to sell them off at the deadline. This is why I chose the quasi 3-3-3 system.

Imagine Klentak’s bargaining power at the deadline if he has even three of these newly trained utility pitchers pitching well, especially if one is a guy like Asher, Eflin, or Thompson? He could promise 5+ years of control of a utility pitcher who can be a traditional starter or a multi-inning reliever out of the bullpen.

Some people will read this and think that this would be a “demotion” or “devaluation” from being a “starter.” This is not true. All of these pitchers made it to the MLB as what you would call “starters.” They have excelled at pitching 6+ innings per game. This experiment would simply add value to all of them. Just as playing Ben Zobrist at LF, RF and SS doesn’t take away his ability to play 2B.

Most relief pitchers don’t get drafted as closers or relief pitchers. They are given chances at various roles and stick with whichever role suits their strengths best. Look at Chapman and Andrew Miller. Look at Joe Blanton! Terrible pitcher as a labeled “starter” but excelled in a set-up role for the Dodgers last year. General managers won’t trade for a guy for a postseason run if he hasn’t proven that he is going to be a solid contributor in the specific role they need for their team. So by using 2017 as a value-booster, you train all of your pitchers for multiple roles so you can have the leverage to trade any of your guys to any team. Every postseason team needs pitching. The 3-3-3 rotation will give Klentak unlimited options to acquire talent that will help the 2019 team be successful. GMs are most vulnerable at the deadline, and it is time to take full advantage.

Some people might argue that bringing up all of these pitchers at once would be a waste of MLB service time. But what is more important to a GM who has multiple pitchers with middle-of-the rotation ceilings? An option year or service time? This experiment is exactly that, an experiment. It is a trial run for one half of a season to ramp up current asset valuations to acquire a lot of quality pieces for the future. Since all of these pitchers are already on the 40-man roster, sending them to the minors would waste an option year anyway. So why not give this a try? The worst thing you could lose is half a season of MLB service time on a few guys who have served less then 20% of one year in their career.

HEALTH

In an arm-health study by Dr. James R. Andrews the following chart is comprised:

Ages 14 and under – 66+ Pitches (4 days rest) 51-65 (3) 36-50 (2) 21-35 (1) 1-20 (0)

Ages 15 and over – 76+ Pitches (4 days rest) 61-75(3) 46-60 (2) 31-45 (1) 1-30 (0)

These pitchers are prized assets. Millions of dollars coupled with thousands of hours of prep, coaching and playing time are used per arm. Why don’t we take better care of these players?

As a kid, your parents told you to eat your vegetables, sleep eight hours a night and stay in school while getting 60 minutes of exercise a day. But as we grow older we continually skip our vegetables, sleep five or six hours a night, forget to keep our brains active, and rarely exercise. We feel that we can still function this way, but more importantly, we feel we have to function this way. This is because we put too many responsibilities on ourselves at the expense of our own well-being. I’m arguing that we are giving these pitchers too many responsibilities, at a detriment to their peak physical health. Why? Because traditional baseball knowledge tells us that a five-man starting staff is the right way to go in 2017. But look back at history: there used to be one-man, two-man, three-man and even four-man rotations. Those proved to be unsuccessful. I am saying that the five-man rotation isn’t working either. It’s time to make a change.

What if we treated these valuable multi-million-dollar arms with the care that we take with our Little League arms? I propose a hopeful plan of three innings finished for each starter, but an absolute maximum of 36-50 pitches no matter what. These pitchers will then receive two days of rest for every 36-50 pitches, thus receiving the care a child under 14 would receive (see chart above). It is impossible to argue that this wouldn’t be a healthier system than the one we have now. Finally, let’s shift back to trade value. If Klentak is making deals on July 31 and a playoff contender is asking him how his players can help them win a championship, health is another big concern! If he can say that his pitchers have been put on a stricter regimen than any other team in the league, and that his players’ arms are healthier and more fresh than any other team in July in the history of baseball, that is going to increase his bargaining power. Remember, keeping players healthy, putting them in the best position to succeed and increasing trade value all are focused on the 2019 season. Klentak’s initial plan has always been focused on the 2019 season. And this plan will add tremendous benefit to that goal.

Conclusion

Now I am not saying that every team should utilize this strategy. I am not saying this is the future of baseball for eternity. I am saying that with the Phillies assets, at the perfect time in their development, this will be a great strategy to use. A Double-A or Triple-A prospect is worth much less than an MLB-proven prospect. A pitcher who can relieve, start and spot-start is worth more than just a conventional “starter” or “reliever.” More utility is always better than less utility. Healthier arms are better than overused arms.

I am saying the Phillies should give this a try for half of a season in which they won’t win more than 80 games. There is nothing to lose. And hey, if everything goes to plan, maybe this starts a revolution. If not, then they seamlessly revert to a five-man rotation in August. The goal of business is to buy low and sell high, looking for the most reward for the least amount of risk. This is about as high-reward as you can get in a sub-.500 season with about as little risk as I can imagine.

A new idea is always crazy before it makes sense. In the 1920s and 30s it was a rule that star pitchers had to throw 10-20 relief appearances in addition to their normal starting roles. In the 1880s, catching a ball on one bounce was an out. It even used to be legal for a first baseman to grab a runner by the belt so he couldn’t steal second! It is time for a new discussion about the modern-day pitching staff. It is time for rebuilding teams to try new things to get an edge on the competition. It is time for the game of baseball to go through yet another change. We owe it to the fans, to the players, and to the history of our beloved game. We owe it to ourselves to put our reputations on the line for the greater good of baseball.


Who Is the Greatest Second Baseman Ever?

It was when I was in sixth grade that I first began to seriously examine baseball.  I made my first annual Top 100 MLB players list that year.  Of course I didn’t know about advanced stats at the time, so Miguel Cabrera was atop that list.  Ironically that was before his Triple Crown.  Brian Kenny had educated me by then, and Trout has been first on every list since.  Anyway, back to the point, I also received the Bill James Historical Abstract that year, and became obsessed with his all-time rankings.  There was his all-time Top 100, and a Top 100 at each position.  Thinking about this the other day, it occurred to me how unusual the second-base rankings were.  Far be it from me to question the Godfather of Sabermetrics, but they seem wrong to me.  Here is the Top 10:

  1. Joe Morgan
  2. Eddie Collins
  3. Rogers Hornsby
  4.  Jackie Robinson
  5. Craig Biggio
  6. Nap Lajoie
  7. Ryne Sandberg
  8. Charlie Gehringer
  9. Rod Carew
  10. Roberto Alomar

Again, this seems wrong, but it is Bill James I’m refuting, so some research is probably required.  First, let’s rank the group by career rWAR:

  1. Rogers Hornsby 128.7
  2. Eddie Collins 122.2
  3. Nap Lajoie 104.8
  4. Joe Morgan 99.6
  5. Charlie Gehringer 79.6
  6. Rod Carew 76.7
  7. Craig Biggio 65.5
  8. Roberto Alomar 65.2
  9. Ryne Sandberg 64.2
  10. Jackie Robinson 59.4

Career rankings are tricky, because at some point a great peak is better than a long career.  Volume does matter.  Players like Robinson, who played only 10 seasons, suffer in career totals.  Let’s see the players ranked by the total fWAR from their four top seasons.  The group is ranked here by four-year peak:

  1. Hornsby 45.6
  2. Morgan 38.7
  3. Collins 38.0
  4. Lajoie 36.4
  5. Robinson 33.2
  6. Gehringer 30.8
  7. Carew 28.7
  8. Sandberg 28.1
  9. Biggio 26.9
  10. Alomar 25.7

That’s nice.  We now know who the best among the group were for their career and for condensed excellence.  However, simply having a long career doesn’t mean a player is the best, nor does having the best brief period of dominance.  Luckily, there’s JAWS.  JAWS is a system used for ranking players that combines career WAR and WAR over a player’s seven-year peak.  It is often used for analysis of Hall of Fame candidacies.  Let’s check out our group when using the JAWS system:

  1. Hornsby 100.2
  2. Collins 94.1
  3. Lajoie 83.8
  4. Morgan 79.7
  5. Gehringer 65.6
  6. Carew 65.4
  7. Sandberg 57.2
  8. Robinson 56.8
  9. Alomar 54.8
  10. Biggio 53.4

After seeing these three lists it is evident that only four of the ten are in the running for the title of being the top second baseman of all time:  Collins, Hornsby, Lajoie, and Morgan.  So far all I’ve used to evaluate these players is WAR.  Now, WAR is definitely a great tool, but it is not the only tool.  How about comparing the remaining four players in a few other ways?  Let’s see career wRC+ and Def for starters.

  • Collins:  144, 68.3
  • Hornsby:  173, 126.5
  • Lajoie:  144, 86.3
  • Morgan:  135, 14.0

Hornsby is the top-rated player in both wRC+ and Def.  He lead all three lists of WAR metrics.  This doesn’t really look close.  Why then did Bill James have both Morgan and Collins ahead of Hornsby?  He was clearly the best hitter of the three, so then why?  He led both of them in defensive value, so that can’t be why either.  Maybe it’s baserunning?  Let’s check out these three players (sorry Nap Lajoie) in BsR.

  • Collins 42.3
  • Hornsby -1.8
  • Morgan 79.0

Here we go!  Finally, a reason to question Hornsby as the greatest second baseman.  Morgan was first for Bill James, so clearly he believes that the mediocre baserunning of Hornsby and the tremendous baserunning of Morgan makes a huge difference.  Let’s concede hitting to Hornsby, and focus on the two final candidates in just fielding and running the bases.  For their careers the difference in fielding was 112.5 runs, while in baserunning it was 80.8 runs.  Hornsby still wins.  No matter how it is examined, Hornsby always comes out on top.  The greatest second baseman in baseball history is Rogers Hornsby.


An Attempt at Modeling Pitcher xBABIP Allowed

Despite an influx of information resulting from the advent of Baseball Info Solution’s batted-ball data and the world’s introduction to Statcast, surprisingly little remains known about pitchers’ control over the contact quality that they allow.  Public consensus seems to settle on “some,” yet in a field so hungry for quantitative measures, our inability to come to a concrete conclusion is maddeningly unsatisfying.  In the nearly 20 years since Voros McCracken first proposed the idea that pitchers have no control over the results of batted balls, a tug-of-war has ensued, between those that support Defensive Independent Pitching Statistics (DIPS) and those that staunchly argue that contact quality is a skill that can be measured using ERA.  Although it seems as if the former may prevail, the latter seems resurgent in recent years, as some pitchers have consistently been able to outperform DIPS, hinting at the possibility of an under-appreciated skill.

It is also widely assumed that a hitter’s BABIP will randomly fluctuate during the season, and that changes in this measure often help to explain a prolonged slump or a hot streak at the plate.  Hitters’ BABIPs can also vary drastically from year to year, making it difficult to gauge their true-talent levels.  Research in this field has been done, however, and there have been numerous attempts to develop a predictive model for this statistic, one that projects how a player should have performed, or perhaps more succinctly, his expected BABIP, or xBABIP.  Inspired by the progress, and albeit limited, success of these models, I embarked upon a similar project, instead focusing on the BABIP allowed by pitchers, rather than that produced by batters.  What began as a rather cursory look at exit velocity evolved into a much deeper look, and with this expansion of scope, I achieved some success, though not as much as I had hoped.

My research began with a perusal of Statcast data, and I began to use scatter plots in R to visualize each statistic’s relationship to BABIP.  Most of the plots looked something like this:

View post on imgur.com

In the majority of plots, it seemed as if there may have been some signal, but there was quite a bit of noise, making it difficult to detect anything of significance.  This perhaps explains the lack of progress in projecting BABIP: after looking at these plots, it appears quite simply difficult to do.  Despite these obvious challenges, I remained hopeful that I could perhaps develop something worthwhile with enough data.  Therefore, I began aggregating information, collecting individual pitcher-seasons from FanGraphs, Baseball Savant, Brooks Baseball, and ESPN, then manipulating and storing the data in a workable format using SQL.  Since Statcast data only became available to the public in 2015, my sample size is unfortunately a bit limited.  I also wanted to incorporate the defense that pitchers had behind them along with park factors when creating my model, so I removed all pitchers that had changed teams mid-season from my records.  This left me with a grand total of 641 pitcher-seasons (323 from 2015, 318 from 2016), and 188 pitchers showed up in both years.  For the remainder of my study, I used the 641 pitcher-seasons to develop the model, but when checking its year-to-year stability and predictive value, I could only use the 188 common data points.

To begin, I fed 29 variables into R: K/9, BB/9, GB%, average exit velocity, average FB/LD exit velocity, average GB exit velocity, the pitcher’s team’s UZR, the pitcher’s home park’s park factor, his Pull/Cent/Oppo and Soft/Med/Hard percentages, and an indicator variable for every PITCHf/x pitch classification.  (Looking back on this, I wish I included more data in my analysis to truly “throw the kitchen sink” at this problem, perhaps including pitch velocity, horizontal and vertical movement, and interaction terms to more accurately represent each individual’s repertoire.  Alas, I plan on keeping this in mind and possibly revisiting the topic, especially as more Statcast data becomes available.)  This resulted in an initial model with an adjusted R-squared of about 0.3; I then ran a backwards stepwise regression with a cutoff p-value of 0.01 to determine which variables were most statistically significant.  Here is the R output:

View post on imgur.com

For clarity, the formula: xBABIP = -0.157 + 0.005684 * BB/9 + 0.0009797 * GB% + 0.003142 * GB Exit Velocity – 0.0001483 * Team UZR + 0.005751 * LD%

I again obtain an adjusted R-squared of about 0.3, and I don’t find any of these results to be overly surprising, but to be fair, I had little idea of what to expect.  Before examining the accuracy of my entire model, I checked each variable’s individual relationship to BABIP, along with the year-to-year stability of each.  These can be found below in pairs:

View post on imgur.com

View post on imgur.com

I was most perplexed by the statistical significance of BB/9, and even after completing my research, I still find no entirely compelling explanation for its inclusion.  Typically, BB/9 is considered a measure of control rather than command, but intuitively, these skills seem to be linked, and perhaps pitchers with better command and control are able to paint edges more effectively, thus avoiding the barrel and preventing strong contact.  I was disappointed that its relationship to BABIP appeared so weak, but because of its relative year-to-year stability, I hoped that it would retain some predictive power.

View post on imgur.com

View post on imgur.com

Previous research has indicated that ground-ball hitters are able to sustain higher-than-average BABIPs, and thus, its inclusion in my model should not come as a shock.  Again, it would have been nice to see a stronger correlation between GB% and BABIP, but there is obviously quite a bit of noise.  However, it does seem that generating ground balls is a repeatable skill, which lends itself nicely to the long-term predictive nature of an xBABIP model.

View post on imgur.com

View post on imgur.com

Again, as previous research has suggested, the inclusion of GB exit velocity is to be expected.  However, its correlation with BABIP is not as high as I would have hoped; I suspect this may be a result of the unfair nature of ground balls.  In a vacuum, one would expect that low exit velocities are always superior, yet a fortunately-placed chopper may actually have better results than a well-struck ground ball hit right at a fielder, and thus, exit velocity’s signal may be dampened.  There does appear to be some year-to-year correlation though, which offers some promise of an unappreciated skill.

View post on imgur.com

View post on imgur.com

Here, I’m surprised by the lack of correlation between UZR and BABIP; I collected this data to control for the quality of the defense behind a pitcher, assuming that this could be a pretty significant factor, and although it did remain in my model, the relationship appears to be quite weak.  We should expect a very low year-to-year correlation between UZR, as pitchers that changed teams in the offseason were included in my study, and even if they remained on the same roster, teams’ defensive makeups can change drastically from one season to the next.  Thus, the latter graph is rather useless, but I chose to include it for consistency.

View post on imgur.com

View post on imgur.com

Unsurprisingly, LD% has the strongest relationship to BABIP, checking in with an R-squared of about 0.15.  I obviously wish that there were a stronger correlation between the two, yet despite the noise, when looking at the data, I think it is fairly evident that there is a signal.  And although I have read that LD% fluctuates wildly from year to year, I was shocked by the latter graph.  It seems as if this is entirely random, and that this portion of a pitcher’s batted-ball profile can be simply chalked up to luck.  This revelation is a bit discouraging, as it suggests that my model may struggle with predictive power, since its most significant variable is almost entirely unpredictable.

I anticipated that more variables would be statistically significant, and I am surprised by their disappearance from the model.  I assumed that Hard% would be highly correlated with BABIP, but it disappeared from my formula rather quickly.  I also assumed that pitchers who generated a high true IFFB% would exhibit suppressed BABIPs, but nothing turned up in the data.  And finally, I thought that K/9 may have been significant; it can be considered a rough estimate of a pitcher’s “stuff,” and I speculated that pitchers with high K/9 probably throw pitches with more movement than usual, perhaps making them harder to square up, but my model found nothing.

After considering each of the significant variables individually, I wanted to examine the overall accuracy of my entire model.  To do so, I plotted pitchers’ xBABIPs vs. their actual BABIPs, along with the difference:

View post on imgur.com

View post on imgur.com

As mentioned earlier, after incorporating all of the statistically significant variables in my model, I achieve an R-squared of about 0.3, a result that I find satisfying.  I obviously wish that my model could have done a better job explaining some of the variation in the data, and I suspect my model could be improved, although I have no idea by how much.  There is an inherent amount of luck involved in BABIP, and it is entirely plausible that pitching and defense can in fact account for only 30% of the observed variance, and the rest can only be explained by chance.  Despite the lower-than-desired R-squared, I do believe it still verifies the validity of my model, if only for determining which pitchers over- or under-performed their peripherals, saying nothing about why they did so or if they can be expected to do so again in the future.  The lack of correlation in the difference plot indicates that pitchers have been unable to systematically over- or under-perform their xBABIP from year to year, and along with the residual plot, suggests that my model is relatively unbiased and doesn’t appear to miss any other variables that obviously contribute to BABIP.

After determining that my metric had some value in a retrospective sense, I set out to determine whether it had any predictive power.  Because of the lack of year-to-year correlation for most of the statistically significant variables included in the model, I was quite pessimistic, although still hopeful.  I first checked the year-to-year stability of both BABIP and xBABIP:

View post on imgur.com

View post on imgur.com

It seems that both measures are almost entirely random, although xBABIP is perhaps just a bit more stable from season to season.  Despite this, comparing 2015 BABIP to 2016 xBABIP revealed that, as expected, my model holds little to no predictive power:

View post on imgur.com

Again, although disappointing, this result was to be expected, as the most powerful variable in my model, LD%, fluctuates wildly.  Despite this lack of predictive power, I stand by my model’s validity when considering past performance, and as more data accumulates, perhaps it can be adopted in a stronger predictive form.

Even after concluding that my metric has little predictive value, I thought it would be interesting to look at some of the biggest outliers.  2015’s biggest under- and over-achievers (with their 2016 seasons included as well), along with 2016’s luckiest and unluckiest pitchers can be found below:

View post on imgur.com

View post on imgur.com

View post on imgur.com

View post on imgur.com

Although the model holds no predictive power after quantitative analysis, anecdotally, it appears to do a decent job.  Each of the 10 pitchers featured as an over- or under-achiever in 2015 saw the absolute value of their difference fall in 2016 (although the sign did change in some cases); in no way am I suggesting that the model is predictive, I just find this to be an odd quirk.  I also find it perplexing that George Kontos appears an over-achiever in both years and can think of no explanation for this.  Along with outperforming xBABIP, his ERA has also beaten FIP and xFIP in each of the last two seasons and five of the last six, suggesting a wonderful streak of luck, or perhaps hinting that the peripheral metrics are missing something.

Ultimately, although it would have been nice to draw stronger conclusions from my research, I am mostly satisfied with the results.  When developing his own model for hitter BABIP, Alex Chamberlain achieved an R-squared of about 0.4 when examining the correlation between BABIP and xBABIP, the highest I have found.  However, his model included speed score, a seemingly crucial variable that I was unable to account for when analyzing pitcher’s BABIPs.  With this in mind, I find an R-squared of 0.3 for my model entirely reasonable, and despite its lack of predictive power, I consider it to be a worthy endeavor.  As the sample size grows and more Statcast data is released, I plan to revisit my formula in coming offseasons, perhaps refining and improving it.


Imagining Shohei Otani as a True Free Agent

We all know about Shohei Otani, but in case you are the one baseball fan who doesn’t, he is possibly the best baseball player in the world.  Otani turned 22 years old in 2016.  Although he did not have enough plate appearances to qualify, if he did, Otani’s 1.004 OPS would have led the country (of Japan).  In 382 plate appearances, he posted a slash line of .322/.416/.588, in addition to hitting 22 home runs.  That sounds like a very good player who would draw serious interest from MLB teams if posted.  However, that’s not all.  Otani also posted a 1.86 ERA in 140 IP with an 11.2 K/9.  He owns the NPB record for fastest pitch, at 165 km/h (102.53 mph).  The pitching stats alone would have every team in the MLB drooling.  Combine this with his hitting, and Otani might just be the best baseball player in the world.  And the best baseball player in the world is not going to paid like his title suggests.

The problem is that Otani will not yet be 25 after next season.  The new CBA keeps all international players under 25 from being exempt of the bonus pool system.  A tweet from Jim Allen reported that Otani still wishes to be posted after the 2017 season, when he will be 23 years old.  According to an excellent Dave Cameron article also on FanGraphs, the most money Otani could receive is $9.2 million.  This figure would be equivalent in 2016 to a player worth approximately 1.15 WAR.  Otani would surely be worth more wins than 1.15.

At first I wondered if this would make Otani the most underpaid player in the MLB.  Before that question could be answered, however, I had to answer a more important one: how much would Shohei Otani be worth in wins and, by extension, in dollars?  To make this more interesting, let’s make it a one-year deal, in which Otani would be paid the 2017 projected average price of $8.4 million per win above replacement.

The NPB has no available WAR figure, and no OPS+ or ERA+ was offered either.  Unfortunately, I could not find NPB league totals, so no calculating OPS+ or ERA+ on my own, at least not accurately.  I’ll use MLB league totals to find these numbers, but it is an obvious flaw in my research.  If anyone can find NPB totals for me, post the link in the comments, and I’ll gladly redo the study with those figures.

So, using the MLB totals, here are Otani’s numbers in 2016.  OPS+ 170.  ERA+ 225.

Those numbers look really good.  If these were for an MLB player, he would be by far the best player in the league.  How good were the numbers of other Japanese players before and after they were posted though? Let’s see, using three pitchers’ ERA+ and three hitters’ OPS+.  First the pitchers, including what Otani would hypothetically produce in 2017 by what the others produced.

Masahiro Tanaka:  2013 (NPB) 305; 2014 (MLB) 138

Yu Darvish:  2011 (NPB) 274; 2012 (MLB) 112

Hisashi Iwakuma:  2011 (NPB) 163; 2012 (MLB) 121

Shohei Otani:  2016 (NPB) 225; 2017 (MLB) 113

Now for innings pitched, another component required for the crude WAR I’ll project.

Tanaka:  212.0; 136.1

Darvish:  232.0; 191.1

Iwakuma:  119.0; 125.1

Otani:  140.0; 112.2

The raw numbers of IP and ERA+ can be converted into a metric (PV) that I can change into WAR.

Tanaka:  85.227 PV; 3.3 WAR

Darvish:  103.932; 3.9

Iwakuma:  73.552; 2.0

Otani:  63.911; 1.7

Pitching, Otani would be projected for a 1.7 WAR.  That is worth $14.28 million in real value.  Now for batting, which will be OPS+.

Ichiro Suzuki:  2000 (NPB) 157; 2001 (MLB) 126

Hideki Matsui:  2002 (NPB) 205; 2003 (MLB) 109

Kosuke Fukudome: 2007 (NPB) 155; 2008 (MLB) 89

Otani:  2016 (NPB) 170; 2017 (MLB) 107

That is the quality component of WAR.  Plate appearances now for quantity.  As a side note, because I’m not factoring in defense, oWAR is going to be used instead of WAR.

Ichiro:  459; 738

Matsui:  623; 695

Fukudome:  348; 590

Otani:  382; 540

Now for my metric to convert to oWAR.  I’ll call it OV.

Ichiro:  115.305 OV; 6.1 oWAR

Matsui:  93.179; 3.1

Fukudome:  63.012; 0.6

Otani:  68.180; 0.9

On offense Otani would have a 0.9 WAR.  This translates into $7.56 million.  For a one-year deal using real value, Otani should receive $21.84 million, while producing a 2.6 WAR.  But what about a long-term deal with market value instead of real value?  Using Bill James’ stat of projected years remaining to determine the length of the deal, it would be 10 years.  The first year would not have a salary of $21.84M, but $13.72M.  This year was easy.  Now for the next nine years.  First, we’ll examine his pitching value.  I won’t bore you with all the calculations.  This article is tedious enough without it.  Just the pitching WAR for each year.

2018 2.1; 2019 2.9; 2020 3.9; 2021 4.8; 2022 5.9; 2023 5.7; 2024 3.8; 2025 2.1; 2026 0.7

Now the oWAR for each of the seasons:

2018 1.6; 2019 2.3; 2020 3.0; 2021 3.8; 2022 4.5; 2023 5.3; 2024 4.7; 2025 3.1; 2026 1.7

The total WAR for the years are as follows:

2018 3.7; 2019 5.2; 2020 6.9; 2021 8.6; 2022 10.4; 2023 11.0; 2024 8.5; 2025 5.2; 2026 2.4

Over the course of the 10-year deal, Otani would have a total WAR of 64.5.  This is not what he would likely produce.  My projections are — ahem — optimistic.  These are the numbers he could produce if played as both a pitcher and a semi-regular hitter.  Using real value and these WAR figures, Otani would have a real value of $689.14M.  You can read that number again.  I had to do a double-take.  Go ahead and do one too; it’s still $689.14M.  That is real value — however, not market value.  The market value is the much more important, and interesting, number.  What the market value turns out to be, $249.01M, is still massive, but at least the $24.901M AAV is more reasonable in the market.  In fact, this is likely what he will receive when posted, if he is eligible for this kind of deal.  It will be a shorter deal than 10 years, but the AAV should be in line with what I projected.

However, Otani is a mind-boggling player, so no contract, no matter how mind-boggling it may seem, is out of the question for him.  Even $689.14M.


The Season’s Least Likely Non-Homer

A little while back, I took a look at what might be considered the least likely home run of the 2016 season. I ended up creating a simple model which told us that a Darwin Barney pop-up which somehow squeaked over the wall was the least likely to end up being a homer. But what about the converse? What if we looked at the ball that was most likely to be a homer, but didn’t end up being one? That sounds like fun, let’s do it. (Warning: GIF-heavy content follows.)

The easy, obvious thing to do is just take our model from last time and use it to get a probability that each non-homer “should” be a home run. So let’s be easy and obvious! But first — what do you think this will look like? Maybe it was robbed of being a home run by a spectacular play from the center fielder? Or maybe this fly ball turned into a triple in the deepest part of Minute Maid Park? Perhaps it was scalded high off the Green Monster? Uh, well, it actually looks like this.

That’s Byung-ho Park, making the first out of the second inning against Yordano Ventura on April 8. Just based off exit velocity and launch angle, it seems like a worthy candidate for the title, clocking in at an essentially ideal 110 MPH with a launch angle of 28 degrees. For reference, here’s a scatter plot of similarly-struck balls and their result (click through for an interactive version):

(That triple was, of course, a triple on Tal’s hill)

But, if you’re anything like me, you’re just a tad underwhelmed at this result. Yes, it was a very well-struck ball, but it went to the deepest part of the park. What’s more, Kauffman Stadium is a notoriously hard place to hit a home run. It really feels like our model should take into consideration both the ballpark in which the fly ball was hit, and the horizontal angle of the batted ball, no? Let’s do that and re-run the model.

One tiny problem with this plan is that Statcast doesn’t actually provide us with the horizontal angle we’re after. Thankfully Bill Petti has a workaround based on where the fielder ended up fielding the ball, which should work well enough for our purposes. Putting it all together, our code now looks like this:

# Read the data
my_csv <- 'data.csv'
data_raw <- read.csv(my_csv)
# Convert some to numeric
data_raw$hit_speed <- as.numeric(as.character(data_raw$hit_speed))
data_raw$hit_angle <- as.numeric(as.character(data_raw$hit_angle))
# Add in horizontal angle (thanks to Bill Petti)
horiz_angle <- function(df) {
angle <- with(df, round(tan((hc_x-128)/(208-hc_y))*180/pi*.75,1))
angle
}
data_raw$hor_angle <- horiz_angle(data_raw)
# Remove NULLs
data_raw <- na.omit(data_raw)
# Re-index
rownames(data_raw) <- NULL

# Make training and test sets
cols <- c(‘HR’,’hit_speed’,’hit_angle’,’hor_angle’,’home_team’)
library(caret)
inTrain <- createDataPartition(data_raw$HR,p=0.7,list=FALSE)
training <- data_raw[inTrain,cols]
testing <- data_raw[-inTrain,cols]
# gbm == boosting
method <- ‘gbm’
# train the model
ctrl <- trainControl(method = “repeatedcv”,number = 5, repeats = 5)
modelFit <- train(HR ~ ., method=method, data=training, trControl=ctrl)
# How did this work on the test set?
predicted <- predict(modelFit,newdata=testing)
# Accuracy, precision, recall, F1 score
accuracy <- sum(predicted == testing$HR)/length(predicted)
precision <- posPredValue(predicted,testing$HR)
recall <- sensitivity(predicted,testing$HR)
F1 <- (2 * precision * recall)/(precision + recall)

print(accuracy) # 0.973
print(precision) # 0.811
print(recall) # 0.726
print(F1) # 0.766

Great! Our performance on the test set is better than it was last time. With this new model, the Park fly ball “only” clocks in at a 90% chance of becoming a home run. The new leader, with a greater than 99% chance of leaving the yard with this model is ARE YOU FREAKING KIDDING ME

I bet you recognize the venue. And the away team. And the pitcher. This is, in fact, the third out of the very same inning in which Byung-ho Park made his 400-foot out. Byron Buxton put all he had into this pitch, which also had a 28-degree launch angle, and a still-impressive 105 MPH exit velocity. Despite the lower exit velocity, you can see why the model thought this might be a more likely home run than the Park fly ball — it’s only 330 feet down the left-field line, so it takes a little less for the ball to get out that way.

Finally, because I know you’re wondering, here was the second out of that inning.

This ball was also hit at a 28-degree launch angle, but at a measly 102.3 MPH, so our model gives it a pitiful 81% chance of becoming a home run. Come on, Kurt Suzuki, step up your game.


Ranking the Importance of the Five Tools

A good friend of mine with whom I argue about baseball often once posed a very interesting question to me.  He asked me, if I were to build a team completely devoid of one tool, which tool would I want to be missing?  In the ensuing argument, I was asked to rank the tools from least to most important for team success.  I put the order as arm, speed, fielding, contact, and power.  It was not until later that day that it struck me just how great a question he had asked.  Now, several months later, I will attempt to quantify the tools.

The rules for this study will be simple.  Two teams will be assembled for each of the five tools.  Each team will be considered league-average in every tool but the one for which they are being evaluated.  One of the teams for each tool will be the best possible in that one area, and the other will be the worst possible.  The runs lost from league-average by the worst possible team will be subtracted from the runs gained by the best possible teams.  The larger the difference, the more important the tool.  The teams will have one player for each position (minimum 250 PA, 450 Inn).

Note:  Pitchers are not included.  Losing arm does not mean losing value from pitchers.

Power

The players on the teams for power will be determined using isolated power.

Best Possible Team:  C) Evan Gattis (.257); 1B) Chris Carter (.277); 2B) Ryan Schimpf (.315); 3B) Nolan Arenad0 (.275); SS) Trevor Story (.296); LF) Khris Davis (.277); CF) Yoenis Cespedes (.251); RF) Mark Trumbo (.277)

This group has a combined ISO of .276, which would put their team OPS+ at about 115.4.  An average team has 6152.6 PA in a season.  Using these figures, they would score 836 runs as a team, compared to the 725 of an average team.

Worst Possible Team:  C) Francisco Cervelli (.058); 1B) Chris Johnson (.107); 2B) Jed Lowrie (.059); 3B) Yunel Escobar (.087); SS) Ketel Marte (.064); LF) Ben Revere (.083); CF) Ramon Flores (.056); RF) Flores

The combined ISO for this team was only .072, making the OPS+ about 87.8.  Runs scored for this team would then be 636.

Difference between BPT and WPT:  200 runs

Contact

The players on the teams for contact will be determined using K%.

BPT:  C) Yadier Molina (10.8); 1B) James Loney (10.1); 2B) Joe Panik (8.9); 3B) Jose Ramirez (10.0); SS) Andrelton Simmons (7.9); LF) Revere (9.1); CF) Revere; RF) Mookie Betts (11.0)

Collectively, this team would strike out in 9.7% of their plate appearances.  League average in 2016 was 21.1%, meaning the BPT is 11.4% better than league average.  The team would score 807 runs.

WPT:  C) Jarrod Saltalamacchia (35.6); 1B) Chris Davis (32.9); 2B) Schmipf (31.8); 3B) Miguel Sano (36.0); SS) Story (31.3); LF) Ryan Raburn (31.3); CF) Byron Buxton (35.6); RF) Sano

This high swing-and-miss team would strike out in 33.9% of plate appearances.  This is 12.8% higher than average.  The team would score 632 runs.

Difference between BPT and WPT:  175 runs

Fielding/Arm

As it turns out, there are really not stats for exclusively measuring a fielder’s arm.  Baseball-Reference has Arm Runs Saved, but that is not for infielders.  Additionally, the stat I originally wanted to use for Fielding, UZR/150, is not available for catchers.  To remedy both of these problems, I elected to use DRS.  DRS is available for all positions, and it takes a fielder’s arm into account.  Because I will not be taking values for fielding and arm on their own, fielding will receive about 60% of the total difference in the category.  The remaining 40% will be attributed to arm.

BPT:  C) Buster Posey (23); 1B) Anthony Rizzo (11); 2B) Ian Kinsler/Dustin Pedroia (12); 3B) Arenado (20); SS) Brandon Crawford (20); LF) Starling Marte (19); CF) Kevin Kiermaier (25); RF) Betts (32)

Kinsler and Pedroia tied for the lead at second base, so I just listed both of them.  The brilliant defensive team would be 162 runs better than the average in the field.  Of these, 97 will be attributed to fielding and 65 to arm.

WPT:  C) Nick Hundley (-16); 1B) Joey Votto (-14); 2B) Schimpf/Daniel Murphy/Rougned Odor (-9); 3B) Danny Valencia (-18); SS) Alexei Ramirez (-20); LF) Robbie Grossman (-21); CF) Andrew McCutchen (-28); RF) J.D. Martinez (-22)

The team of these players, who look like pretty good players, would have a -148 defensive value.  The value to fielding is -89 runs, and -59 for arm.

Difference between BTP and WPT (Fielding):  186 runs

Difference between BTP and WPT (Arm):  124 runs

Speed

Speed presents a problem.  It is valuable on the basepaths, obviously, but it is also valuable in the field.  More speed means more range.  Speed Score is a stat that represents the importance of both, but it does not translate well into value.  I decided to go with FanGraphs BsR, even though it does not measure speed in the field.  That value can be circumvented by routes and reactions anyway.

BPT:  C) Derek Norris (1.8); 1B) Wil Myers (7.8); 2B) Dee Gordon (6.2); 3B) Ramirez (8.8); SS) Xander Bogaerts (6.1); LF) Rajai Davis (10.0); CF) Billy Hamilton (12.8); RF) Betts (9.8)

This speed roster is a team that anyone would like to run out every day.  It is a young and athletic team.  Even so, based on speed alone, the team is just 63 runs above average.  That is the lowest value above average for any BPT.

WPT:  C) Molina (-8.7); 1B) Miguel Cabrera (-10.0); 2B) Pedroia (-4.5); 3B) Escobar (-5.6); SS) Erick Aybar (-3.9); LF) Yasmany Tomas (-5.5); CF) Jake Smolinski (-3.4); RF) Tomas

The lead-foot team is 47 runs below average.  That is the closest to average of any WPT.  Speed clearly has the least impact of the five tools.  I regret not putting it last.

Difference between BPT and WPT:  110 runs

Conclusion

I will admit that I was wrong.  Arm actually has some real value.  My excuse, I guess, is to say that it slipped my mind that arm is important for infielders as well as outfielders.  That should not have happened, and I am a little upset I made that mistake.  Fielding also beat out contact, which I did not expect.  I do not even have a defense for this one, as I do not know what I was thinking.

In all honesty, this post was written to win an argument.  However, it does have a deeper purpose.  This answers the question posed so many years ago in Moneyball.  If a general manager can afford to buy players with only one tool, which tool should it be?  This information is probably not new to any front office in baseball, but it is something to remember when considering small-market strategy.

Anyway, here is the official list of the five tools by importance, at least for 2017.

1.  Power

2.  Fielding

3.  Contact

4.  Arm

5.  Speed


Derek Norris, 2016 — A Season to Forget

While it may not be the most exciting Nationals story of the offseason, Wilson Ramos signing with the Rays and the subsequent trade for Derek Norris to replace him is a very big change for the Nats. Prior to tearing his ACL in September, Ramos was having an incredible 2016, and he really carried the Nationals offense through the first part of the year (with the help of Daniel Murphy, of course) when Harper was scuffling and Anthony Rendon was still working back from last season’s injury. Given Ramos’ injury history it makes sense to let him walk, but Nationals fans have reasons to be concerned about Norris.

After a few seasons of modest success, including an All-Star appearance in 2014, Norris batted well under the Mendoza line (.186) in 2016 with a significant increase in strikeout rate. What was the cause for this precipitous decline? Others have dug into this lost season as well, and this article will focus on using PitchFx pitch-by-pitch data through the pitchRx package in R as well as Statcast batted-ball data manually downloaded into CSV files from baseballsavant.com, and then loaded into R. Note that the Statcast data has some missing values so it is not comprehensive, but it still tells enough to paint a meaningful story.

To start, Norris’ strikeout rate increased from 24% in 2015 to 30% in 2016, but that’s not the entire story. Norris’ BABIP dropped from .310 in 2015 to .238 in 2016 as well, but his ISO stayed relatively flat (.153 in 2015 vs. .142 in 2016). Given the randomness that can be associated with BABIP, this could be good new for Nats fans, but upon further investigation there’s reason to believe this drop was not an aberration.

Using the batted-ball Statcast data, it doesn’t appear that Norris is making weaker contact, at least from a velocity standpoint (chart shows values in MPH):

Screen Shot 2016-12-11 at 9.50.27 PM.png

Distance, on the other hand, does show a noticeable difference (chart shows values in feet):

Screen Shot 2016-12-11 at 9.53.45 PM.png

So Norris is hitting the ball further in 2016, but to less success, which translates to lazy fly balls. This is borne out by the angle of balls he put in play in 2015 vs. 2016 (values represent the vertical angle of the ball at contact).

Screen Shot 2016-12-11 at 9.56.55 PM.png

The shifts in distance & angle year over year are both statistically significant (velocity is not), indicating these are meaningful changes, and they appear to be caused at least in part by the way pitchers are attacking Norris.

Switching to the PitchFx data, it appears pitchers have begun attacking Norris up and out of the zone more in 2016. The below chart shows the percentage frequency of all pitches thrown to Derek Norris in 2015 & 2016 based on pitch location. Norris has seen a noticeable increase in pitches in Zones 11 & 12, which are up and out of the strike zone.

Screen Shot 2016-12-11 at 10.11.19 PM.png

Norris has also seen a corresponding jump in fastballs, which makes sense given this changing location. This shift isn’t as noticeable as location, but Norris has seen fewer change-ups (CH) and sinkers (SI) and an increase in two-seam (FT) & four-seam fastballs (FF).

Screen Shot 2016-12-11 at 10.15.10 PM.png

The net results from this are striking. The below chart shows Norris’ “success” rate for pitches in Zones 11 & 12 (Represented by “Yes” values, bars on the right below) compared to all other zones for only outcome pitches, or the last pitch of a given at-bat. In this case success is defined by getting a hit of any kind, and a failure is any non-productive out (so, excluding sacrifices). All other plate appearances were excluded.

Screen Shot 2016-12-11 at 10.21.20 PM.png

While Norris was less effective overall in 2016, the drop in effectiveness on zone 11 and 12 pitches is extremely noticeable. Looking at the raw numbers makes this even more dramatic:

2015                                                     2016

Screen Shot 2016-12-11 at 10.23.19 PM.png                       Screen Shot 2016-12-11 at 10.23.38 PM.png

So not only did more at-bats end with pitches in zones 11 and 12; Norris ended up a shocking 2-for-81 in these situations in 2016.

In short, Norris should expect a steady stream of fastballs up in the zone in 2016, and if he can’t figure out how to handle them, the Nationals may seriously regret handing him the keys to the catcher position in 2016.

All code can be found at the following location : https://github.com/WesleyPasfield/Baseball/blob/master/DerekNorris.R