Archive for Uncategorized

An Attempt to Predict Hits With Statcast

Most of what happens in a baseball game are influenced by chance. A ball hit on the screws can end up in the outstretched glove of a diving fielder. The outfield wall could be just six inches too tall, keeping a home run in the park. Strike three could be called ball four by the home plate umpire. Traditional statistics can’t account for all of this, hence why sabermetricians have developed context-specific statistics like DIPS (defense independent pitching statistics) or wRC (weighted runs created). These stats try to explain the outcomes of batted balls while controlling for defense and ballparks.

I sought out to try and create a model that controls for defense, but from the hitter’s perspective. A model that could predict batted ball outcomes could be used to better evaluate hitters and their quality of contact. Using 2017 MLB pitch-by-pitch Statcast data’s batted ball statistics (launch angle, exit velocity, outcome and spray angle), I used a random forest to model whether a batted ball would be a hit or an out. I trained my model on 20% of the data, and felt confident the training set and test set were identical, with similar means and standard deviations for launch angle, speed and spray angle.

I chose to use a random forest because it runs multiple decision trees on subsets of the training set and averages the results across the sets. A Random Forest model uses k-decision trees, or binary ‘decision’ or outcome model, to model the data. Random forest algorithms minimize variance and bias through averaging; a random forest helps prevent overfitting, something I was afraid of doing. Using the Random Forest provided much better accuracy than running a Logistic Regression, my alternative hypothesized model, due to the number of trees (10) and the nature of a decision tree versus a regression.

 

Without further ado, the results (in visual form):

Actual Hits & Outs.jpg  Predicted Hits & Outs

There’s quite a bit going on in these plots. Let me break it down.

These plots are of every fair ball hit (with a few misclassifications) in 2017 and their landing (or caught) locations. The dark blue balls in play are hits, while the light blue balls are outs. On the left are the actual hits and outs, while on the right are the predicted hits and outs. There are almost a hundred thousand points on these plots, making it difficult to sift through. Here is an explanation of these plots in tabular form:

correct

My model does a much better job at predicting outs than hits. It was correct almost 90% of the time at predicting outs, compared to merely 66% of the time predicting hits. From From the perspective of hits being good (the batter’s perspective), 10% of outs were false positives, and 34% of hits were false negatives. I believe my model did better with outs because there are many more outs than hits – league-average BABIP is .300, or 30% of the time a ball in play is a hit, 70% of the time it’s an out. The model was accurate 81.4% of the time. Despite the high accuracy, the model only ran a .1769 R-Squared. That is, the model was able to describe 17.7% of the variance in batted ball results.

Overall, I feel this model can help predict batted ball results. Two main drawbacks of the model are that it only predicts hits instead of the type of hit and that it requires more data to increase accuracy. I believe having fielder data, such as shifts and defensive capabilities, would greatly increase the accuracy of the model, though at the risk of overfitting (given the small samples of fielded balls in certain areas).

I plan to explore this model further and look at individual batters to compare their actual hits to the predicted ones.

 


Effect of Pitch Selection on Launch Angle and Exit Velocity

When talking about launch angle much focus is on swing plane and of course rightfully so. Many players like Jose Bautista, Josh Donaldson, Daniel Murphy and Justin Turner have demonstrated that it is possible to change the swing and achieve spectacular gains in power output.

However also the plate discipline by the hitter and the way he is pitched have an effect. Looking at Statcast data the average launch angle in the upper third oft he zone is around 20 degrees, while it is only 5 degrees in the lower third. Of course that doesn’t mean higher pitches are better to swing at, high pitches are also known to induce more pop-ups and whiffs on certain types of fastballs (high spin) but for players who have trouble to elevate the ball it can make sense to swing a little less in the lower part of the zone. On the other hand a high whiff or popup rate type of player who has a good launch angle it might make sense to leave the high pitches alone.

I did a breakdown of the zones for right-handed hitters. I looked for LA but also exit velocity to see where the good parts are. Unsurprisingly pitches over the plate do better in both LA and EV. Inside pitches did better in the LA but worse in the EV and for outside pitches it was vice versa, better LA but worse LA.

Just high and low both did about the same in EV but high did better in LA by far. When looking finer we could confirm that the combination of low and away gave the lowest launch angles and up and in gave the highest, but up and in also by far yielded the worst exit velocities probably because there is the least space to get the barrel around up and tight – so there is a trade-off between EV and LA.

View post on imgur.com

Over the plate is, of course, good and middle pitches too as are up and away and down and in. The down-away to up-and-in axis is probably to avoid.So ideally a batter would have a slightly tilted away from him zone (imagine the zone is a rectangle piece of wood and the batter pushes the top of the piece away from him so that the top is farther away from him than the bottom. Also it should be a little wider in the middle than in the very edges (like an ellipse)

.

View post on imgur.com

Of course, the pitcher has a say in this too. If a hitter adjusts pitchers will adjust too. There are some batters who can beat that a little like for example Brian Dozier who is very quick to the inside and thus can crowd the plate a little without opening apart but for most hitters that is not really true. So if a batter has a swing change and then struggles in the second half we should probably also look at the swing and pitch profile. Still, it is good for a hitter to match his swing rates and hot zones as even good pitchers will miss their target quite a few times. A batter not aware of his hot zones could leave serious potential on the table.

I also found one interesting thing. I looked at right-handed batters mostly in my analysis but also did a quick check on lefties. The lefties had a higher LA on inside pitches than the righties but a lower one than the righties on outside pitches? Why is that? handedness of pitchers faced maybe? I found indeed that righties facing opposite-handed pitchers indeed have a higher LA on inside pitches than against same-sided pitchers and against LHPs it was vice versa, so there seems to be an effect there.

And, lastly, the LA on offspeed pitches (10 degrees) was slightly lower than on fastballs (11 degrees). Surprisingly low breaking balls had a higher LA than low FBs but inside OS pitches where easier to lift.

View post on imgur.com


Relief Pitchers Haven’t Been Feeling the Pressure of the Weak FA Market

I’m sure that you’ve seen a plethora of articles about how the FA market is in free-fall. Here’s Craig Edwards talking about the decline in payrollthat might either be a cause or a result of the slow market, here’s Tom Verducci speculating on the reasons behind the slow market, here’s Jay Jaffe talking about how the slow FA market might have its own structure to blame – we could write an encyclopedia of literature about why the FA market has stalled out so much. But curiously enough, there’s a group of FA that isn’t really experiencing these difficulties – relief pitchers.

Methodology

Previously, I discussed using a similarity tool to generate most-similar comparisons based on batted ball data and peripheral data. In this article, I’ll use the same notion to find most similar FAs and compare the contracts that historical comps have signed to the ones signed by 2017’s FA class. We can use this to illustrate the differences between the position player market, the SP market, and the RP market.

I modified my similarity tool to generate similarity scores for players on the basis of their production last season (in fWAR), their production over their career up until their free agent year (again, in fWAR), and their age, with age weighted twice as much as the other production measures. I then downloaded free agent contract data for all MLB free agents from 2006 to 2017 from ESPN, adjusted those figures to account for inflation, and then added production data to my dataset.

Then, using the tool, I generated a list of the most similar free agents for players in a given year – we are then assuming that, within a position, a player who produces X amount of WAR in a year, has Y amount of career WAR, and is Z years old should generate the same contract as a player who is of similar age with a similar history of production. While this assumption ignores aging curves and the strength of the market, it gives us a rough idea of who is most similar to whom in terms of production entering free agency, and we can then compare what contract they received versus what contracts players have historically received for similar production.

For an example, let’s look at Todd Frazier. Here are Frazier’s most similar comparisons at 3B, according to the tool.

Todd Frazier Most Similar FAs
Year Name Similarity Score WAR in FA Year WAR up to FA Year Age Contract (adj. for inflation) AAV
2017 Todd Frazier N/A 3 21.2 31 $17/2 8.5
2013 Jhonny Peralta 0.489 3.8 22.3 31 $56/4 14
2010 Juan Uribe 0.629 2.8 13.6 31 $23/3 7.6
2009 Orlando Hudson 0.687 2.8 16.9 32 $6/1 6

Since 2006, the runaway for most similar play to Frazier is Peralta, who made nearly twice as much in terms of AAV as Frazier, received twice as many years, and received three times as much guaranteed money! Uribe and Hudson each made similar deals to Frazier in terms of AAV, but neither had anywhere close to Frazier’s history of production.

Frazier’s deal is emblematic of the problems facing the free agent market today. Among ESPN Top-25 free agents that have signed, here are each’s most similar free agents and the deals that they’ve signed.

Top FAs vs. Most Similar Historical FAs
Player ESPN FA Rank Contract Most Similar FA Contract (adj.)
Lorenzo Cain 2 $80/5 Gary Matthews Jr., 2006 $61/5
Zack Cozart 3 $38/3 Justin Turner, 2016 $65/4
Carlos Santana 5 $60/3 Carlos Lee, 2006 $121/6
Todd Frazier 9 $17/2 Jhonny Peralta, 2013 $56/4
Jay Bruce 12 $39/3 Nick Markakis, 2014 $46/4
Jhoulys Chacin 14 $16/2 Mike Pelfrey, 2013 $12/2
Yonder Alonso 15 $16/2 James Loney, 2013 $22/3
Jake McGee 17 $27/3 Ryan Madson, 2011 $9/1
Anthony Swarzak 20 $14/2 Jesse Crain, 2013 $3/1
Mike Minor 22 $28/3 Scott Feldman, 2013 $32/3
CC Sabathia 23 $10/1 Tim Hudson, 2013 $24/2
Welington Castillo 25 $25/2 John Buck, 2010 $20/3

Across the board, free agents are signing contracts that are either in the ballpark of their comparables or significantly lower. Some of these are imperfect comparisons that ignore market factors — Gary Matthews Jr. was competing with Barry Bonds, Jim Edmonds, and Alfonso Soriano in 2006 while Cain’s only major competition this year was J.D. Martinez, a guy who would probably be best served signing somewhere as a DH — but, still, there exists a shocking trend in underpayment, where players are getting fewer years and less guaranteed money than their most similar comps.

Take, for example, Carlos Lee versus Carlos Santana:

Carlos Santana vs. Carlos Lee
Year Name Similarity Score WAR in FA Year WAR up to FA Year Age Contract (adj.) AAV
2017 Carlos Santana N/A 3 23 31 $60/3 $20
2006 Carlos Lee 1.055 1.9 19.7 30 $121/6 $20

I can certainly see the reasons for giving Lee a six-year deal, and Santana surpasses Lee in every respect except for being a year older. It astounds me to think that Santana, who is a much better player than Lee ever was, got half the deal that he did. Santana feels like a victim of the market.

And just last year, consider that Justin Turner received a $65/4 deal from the Dodgers, which was called “a massive bargain” for the Dodgers by Dave Cameron:

“…realistically, given the Cespedes/Fowler/Desmond signings, it feels like Turner should have gotten something like $90 to $100 million in this market. And as Craig Edwards showed in his piece on Turner in November, that’s pretty much what we should expect him to be worth based on recent comparable players.”

If the Turner deal was “a massive bargain”, then the Zack Cozart deal was finding a diamond ring on the sidewalk.

Zack Cozart vs. Justin Turner
Year Name Similarity Score WAR in FA Year WAR up to FA Year Age Contract (adj.) AAV
2017 Zack Cozart N/A 5 14.9 32 $38/3 $13
2016 Justin Turner 0.332 5.5 13 32 $64/4 $16

Even if we rely upon conservative estimates and think that Cozart settles in around a 2.5-3 WAR player, especially after losing the positional adjustment bonus from playing at SS, Cozart is still being paid like he’s still in arbitration while producing like he’s in his prime. Something is wrong, oh so terribly wrong with the MLB FA market, and we can talk and talk about it until Rob Manfred comes in and institutes a debate clock to speed up pace-of-discussion. But strangely enough, RPs seem insulated from this market downturn.

The Differences Between Position Players, SPs, and RPs

I split up our MLB FA Class of 2017 into Position Players, SPs, and RPs, and then looked at each player who received an MLB contract and whose most similar free agent also received an MLB contract.

Differences in contracts compared to most similar players by position
Position Average % Difference in Total Contract Value Average % Difference in Years Average % Difference in AAV
Position Players -38% -7% -12%
SPs -16% -11% -6%
RPs 17% 5% 17%

What about players who received minor league contracts or players who signed in Japan? My data contains 40 position players who have signed free agent contracts, and of those, 19 have taken minor league deals or signed in Japan. 13 of those players’ most similar free agents also took minor league contracts, but six of the players who took minor league deals had most similar free agents with major league deals. Of five free-agent SPs who signed minor league deals, 2 of them took minor league deals when their most similar player had received a major league deal. But not a single RP who took a minor league deal had a most similar FA with a major league contract. Not one.

Conversely, among position players, only three players received MLB contracts when their most similar player only got a minor league deal out of 20 FAs with MLB contracts (and one of them was Alcides Escobar signing with the Royals, which is cheating). That figure is 2 out of 11 MLB starters, but it’s 8 out of 26 among MLB relievers.

In other words: in a year when position players and SPs are more frequently being forced to take minor league and overseas deals instead of MLB deals when they might have historically deserved an MLB deal, the reverse is true of relievers.

Perhaps the best example of this phenomenon would be Bryan Shaw, who signed a 3-year deal with the Rockies for $27 million dollars earlier this offseason. Here are Bryan’s closest comps according to the tool.

Bryan Shaw most similar FAs
Year Name Similarity Score WAR in FA Year WAR up to FA Year Age Contract (Adj.) AAV
2017 Bryan Shaw N/A 1.6 4 30 $27/3 $9
2013 Chad Gaudin 0.564 1.2 4.6 30 Minor League N/A
2008 Tim Redding 0.566 1.2 4.7 30 $3/1 $3
2009 Rafael Soriano 0.648 2 6.2 30 $8/1 $8

No one among Shaw’s closest comps got even a third of the guaranteed money he was offered, and Soriano, who had a much better history of production, received only a one year deal for $8.3 million (adjusted). Shaw’s most similar reliever, Chad Guadin, couldn’t even get a major league deal! Sure, the Rockies have historically had to overpay free agent pitchers to get them to sign, but nowhere near to this degree. A contract like this for a reliever of Shaw’s caliber is without precedent.

The Virtues of not waiting for the market to collapse around you

The next logical step is to examine why relievers are flourishing when others are floundering: There does not immediately appear to be a single, straightforward answer to this question, but rather, several confounding factors.

One of the largest drivers of this trend has been the rise in demand for relievers. As I discussed last season for Sporting News, thanks to the postseason success of teams with “super-pens” (Cubs, Indians, Dodgers), relievers have been sought after in both trade and free agency, and as a result, teams are willing to pay pretty pennies to build their own super-pen.

Using a $/WAR framework, it’s obvious that relievers are usually paid considerably more in terms than position players and starters in terms of $/WAR (which I would attribute to the fact that WAR, as a largely context-neutral metric, undervalues relievers whose value is very context-dependent). But $/WAR for relievers has spiked quite a bit from last season to this off-season.

$/WAR by Position, 2006-2017

There’s a substantial amount of year-to-year variation, but $/WAR for relievers is at its highest level since 2007 – thus, I’m inclined to believe that relievers are being valued more than they have been in recent seasons. But at the same time, $/WAR might be an indicator of another market trend — the fact that most relievers were off the market well before the FA market collapsed in on itself.

MLB’s transaction tracker counted 69 reliever free agents who signed MiLB or MLB contracts this offseason. Forty-seven of them signed before 2018. In the span of Dec. 12-17 (about the same time as the winter meetings with some lag to account for processing the signings), 12 relievers signed MLB free agent deals for multiple years – guys like Anthony Swarzak, Steve Cishek, and Brandon Morrow. Just like that, most of the big-names RPs were off the market, well before people realized how awful the free agent market would truly be.

RPs who signed in January or later didn’t experience as much of a boon as those who signed earlier as well. RPs who signed MLB deals in January or later whose most similar FA also signed an MLB deal saw only 5% more money, and 4% fewer years, and only two signed MLB deals when their most similar FA had signed a minor league deal (though only nine MLB RP FAs have signed in 2018, so take this with a sprinkling of “small-sample-size-salt”).

It also raises the question: have the RPs taken the FA money away from other types of players? I plotted the percent of FA money spent on RPs versus other players, and it would certainly appear as though RPs are occupying much more of the market in terms of overall money now compared to years past.

% Of Total FA spent Distribution

However, teams are not shortchanging SPs and position players to pay RPs – there has thus far been extremely little money thrown around thus far. Even if the remaining FAs sign large contracts (which seems unlikely in their current situation), it will still take nearly seven hundred million dollars worth of contracts in order for FA spending to reach 2016 levels.

FA Spending Year By Year

While the current distribution of money is skewed towards RPs, that is more of a result of having many RPs already signed with more SPs and position players still waiting for contracts than it teams robbing SPs and position players to pay RPs.

There has simply been a large absence of money in free agency – partially because many FAs have yet to sign, but also because many SPs and position players have not paid what they have been paid in the past. But that hasn’t been a problem for RPs, because many RPs got in on the ground floor. The end result? A new dynamic in the FA market. Here’s hoping that we see some correction in the market, and soon – I’m running out of things to write about other than how slow the FA market is…


A Bear Market for Moose

Mike Moustakas is a free agent, and, like the seemingly 10,000 other players, remains unsigned. Along with JD Martinez and former Royals teammate Eric Hosmer, he’s considered one of the top available position players available and is discussed as a guy who could land a multi-year contract somewhere north of $70 million. But “Moose” is special in that he’s especially unfortunate to be a free agent right now.

Let’s play a game. Each of these stat lines from 2017 is an active third baseman, which of these players is worth at least $17.4 million per season AND a draft pick?

1) .260/.367/.461, 117 wRC+, 4.1 WAR
2) .249/.323/.450, 106 wRC+, 3.5 WAR
3) .273/.349/.513, 119 wRC+, 3.4 WAR
4) .272/.341/.472, 112 wRC+, 2.5 WAR
5) .248/.357/.487 111 wRC+. 2.5 WAR
6) .272/.314/.521, 114 wRC+, 2.2 WAR

Did you guess player #6? Because that player with the acceptable-but-definitely-not-amazing 2.2 WAR last season is in fact Mike Moustakas. The others in this group? In order: Eugenio Suarez, Kyle Seager, Travis Shaw, Jedd Gyorko, and Jake Lamb. Solid players, but not exactly the centerpieces of their respective franchises. To get a sense Moustakas’s production in comparable dollar value, those five players COMBINED to earn $18.71 million total in 2017, just a notch (in MLB contract terms) above the qualifying offer Moose already turned down.

In fact, that $17.4 million Moose declined would have given him the 6th highest annual salary among all third basemen in 2018, landing right in between Nolan Arenado ($17.75 mil) and free-agent-to-be Manny Machado ($16 mil). In other words, Moustakas wants top-5-player-at-his-position type money, a category to which he clearly doesn’t belong. For instance, last year among third basemen, Arenado and Machado ranked 4th and 14th in WAR, compared to Moustakas alllll the way down at 22nd. His numbers don’t look much better for offensive rating (20th) or abysmal defensive rating (79th) either. As far as BsR is concerned, he was the second worst 3B on the basepaths all year with an atrocious -5.4 rating, only stumbling ahead of the notoriously rock-footed Asdrubal Cabrera (sidenote here: c’mon Mets, the joke’s over).

Compared to one of the guys who will fill a previously open 3B role for an expected Moustakas bidder, Evan Longoria, Moose looks like an even more remarkable bust. Despite overall lackluster offensive numbers, .261/.313/.424 96 wRC+, Longo still managed a higher WAR (2.5) as a result of his respectable ratings among 3B in defense (14th) and BsR (13th). To top it off he’s “only” making $13.5 million next year, making him a significant bargain over what it would cost to sign Moustakas despite the moderate drop in offensive production, especially considering the relative gains in defense and baserunning.

As the Giants did have to part with top prospect Christian Arroyo to complete the deal, this is actually a solid baseline from which to compare Moose. Would most teams give out a Longoria-sized contract and a draft pick to acquire Moustakas? No? How about for even more money? Still no? Shocking.

So who is even around for Moustakas to sign with? Someone has to want his 38 home runs, right? Well…

The market for third basemen was actually fairly robust in the onset of the offseason, but as we’ve gotten deeper into winter, it seems as though just about everyone has filled the role or spent their money elsewhere. iInept orpower-needyy teams with hopes to compete, such as the Angels (signed Zack Cozart), Giants (traded for Evan Longoria), and Mets (signed Todd Frazier), have filled voids. Similarly, teams like the Yankees, Cardinals, and Phillies were reported to show some interest, but have all since opted to spend their money elsewhere, adding Giancarlo Stanton, Marcell Ozuna, and Carlos Santana, respectively.

Of course, any team would love to unilaterally add another 30+ homeruns to their stat sheet, but in this modern homerun happy era of baseball, dingers aren’t really all that hard to find. The market is still saturated with available niche power hitting corner guys with higher walk rates such as Morrison, Duda, Carter, Napoli, Reynolds, and Lind, most of which will likely provide greater on field value per dollar spent than Moose will. Additionally, the successes of teams like the Cubs, Astros, Rockies, Diamondbacks, Red Sox, and Indians to find and develop premier young, inexpensive power hitters has further strained a market that in the past been governed by whichever available name was the most prolific.

Then of course there’s the two elephants in the room: Machado and Donaldson.

The two soon-to-be free agents are certainly affecting this year’s free agent crop, but no one has lost more future money as a result of their impending free agency than Moutsakas. Not only are Machado and Donaldson much more highly touted as all around third basemen, being both offensive difference makers and defensive wizards, they’re going to cost their future signatory teams a fortune to bring onboard, factors which are extremely limiting to Moose’s potential suitors. Just the potential to sign one of the two titans next offseason (or the likes of Bryce Harper, Charlie Blackmon, Daniel Murphy, Andrew McCutchen, and many more!) affects how a multitude of teams are using their dollars this winter. Teams don’t want to sign a hitter to a massive, long-term contract if there are better options next season around the diamond, and if a they plan on expanding their payroll in future seasons, they’ll need to plan to get under the luxury tax for this coming one. Thus despite the availability of funds for teams like the Yankees and Phillies, the incentive to sign someone now just isn’t there. Combine this economic sentiment with Moustakas’s on field production (or comparatively lack there of) and draft pick compensation, and you’ve found a perfect storm of free agency limbo.

Ok, so what’s the field actually look like then? Somebody’s gotta want this guy. Who out there is willing to shell out a multi-year, $70mil+ contract, and give up a draft pick to do it?

Well there are only three teams without obvious opening day starting third baseman that I can tell: the Yankees, Royals, and Braves. Yankees will more than likely look elsewhere for a cheaper, single-season solution, as they look to stay under the luxury tax for 2018 before throwing the bus at Machado in the offseason. Moustakas could opt to return to the Royals, but they are much more intent on resigning Hosmer to a long-term deal. The Braves have an opening and the funds but they don’t seem to be in compete mode for the next few seasons, so it’s doubtful that they’ll make a free agent splash like Moose unless its a deal for 5+ years.

There is always the option of signing a one-year deal with someone, but how many teams are willing to give up a draft pick for one year of a guy? The correct answer is no one, especially if the on field production is shaky to begin with. There is the possibility that the Royals come out with a one-year deal, as they of course wouldn’t have to forfeit a draft pick, but that doesn’t appear to be a part of the Royals’ long term strategy. As they dive into full fledged rebuild mode, the Royals are looking to get younger, stock picks, and cut costs. So it makes sense to sign someone like Eric Hosmer to a long term deal, but very little sense to give out a massive long term contract to a guy they don’t view as a centerpiece of a franchise. There just isn’t much motivation for a team with little anticipation to compete this year to go out and overpay for one season of an overrated niche power guy with a low walk rate, forgoing a future pick in the process.

Moose probably doesn’t have much interest in a one-year deal anyway, regardless of the salary. Though it would undoubtedly benefit him to re-enter free agency next year without the compensatory pick attached to him, as a player can only receive a qualifying offer once, the notion of having to compete with Machado, Donaldson, Murphy and others in next year’s market is less than enticing. Being at best the 4t- ranked free agent at your position, especially when the teams losing the top 3 will likely look for in house options to fill the vacated roles, is not a recipe for a big contract. Because of this, there’s little reason to think that next year’s market will be any more advantageous for Moustakas, especially if his peripheral stats stay steady through next year.

Thus it’s increasingly looking as though the most likely path forward for Moose is in the Todd Frazier 2-year deal mold, but the lingering questions of with whom and for how much remain murky. Frazier signed for just $17 mil total over those two years, well below the three-year, $42-million deal he was projected to received, as he fell victim to many of the same analytical obstacles plaguing Moustakas. However, despite the lower projected price tag, Frazier’s .213/.344/.428 slash line, 108 wRC+, and 3.0 WAR in 2017 actually parallels quite closely to Moose’s offensive production, and his positive defensive rating (10th among 3B) clearly sets him apart. Here it becomes increasingly apparent why the Mets, yet another team previously thought to be interested in Moustakas, opted for his free agent alternative. A slight downturn in homeruns, in exchange for comparable production, better defense, and much less money is far too sweet of a deal to overlook.

So yes, Moustakas, the Scott Boras client who turned down a qualifying offer, whom MLB Trade Rumors projected to receive a $85mil/5year contract at the start of the offseason, who will be just 29 years and 199 days old come opening day, can’t seem to find a job. And, well, honestly, would you pay the man? Teams are too analytically savvy nowadays and every MLB executive has access to Fangraphs. If I’m Scott Boras I have the Royals and Braves on speed dial and I’m calling them every hour in the hopes of making magic happen. But if I’m Mike Moustakas, I’m investing in a really comfy couch and fine-tuning my March Madness bracket.


Omar Vizquel: G.O.A.T. Defender?

In the 2018 Hall of Fame balloting, Omar Vizquel received 37% of the vote in his first year on the ballot.  This implies strong voter support, and a high likelihood of being inducted into the Hall in the coming years.  The problem, as has been noted by many writers including Craig Edwards here at Fangraphs, is that Omar Vizquel was not a good offensive player.  Edwards compares Vizquel to other below-average offensive producers already inducted into the Hall and concludes:

“It seems necessary to point out that Vizquel’s [offensive] deficiency wasn’t a run-of-the-mill weakness. If elected to the Hall of Fame, he might be the worst offensive player there.”

Of course, Vizquel is not getting support for the Hall of Fame based on his offensive reputation.  He’s known as a great defender.   Yet, advance stats seem to indicate in no uncertain terms that the value Vizquel provided with his glove was not nearly enough to make him a Hall of Famer.  According to JAWS, a system developed by Jay Jaffe to evaluate Hall of Fame worthiness, Vizquel is about as strong of a candidate as Hanley Ramirez, Dave Concepcion or Rafael Furcal i.e. he is not particularly worthy and it’s not particularly close.  But those 37% of voters seem pretty insistent.  What are they seeing that the statistics aren’t?

Vizquel was a mediocre offensive player, and that can’t be disputed.  The ability of offensive statistics such as wRC+ and BsR to quantify historical offensive value and adjust for historical context are firmly established.  Defensive statistics, on the other hand, remain controversial.  Since 2003, when granular fielding data became available through Baseball Info Solutions, Baseball-Reference has used Defensive Runs Saved (DRS) in their WAR calculations, and Fangraphs has used Ultimate Zone Rating (UZR), both statistics derived from the BIS data.  I believe that both are good metrics for evaluating defense, but are far from perfect.  Even further from perfect is the statistic used to calculate defensive WAR for both Baseball-Reference and Fangraphs for seasons prior to 2003, Total Zone (TZ), which is calculated using Retrosheet play-by-play data.  There has been criticism of the use of these statistics for historical comparison, including by Bill James, who argues against Andruw Jones‘ defensive-value based case for the Hall by stating that older defensive metrics such as TZ are more conservative in their allotment of value due to the limitations of the data to quantify exceptional performance.  He argues that comparing players evaluated by new metrics to players evaluated by old metrics is comparing apple-to-oranges, that the methodologies are too different, and their accuracy too poorly understood for strong arguments about players to be based off of them.

Vizquel was 36 when UZR and DRS 2003, and as such his prime years are all being evaluated by TZ.  Here are the defensive runs valuations across his career, per Fangraphs, bucketed into ranges of years where the statistics are stable:

Year Age Innings Fielding Fielding/1500 Metric
1989-1994 22-27 5833.2 66.0 17.0 TZ
1995-2001 28-34 8987.0 18.0 3.0 TZ
2002-2007 35-40 6880.1 41.0 8.9 UZR
2008-2012 41-45 2617.1 6.2 3.6 UZR

So the metrics here are telling us that early in his career, Vizquel was a top-of-the-league defender, then dipped to a slightly above average defender for this late-20’s early 30’s.  Then he pops back up to great for his late 30’s when UZR kicks in, and dips back to slightly above average for his 40’s.  This is odd, especially with how Vizquel falls off a cliff in his late 20’s, then returns to form in his late 30’s.  Important to note is that that over half of the defensive runs accumulated in the 2002-2007 interval are credit of a 23-run 2007 season, his best single-season total of his career.  Did Omar Vizquel have far-and-away his best defensive season as a 40-year-old on the Giants?  Maybe.  Things happen.  But probably not, right?  Was Omar Vizquel a much better defensive infielder in his late-30s than in his late 20’s?  Maybe.  It’s possible.  But that doesn’t really make sense, does it?

I’m not showing this to discredit defensive statistics.  I’m just trying to illustrate that there’s a wide margin of error that we’re dealing with here, and the further complication of a change in metrics half way through Vizquel’s career.  Is it possible that Omar Vizquel’s Hall of Fame case is being lost in all that?  Let’s see.  Let’s say we don’t trust Vizquel’s defensive metrics at all.  Let’s say that all we trust are the distributions of valuations defensive metrics assign to each year’s pool of players.  Let’s give Omar Vizquel as many defensive runs as he needs to be a Hall of Famer, and then let’s look at what that implies about how good he would have had to have been, relative to the league.  For instance, if Vizquel with his added value now has the career defensive numbers of Mark Belanger, and you want to argue that he was actually as good as Mark Belanger defensively, then you can also argue Vizquel is Hall-worthy.

For this exercise, I’m going to define Hall of Fame worthiness as the average JAWS of Hall of Fame shortstops, 54.8.  JAWS is calculated by averaging a player’s career WAR and best 7 seasons worth of WAR.  I needed to get Vizquel’s 34.2 JAWS up to 54.8 by adding only fielding runs.  To accomplish this, I threw away Vizquel’s metrics and assumed that he produced fielding runs at a constant per-inning rate throughout his career.  I then took into account aging by adding a linear 3% decrease in this rate starting at age 33.  Then, using the values of his other WAR components provided in his Value table on Fangraphs, I was able to calculate his career and peak WAR for different per-inning fielding runs rates.  To be clear, I kept all of his career values estimated by Fangraphs the same, including his positional adjustment.  I have him playing the exact same number of innings that he did in real life.  The only thing changing here is the rate at which he produced fielding runs.  The rate that got him to 55 JAWS turned out to be 0.019 Fielding Runs/Inning.  Here’s what that looks like in terms of WAR:

WAR WAR7 JAWS
JAWS SS Average 66.7 42.8 54.8
Vizquel Actual 42.6 25.8 34.2
Vizquel Proposed 71.9 38.1 55.0

Did I just give Omar Vizquel 29.3 more career WAR?  Yes, it appears so.  Here is what my “proposed”, hypothetical Vizquel fielding runs totals look like compared to his actual runs.

That seems like a whole lot of extra fielding runs, doesn’t it?  An unrealistically high amount, perhaps?  Well, let’s see.  Below, I plotted the proposed and actual defensive runs (with the positional adjustment added) on top of violin plots of the distribution of defensive runs for all players in the league each year.  The proposed Vizquel seasons are red triangles, while the actual Vizquel seasons are the blue squares.

What we’re seeing here is that for my proposed Vizquel defensive seasons, he would be or near the top of the league nearly ever year for about 20 straight years, apart from two seasons where his playing time was down due to injury.  So, it looks like Vizquel needs to have been pretty damn good at defense to be Hall-worthy.  Here is where he would rank among the league each year with my proposed defensive runs totals, along with where he actually ranked, and the proposed and actual runs totals.

Year Proposed Lg. Rank Actual Lg. Rank Proposed Def. Runs Actual Def. Runs
1989 4 31 28.3 12.9
1990 17 16 17.1 17.2
1991 2 5 28.6 21
1992 3 7 29.0 20.1
1993 1 3 33.3 24
1994 8 47 15.4 7.8
1995 2 47 29.9 8.3
1996 2 56 33.0 9.1
1997 1 55 32.9 10.1
1998 3 20 33.1 17.1
1999 4 14 30.6 21.5
2000 1 91 31.3 5.8
2001 2 230 30.4 -1.2
2002 1 132 28.9 3.7
2003 30 78 12.0 7.3
2004 4 87 26.5 5.6
2005 2 20 26.7 13.5
2006 2 14 25.8 16.1
2007 6 1 23.9 30.2
2008 32 81 12.5 6.3
2009 80 39 7.0 11.2
2010 34 294 11.6 -3.5
2011 96 269 5.3 -2
2012 111 171 4.8 1.7

My proposed Vizquel seasons puts him as a top-10 defender in the league 17 times, and at number one four times.  That’s a lot of times!  One might say way too many to realistically expect!  Hmmm…  Now let’s look at how my proposed Vizquel’s career defensive value stacks up against all post-War non-catchers.   This table was taken from the Craig Edwards piece cited at the start of my article by the way.

Most Defensive Runs Above Average
Player Def
Omar Vizquel Proposed 557.9
Ozzie Smith 375.3
Brooks Robinson 359.8
Mark Belanger 345.6
Cal Ripken 310.1
Luis Aparicio 302.7
Andruw Jones 281.3
Omar Vizquel Actual 263.8
Adrian Beltre 226.1

Yowza! That’s a lot of runs!

If the conclusion of this analysis isn’t obvious by now, here it is:  To make Omar Vizquel a Hall of Famer by boosting his fielding numbers, you have to make him really, REALLY good at defense.  Like capitalized, bolded, italicized REALLY good.  Twice as good as the metrics say.  182 runs better than Ozzie Smith.  You have to believe that he performed as a top-10 defender in the league from age 22 to age 40.  You’re saying he was peak-Andrelton Simmons for nearly two decades.  To argue Vizquel is worthy of the Hall of Fame, given his offensive value is what it is, you’ll have to argue that he was, by a considerable margin, the greatest defender of all time.

There are ways I could have made these proposed numbers a little more plausible.  I could’ve concentrated Vizquel’s defensive value more into his seven peak seasons, which would’ve meant he needed less career WAR to achieve the same JAWS score, but that would’ve made the value of those peak years absolutely absurd.  I could’ve lowered the bar, just trying to get him to, say, one standard deviation below the mean Hall of Fame shortstop JAWS score.  But that puts his value in the territory of the Joe Tinkers, Hughie Jenningses and Dave Bancrofts of the world, who’s own inclusion in the hall is questionable.  And I can’t see how doing any of these things would even get my proposed values down near Ozzie Smith. Ozzie Smith! Y’know, like,the greatest defensive shortstop of all time?

If you want to make the argument that Omar Vizquel is underrated by fielding metrics, that could very well be the case.  He was a great player who played on some phenomenal teams, and it’s plausible the metrics aren’t getting his fielding numbers quite right.  But just bumping up Vizquel a few runs here and there still isn’t going to get him anywhere near the Hall of Fame.  The bottom line is that a player who runs a 83 wRC+ over 24 years in the majors has an enormous amount of ground to make up with his defense if he is going to be Hall-worthy.

If you want to make the case that he is a Hall of Famer based on his fielding, as 37% of Hall voters seem to have, you are also going to have to inflate the value of his fielding to the point of absurdity.  It’s important to note just how good you’re implying he was.


Has Barreled Contact Reached Statistical Stability?

When making evaluations on player ability in terms of their quantifiable actions, there comes a point when you have to take into consideration sample size to determine the validity of the numbers you’re seeing.

Take a batter who comes up 100 times and gets 27 hits. That’s a .270 batting average. Not bad. Another batter comes up 1000 times and gets 270 hits for the same .270 average. So, are both hitters the same? On the surface, yes. However, can you expect the hitter who came up 100 times to continue to hit .270? Is that a reliable amount of at-bats to make an inference? Can we assume the batter with 1000 at-bats is more likely to continue to hit around .270 going forward? I believe we’d all agree, since this is pretty basic-level statistics, that the higher at-bats, the more reliable the batting average.

Statcast has a new-ish measurement of balls hit on the barrel of the bat, or ‘barrels’. This is useful because now we can see how well batters are squaring up on pitches.

Let’s say you have two different batters. One that bloops singles off end of the bat or sneaks grounders past the infield may have a similar batting average as a guy who regularly rips hits into the outfield. So how would you judge the better hitter? They both (with exceptions) produce the same result. Would you go with the guy who regularly squares up on pitches; a hitter that is likely to produce more ‘effective’ hits? Or a batter who tends to hit the ball off the end of the bat, in on the hands, etc. who tends to produce weak contact that could result in groundouts, pop-ups, etc?

If you have to pick one to pinch hit, who would you rather have walking to the plate?

Before I roll up my sleeves, glance below at the type of contact MLB hitters have been producing on average the past three years.

contactType

What I’m going to do is determine if three years of data is enough to make an inference on what we can reasonably expect an average hitter to produce in terms of barrels per contact; have we reached a point where the three-year sample size is reliable to make inferences going forward?

First, I looked at the collection of batted ball events since 2015. Each year had roughly 900 hitters with at least one batted ball event. All together it accumulated a total ‘population’ of about 2700 hitters. I decided it would be easier and more educative to try and break it down year by year.

Using the 900-something batters per year, I wanted to develop a sample size from that group with a confidence interval no higher than five. Using the entire three-year ‘population’ of hitters would show results all over the board; the data became very volatile as the batted ball events decreased.

By taking no less than 100 occurrences of contact, it’s more reasonable to scale. The average batted ball event (BBE) per qualified hitter (with at least one event) is roughly 40% of the overall average of 253 events per hitter. This is closer to the overall ratio of hitters that had several dozen BBEs instead of batters with a few events, which produced large fluctuations.

You could ask “Why didn’t you take ALL the data and average it out?” Well, I could have. The problem I had was the variation is incredibly high; too many of the 2700+ had a very small amount of events (and barrel rate) which cannot lend itself to fidelity. On a scatter plot, it tells us almost nothing.

Instead, I cut the ‘population’ down and required at least 100 BBEs. That gave me a total of 1170 players, or a little more than half of the entire 2015-2017  hotter population.

This is the scatter plot, based upon BBEs (Y-axis, horizontal) and total barreled hits (X-axis, vertical) that was produced using that criteria.

chart (15)

In the above chart, the coefficient of determination (or, r2) equaled 0.161; not a great, but certainly not menial, expectation of correlation between BBE and total barrels.

In layman’s terms, the more events you produce, the higher the expectation of having more barrels becomes. You could have made that inference without the chart, however, I was curious to see if the increase was as sharp as I expected it to be (it wasn’t).

So I wanted a more reliable correlation, as it is logical to assume that the more you do something, the higher the amount of times you achieve your goal.

I took all of those BBEs and compared them to the percentage of barrels (X-axis) to BBEs (Y-axis). I feel that ratio produces a much more accurate relationship.

chart (16)

This time, the r2 equaled a much more stable 0.006 with several outliers present. The further you look down from those outliers, the more concentrated the chart. For the most part, roughly 80% of the plot points are 10% or below. The amount of hitters above that 10% mark would be baseball’s elite power hitters.

It appears we may have concrete proof of normalization.

So, for now, we can assume that your average batter can expect to have maybe 5%-7% barrels per contact; slightly more as your contact events increase.

But, let’s break it down a bit so we can say with certainty that this ratio is dependable for hitters going forward. I wanted to keep the sample size the same throughout the three years of collected Statcast data; 66%, or 395 batters.

We’ll start with 2015.

Below I took the total population of 915 batters in 2015 and used a confidence interval of 4.89 to get the sample size of 395. And, as with all subsequent charts, I worked with a 99% confidence level.

-With all remaining charts, the X-axis is the percent of BBEs to barrels and the Y-axis is the BBEs.

chart (17)

For 2015, the coefficient of determination is 0.032 with maybe nine outliers. There is a minor amount of regression but mostly a stable trend line. And, we see the line staying within a 7%-9% ratio of barrels to BBEs.

Here is 2016’s data; a population of 909 hitters with a 5.00 confidence interval.

chart (18)

Now, even with a similar r2 as 2015 (0.039) we are starting to get larger variation and a few more outliers. Yet the trend line again regresses, this time at a slightly sharper scale.

For 2017, 905 total hitters and a confidence interval of 4.88.

chart (19)

2017 comes across as a mess of variation with dozens of outliers. The trend line produced an r2 of 0.007. And, in contrast to the previous years, there wasn’t a regressive trend as BBEs became more frequent; it actually shows a slight increase.

What does that mean? No idea. Could it be, now we have this information available, that hitting coaches are working with batters to improve their contact? Shot in the dark but I can’t come up with a better inference.

Now, lets use each year sample size combined (1175), use a confidence interval of 4.9 (average CI of the three years of study) to come up with a sample size of 66%, or 552 batters.

chart (20)

Now we have a very stable (with a negligible increase) trend, 0.003 coefficient of determination, with some variation and exceptions at a rate of 10%.

Most of those outliers from the graphs are represented in the following chart. And, of those aberrations, several appear in all three groups.

3YearBBE

So, the question is whether or not the available Statcast data on barrels is considered stabilized after three years; can we reliably scale a batter’s barrel rate? Do we have a reliable sample size for hitters?

It looks as though we do.

After three years, the overall trend line(s) appear to be somewhat stable in the 5-8% window for an average batter; we can expect most hitters to be at or below 10% barrels per batted ball event.


Baseball Prospectus’ New Metric Has a Bartolo Colon Problem

Last Tuesday, Baseball Prospectus rolled out three new metrics for evaluating pitcher performance – Power (PWR), Command (CMD) and Stamina (STM). I was particularly drawn to the PWR metric, which is described as a way of evaluating how much a pitcher fits into the “power pitcher” archetype. It’s an intriguing and novel approach to evaluating and classifying pitchers, I think it’s great new lens for looking pitchers. But when I looked a little closer at the 2016 PWR scores, something jumped out at me.

Baseball Prospectus PWR Leaders, SP, 2016

Oh no.

Bartolo. Colon.

Oh no.

Bartolo Colon is not a power pitcher. Bartolo Colon is the exact opposite of a power pitcher. Bartolo’s peak fastball velocity by Baseball Prospectus’s metrics was 72nd out of 84 pitchers with 150 IP in 2016. Bartolo does not blow anyone away with his 90 MPH fastball, he relies on pinpoint placement to generate whiffs and mixes in offspeed stuff to generate weak contact (indeed, Colon’s 2016 ranked 7th in CMD).

PWR is still an effective measurement: look at all of the other pitchers it (correctly) classifies as power pitchers. But there does not exist an interpretation of the phrase where one could think of Colon’s 2016 as emblematic of a power pitcher. So what gives? Can PWR be adjusted to relieve it of its Bartolo Colon problem?

Like a computer program, if I want to debug this, I have to know how PWR works. Fortunately, BP tells us how PWR is calculated in a fairly straightforward manner:

As of right now, our Power Score is comprised of these three identifiable parts: Fastball velocity (three parts), fastball percentage (two parts), and the velocity of all offspeed pitches (one part). There are some other factors that we considered when developing this metric—such as the tendency to work up in the zone, and to lean on fastballs in put-away counts—but the current version of this metric only includes the three main components discussed above.

While I don’t have access to BP’s exact numbers used for calculating the PWR, I rigged up a rough approximation using the PITCHf/x numbers available on FanGraphs by normalizing each of the above components and weighing them as described above. I plotted my values (xPWR) against BP’s (PWR) and they look reasonable, so I’ll try to use xPWR to mess around and see if I can resolve PWR’s Bartolo Colon issue while maintaining their current level of accuracy for evaluating actual power pitchers.

xPWR vs. PWR

2016 Colon has a xPWR score of 54, not 59, and he’s only 20th in xPWR, which doesn’t seem so bad until you realize that Colon’s xPWR puts him squarely between Jose Fernandez and Max Scherzer. Colon needs dramatic adjustment, and hopefully, the adjustments I make in terms of xPWR can be translated to PWR as well.

The best way to fix a problem is to address the cause, so why is Colon registering an abnormally high PWR score? The main culprit is likely his Fastball%. Here are the leaders in FB% from 2016:

MLB FB% Leaders (2016)
Name FB% PWR xPWR
Bartolo Colon 89.5% 59 54
Aaron Sanchez 74.3% 58 64
J.A. Happ 73.5% 56 57
Robbie Ray 71.1% 63 62
Jimmy Nelson 71.0% 59 60
Jose Quintana 66.5% 46 50
Kevin Gausman 66.3% 59 63
Ian Kennedy 66.2% 51 52
Doug Fister 65.8% 36 37
Brandon Finnegan 65.6% 52 52

I know that FB% is worth about one-third of PWR, and Bartolo is in a league of his own when it comes to FB%. Hence, the most likely culprit appears to be Colon’s insane FB%. There have only been two seasons where pitchers threw 2000+ pitches in a season and posted an FB% above 89%, and both belong to Bartolo Colon – 2012 and 2016. Starters (and to a large extent, relievers) do not typically rely upon their fastballs so much, and since Colon is such an outlier, using normalized scores makes him stand out in a big way. The closest any starter came to Colon’s crazy FB% values was Henderson Alvarez in 2014 (82.7%), so Colon receives a (rather unfair) bonus in PWR scores for throwing so many fastballs, one that makes up for his lack of velocity. Colon cheats the PWR metric by throwing pitches that are technically fastballs and are classified as such but aren’t nearly fast as a traditional fastball. The flaw in PWR is that it assumes that any pitch classified as a fastball is, well, fast – but this isn’t the case for Bart, and so he presents an anomaly.

Perhaps I can rectify giving Bart such an advantage by reducing the weight of FB% – if I drop the weight on FB% to one part instead of two, our top pitchers (min 150 IP) by xPWR (v2) look like this:

MLB xPWR (v2) Leaders (min 150 IP, 2016)
Pitcher xPWR
Noah Syndergaard 65
Carlos Martinez 62
Yordano Ventura 62
Aaron Sanchez 62
Robbie Ray 61
Michael Fulmer 60
Jon Gray 59
Danny Duffy 59
Jose Fernandez 59
Carlos Rodon 57

And here are are our best xPWR scores for relievers (min 40 IP):

MLB xPWR (v2) Leaders (min 40 IP, 2016)
Pitcher xPWR
Aroldis Chapman 87
Arquimedes Caminero 78
Trevor Rosenthal 74
Zach Britton 73
Pedro Baez 73
Carlos Estevez 72
Craig Kimbrel 72
J.C. Ramirez 72
Edwin Diaz 71
Hunter Strickland 70

Note that I scaled the original values to best match the scale of PWR.

Colon has — rather ignominiously — dropped out of the top ten, with his xPWR (v2) falling all the way to 45 the same as John Lackey and Jake Odorizzi. The leaders in xPWR (v2) all fit the profile of a power pitcher — hard throwers, fast offspeed stuff, rely heavily on the fastball — and Colon can’t cheat the metric as much. But at the same time, we’re still committing the same mistake as the originally PWR metric in assuming that fastballs are thrown hard, just to a lesser degree. Maybe we should revamp our approach to the PWR metric.

Perhaps we can simply use average pitch speed across all pitches. This approach rewards pitchers for simply throwing hard and doing so frequently. If I use total average pitch velocity and normalize those values to fit with PWR, Bart’s exploit of FB% can’t work. At the same time, taking a straight average of pitch velocity and normalizing it incorporates all of the tenets of PWR (fastball velocity, FB%, and offspeed velocity), so we’re staying true to the spirit of the original metric. Let’s use this approach for xPWR (v3).

Here are the leaders for 2016 in xPWR (v3) among pitchers with 150+ IP…

MLB xPWR (v3) Leaders (min 150 IP, 2016)
Pitcher xPWR
Noah Syndergaard 73
Aaron Sanchez 63
Carlos Martinez 63
Michael Fulmer 63
Robbie Ray 62
Yordano Ventura 61
Jon Gray 60
Jimmy Nelson 60
Jeff Samardzija 60
Carlos Rodon 60

… and relievers with 40+ IP.

MLB xPWR (v3) Leaders (min 40 IP, 2016)
Pitcher xPWR
Aroldis Chapman 88
Arquimedes Caminero 79
Zach Britton 77
Trevor Rosenthal 75
Pedro Baez 73
Jeurys Familia 73
Carlos Estevez 73
J.C. Ramirez 72
Craig Kimbrel 72
Edwin Diaz 71

And what of our good friend Bartolo? Colon’s xPWR (v3) score falls around 47, the same range as Kyle Gibson and Felix Hernandez.

This third method gives us a lot less range in terms of scores, so it’s more difficult to differentiate between players – but at the same time, it does just as good of a job of identifying pitchers who fall into the power-pitcher archetype while leaving out those who are not.

Is PWR “broken” in its current state? Of course not. Almost every metric has a few players who can cheat it one way or another. Colon happens to be extremely good at cheating the PWR metric. With a couple changes, however, BP might be able to keep Colon from breaking into the top ten with a ridiculous PWR score while maintaining the integrity of the metric as a method of evaluating how well pitchers fit into the PWR archetype.


Building a Team of Free Agents on a Budget

There is no need to emphasize how bizarre this off-season has been. By this time last year, the best available free agents were Matt Wieters and Jason Hammel. This year, there are enough available free agents to create an all star team. With that in mind, I began to wonder if a team could actually be competitive by signing 25 free agents. A super-team of current free agents would undoubtedly contend this year. However, it would also require a payroll in the range of $300MM. If such a team had to stay within the luxury tax threshold, it would need to make significant cuts.

To satisfy my curiosity, I made a spreadsheet of WAR and salary projections for all of the remaining free agents. I attempted to construct the best teams possible within a variety of budgets, and compared my projected WAR totals to teams with similar payrolls. Constructing a great team was more challenging than I expected. I encourage readers to give it a try.

Note: Most contract values are based on a combination of reported offers, the MLBTR free agent predictions, and recent signings. It is likely that many of these players will sign deals that are far off my projections.

Download the Team Builder:

Click Here or the link below to download the team builder spreadsheet. The file should be titled “Free Agent Team Builder 2018”. I suggest using Excel, I haven’t tested it on other programs.

https://www.dropbox.com/s/aii5ewmhabpna8q/Free%20Agent%20Team%20Builder%20February%202018.xlsx?dl=0

Create your own free agent super-team, or see if you can build a competitive roster on a budget. Feel free to comment or share your team, and see how your team stacks up against mine.

Here are two examples of teams I created:

Small Market Team ($90MM Payroll)

Pos. Name 2017 WAR DC Proj. WAR My Proj. WAR Proj. AAV Years Total Value
Starting Lineup
CF Ben Revere 0.0 0.0 -0.1 $2.0MM 1 $2.0MM
SS Eduardo Nunez 2.2 1.7 2.2 $8.5MM 2 $17.0MM
3B Todd Frazier 3.0 2.4 2.3 $11.0MM 4 $44.0MM
DH Lucas Duda 1.1 1.8 1.3 $7.0MM 2 $14.0MM
RF Jose Bautista -0.5 0.0 0.2 $5.0MM 1 $5.0MM
LF Melky Cabrera 0.0 0.1 0.2 $3.0MM 1 $3.0MM
1B Mike Napoli -0.5 0.3 0.3 $2.5MM 1 $2.5MM
2B Chase Utley 1.3 0.0 0.1 $2.0MM 1 $2.0MM
C Carlos Ruiz 0.5 0.0 0.2 $2.5MM 1 $2.5MM
Bench
C Jose Lobaton -0.6 0.2 -0.1 $1.5MM 1 $1.5MM
IF Cliff Pennington 0.4 -0.1 0.1 $1.5MM 1 $1.5MM
OF Craig Gentry 0.1 0.0 0.1 $1.0MM 1 $1.0MM
OF Alex Presley 0.2 0.0 0.0 $1.0MM 1 $1.0MM
Rotation
SP Alex Cobb 2.4 1.7 2.0 $14.5MM 4 $58.0MM
SP Jeremy Hellickson 0.3 0.2 0.7 $5.5MM 1 $5.5MM
SP Brett Anderson 0.8 1.7 0.5 $5.0MM 1 $5.0MM
SP Jesse Chavez 0.3 0.1 -0.2 $3.0MM 1 $3.0MM
SP Nick Martinez 0.0 0.3 -0.2 $2.0MM 1 $2.0MM
Bullpen
CP Huston Street 0.1 -0.1 0.0 $2.0MM 1 $2.0MM
SU Tyler Clippard 0.2 -0.1 0.3 $3.0MM 1 $3.0MM
SU Fernando Abad 0.3 0.0 0.2 $1.5MM 1 $1.5MM
MR Luke Hocheaver 0.0 0.0 0.3 $1.5MM 1 $1.5MM
MR Zac Rosscup 0.1 0.0 0.1 $1.0MM 1 $1.0MM
MR Shae Simmons 0.0 0.1 0.0 $1.0MM 1 $1.0MM
LR Henderson Alvarez -0.1 1.0 0.0 $1.5MM 1 $1.5MM
2017 WAR DC Proj. WAR My Proj. WAR
Total WAR 11.6 11.3 10.5
2018 Payroll and Total Commitments $90.0MM $182.0MM

 

Big Market Team ($197MM Payroll)

Pos. Name 2017 WAR DC Proj. WAR My Proj. WAR Proj. AAV Years Total Value
Starting Lineup  
CF Jon Jay 1.6 0.5 1.5 $7.0MM 2 $14.0MM
SS Eduardo Nunez 2.2 1.7 2.2 $8.5MM 2 $17.0MM
RF J.D. Martinez 3.8 2.7 4.4 $25.0MM 6 $150.0MM
1B Eric Hosmer 4.1 2.8 2.7 $20.0MM 7 $140.0MM
3B Todd Frazier 3.0 2.4 2.3 $11.0MM 4 $44.0MM
LF Carlos Gonzalez -0.2 1.0 0.9 $10.0MM 1 $10.0MM
C Jonathan Lucroy 1.2 2.4 2.0 $10.0MM 2 $20.0MM
DH Melky Cabrera 0.0 0.1 0.2 $3.0MM 1 $3.0MM
2B Brandon Phillips 1.6 1.1 0.9 $6.0MM 1 $6.0MM
Bench
C Jose Lobaton -0.6 0.2 -0.1 $1.5MM 1 $1.5MM
IF Cliff Pennington 0.4 -0.1 0.1 $1.5MM 1 $1.5MM
OF Craig Gentry 0.1 0.0 0.1 $1.0MM 1 $1.0MM
OF Seth Smith 0.5 0.8 0.3 $2.5MM 1 $2.5MM
Rotation
SP Yu Darvish 3.5 3.6 3.8 $26.0MM 6 $156.0MM
SP Alex Cobb 2.4 1.7 2.0 $14.5MM 4 $58.0MM
SP Andrew Cashner 1.9 0.9 1.3 $8.5MM 2 $17.0MM
SP Jeremy Hellickson 0.3 0.2 0.7 $5.5MM 1 $5.5MM
SP Brett Anderson 0.8 1.7 0.5 $5.0MM 1 $5.0MM
Bullpen
CP Greg Holland 1.1 0.1 1.3 $11.5MM 3 $34.5MM
RP Seung Hwan Oh 0.1 0.2 0.4 $5.0MM 1 $5.0MM
SU Tony Watson 0.1 0.0 0.4 $5.0MM 2 $10.0MM
MR Tyler Clippard 0.2 -0.1 0.0 $3.0MM 1 $3.0MM
MR Fernando Abad 0.3 0.0 0.0 $1.5MM 1 $1.5MM
MR Luke Hocheaver 0.0 0.0 0.0 $1.5MM 1 $1.5MM
LR Jesse Chavez 0.3 0.1 -0.2 $3.0MM 1 $3.0MM
2017 WAR DC Proj. WAR My Proj. WAR
Total WAR 28.7 24.0 27.7
2018 Payroll and Total Commitments $197.0MM $710.5MM

My Analysis: 

Based purely on WAR projections, my small market team would be the worst team in baseball. It’s worth noting that this spreadsheet has some flaws. Projected WAR would increase with a full 40 man roster, but so would payroll obligations. With that in mind, this team would still doubtfully have a winning record. There is some potential upside throughout the roster, but my lineup is heavily reliant on veteran players returning to old form. The bullpen is probably the biggest weakness, but spending my budget on relief pitching would have been a tough decision to make.

The big market team is much more promising. However, my projected WAR would still rank them among the bottom tier of teams. With some added depth, I think this team would have a winning record, but contending for a World Series would be a bit of a reach. With such a high payroll, this team would likely also rank among the teams getting the lowest amount of value for each player.

In a way, I believe this project exemplifies one reason why the free agent market has been so stagnant. While these teams are respectable, they would project poorly compared to others. I knew beforehand it would be tough to build a team paying 25 players for past performance, but attempting to put these teams together helped me further appreciate the value of homegrown talent.

Now it’s your turn to build a team, download the spreadsheet and give it a shot! If you have any thoughts or anything to add, please feel free to comment below.


Miguel Cabrera and the Inevitable Decline

Miguel Cabrera had a tough 2017. Could his decline be due to regression? Age? Could it be the back problems he allegedly played through? Or, was he just plain unlucky?

Knee-jerk assumption is health issues. From Jon Tayler (Sports Illustrated):

…it’s clear that, at age 34, his body is breaking down. On (September 24th 2017), Detroit learned that Cabrera, who had to leave Saturday’s game early with back pain, has been diagnosed with two herniated discs in his lower back, with manager Brad Ausmus telling reporters that his star may not play again this year. Back issues have been a problem for Cabrera since he played for Venezuela in the World Baseball Classic back in March and are the latest in a litany of aches and pains he’s dealt with since turning 30; as Ausmus put it, “This has probably slowly been developing for years.

Baseball players break down. Some sooner (and more drastically) than others.  A player with Cabrera’s skill set can regress and still be above average.

So, let’s delve into regression and luck.

A quick overview of the last four seasons for Cabrera.

mCabrera1417

2017 was likely worse than anyone could have reasonably expected.

We’ll mostly work with Weighted On-Base Average. wOBA is a great tool that helps determine how productive a hitter has been.

It’s more informative than OBP as it uses weights to determine where a hitter ended up and what he accomplished when reaching base. OBP only tells us that the batter got on base. That might be enough for others, but some of us would like a little more context; no judgment on which you prefer.

For regression sake, let’s look at Cabrera’s career wOBA against the league average in terms of age.

chart (11)

As we can see, Cabrera had a wOBA well above average for a player his age. According to the chart data, at ages 29 and/or 32 is when wOBA seems to peak; .319 for 29 year olds, .320 for 32 year olds.

Once Cabrera hit 34, he crashed back to earth and managed a league average wOBA.

Perhaps 2017 was an anomaly; a result of bad luck? Here’s a glance at his batting average on balls put in play. His career BABIP is .344 and 2017 he posted a .292; slightly worse than league average.

chart (12)

Cabrera’s BABIP remained steady, save for a few fluctuations, then plunged into mediocrity. So he was just unlucky…right?

We can look at this graph and see from age 19 to 23 it continually climbed, then dove down at 25. The chart follows a similar trend, with a bit more volatility, from age 26 to 32. Can we infer it will trend upward again? Since it would be quite a feat for his BABIP to go on another positive run, I’d venture to guess that, at this point, it will stabilize.

So, can we blame the injury now?

Well, we can’t measure how much his back problems affected his hitting. We can take it into consideration but we don’t know how much it was actually bothering him. He managed to play in 130 games, so its hard to say it was that much of a problem for him. I would presume, depending on the pride of the player, that as you got older you’d want to protect your body more; give it more rest. Obviously, Cabrera is a tough guy as he averages something like 150 games a season. Knowing that the organization was sliding down into obscurity, maybe he felt it was his duty to keep playing for the fans.

Those are a lot of ‘maybe’s’.

Other than his rookie year, he’s never played less than 100 games each season. He had injuries in 2015; listed as day-to-day with a back soreness on September 23rd and ended up on the 15-day DL July 4th with a calf injury.

Regardless, the sharpness of the wOBA decline is what I find disconcerting. His biggest drop in wOBA occurred between the ages of 29 and 30; about a .070 drop. Then, going from age 33 to 34, it dropped .086 points. To note, the average wOBA actually increases two-hundredths of a point from 33 to 34.

So why did this happen? We’re going to investigate Cabrera’s wOBA versus his xwOBA for 2017.

To summarize xwOBA: Based upon the type of contact, it’s what was expected to happen versus what actually happened.

*Already know xwOBA? Skip down to the chart

Aaron Judge drives a ball into the left-field gap, under a certain launch angle and exit velocity. Let’s say he hits it into an average outfield and it drops in for a double. Alternately, Mike Trout drives a ball under the same conditions but Billy Hamilton is playing center field. Since Hamilton has elite speed and is a good defender, he caught the same type of hit Judge dropped between inferior defenders.

One other thing I want to point out about xwOBA. It takes speed into account. Albert Pujols is not a fast runner; much slower than average. That being said, he’s more inclined to hit into double plays and/or unable to leg out an infield single. A ball hit with the same trajectory by other players might be beaten out.

I understand that speed is a factor in a game but given the likelihood of that ground ball being hit for an infield single, xwOBA would adjust for a player like Pujols because it would be expected that he could leg it out. That aspect could be seen as a flaw depending on your point of view.

This might be oversimplifying the concept…or making it even more confusing. And it might not be an exact science, but its pretty darn close.

Here’s a chart of comparison to other hitters who saw a variance from xwOBA to (actual) wOBA in 2017.

2017XWOBA

We can see one thing standing out; Cabrera, by two-hundredths of a point, is well ahead of the other nine in terms of the difference. It took a little bit of a dip from Brandon Moss to Logan Forsythe but not as drastic.

Going back to 2016, Billy Butler had the biggest drop at -.058; Cabrera finished fourth with -.050 (.459 xwOBA/.409 wOBA).

So, two reasonably big differential drops over the course of two years. The caveat here is in 2016, Cabrera was much more productive; a 4.8 WAR with a 152 wRC+.

Consider this contact visual, from ‘16-‘17, of Cabrera’s xwOBA and wOBA.

mCabrera1

Quite a distinction in contact as well as balls in play. Yet, his launch angles remained in the same sphere, between roughly 40 and -20 degrees. Cabrera clearly isn’t having a problem with his swing. Mechanically, anyway.

Let’s move onto contact and exit velocity during the drop-off years of ’16 and ’17. In 2016 Cabrera had a total of 238 ‘good contact’ hits (barrels/solid/flares) on 9.5% of pitches seen:

Miguel Cabrera (6)

In 2017, 161 with a 7.8% ratio:

Miguel Cabrera (5)

 

And how about Cabrera’s exit velocity?

mCabreraEV

Was he swinging in pain? How much and to what detriment isn’t quantifiable, especially because he still managed 500-plus plate appearances. Pain to you isn’t necessarily pain to someone else.

Cabrera is on the decline. His xwOBA data makes a case for that. I’m going to infer it was simply a coincidence that his injury occurred the same year. While he wasn’t hitting the ball as hard (humans do lose strength), he maintained his launch angles; something that would have changed (at least a little) if you’re burdened from back pain.

You can’t play at a high level, like Cabrera has, forever. Even during his not-so-great years, he was still so much better than an average player. Regression is inevitable. Last season appeared to be the year that it happened to one of the best hitters the game has ever seen.

*Statcast data courtesy of Baseball Savant


Try to Catch Corey Knebel Upstairs

There is a good chance you didn’t know who Corey Knebel was until last season. As a mediocre middle reliever for a bad Milwaukee Brewers team, he did not receive much attention in 2015 and 2016. Most are familiar with him now after he posted 39 saves for a contending Brewers team. Knebel ranked third in the league in strikeout rate in 2017 with a 40.8% rate, sitting behind only the far-and-away best relievers in baseball: Craig Kimbrel and Kenley Jansen. Let’s take a look at what he has to offer.

Knebel will do one of two things.

He will blow past you with 98+ mph, or:

… he will drop this nasty breaking ball on you.

Knebel really doesn’t do anything else, and, as 2017 proved, he doesn’t need to. He has two incredible pitches, and that’s all a reliever needs to be successful. That curveball is exceptional, but let’s focus on Knebel’s fastball for a little.

Velocity doesn’t make you good, but velocity is generally good. Spin doesn’t make you good, but spin is generally good. Most pitchers would say they would rather have more of the two than less of the two. Neither of them guarantees success, and the two combined don’t guarantee success either. However, it’s unlikely you will find a pitcher with an abundance of both categories who isn’t marginally successful IF they can command their pitches. And that’s a big if. (See Betances, Dellin)

Knebel’s fastball has an abundance of velocity and spin. In his career, the pitch has averaged 96.4 mph and 2375 RPM (2245 is roughly league average). Here is a plot of spin rate and exit velocity of all the relief pitchers who have thrown at least 200 four-seam fastballs since Knebel’s first full season in 2015. Knebel is highlighted in yellow.

Knebel ranks near the top in both categories. Using Z-scores (which measure how much something differs from average, positively or negatively) to standardize velocity and spin rate, his combined score ranks 41 out of 347. Impressive, of course, but not outstanding. So why did Knebel’s fastball plummet from a 117 wRC+ against in 2015-16 to a 76 wRC+ last season, despite the same high velocity and spin? Watch the fastball GIF again and pay attention to how high in the zone, or out of the zone, rather, Knebel throws the ball.

Now, here are two pitch heat maps. The first, Knebel’s fastball location in 2015-16. The second, his fastball location in 2017.

The first is essentially middle-middle. High-heat pitchers are less afraid to throw in the strike zone, as they want to force hitters to catch up to the pitch. Hitters caught up to Knebel. While the spray in 2017 is not as compact, the shift up in the zone is obvious. With how hard he throws his fastball and how much spin it has, it’s quite difficult to hit Knebel high in the strike zone.

Even when his fastball was not successful prior to last season, it was still troublesome for hitters up in the zone. In 2015-16, Knebel caused opponents to whiff on just over 40% of their swings at fastballs high in the zone, ranking 12th among relievers. That figure was even more spectacular in 2017, as he posted a 46.4% rate, which placed him third among relievers. Hitters can’t contact that velocity and spin that high up. The change in pitch location doubled his swinging strike rate from 8% in 2016 to 16% last season, along with shooting his chase rate up from 22% to 32.6%.

Knebel always had the stuff for a great fastball, he just had to figure out how to command it. Now that he has, he looks like a terror of a pitcher. Knebel can finish hitters off with his curveball or dare them to catch up to his fastball upstairs.