Month: May 2016 | Page 2 | Community Blog

Archive for May, 2016

The Future of Analytics In Baseball: How Will Small-Market Teams Fare?

May 19, 2016

This post originally appeared on the Pittsburgh Pirates blog Bucco’s Cove.

A recent episode of the Baseball Prospectus podcast Effectively Wild (and if you don’t listen to it, this is one of the best baseball podcasts out there) had two analysts from the LA Dodgers’ front office as guests. During the episode, one of them said, “Even though we have grown substantially in the last year…” and went on to talk about the size of their analytics department and how they work together. This is a scary prospect for small-market teams like the Pirates; embracing analytics before such things were en vogue allowed teams like the Moneyball A’s, the Royals, the Pirates, and many others to gain a competitive advantage over their comparatively retrograde competition still throwing money at their problems every offseason.

The window of opportunity for small-market teams to use advanced analytics to their advantage may be closing faster than we think. Most (and possibly all, I don’t have access to every team’s front office payroll) teams have some sort of analytics department (or “Baseball Operations Department,” as they’re often dubbed). According to this ESPN article from about 14 months ago, only two woeful teams are listed as “nonbelievers,” the Marlins and the Phillies, and the Phillies have since seen some significant shuffling in their front offices. Larger teams are beginning to emulate their smaller counterparts to varying extents, with results that will bear fruit over the coming seasons. As a fan of a small-market team, this is concerning; the limited dividends paid from the analytics advantage may mean a return to the old power structure in baseball in which larger-market teams with more money have the ability to acquire players at will. The difference, however, will be that stats will have informed the signings, so if two teams are targeting the same player for “sabermetric reasons,” the team with more money will obviously still have the upper hand.

Scarier still for fans of small-market teams is that the greater financial capital available to geographically-favored franchises is that these financial resources can not only be employed to sign the best players, but also the most talented analysts and more of them. The premise that teams all have access to effectively the same data and analysis is rendered moot if larger franchises can secure a stronger analytics department, both in terms of the number of analysts and the talent of the analysts (money could even be used to lure talented analysts to the richer franchises in the same way that players are). For example, the Cubs thus far this season seem to be a perfect confluence of young talent, effective free-agent signings based on a strong analytics department, and a hell of a lot of money, which is exactly where you want to be if you’re trying to create a dynasty and win multiple Commissioner’s Trophies.

Parity in the league is still greater than that of the NFL, but we could be witnessing the last generation of such parity. How is such a situation solved? The one obvious choice is a salary cap; the player’s association would be loath to support such an idea, although it’s perhaps beginning to be in their interest. As the league’s revenue increases, players haven’t been getting the same share of that revenue, according to Nathaniel Grow on FanGraphs. A quote from that article:

“The biggest difference between the NBA and MLB, then, isn’t the fact that the former has a salary cap while the latter does not. Instead, the primary difference between the two leagues’ economic models is that by agreeing to a “salary cap,” NBA players in turn receive a guaranteed percentage of the league’s revenues, while MLB players do not.”

According to the same article, the players’ share of revenue has fallen about 13% to 16% since 2002 or 2003. While this argument is unlikely to induce the MLBPA to support a salary cap, a downturn in league parity could force their hand at some point in the future. This would be a long-term effect, however; many years of a “lack of parity,” coupled with a downturn in the popularity of the sport as a whole, would be required to even have the MLBPA thinking about acquiescing to a salary cap.

Coming back to the proliferation of analytics departments among MLB teams and their effect on important advantages held by those willing to embrace statistics: I don’t know what’s going to happen. There are many facets to analytics, more than just comparing players based on the BABIP or K% or arm slot or determining what players to acquire and how much they’re worth. For example, one of the Effectively Wild guests from the episode I cited earlier was a biomedical engineering major during her undergraduate studies, implying that the front office is becoming interested in the medical side of analytics: preventing injuries, improving player health, and looking at the biomechanical aspect of baseball, which takes a significant toll on players’ bodies. This is not too dissimilar from what the Pirates have done in recent years and is just one of the many components to assembling and maintaining a competitive squad.

This line of thinking admittedly removes the human component from the equation, which is still incredibly important to this entire process. There will always be GMs who are more willing to try new strategies to win and those who are unwilling to change (*cough* Ruben Amaro, Jr.). Coaching and player development, especially in the minor leagues, will continue to be extremely important for MLB franchises and is largely outside the purview of the type of statistical analysis that is widely considered in evaluating players. Rather, this part of baseball can be thought of, to a certain extent, as producing the statistics that analysts ultimately study. As a result, there will always be opportunities for smaller-market teams to hire talented personnel, including trainers, coaches, scouts, and other employees outside the scope of the Major League analytics departments that will influence franchises’ success and failure.

However, analytics at the MLB level may start to be influenced by money. Ultimately, stories like the Pirates’ repeated acquisitions of undervalued Yankee catchers who are stellar pitch framers, the Royals’ World Series win relying on great defense and a crazy strong bullpen, and the general parity of the league beyond the traditionally great franchises may be fewer and further between. Those franchises with more money may regain the competitive advantage that the sabermetric revolution has wrested away from them for the past decade, and smaller-market teams will have to find yet another way to adapt to the ever-changing baseball landscape.

Got Projections?

by Juan Pablo Zubillaga

May 18, 2016

Back in college, I remember being fascinated by a concept I learned in one of the first chemistry classes I took: the atomic orbitals. Contrary to what I thought at the time, electrons don’t orbit around the atom’s nucleus in a defined path, the way the planets orbit around the sun. Instead, they move randomly in the vicinity of the nucleus, making it really hard to pinpoint their location. In order to describe the electrons’ whereabouts within the atom, scientists came up with the concept of orbitals, which, simply put, are areas where there’s a high probability of finding an electron. That’s pretty much how I see baseball projections.

A term that is very often used by the sabermetric community is “true talent level,” and just like an electron’s position, is a very hard thing to pinpoint. Projections, however, do a very good job of defining the equivalent of an atomic orbital, sort of like a range of values where there’s a high probability of finding a certain stat. I know what you’re thinking; projections are not a range of values. But you can always convert them very quickly just by adding a ±20% error (or any other percentage you consider fitting). So, for example, if a certain player is projected to hit 20 home runs, you can reasonably expect to see him slug 16 to 24 homers.

As a 12-year veteran fantasy baseball manager (and not a very good one at that), I’ve never used projected stats as a player-evaluating tool when I’ve gone into a draft. For some reason (probably laziness), I’ve mainly focused on “last year’s” stats, and felt that players repeating their last season’s numbers was as good a bet as any. This year, after taking a lot of heat for picking Francisco Lindor and Joe Panik much higher than what my buddies thought they should’ve been taken, I started wondering how much of a disadvantage was using a simple prior-year data instead of a more elaborate method.

To satisfy my curiosity, I decided to evaluate how good a prediction are “last year” numbers, and compare them to other options such as using the last two or three years, and using some projections publicly available. In this particular piece, I’ll limit the study to offensive stats, but I’ll probably tackle pitching stats in a second article.

The first step for this little research was to establish the criteria with which to compare the different projections. A simple way to evaluate projection performance is using the sum of the squared errors; the greater the sum, the worse the projection (in case you’re wondering, squared errors are used in order to make negative errors positive so they can be added, it also penalizes bigger errors more than smaller errors). In this particular case however, I wanted to evaluate projections for a number of different stats, so a simple sum of squared errors would have an obvious caveat in that stats with bigger values have bigger errors. For example, an error of 10 at-bats is a very small one, given that most players log 450+ of them per season. On the other hand, an error of 10 HR is huge. Additionally, not every stat has the same variation among players. Home runs, for example, have a standard deviation of around 70% of the mean, while batting average’s standard deviation is only about 11% of the mean. So, you could say that it’s harder to predict HR than it is to predict AVG.

Long story short, I divided each squared error by the squared standard deviation, and calculated the average of all those values for each stat. Finally, I converted those averages to a 0 to 1 scale, with 1 being a perfect prediction (in reality, these values could be less than zero when errors are greater than 1.5 standard deviations, but I scaled it so that none of the averages came out negative).

For this study, only players with at least 250 AB on the season were considered. Also, players that were predicted to have less than 100 AB were not considered, even if they did amass more than 250 AB on the season. The analysis was done on five different sets of predicting data:

Last season stats.

A weighted average of the two preceding seasons, with a weight of 67% for year n-1, and 33% for year n-2.

A weighted average of the last three seasons, with 57.5% for year n-1, 28.5% for year n-2, and 14% for year n-3.

ZiPS projections (Created by Dan Szymborski, available at FanGraphs)

Steamer projections (Created by Jared Cross, Dash Davidson, and Peter Rosenbloom. Also available at FanGraphs)

The following graph shows the average score of each of the 5 projections for each individual stat considered in this study. The graph also shows the overall score for each stat, in order to have an idea of the “predictability” of each one of them. Remember, higher scores indicate better performance, with 1 being a perfect prediction.

Other than hinting that it is in fact a very poor decision to use only last year’s data, this graph doesn’t tell us much about which predicting data has a better overall performance. It does provide, however, a very good idea of the comparative reliability of each stat within the projections.

Aside from stolen bases (which honestly surprised me as being the most predictable stat of the bunch), the three most reliable stats are the ones you would’ve expected: HR, BB, and K. They’re called “true outcomes” for a reason, they depend a great deal on true talent level, and involve very few external factors such as luck or opponent’s defensive ability.

On the other end of the spectrum, it’s really no surprise to find three-baggers as the least reliable stat. This may seem counterintuitive at first, given that players that lead the league in triples have a distinctive characteristic in being usually speedy guys. Nonetheless, 3B almost always involve an outfielder misplaying a ball and/or a weird feature of the park such as the Green Monster in Fenway or Tal’s Hill in Minute Maid’s center field, making triples unusual and random events. Playing time (represented in this case by at-bats) has also an understandably low overall score. Most injuries, which are a major modifier of playing time, are random and hard to predict. Also, managerial or front-office decisions can affect a player’s playing time. It does surprise me, however, to see doubles so far down in this graph, and I really can’t find a logical explanation for it.

Let’s move on now to the real reason why we started doing all this in the first place. Here’s a graph that shows the average score for each predicting data, for years 2013, 2014, and 2015. It also shows the three-year average score.

The one fact that clearly stands out in this graph is that last-year numbers are a very poor predicting tool. Its performance is consistently and considerably worse than any other set of data used. So my initial question is answered in a pretty definite way: it is a huge mistake to rely on just last season’s number when trying to predict future performance.

Turning our attention to the other four projections, it becomes a bit harder to separate them from each other, especially using only three years’ worth of data. The average performance of the three-year period gives us a general idea of the accuracy of each option, but looking at the year-by-year numbers, it’s not really clear which one is better. Steamer seems to be the winner here, since it had the better score on all three years. ZiPS, on the other hand, despite having a better overall score than the three-year weighted average, has a worse score in two of the three years. They were really close in 2014 and 2015, but ZiPS was considerably better in 2013, which interestingly, was a less predictable year than the other two.

The biggest point in favor of ZiPS when comparing against the three-year weighted average is that ZiPS doesn’t actually need players to have three years’ worth of MLB data in order to predict future performance, and that makes a huge difference. Another major point in favor of ZiPS is that it’s doing all the work for you! Believe me, you do not want to be matching data from three different years every time drafting season comes around (I just did it for this piece and it’s really dull work).

After all is said and done, projection systems such as Steamer or ZiPS do a fine job of giving us a good indication of what to expect from players. We’re much better off using them as guidelines when constructing our fantasy teams than any home-made projection we could manufacture (unless you’re John Nash or Bill freaking James). I know next March I’ll be taking advantage of these tools, hoping they translate into my very elusive first fantasy league title.

Tyler Wilson and His Five Plus Pitches

by Michael Tamburri

May 17, 2016

Let me preface this article by saying that I watch A LOT of baseball. I also have an extensive analytical background and am always analyzing baseball stats looking for value in players. Last week, I was watching an Orioles game and the starting pitcher was a player I have never heard of. His name is Tyler Wilson. While watching the game, I was very impressed with his overall make-up and the confidence he displayed in each one of his pitches. Many times what separates a pitcher from being able to start at the big-league level versus being destined for the bullpen is the ability to throw multiple pitches. The ability to throw each of those pitches effectively, however, can be what separates a good starting pitcher from a great starting pitcher. The more I watched of Wilson, the more intrigued I became about his future outlook, and the more motivated I became to write this article. (I went back and watched all of Wilson’s starts this year before writing this article.)

To give you a little background, Tyler Wilson has never been an elite prospect. He attended college at the University of Virginia, where he was overlooked by fellow staff-mate, and future 1st round pick, Danny Hultzen. Wilson was drafted by the Orioles in the 10th round of the 2011 MLB Draft. Ever since being drafted, he has quietly excelled at every level. He doesn’t have the dominant strikeout numbers that you look for in pitching prospects, which is a big reason he has gone overlooked for much of his career.

After climbing his way through the organizational ladder, Wilson made his major league debut with the Orioles last year and eventually made the team this year out of spring training. Although he made the team in a bullpen role, early season injuries to the Orioles pitching staff opened up an opportunity and Wilson has really taken advantage of it. Enough of the background though. Let’s move on to what I saw while actually watching him pitch.

Tyler Wilson features a cutter and a two-seam fastball. Each of these pitches sit in the 89-91 mph range and both show a great amount of movement. The cutter is most effective against right-handed batters when thrown on the outside portion of the plate. Check out the video below to watch him fool Kansas City Royals outfielder Lorenzo Cain with three straight cutters:

He essentially gave Cain, a very good hitter, three of the exact same pitches in a row…and Cain couldn’t touch them. In every start this year, Wilson has pounded the outside corner with this cutter and has had fantastic results. Don’t think by any means though that he is a one trick pony. As soon as you start to expect that cutter on the outside corner, Wilson will come right back in on you with a two-seam fastball:

Look at the horizontal movement on that pitch! Absolutely filthy! Wilson has showed a ton of confidence in both of those pitches so far this season as he uses them to pound both sides of the strike zone and his command of them has been exceptional. He is not afraid to throw them in any count and they are equally effective vs both left-handed and right-handed batters.

While his fastballs both seemed to be plus pitches upon first glance, I started to have thoughts that this guy might be for real as soon as he started throwing his curveball. Wilson’s breaking ball sits in the 77-79 mph range. I was astonished by how well he was able to locate his curve and the amount of movement on each and every one he threw. Watch him send White Sox slugger Jose Abreu down swinging in the video below:

Abreu had no chance. In his most recent start against the Twins, Wilson’s curve looked even better. Check out the one he threw to Byung-Ho Park:

Both of those pitches came in a 2-2 count. Many pitchers are scared to throw a breaking ball in a 2-2 count, especially to players with plus power such as Abreu and Park. If you miss your target, two things can happen. One — you leave the ball up in the zone and it gets hit out of the stadium. Two — you throw it in the dirt; the hitter lays off; and now you have to pitch to this slugger with a full count. Wilson isn’t scared to throw his curveball in any count and that is what makes him so dangerous. You never know when to expect it, but at the same time you have to expect that he can throw it at any moment.

The last pitch in Wilson’s arsenal is his changeup. This pitch has a ton of downward movement and produces a lot of groundballs. While there were many better examples that I could have shown you of his change-up in action, I wanted to show one of his bad ones. Even when he missed his target, the batter was still fooled by the amount of movement on this pitch. Check out the following pitch to Royals SS Alcides Escobar:

The catcher set up down in the zone and Wilson clearly misses his target. Luckily it didn’t seem to matter as the pitch had an insane amount of horizontal movement, running in on Escobar and jamming him.

Take a look at the chart below, showing the vertical and horizontal movement on each of Wilson’s pitches:

The middle portion of this chart is empty. All five of his pitches have a tremendous amount of movement, and none of them move in the same direction. The fact that he is able to command each of these pitches so well and keep hitters guessing with which one will come next is the reason why he has had so much success. A big reason why hitters are having trouble guessing his pitches is because of how well Wilson is able to repeat his delivery. The chart below shows Wilson’s release point for each type of pitch:

As you can see, his release point is almost identical with all five of his pitches. At this point, I have watched all of his starts from this season and was very impressed. I then decided to do some research and was immediately impressed with stats such as his career BB rate and low WHIP, but wanted to dig further. I began to look through the PITCHf/x data because I was curious to see how effective each of his pitches actually were. Based on the PITCHf/x value metric, all of his pitches so far this year have graded as above average. If you are not familiar with the PITCHf/x value scale, someone who has a fastball ranking of zero means that he possesses an average fastball. Any value above zero means that pitch is above average. Obviously the higher the number, the better the pitch. The same goes for negative numbers and pitches being below average. See the table below for the breakdown of Wilson’s arsenal:

Based on the above values, the change-up has been Wilson’s most valuable pitch this season with his curveball close behind. Obviously it is very early in the season and we are working with a small sample size…but that doesn’t mean we can’t have fun! While doing this research, I set out the goal to find every starting pitcher who throws five or more above-average pitches. Below is the list of players who fit that description:

IP = Innings Pitched
FA = Fastball
FT = Two-Seam Fastball
FC = Cut Fastball
SI = Sinker
SL = Slider
CU = Curveball
CH = Change-up
KC = Knuckle Curveball
EP = Eephus

There are only five pitchers who have thrown five or more pitches above average so far this season! Wilson is in great company, as the other four pitchers are all All-Star-caliber players and borderline household names. Being that this is such a small sample size, I decided to look back at last year’s stats to see how many players fit this description over a full season. Using the same parameters and setting the minimum IP to 100, the following table was produced:

Once again, the names on this list are some of the top pitchers in baseball. A few of these pitchers have a pitch that graded out as below average, but since they had five or more different pitches all individually grade as above average, they made the final cut.

As you can see, it is very rare to have a pitcher who has five legitimate plus pitches. I am very interested to see if Tyler Wilson can maintain these results over the course of a full season, and I really hope he is given the opportunity to do so. If he continues to pitch the way he has been, the Orioles will have no choice but to leave him in the rotation. Although he has had limited success, Wilson has struggled in each of his starts when facing the lineup the third time around. This could be due to the fact that he is still in the process of being stretched out from his bullpen role. When in the bullpen, you don’t have to prepare to face the same hitter three times. I am hopeful that once he is fully stretched out and back into his starter mentality, he will be able to make the necessary adjustments and continue to throw all of his pitches with confidence. If he can continue to make quality pitches as he faces the lineup for a third time, I believe Tyler Wilson has the chance to become a very special pitcher.

Memorable quotes I heard during the TV broadcasts:

“Everyone thinks that I pitch with a chip on my shoulder but I really don’t. I just go out and compete. I don’t think of it that way.” – Tyler Wilson

“I think he understands himself. He can maintain his game-plan throughout the game. He’s going to keep us in the game and give us a chance to win. What more can you ask for?” – Pitching Coach Dave Wallace

“I love that he can make the ball run in and then cut away. He pitches to both sides of the plate. Not a lot of young pitchers can do that.” – Manager Buck Showalter

…no Buck, not a lot of young pitchers can do that.

Twitter – @mtamburri922

Drew Pomeranz Is Here to Stay

by Matthew Prowant

May 15, 2016

After shutting out the Chicago Cubs offense over six innings of 10-strikeout ball, Drew Pomeranz lowered his season ERA to 1.80 and FIP to 2.61. He currently ranks 3^rd among qualified starters in K% and is tied for 11^th in WAR. Furthermore, Pomeranz has faced four of the top five offenses in the National League, as well as having had a season opener at Coors Field, hence we cannot claim stat padding against mediocre competition. While a .250 BABIP and 82.1 LOB% may not exhibit the greatest signs of stability, Pomeranz is finally reaching the potential that garnered him a top-30 prospect ranking from Baseball America. So what has Pomeranz done to unlock this potential?

Pomeranz has discovered his newfound success by neutralizing right-handed bats. Earlier in his career, Pomeranz’ relative struggles against righties led many to wonder whether his ultimate fate rested in the bullpen. In fact, heading into 2016 many doubted whether he could even earn a spot in the Padres rotation; he couldn’t even earn a mention in Jeff Sullivan’s positional preview post. This sentiment was understandable given his career .340 wOBA against and 7.1 K-BB% when facing right-handed hitters up to this point. In 2016, however, he has lowered the wOBA against to a measly .240 while striking out 34% of righties. By dropping 100 points of wOBA, he’s essentially transformed his average opposite-handed plate appearance from Kyle Seager to Omar Infante. As with any dramatic improvement in performance, a confluence of factors has led to Pomeranz’ success.

Since debuting in 2011, Pomeranz has gradually raised his vertical release point up nearly half a foot. This more over-the-top delivery has undoubtedly provided him greater deception against righties. More noticeably, however, Pomeranz has brought his changeup back from the dead. Early in his career, Pomeranz threw his change roughly 9% of the time to righties. From 2013-2015, when 72% of his appearance came out of the bullpen, Pomeranz lowered that rate to 3%. This season, however, Pomeranz is utilizing his change-piece over 15% of the time against right-handers. Throwing it around 87 mph, Pomeranz’s change nearly perfectly mimics his sinker in both velocity and movement, but to differing results. Pomeranz generates an above-average 44% fly balls on balls in play with his change, while the sinker gets 67% groundballs. This deception, combined with Pomeranz’s pitcher-friendly home park, have led to a dearth of quality contact on the changeup, as illustrated by the .111 ISO against on the pitch.

Despite the resurgence of Pomeranz’s changeup, his improved curveball has been the true game-changer. He trails only the enigmatic Rich Hill in percentage of pitches that are curveballs; likewise, he employs it over 43% of the time against righties, up from 23% over his career before joining San Diego. His 4.6 curveball pitch value trails only the Phillies duo of Aaron Nola and Jerad Eickhoff, and their club’s experimental pitching philosophy, so far in 2016. After leaving the breaking-ball-murdering confines of Coors Field in 2014, Pomeranz witnessed a significant increase in both vertical movement and velocity. This, however, does not explain his recently-discovered success. Similarly, he has kept his Zone% on the curve right around his career average of 43%. The key lies in where out of the zone he locates the ball. This season, Pomeranz is hitting low-and-gloveside off the plate with almost 30% of his curves to both righties and lefties alike. Prior to this campaign, Pomeranz only hit that spot about 10% of the time, as he more evenly distributed his curveballs across the zone horizontally. Whether a change in approach or simply improved mechanics and command, Pomeranz is finding tremendous success with his hook. Using the curve against righties, Pomeranz has raised his Whiff% to a career-high 16.4% in addition to generating a career-high 39.6 Swing %. Furthermore, nearly three-quarters of his balls in play off the curve are grounders and he has yet to permit a single fly ball on the pitch vs. right-handed hitters.

As Eno Sarris noted in his discussion with him last December, Pomeranz’s success hinges on three things: “his health, his changeup, and his curveball.” Seven starts into the season, Pomeranz’s progress on these three fronts has led him to success against righties and helped him unlock his prior potential. He’s gone from a guy the Athletics traded for spare parts to a solidly above-average starter for the Padres. Perhaps the most encouraging aspect of this emergence: Pomeranz is still only 27 years old. With almost three more years of service time left, and an inevitable sell-off of Tyson Ross, Andrew Cashner, and James Shields on the horizon, Pomeranz could potentially parlay his improvement into an ace role on the Padres staff. Of course, Pomeranz could find himself on the market in the near future, and he would certainly fetch more than Yonder Alonso and Mark Rzepczynski this time around.

xHR%: Questing for a Formula (Part 5)

by Jackson Mejia

May 15, 2016

This is the long-delayed fifth part in the xHR series. If you really want to read the first four parts, they can be located here, here, here, and here.

More than a month late, the highly anticipated follow-up to the first iteration of xHR has arrived. Once more, that increasingly trivial metric will grace the page of FanGraphs, wallowing in the mostly prestigious Community Research section (on the other hand, this section is most definitely the best section on the World Wide Web for experimental metrics and amateur analyses).

Unless the reader has an impeccable memory for breezily scanned, frivolous articles, he or she likely needs a reminder as to what xHR% is and aims to be. xHR% is a metric that describes at what rate a player should have hit runs over a given season. From this, expected home runs, a more understandable counting statistic, can be found by multiplying plate appearances by xHR%. It cannot be emphasized enough that the metric is not predictive; it only aims to describe. Without further ado, the formula is here:

I know that’s a lot to look at, and it isn’t exactly self-evident what all of the variables mean. As such, an explication of each part is necessary and provided below. (For logical rather than chronological purposes, the Kn variable will be analyzed last.)

AeHRD – One of the biggest differences between this formula and the last one is that this one does not use home run distance. This iteration uses expected distance, rendering it a combination of simple math, sabermetric theory, and physics. As such, expected home run distance strips out one of the biggest factors in luck — the weather.

Expected home run distance is found by utilizing a method taken from Newtonian Mechanics to calculate how far objects go. By using ESPN’s HitTracker website, I was able to obtain launch angles and velocities for nearly every home run hit in 2015. From this, I was able to resolve velocity into its respective parts, velocity in the x-direction (Vx) and velocity in the y-direction (Vy). After that, I calculated the amount of time the ball would be in the air with the formula vf=vi+gt, where vf is final velocity (0 m/s), vi is initial velocity (Vy), and g is simply the gravitational acceleration constant. Finally, I multiplied Vx by time in order to get the total expected distance.

I repeated that process for every home run hit by a given player in order to find his average expected home run distance. By doing this, I was able to strip out all weather-related components.

AeHRDH – Utilizing the same process as above, I found the average expected home run distance for every stadium. This is the player’s home stadium’s average home run distance, regardless of team.

AeHRDL – The same as above, but done for every home run hit in the majors last season.

When put together in the numerator and the denominator, the above variables serve as a “distance constant” of sorts that will at most adjust the resulting expected home runs by plus or minus two. Occasionally, the impact is negligible because the average expected distance is very close to that of the player’s home stadium and the league. Averaging the mean expected home run distance of the league and of the home stadium allows the metric to paint a more accurate picture of where the player hit his home runs and whether or not they should have left the park. Nevertheless, it’s important to note that this formula still fails to account for fly balls that fell just short of the wall due to the wind and other factors, meaning that there are still expected home runs unaccounted for.

FB% – If you remember correctly, or took the time to briefly review the previous posts, then you will recall that in the prior iteration of the formula there was a section very similar to this one. The only differences are that the weights on each year of data have changed (those are still somewhat arbitrary, however, but I am working on getting them to more precisely reflect holdover talent from past years) and the primary statistic used.

Previously, HR/PA was used, but it had to be abandoned because the results were too closely correlated with reality. This time, I looked at how similarly descriptive formulas were quantified. Oftentimes, those metrics did not use the target expected metric in their formulas. Rather, they utilized other metrics that correlated moderately well or strongly with their expected metric. In this case, I decided to use FB% because it’s a relatively stable metric (especially in comparison with HR/FB), and it has a strong correlation with HR% (about .6).

As a clarification, the subscript Y3, Y2, and Y1 indicate the years away from the season being examined, where Y1 is really Y0 because it’s zero years away. So just to be clear, Y1 is the in-season data from the year being examined. In the data to be examined, for example, Y1 is 2015, Y2 is 2014, and Y3 is 2013.

Kn – As you can well imagine, FB% numbers are always far greater than HR% numbers*, resulting in some truly ridiculous results if a constant isn’t applied that relates HR% to FB%. For instance, without a constant to modify the results, Jose Bautista would have been expected to hit 304 home runs last season. That’s a lot of home runs. Just two and a half seasons of playing at that level and he’d have the home run record in the bag. Luckily, I’m not stupid enough to think that that’s actually possible, and so I initially related FB% and xHR% with a constant, called KCon.

Unfortunately, KCon didn’t work as well as I’d hoped because it skewed expected home run results way up for terrible home run hitters and way down for the best home run hitters. By skewed, I mean bad by more than six home runs. And so I, in my infinite (and infantile) amateur mathematical wisdom, made it into a seven part piecewise** function. By this, I mean that there’s a different constant for each piece of the formula, defined by HR% at somewhat arbitrary, though round points. For clarity, here they are:

K1 = HR%<1

K2 = 1≤HR%<2

K3 = 2≤HR%<3

K4 = 3≤HR%<4

K5 = 4≤HR%<5

K6 = 5≤HR%<6

K7 = 6<HR%

It works quite well. I am very excited about the current iteration of xHR%, its implications, and all it has to offer. Of course, it is not finished, but I think I’m getting closer. Please comment if you have any questions, an error to point out, or anything of that nature. There will be a results piece published soon on the 2015 season, so keep an eye out.

*It wouldn’t be surprising if Ben Revere became the first player to have a HR% equal to FB% (both at 0%, naturally).

**It is neither continuous nor differentiable.

The Cubs, the Astros, and Tank Warfare Revisited

by 1908

May 14, 2016

Last year the once lowly Cubs won 97 games, and the also once lowly Astros won 86. Because both clubs had been as bad as Trump’s rug for years, many attributed these successes to the practice of tanking — intentionally losing games to acquire high draft picks with which to rebuild. This year, the Astros have gone a bit backward in the early going, thanks mainly to an incendiary pitching staff (if you had this guy second among Houston pitchers in WAR by mid-May, stop reading this right now and go fix world hunger). The Cubs have continued to roll, and as you know are currently on a pace to win 3.4 billion games this year. Those tanks seem unstoppable.

The interwebs were aflame with tanking debates during the offseason, with some saying it’s destroying Our Way of Life, and others saying well, no, it isn’t. This seems like a question susceptible to analysis using a new statistic with a vaguely humorous name. But before we get to that, we need to define the “tank” — I consider it to be the bottom six teams in the majors in any year. I arrived at six by rigorously counting the number of divisions in major-league baseball, and assuming that in most years the bottom six teams will be in their respective divisional cellars. This won’t always be true, but it will seldom be egregiously false.

So a team in the tank gets one of the top six draft picks in the following June draft. The new statistic, TankWAR, is simply the WAR attributable to each player the team drafted with a top-six pick, or to players obtained by trading one of those top-six players.

The Cubs and Astros each had four tank picks in the last ten drafts, twice the random expectation. The italicized players have reached the majors.

Cubs Tank Picks 2006-2015

Albert Almora (6) 2012

Kris Bryant (2) 2013

Warbird (4) 2014

Astros Tank Picks 2006-2015

Carlos Correa (1) 2012

Mark Appel (1) 2013

Brady Aiken (1) 2014

Alex Bregman (2) 2015

Last year the Cubs accumulated 50.2 WAR. Bryant contributed 6.5 of that, while Kyle Schwarber added another 1.9. So the Cubs’ TankWAR last year was 8.4, or 16.7% of the team total. On the one hand, the Cubs probably would have come close to 90 wins without these guys. On the other hand, wins 90-97 are among the most valuable in baseball. On the third hand, last year it wouldn’t have made a difference. At 89 wins or 97 the Cubs were the second wild card. On the fourth hand, that’s probably pretty rare.

Also note that of the Cubs’ starting 13 (eight position players plus five starting pitchers) only Bryant and Schwarber were Cubs draftees. The team acquired the other 11 through trades and free-agent (including international) signings. To put it another way, 42 of the Cubs 50 WAR came from players that every other GM had access to regardless of the previous year’s record.

This year, the Cubs’ TankWAR is just 1.4 (with Bryant contributing 1.5 and Schwarber subtracting 0.1 before suffering his season ending injury). That’s just under 10% of the Cubs’ total WAR of 15.6. So however important tanking was to the Cubs last year, this year it’s mattered less thus far.

For the Astros, Carlos Correa put up a 3.3 TankWAR in 2015, just over 7% of the Astros total of 44.6. Those three wins put the Astros in the playoffs — without them, The Fightin’ (and I do mean fightin’) Scioscias would have been in. To no one’s great surprise, in the current season Correa has just about doubled his contribution to the team — his 0.8 TankWAR is 14% of the team’s 5.6 total. (In theory, Ken Giles‘ -0.3 WAR could also be considered TankWAR since Mark Appel was one of the Ryder-load of prospects Houston traded for him, but Appel seemed to be an afterthought in that deal.)

The Astros were a more draft-dependent team than the Cubs in 2015, with six of their 14 regulars (including the DH) being Houston draftees. George Springer was by far the highest pick of the lot, costing Houston the 11th overall pick, thanks to the Astros bad-but-not-especially-tankly 76-86 finish in 2010 (good for fourth of six in the then-bloated NL Central). Most of the Houston draftees were guys that the other 29 GMs had passed over, and over, and sometimes even over again.

Both teams still have solid farm systems, if somewhat less spectacular than in recent years thanks to graduations and in the Astros’ case, that ill-advised Giles trade. The tank picks currently in their respective systems could help their teams relatively soon. But these teams are already very good. The remaining tank draftees won’t be turning their teams around so much as extending their respective windows of success, either by joining the big club or anchoring key trades.

So the evidence that tanking works is mixed. Both teams have benefited from their tank picks, but it is a significant exaggeration to say the Cubs’ and Astros’ recent successes are solely or even primarily because of tanking. However, Bryant and Correa in particular are players that can move their teams from good to great. These are the kinds of players that will typically be available only to the very worst teams under the current draft system. Thus, the worrywarts aren’t entirely … wartless — there will always be some incentive under some circumstances to get one of those top picks.

That said, the case for making major rules changes in response to tanking remains thin. While it’s clear that in recent years the Cubs and Astros lacked quality major-league talent, it isn’t at all clear that they were deliberately trying to sabotage their rosters (the case of Kris Bryant’s AAA hostage drama is a different problem). And, as noted above, most of the Cubs’ and Astros’ WAR during their recent resurgence has come from players who they could have obtained whether they had tanked or not. Indeed, one of the most tank-dependent teams of all time, your 2008 World Series Rays, obtained less than a quarter of its WAR from tank picks.

Another thing to bear in mind is that every team is different. For some teams, attendance is highly correlated with winning percentage, and for others, not so much. Tanking will probably cost the highly correlated teams more revenue, making it harder for those teams to finance the other rebuilding components. The low correlation teams have more patient fans and thus may have the room to explore more radical roster revision approaches.

Thus, a patient fan base is an asset. Changing the rules to prevent death-and-resurrection rebuilds isn’t a neutral solution — it would directly favor the teams whose fans desert them in the lean years (these are discussed in detail in the preceding link), and disfavor the teams with patient fans (like the Cubs and the Astros). The case hasn’t been made that the patient fan problem is so egregious that it needs to be legislated out of existence; indeed, it isn’t clear there’s a problem here at all. Each franchise (well, maybe except this one) tries to win by maximizing the advantages it has over its competitors while minimizing the impact of its relative weaknesses.

That doesn’t sound very nefarious. In fact, it sounds a lot like baseball.

Don’t Worry About Brett Cecil (Too Much)

by jamesryu

May 14, 2016

My friend posted something interesting on Facebook. It said:

“Dear Jays bandwagoners, stop booing Brett Cecil. Form is temporary, class is permanent.
2014 April: 5.14 ERA, May-Sept: 2.09 ERA
2015 April: 5.23 ERA, May-Oct: 2.09 ERA
2016 April: 5.79 ERA”

Maybe he is a slow starter and he should be able to go back to his second-half form as the season goes on. What I am slightly concerned about is that his April 2016 season ERA is worse than Aprils from the two previous seasons.

Let’s examine his pitches. He struggled big time in June 2015 when he posted an abysmal 9.00 ERA, but he did not allow a single run after June 30th that season. He has a 5.59 ERA as of May 11th. I went to brooksbaseball.net and researched his four-seam fastball, curve, and sinker between these three periods.

Four-seamer

Usage: 31%(June 2015) -> 21%(After June 30th of 2015 season) -> 13% (This season, as of May 4th)

Velocity: 93.9 mph -> 93.0 mph -> 92.8 mph

Horizontal movement: 3.6 inches -> 4.4 inches -> 5.1 inches

Whiff/Swing rate: 8% whiff/swing -> 17% whiff/swing -> 8% whiff/swing

GB/BIP: 13% -> 39% -> 11%

LD/BIP: 38% -> 30% -> 33%

FB/BIP: 38% -> 26% -> 56%

Horizontal release point: 0.83ft (June 2015) -> 0.89 (July 2015) -> 0.55 (August 2015) -> 0.61 (Sep 2015) -> 0.64 (This season)

Vertical release point: 6.57ft (June 2015) -> 6.49ft (July 2015) -> 6.58ft (August 2015) -> 6.51ft (Sep 2015) -> 6.54ft (This season)

Brett is relying less on his four-seam fastball as time goes. He is trying to adapt to the ‘sinker-ball’ trend. While his four-seamers have some movement, he may have felt the need to opt for a new pitch with more movement. His fastball velocity is in the low 90s and he can reach for 94 on occasion. That’s not ideal for a relief pitcher. His four-seamer is gaining more horizontal movement as time goes. He, in this season, has 1.5 more inches of horizontal movement than last season. He had big success with his four-seamer after June 2015 — it induced a 17% whiff rate, which is 9% higher than June 2015.

He also recorded a 39% GB/BIP using his four-seamer in his last three months of 2015 season, which is 27% higher than June 2015 (39% GB/BIP means that he induced 39 ground balls in every 100 balls in play off his four-seam fastball). His LD/BIP and FB/BIP also had substantial decreases in the last three months of the 2015 season, which helped him record a 0.00 ERA in that span. One of my theories of his successful 2015 season is that he changed his horizontal release point throughout the 2015 season. You can see the changes above. You can also observe the changes in the graph that I created using R:

Blue plots indicate his release points from April to June 2015 when he struggled to get batters out. Red plots indicate his release points from July to October 2015. You can definitely see that red plots clustered away from the blue plots. He made this adjustment and his command significantly improved, as well as other metrics.

April-June 2015: 25IP 11BB 5.40 ERA
July-Oct 2015: 29.1IP 2BB 0.00 ERA

Batters have adapted to him this season. His release points of this season are consistent with his 2015 second half, but he is struggling this season. His four-seam fastball is being hit hard again this season. His whiff/swing rate in the second half of 2015 was 17% and his 2016 season whiff/swing rate is 8%. If you refer to the ball-in-play stats above, his 2016 season ground ball/BIP, line drive/BIP, and fly ball/BIP rates are also worse than in the second half last season. But I don’t see velocity drop and change in release points for his four-seamer. Movement of his four-seamer is actually better. I can’t seem to diagnose what is wrong with his four-seam fastball this year and it leads to me to assume that his lackluster breaking balls are hindering the effects of his fastball as well. Now I am going to continue on researching with his other pitches and examine some specific situations.

Cecil is throwing significantly less four-seam fastballs for the first pitch of at-bats. He seems to be afraid of throwing it for the first pitch. Maybe he thinks that batters are waiting for it. Or maybe he wants to try to induce groundballs more and decided to throw a sinker more. You can see that he throws more sinkers for a first pitch instead of four-seamers.

His sinkerball approach for the first pitch seems to be a good one because most of the sinkers he throws for the first pitch are strikes. Last year, he threw 64% of his first-pitch sinkers for a strike. 19% of sinkers he’s thrown this year in his first pitch have been balls. Refer to pitch outcomes below:

However, he should avoid throwing a curveball for the first pitch, if he doesn’t want to get behind. Out of 12 curveballs that he’s thrown for the first pitch this year, nine of them were called a ball. If you look at the tables above, he did much better last year with his curveball for the first pitch.

He should not throw a curveball if he wants to get further ahead either. Look at the table below for pitch outcomes in 0-1 counts. You will notice that batters are not chasing it, and they don’t whiff on it when they swing after it. Although Cecil’s 2016 season 0-1 curveball sample is limited with only nine, you can see the pattern. 12% more balls taken by batters against Cecil in 0-1 counts this year compared to the second half of 2015. 36% less swings have been taken this year against Cecil’s curve. No batters have whiffed against Cecil’s 0-1 curveball this year. His 0-1 curveball in the second half of 2015 served him so well, inducing whiffs in 26% of occasions. Now that he can’t do that, he is failing to get ahead 0-2 as often as last year, which gives him more trouble getting outs.

And when he does get to an 0-2 count somehow, he is struggling to get guys out with curveball. You can see here:

Half the curveballs he’s thrown in 2016 in 0-2 counts were called a ball. Worse rate than last year. Batters swung at it 61% of time in the second half last season, while they now swing at it only 39% of the time. Batters are also making more contact with 0-2 curveballs this year than last year. It’s the same story when considering when he is ahead. (In other words, all counts when he is ahead)

His refined curveball in the second half of the 2015 season was the reason why he was doing so well. According to FanGraphs, his wCu/C in the 2014 and 2015 seasons were 2.5 and 2.8, respectively. This year, it is an awful -5.2. His curveball must be refined because batters figured it out.

Let’s figure out what could be wrong with his curveball then.

His curveball’s horizontal movement deviates from last year’s second half. His curveball was great in the first half of last season as well. Last season, the horizontal movement of the curveball ranged between 0 to 1 inch. This means that his curveball’s horizontal last year moved 0 to 1 inch away from the catcher’s glove side. This season, it is moving toward the glove side of the catcher. I don’t know whether that has a negative impact. It’s inconclusive.

Brett’s horizontal release points of 2016 curveballs are up to par with the second half of 2015. So I don’t think horizontal release point has had any impact on his curveball this year.

He has more vertical depth on his curveball this season than the last. More vertical depth on his curve is a good thing. But I don’t think improving vertical depth will fix anything, given that his curveball got its job done last year with less vertical depth.

Vertical release point of his curveball this season is within the range of second half of 2015. I don’t think vertical release point of his curveball is a problem either.

His curveball velocity is down this year. This is likely the biggest problem with Cecil. This implies that batters have some more fractions of a second to judge whether the curveball is a ball or strike. This gives batters some more time to decide whether to swing or not. I am convinced that a velocity increase will help him. Fortunately, he experienced a velocity increase throughout each of his last four seasons (2012 to 2015), as you can see in this graph:

It does seem to explain his improved ERA throughout each of the last two seasons. We should monitor his velocity this May to see if there is any sign of velocity improvement. In the meantime, it’d be best to let him pitch in low-to-medium leverage situations until he is warmed up for home stretch. He looks to me like he will be okay. He is only 29 this year and I don’t think we need to worry that his velocity drop is a permanent thing yet. Message to Brett: “Just relax and stop thinking about your disappointing start to this season. It’s likely nothing and time will only solve it. Congratulations on the birth of your daughter.”

David Price Should Be Okay

by bostnboy3

May 13, 2016

(Written before Price dominated on Thursday)

Obviously there is some concern about David Price. So I went and dove into his numbers to see what I could figure out. (All data below was obtained through FanGraphs, who coincidentally also wrote an article about Price, with similar methodology and results.)

So let’s start at the top and look at his ERA.

| ERA
—|—
Career | 3.19
3 Year Average | 3.01
2016 | 6.75

Yikes! His ERA this season is more than twice what we’ve ever seen out of Price. This is no surprise to anyone. But we all know that historical ERA isn’t really a good predictor of future ERA (it includes too much “noise” from things that the pitcher can’t control). So let’s look at some metrics that are better indicators of the way he’s pitching.

| SIERA | xFIP
—|—|—-
Career | 3.36 | 3.34
3 Year Average | 3.09 | 3.07
2016 | 2.99 | 2.94

Okay, so according to both xFIP and SIERA, Price is actually pitching as well as he’s ever pitched. Nothing to be concerned with here, and in fact we should be really happy with how he’s pitching.

In most cases, when a pitcher’s ERA is significantly higher or lower than their xFIP and SIERA, it can usually be chalked up to variance and you should expect things to settle back to their historical numbers.

Over his career Price’s ERA has actually outperformed his xFIP by almost half a run per 200 innings pitched. Which makes it even more peculiar why this season his ERA would be *lagging* his xFIP by such a significant margin.

So let’s go a little deeper and try to figure out *why* his ERA is so much higher than his xFIP.

Well, the obvious first things that jump off the page are his BABIP and Left on Base % (LOB%)

| BABIP | LOB%
—|—|—-
Career | .286 | 75%
3 Year Average | .298 | 74%
2016 | .373 | 54%

His BABIP is 75 points higher than his three-year average and he’s stranding 20% fewer runners. It’s easy to look at these numbers and say he’s just getting unlucky on balls in play and getting unlucky on batter sequencing.

The LOB% I can buy being just bad luck, but the BABIP I want to check on. Let’s look at his batted ball profile and see how unlucky he’s been on balls in play:

| LD% | GB% | FB% | Soft % | Med % | Hard %
—|—|—-|—-|—-|—-|—-
Career | 20% | 44% | 36% | 18% | 56% | 27%
3 Year Average | 22% | 42% | 36% | 17% | 55% | 28%
2016 | 29% | 40% | 31% | 17% | 42% | 41%

Uh-oh. His soft-hit and ground-ball ratios are constant, but in 2016 he’s giving up more line drives and harder contact by a significant margin. Giving up more line drives and harder hit balls helps explain his elevated BABIP… It’s not just bad luck. By my calculation his xBABIP based on this batted ball profile is .361. That’s slightly lower than his actual BABIP (.373), but still way higher than his career average.

This is definitely a bit concerning, but let’s see if we can figure out why he’s giving up such hard contact. First place I like to look is his command and velocity numbers.

| Fastball Velocity | Fastball %
—|—|—-
Career | 94.6 | 35%
3 Year Average | 93.6 | 23%
2016 | 91.8 | 12%

Another red flag. His fastball velocity is down almost 2mph from his three-year average. I did check, and his velocity went up about 1.5mph between April and August last year so we should see his velocity pick up as the year goes on, but this isn’t something you want to see out of a guy you just spent $217M on. To go along with the reduced velocity, you are seeing Price rely way less on his four-seamer. He’s basically replaced it with two-seam fastballs and cutters, hoping the movement he gets out of them makes up for the reduced velocity.

But how is he doing with his slightly altered pitch selection?

| K% | BB% | Zone % | Contact % | SwStr%
—|—|—-|—-|—-|—-
Career | 23% | 6% | 47% | 80% | 9%
3 Year Average | 25% | 4% | 48% | 79% | 10%
2016 | 29% | 7% | 48% | 71% | 14%

First takeaway is that his strikeouts are actually up! Despite the reduced velocity, he’s striking out more batters and inducing more swing and misses. These are good signs that his “stuff” is still there.

Not shown above, but he’s not getting guys to chase pitches like he used to (3% drop in swing rate on balls out of the zone compared to his three-year average), but on pitches in the zone he’s getting way *more* swing and misses (12% improvement on batter contact rate on pitches in the zone).

**So what does this all mean?**

As far as I can tell, Price will be fine. He’s lost some velocity, so you are seeing him switch from a four-seam fastball to a two-seam fastball. Because of the movement on these pitches, he’s getting more swing and misses when he throws strikes. But with the drop in velocity, when guys do put the bat on the ball, they are doing so with more authority. What this means for Price is he will need to get his offspeed pitches working to keep batters off balance and induce more swings on pitches out of the zone. Namely his changeup which has seen a big drop in value so far this year.

His LOB% should stabilize and if he can start commanding his changeup better, his BABIP should drop as well, which will make his ERA start to resemble that of the Price the Red Sox paid for this offseason.

The best news of all? It’s only May, so we have a lot of baseball left. No need to panic yet, as far as I can tell.

A Conversation On the Trainwreck in Atlanta

by Jaack

May 13, 2016

I am a Braves optimist. I believe that the Braves are just a typically bad team on their way to a typically bad season.

I am a Braves pessimist. I believe that 60 wins would be a miracle for this travesty of a team. I think they would be no better than average in the International League.

You’re overreacting. Yeah, an 8-24 record is nothing to brag about, but that isn’t an historically awful month. I mean, just least year, the Phillies had a 3-19 stretch in May and June, and they didn’t even lose 100 games that season. Or even better, look at the Twins, they’ve only won ~~one~~ none more games than the Braves. No one is talking about them as a historically bad team. I mean, the 2014 Giants, who won the World Series, had a 7-21 stretch in June. Calm down, it’s only May.

This isn’t simply a matter of the Braves having a poor stretch. The Braves simply don’t have good players. Freddie Freeman is good, Ender Inciarte is probably all right, and Nick Markakis is average. And that’s it. Their top two pitchers are Julio Teheran, who won’t be a Brave in two months, and Jhoulys Chacin, who hasn’t pitched 100 innings since 2013, and who has also now been traded so nevermind. The Braves have a severe lack of talent, and the little talent they have is going to be traded away.

Yeah the team doesn’t have very many established quality players, but help is on the way. Mallex Smith is already up and Dansby Swanson is on the way. Aaron Blair. Maybe they can get something out of Hector Olivera. The kids on the way will help boost the offense once Markakis and Aybar get traded away midseason.

What offense is there to boost? The Braves’ team wRC+ is 57. The 1920 Athletics, the worst hitting team of all time, had a wRC+ of 67. The team has hit seven home runs. Trevor Story did that in about a week. Ryan Howard’s rotting corpse has hit about as many home runs as this entire team. And it isn’t like they have been unlucky. The team’s BABIP is .289, which is just about league average. By BaseRuns the Braves have won exactly as many games as the ought to have. In fact, BaseRuns calculates that the Braves should be averaging 2.6 runs per game.

The Braves’ BaseRuns are bad, but the Brewers and Reds haven’t exactly been much better. Besides, the Braves are still projected to win 60 games if you look at the depth charts. Even if you think that’s too optimistic, its probably not 15 wins too optimistic, which is what it would have to be for the Braves to be historically bad.

The 1962 Mets were better through 28 games than the 2016 Braves have been. They lost 120 games. The Braves are on pace to lose 124.

Wait a second, you aren’t even responding to my points, you’re just saying scary things.

The Braves’ run differential is -63. Extrapolate that out to 162 games and that’s -340. The 119-loss 2003 Tigers had a run differential of -337.

I GET IT! The Braves have been truly awful so far. But they’ve had a ridiculous schedule too. The worst two teams they have faced so far are the Marlins and the Diamondbacks, and they went 3-3 against them. Once the Braves get some games against the Phillies, Reds, and Brewers, their record will improve.

The Braves are 2-16 at home.

But they’re 6-8 on the road! That’s actually not terrible!

Ryan Weber is sixth on the team in offensive value among players with plate appearances. He is a reliever. He grounded out in his one at bat.

But…. but..

Also, Jeff Francoeur.

Embrace the darkness, my child.

Simulating the WARriors

by Gus Madsen

May 13, 2016

116.

116 is the Major League Baseball record for most wins in a single season, achieved by the 1906 Chicago Cubs and the 2001 Seattle Mariners.

For 95 years the record was unbreakable. Fifteen years after that, it remains unmatched.

Major-league players are assigned a value called Wins Above Replacement (WAR), a statistic that displays the number of wins a player added to the team above what a replacement player would have added. In recent years, a WAR value of 8 or higher would be associated with an MVP-quality season, a value of 5 for an All-Star, 2 for the average starter, 0-2 for a bench player, and less than 0 for a replacement player.

With my curiosity looming, I decided to do a little research and came up with a list of the highest single-season WAR values for every position throughout history. But I decided to take it a step further. I wanted to create the greatest WAR-based roster of all time, a 25-man winning powerhouse that would be called, fittingly, the WARriors. I found the highest single-season WAR for each of the starting eight non-pitcher positions, followed by the highest single-season WAR for a five-man starting rotation, and then decided to add three infielders, three outfielders, a catcher, four relief pitchers, and a closer, all with the highest single-season WAR in their respective position (for the bench hitters, I chose the players with the NEXT-highest WAR at their position, behind the starting eight).

Here’s what I came up with:

WARriors Roster

C- Mike Piazza 1997 – 8.7 WAR
1B- Lou Gehrig – 1927 – 11.8 WAR
2B- Rogers Hornsby 1924 – 12.1 WAR
3B- Mike Schmidt 1974 – 9.7 WAR
SS- Cal Ripken Jr. 1991 – 11.5 WAR
LF- Carl Yastrzemski 1967 – 12.4 WAR
CF- Barry Bonds 2001 – 11.8 WAR
RF- Babe Ruth 1923 – 14.1 WAR

Total: 92.1 WAR

UT- Honus Wagner 1908 – 11.5 WAR
OF- Ty Cobb 1917 – 11.3 WAR
OF- Mickey Mantle 1957 – 11.3 WAR
OF- Willie Mays 1965 – 11.2 WAR
UT- Joe Morgan 1975 – 11.0 WAR
UT- Jimmie Foxx 1932 – 10.5 WAR
C- Johnny Bench 1972 – 8.6 WAR

Total: 75.4 WAR

SP- Tim Keefe 1883 – 20 WAR
SP- Old Hoss Radbourn 1884 – 19.3 WAR
SP- Jim Devlin 1876 – 18.6 WAR
SP- Pud Galvin 1884 – 18.4 WAR
SP- Guy Hecker 1884 – 17.8 WAR

Total: 94.1 WAR

RP- Jim Kern – 1979 – 6.2 WAR
RP- Mark Eichhorn – 1986 – 7.4 WAR
RP- John Hiller – 1973 – 8.1 WAR
RP- Bruce Sutter – 1977 – 6.5 WAR
CL- Goose Gossage 1975 – 8.2 WAR

Total: 36.4 WAR

Added together, the total team WAR for the WARriors is a ridiculous 298. That’s almost two full seasons of wins. To put it in perspective, the 2001 Mariners had a total team WAR of 67.7, and the 1906 Cubs’ total was 56. This is expected, however, and is a near impossible task to analyze efficiently because of the lack of pre-1900 data, and the mix of players from almost every decade. But it’s still fun to look at, so let’s run with it.

Now, the question on the table is this: Would this team win more than 116 games? I’d put money on it. But an even bolder question, would this team go 162-0? Again, we have to understand what we’re dealing with. The skill level of a ballplayer in 2016 is entirely different than an 1800s hurler pitching 500-600 innings per year. Luckily, we have the technology.

First, we need a starting lineup. As the self-proclaimed WARriors manager, here’s the Opening Day nine that I would play (each player listed had the highest single season WAR value for their position):

Hornsby 2B – .424/507/.696

Bonds CF – .323/.515/.863

Ruth RF – .393/.545/.764

Gehrig 1B – .373/.474/.765

Yastrzemski LF – .326/.418/.622

Schmidt 3B – .282/.395/.546

Piazza C – .362/.431/.638

Ripken SS – .323/.374/.566

Keefe SP – 41-27/2.41/359 K’s

But to go 162-0, we need to play 162 games, and who would those games be against? My idea was to simulate a 162-game season by playing 54 three-game series against the last 54 World Series champions (54 times 3 = 162). That should make it interesting, right? So for example, the WARriors would begin with three games against the 2015 Royals, followed by three against the 2014 Giants, then three versus the 2013 Red Sox, and so on, dating back to 1961. To be fair, every other series would be on the road, and the pitcher’s spot will bat. To support my love of the Reds, I decided to use the 2003 Great American Ball Park as the WARriors’ home stadium.

I used the whatifsports.com Dream Team simulator to assemble the WARriors roster. Because the data on their website only goes back to 1885, I will need to eliminate the years of my entire starting rotation from the original roster. However, I am replacing that data with each pitchers’ next-best year post-1885, or finding the next-best-WAR starting pitcher if one of the originals did not play beyond 1885, or if that next-best had a better WAR. Whatifsports manually subs position players as needed, and I manually rotated the starting pitchers every game, also switching the WARriors to the road team every other series.

Without further ado, here are the results of the simulated games:

2015 Royals @ WARriors

Game 1: WARriors 18 Royals 0

Game 2: WARriors 19 Royals 3

Game 3: WARriors 17 Royals 10

WARriors @ 2014 San Francisco Giants

Game 1: WARriors 11 Giants 0

Game 2: WARriors 2 Giants 1

Game 3: WARriors 11 Giants 8

2013 Red Sox @ WARriors

Game 1: WARriors 5 Red Sox 4

Game 2: WARriors 23 Red Sox 4

Game 3: WARriors 11 Red Sox 7

WARriors @ 2012 Giants

Game 1: WARiors 4 Giants 2

Game 2: WARriors 18 Giants 4

Game 3: WARiors 21 Giants 3

2011 Cardinals @ WARriors

Game 1: WARriors 21 Cardinals 2

Game 2: WARriors 27 Cardinals 0

Game 3: WARriors 23 Cardinals 2

WARriors @ 2010 Giants

Game 1: WARriors 18 Giants 8

Game 2: WARriors 6 Giants 1

Game 3: WARriors 13 Giants 10

2009 Yankees @ WARriors

Game 1: WARriors 7 Yankees 2

Game 2: WARriors 15 Yankees 3

Game 3: WARriors 10 Yankees 6

WARriors @ 2008 Phillies

Game 1: WARriors 5 Phillies 4

Game 2: WARriors 13 Phillies 1

Game 3: WARriors 9 Phillies 5

2007 Red Sox @ WARriors

Game 1: WARriors 8 Red Sox 3

Game 2: WARriors 16 Red Sox 8

Game 3: WARriors 12 Red Sox 5

WARriors @ 2006 Cardinals

Game 1: WARriors 21 Cardinals 7

Game 2: WARriors 18 Cardinals 4

Game 3: WARriors 17 Cardinals 11

2005 White Sox @ WARriors

Game 1: WARriors 8 White Sox 2

Game 2: WARriors 14 White Sox 0

Game 3: WARriors 12 White Sox 4

WARriors @ 2004 Red Sox

Game 1: WARriors 5 Red Sox 3

Game 2: WARriors 7 Red Sox 1

Game 3: WARriors 3 Red Sox 1

2003 Marlins @ WARriors

Game 1: WARriors 15 Marlins 0

Game 2: WARriors 23 Marlins 6

Game 3: WARriors 21 Marlins 5

WARriors @ 2002 Angels

Game 1: WARriors 9 Angels 7

Game 2: WARriors 7 Angels 0

Game 3: WARriors 16 Angels 5

2001 Diamondbacks @ WARriors

Game 1: WARriors 2 Diamondbacks 0

Game 2: WARriors 5 Diamondbacks 1

Game 3: WARriors 5 Diamondbacks 4

WARriors @ 2000 Yankees

Game 1: WARriors 13 Yankees 10

Game 2: WARriors 13 Yankees 12

Game 3: WARriors 19 Yankees 3

1999 Yankees @ WARriors

Game 1: WARriors 19 Yankees 13

Game 2: WARriors 16 Yankees 12

Game 3: WARriors 19 Yankees 9

WARriors @ 1998 Yankees

Game 1: WARriors 11 Yankees 5

Game 2: WARriors 8 Yankees 4

Game 3: WARriors 16 Yankees 1

1997 Marlins @ WARriors

Game 1: WARriors 27 Marlins 0

Game 2: WARriors 24 Marlins 2

Game 3: WARriors 15 Marlins 0

WARriors @ 1996 Yankees

Game 1: WARriors 13 Yankees 3

Game 2: WARriors 16 Yankees 0

Game 3: WARriors 25 Yankees 10

1995 Braves @ WARriors

Game 1: WARriors 9 Braves 5

Game 2: WARriors 10 Braves 2

Game 3: WARriors 6 Braves 4

WARriors @ 1993 Blue Jays

Game 1: WARriors 12 Blue Jays 6

Game 2: WARriors 13 Blue Jays 2

Game 3: WARriors 7 Blue Jays 1

1992 Blue Jays @ WARriors

Game 1: WARriors 10 Blue Jays 4

Game 2: WARriors 17 Blue Jays 13

Game 3: WARriors 15 Blue Jays 10

WARriors @ 1991 Twins

Game 1: WARriors 12 Twins 0

Game 2: WARriors 19 Twins 8

Game 3: WARriors 6 Twins 4

1990 Reds @ WARriors

Game 1: WARriors 10 Reds 9

Game 2: WARriors 5 Reds 1

Game 3: WARriors 12 Reds 2

WARriors @ 1989 A’s

Game 1: WARriors 16 A’s 12

Game 2: WARriors 11 A’s 7

Game 3: WARriors 21 A’s 6

1988 Dodgers @ WARriors

Game 1: WARriors 8 Dodgers 3

Game 2: WARriors 14 Dodgers 11

Game 3: WARriors 9 Dodgers 3

WARriors @ 1987 Twins

Game 1: WARriors 20 Twins 6

Game 2: WARriors 22 Twins 1

Game 3: WARriors 15 Twins 9

1986 Mets @ WARriors

Game 1: WARriors 12 Mets 2

Game 2: WARriors 15 Mets 5

Game 3: WARriors 9 Mets 5

WARriors @ 1985 Royals

Game 1: WARriors 9 Royals 5

Game 2: WARriors 4 Royals 3

Game 3: WARriors 17 Royals 5

1984 Tigers @ WARriors

Game 1: WARriors 8 Tigers 3

Game 2: WARriors 4 Tigers 1

Game 3: WARriors 14 Tigers 0

WARriors @ 1983 Orioles

Game 1: WARriors 19 Orioles 3

Game 2: WARriors 23 Orioles 4

Game 3: WARriors 14 Orioles 2

1982 Cardinals @ WARriors

Game 1: WARriors 21 Cardinals 0

Game 2: WARriors 18 Cardinals 1

Game 3: WARriors 7 Cardinals 5

WARriors @ 1981 Dodgers

Game 1: WARriors 6 Dodgers 0

Game 2: WARriors 16 Dodgers 0

Game 3: WARriors 10 Dodgers 6

1980 Phillies @ WARriors

Game 1: WARriors 9 Phillies 6

Game 2: WARriors 12 Phillies 0

Game 3: WARriors 15 Phillies 12

WARriors @ 1979 Pirates

Game 1: WARriors 8 Pirates 4

Game 2: WARriors 10 Pirates 9

Game 3: WARriors 15 Pirates 5

1978 Yankees @ WARriors

Game 1: WARriors 3 Yankees 0

Game 2: WARriors 6 Yankees 1

Game 3: WARriors 14 Yankees 1

WARriors @ 1977 Yankees

Game 1: WARriors 17 Yankees 14

Game 2: WARriors 11 Yankees 7

Game 3: WARriors 14 Yankees 9

1976 Reds @ WARriors

Game 1: WARriors 18 Reds 5

Game 2: WARriors 2 Reds 0

Game 3: WARriors 5 Reds 3

WARriors @ 1975 Reds

Game 1: WARriors 9 Reds 0

Game 2: WARriors 4 Reds 6

Game 3: WARriors 8 Reds 4

1974 A’s @ WARriors

Game 1: WARriors 16 A’s 13

Game 2: WARriors 10 A’s 2

Game 3: WARriors 9 A’s 7

WARriors @ 1973 A’s

Game 1: WARriors 1 A’s 0

Game 2: WARriors 12 A’s 4

Game 3: WARriors 4 A’s 0

1972 A’s @ WARriors

Game 1: WARriors 8 A’s 5

Game 2: WARriors 5 A’s 3

Game 3: WARriors 9 A’s 5

WARriors @ 1971 Pirates

Game 1: WARriors 16 Pirates 3

Game 2: WARriors 5 Pirates 1

Game 3: WARriors 11 Pirates 9

1970 Orioles @ WARriors

Game 1: WARriors 14 Orioles 12

Game 2: WARriors 9 Orioles 8

Game 3: WARriors 12 Orioles 2

WARriors @ 1969 Mets

Game 1: WARriors 22 Mets 0

Game 2: WARriors 17 Mets 0

Game 3: WARriors 15 Mets 1

1968 Tigers @ WARriors

Game 1: WARriors 12 Tigers 6

Game 2: WARriors 10 Tigers 4

Game 3: WARriors 18 Tigers 16

WARriors @ 1967 Cardinals

Game 1: WARriors 16 Cardinals 5

Game 2: WARriors 13 Cardinals 7

Game 3: WARriors 24 Cardinals 14

1966 Orioles @ WARriors

Game 1: WARriors 15 Orioles 2

Game 2: WARriors 20 Orioles 8

Game 3: WARriors 9 Orioles 3

WARriors @ 1965 Dodgers

Game 1: WARriors 5 Dodgers 3

Game 2: WARriors 6 Dodgers 3

Game 3: WARriors 5 Dodgers 0

1964 Cardinals @ WARriors

Game 1: WARriors 12 Cardinals 1

Game 2: WARriors 19 Cardinals 7

Game 3: WARriors 12 Cardinals 8

WARriors @ 1963 Dodgers

Game 1: WARriors 8 Dodgers 0

Game 2: WARriors 8 Dodgers 1

Game 3: WARriors 6 Dodgers 4

1962 Yankees @ WARriors

Game 1: WARriors 10 Yankees 9

Game 2: WARriors 3 Yankees 1

Game 3: WARriors 5 Yankees 2

WARriors @ 1961 Yankees

Game 1: WARriors 17 Yankees 11

Game 2: WARriors 11 Yankees 0

Game 3: WARriors 13 Yankees 2

WARriors Final Season Record: 161-1

Unbelievable. Well folks, there it is. If you actually sifted through all those results, you would see that the one, tiny blemish on an otherwise perfect season was game two against the notorious 1975 Big Red Machine. According to the simulation, George Foster went 1-4 in the game with a two-run shot, and Pete Rose added an RBI single and a stolen base. Ironically, my Reds were the one to end the streak.

In short, a 25-man roster of the best single-season WAR values in the history of baseball went 161-1 against the last 54 World Series Champions, playing each champ in a three-game series and alternating between road and home venues. The WARriors scored an outrageous 2,002 runs in 154 games during this simulation, equal to 13 runs per game. Their opponents scored 708 runs in 154 games, equal to about 4.5 runs per game. That’s a run differential of 1,294.

I am both astounded that I had the patience to run all of those games, and also that not one other team was able to sneak by this loaded roster.

This makes for a very interesting case, and leads to further questions and different match-ups that would be extremely fun to see. Different ballparks, more accurate values assigned, different lineups, etc. would obviously reveal a separate outcome, but these simulations revealed that winning isn’t everything.

Okay, maybe 161 times out of 162 it is.

« Previous Page — « Previous entries

Next entries » — Next Page »

BAL	CHW	LAA
BOS	CLE	OAK
NYY	DET	SEA
TBR	KCR	TEX
TOR	MIN	HOU

ATL	CHC*	ARI
MIA	CIN	COL
WSN	MIL	LAD
NYM*	PIT	SDP*
PHI	STL	SFG

Archive for May, 2016

IP = Innings Pitched FA = Fastball FT = Two-Seam Fastball FC = Cut Fastball SI = Sinker SL = Slider CU = Curveball CH = Change-up KC = Knuckle Curveball EP = Eephus

IP = Innings Pitched
FA = Fastball
FT = Two-Seam Fastball
FC = Cut Fastball
SI = Sinker
SL = Slider
CU = Curveball
CH = Change-up
KC = Knuckle Curveball
EP = Eephus