KC’s Secret Sauce: Age-Defying Player Development

The 2006 Kansas City Royals went 62-100, tallying the team’s fourth 100-loss record in five years. In June of this particular season, owner David Glass hired a new general manager, Dayton Moore. ESPN’s first reaction was that Moore could have waited for the GM job with the Braves, who unlike the Royals were an “admired organization.” Jason Whitlock, who was in the midst of a 16-year stint as a writer for the Kansas City Star, declared that Moore was owner David Glass’ new scapegoat, and would soon be undermined by Glass’ “cheapness and incompetence.”

It took Dayton Moore a while to gain respect in Kansas City. In January of 2014, the last off-season that Moore’s Royals would endure without a World Series appearance, Royals Review mocked Moore’s tenure in Kansas City, which still only had one winning season.

Dayton Moore’s professional history before moving to Kansas City seemed to be the opposite of what a modern general manager was supposed to be. While the league’s front offices frantically shifted towards advanced statistics, Moore’s background was in scouting and player development. He established an excellent analytics office in the Royals’ organization, but his expertise and focus were drafting and signing promising young athletes and patiently developing the right type of team.

It worked, slowly. In 2009, they won 65 games. Then they won more in 2010 — and again in 2011, 2012, 2013, 2014, and 2015.

For six straight seasons, the Royals won more games than in the year before. They are the only team since World War II to do that. And they did it by being better than the rest of the league at exactly what Moore was supposed to do: develop talent.

In 2013, Jeff Zimmerman found that aging curves in baseball were changing. The fascinating and significant article showed that, in general, hitters no longer improved throughout their 20’s.

Dayton Moore’s Royals did not get the memo, and this may be what sets them apart from the rest of the league more than anything else. Sure, their bullpen is historically dominant. Yes, their solid and spectacular defense is remarkable. Their contact ability is extreme.

But all of these factors ignore something else that the Royals do historically well. Their hitters keep getting better, at points in their career when they are not supposed to.

From the ages of 23 to 26, Alex Gordon had a wRC+ of 93. That is not supposed to get better. From the age of 27 to 31, his wRC+ has been 123.

From the ages of 25 to 27, Lorenzo Cain had a wRC+ of 86. That is not supposed to get better. In his age 28 and 29 season, his wRC+ has been 121.

From the ages of 22 to 25, Mike Moustakas had a wRC+ of 82. This year, at age 26, it was 124.

This year, career 104 wRC+ hitter Eric Hosmer hit 125 at the age of 25.

Possibly most importantly for the 2015 team, Kendrys Morales followed up a 71 wRC+ season at the age of 31 with a 131 mark this year.

These players are not extremely old. None of them are at a point in their careers where they should be falling off a cliff, but recent history suggests that they should be stagnating, at best. But that’s not what the Royals have done.

Since Dayton Moore’s first full season in 2007, there have been plenty of mid-career surprises.

To create an aging curve, I measured the difference in wRC+ between consecutive seasons in which players had at least 300 PA. Then, I made every season relative to the performance at age 26, because that appears to be the first year of plateauing for the whole MLB. (Note: The results are not identical to Zimmerman’s).

From 2007-2015, with Age 26 being normalized to 100, here is what we get:

Aging Curve

Royals players dramatically improve from their early 20’s all the way until they turn 30. While the sample size is not huge, the consist improvement is remarkable. Kansas City’s players show greater improvement than MLB average at age 25, 26, 27, 28, and 30. This feat is doubly difficult when you consider that a lucky season at age 27 should naturally show a decline at age 28.

Dayton Moore and the Royals organization are rightly being showered with praise after their second consecutive World Series appearance and their seemingly invincible run through the 2015 playoffs. But their formula of success is not a frozen-in-time snapshot of the 2015 team. Player production in one year does not define the strengths of the Royals organization. Rather, the possibility that it could be even better next year does.


Help With the Physics Behind PITCHf/x

I’ve been digging into the PITCHf/x data over the past few weeks and stumbled across something I can’t quite figure out. When I first started using the data, I didn’t realize that px and pz were where PITCHf/x is mapping the final location of the ball; undeterred I set out to Google to jog my memory on the basics physics formulae that can map time using initial velocity, final velocity, distance and constant acceleration.

Step 1 was to calculate final velocity for every pitch from -50 feet to 0 feet. This was a simple formula that is SQRT(vy0^2-2*50*ay). Initial velocity squared less acceleration * yo2 * distance. Based on y0 being 50 feet from home plate.

Step 2 was to calculate time based on initial velocity and final velocity. I cross-checked my numbers to using the Start_Speed and End_Speed (which don’t match up to to vy0 for some reason) and got basically the same number.

Step 3 was to calculate xFinal based on Time, ax and x0 (ditto for zFinal). Strangely, my zFinal was a little lower (about .17 feet) than the PITCHf/x pz value and .015 more to the right than the px value. That might mean that they are measuring z and x 50 feet from release point, rather than at home plate.

I need to know if (a) my math is wrong (b) pz and px are wrong (c) ax and az are wrong.

Any help would be appreciated!


Meant to Be? The Rockies and the 3-3-3 Rotation

Since the Rockies have started playing baseball in Colorado, they’ve continually run into the same problem: pitching. We’re all familiar with the situation — the altitude and thin air create a hitter’s haven and a nightmare for pitchers, particularly starting pitchers. The Rockies have tried to remedy the situation in the past by bringing in top-tier starting pitchers, only to have them struggle. In 2012 and ’13 they tried a four-man rotation with a 75-pitch limit which led to a 64-98 record and a 5.22 team ERA. 2013 was a bit more successful, as they finished with a 74-88 record and a 4.44 team ERA. Still it wasn’t good enough to contend for a playoff spot and definitely not good enough to compete for a World Series title. In fact, in 2007 when the Rockies had their only World Series appearance, they carried a team ERA of 4.32. Only four teams since 2007, including the Rockies, have had a team ERA of over 4.00 and made it to the Fall Classic. The others were the 2009 New York Yankees and Philadelphia Phillies and the 2010 Texas Rangers. As the Mets and Royals have shown us this year, quality starting and relief pitching can take you pretty far in this game. My question is, with all the different strategies the Rockies have tried, what can they do differently to compete?

My suggestion is a slight tweak on an idea that Dave Fleming wrote about in 2009 called the 3-3-3 Rotation. In his article he describes the 3-3-3 Rotation as three pitchers, pitching three innings, every third day with a pitch limit of 40-60 pitches. By having a pitcher essentially go through the order one time, it allows them to give it all they have for a short time instead of conserving their energy for the later innings. In theory, this makes sense. Look at the Royals the past few years; they’ve turned a number of former starting pitchers into relievers and most, if not all have found success in their new roles. In his first year as a starter in 2008, Luke Hochevar had an opponents batting average of .243/.289/.319 the 1st, 2nd and 3rd time through the order. His last year as a starter in 2012 was a little bit better with a .288/.263/.294 BAA but his best season in the majors came as a reliever in 2013 when he held opponents to a .169 BAA.

This may hit a little too close to home for Rockies fans but last year Franklin Morales as a starter for Colorado had a split of .300/.337/.220; in his first year with the Royals out of the pen he held opponents to a .246 BAA. Staying with the Kansas City bullpen, we can look at Wade Davis, who actually had declining BAA numbers in his last year as a starter — .280/.251/.236 — but still posted a solid .151 BAA in his first season in relief. Andrew Miller had a split of .336/.261/.300 in his last year as a starter in 2011; his first year as a reliever in 2012 was significantly better with a .194 BAA. Zach Britton is a similar case with a .272/.266/.293 BAA split in his last year as a starter in 2011 and a .180 BAA in 2012 as a bullpen piece. The point is, generally speaking, when a major-league hitter has a chance to see a pitcher three times in one game, the advantage shifts to the hitter, and if a pitcher with quality stuff can face the order once, the advantage goes to the pitcher. This point is even more important for the Rockies who can’t afford to give their opponents any more advantages when playing in Colorado.

The Rockies have always struggled to attract top-tier starting pitching, since no one really wants to inflate their numbers by pitching half of their games at Coors Field. Colorado has tried to draft and develop power arms who rely on strikeouts and ground balls more so than fly-ball pitchers but still the results are the same ;; a sub-.500 team with an ERA over 5.00, which is not a recipe for success. The average major-league team has five starting pitchers and carries seven relievers in their bullpen. My tweak on Dave Fleming’s 3-3-3 rotation would be to split the 12 pitchers into four groups of three, all with a pitch count of 40-60 depending on effectiveness. In a perfect world every pitcher would go through the order once, throwing anywhere from 30-60 pitches and then turning the ball over to the next guy up who would hopefully do the same thing.

But we don’t live in a perfect world so by having four groups of three, each pitcher could be shifted around depending on the amount of pitches thrown in a week, meaning an effective pitcher could pitch as much as three to four times a week. The average starting pitcher in the majors pitches once maybe twice a week, each time throwing anywhere between 70-120+ pitches depending on the outing; by splitting up that workload they could see action three to four times a week. The average reliever definitely pitches less innings, around 70-80, and in turn throws less pitches but many major-league relievers spent time in the minors as starters, throwing 100+ innings a season. The workload is definitely something to monitor but in 2015 the Rockies used 29 different pitchers. The average amount of innings that a team played was 1,447, and the Rockies staff as a whole pitched 1,426.1 innings. So between the 29 different pitchers, you could keep arms fresh and put pitchers in a position to succeed.

Which brings me to my next point — putting pitchers in a position to succeed. When an offense has a strong 3 and 4 hitter, a manager may put a young player in the 2nd spot instead of lower in the order to ensure that the young player will see strikes. A pitcher never wants to walk someone in front of a player who can crush it out of the park. This leads to more balls seen in the strike zone, hopefully leading towards a positive result, Josh Donaldson is a great example of that this year. Joe Maddon has also implemented a strategy to set young Addison Russell up for success by having him bat 9th after the pitcher instead of 8th before the pitcher. The logic is the same — Russell will see more strikes because opposing pitchers don’t want to walk him and turn the lineup over to their heavy hitters.

I believe the 3-3-3 rotation does this for pitchers, especially pitchers in Colorado. The Rockies had a collective split of .298/.339/.351 the 1st, 2nd and 3rd time through the order in 2015. By having their pitchers face the opposing lineup once, it allows them to display all of the pitches right away. Instead of establishing your A and B pitches the first time through the order and showing your C and possibly D pitches through the second and third time, a pitcher can show all of them through the first three innings. This creates confusion for the hitters and also forces them to be more aggressive at the plate early, something that can be taken advantage of if properly executed. It’s also worth mentioning that some of the best offenses in the game do a tremendous job of communicating with their teammates about the pitcher and the pitches they’re seeing. Remember, the more familiar the pitcher is to the batter, the more advantage the batter has. If you can remove that advantage from the opposing offense, it further sets your pitching staff up for success. Opposing teams would have to have different game plans for each pitcher they see, and those quick adjustments aren’t the easiest to make throughout a 162-game season.

All in all it’s an experiment and besides Tony LaRussa trying something similar for a week in 1993, there hasn’t been another team to try this method. For many teams, the classic five-man rotation works and who am I to say they’re wrong but the Rockies have never really been able to figure it out and if any team is in a position to give it a shot, I believe it’s them.


An Ode to The Dude

Lucas Duda.  The Dude.  The Big Lebowski.  If there is one player on the Mets who would be most deserving of the title of “Most Underrated” it would be Lucas Duda.  I arrived at this conclusion based on my own subjective, and fallible, perception of the casual baseball fan’s perception about Lucas Duda; which I would assume would be somewhere in the vicinity of none.  Some of his relative underratedness may stem from the fact that he was relatively “streaky” during the course of this regular season, which is an inherent trait of a player who produces a large amount of his value by way of the home run.  Also, after Duda had eight straight hits go for home runs during the course of a seven-game stretch in late July, Yoenis Cespedes caught the eyes of the national media from basically the moment he was traded to New York.  That being said, I think Mr. Duda deserves a little recognition for his solid year, especially since he looms as an important figure in the active World Series.

On the surface Duda has almost exactly replicated his breakout 2014 in 2015.  For reference, this table of arbitrary statistics:

Statistic 2014 2015
WAR 3.2 3.1
wRC+ 136 133
BABIP 0.283 0.285
ISO 0.228 0.242

His overall value has remained almost exactly the same over the past two years as he has churned out two straight 3-win seasons.  His walk rate and strikeout percentage have been fairly stable as well, as has his various swing rates.  In this regard Lucas has been remarkably consistent.

However, there is one portion of Duda’s underlying statistics that differed significantly from this year to the rest of his career.  Duda pulled the ball less, and went to opposite field more.  Here is another table to illustrate this fact:

Year Pull% Cent% Oppo%
2012 44.0% 34.2% 21.8%
2013 45.9% 31.7% 22.5%
2014 44.1% 34.7% 21.1%
2015 39.0% 33.9% 27.1%
Career 43.2% 33.8% 23.1%

* The table starts in 2012 simply because the prior years don’t really provide any additional insight

 

And here are the batted-ball maps for 2o14 and 2015 to further illustrate Duda’s change in approach:

duda2014

duda2015

It looks as though Duda has tried to make himself a more balanced hitter, and decrease the amount of shifts he faces, as he has made an obvious attempt to go the other way more often this year.  This didn’t result in any additional offensive value this year – as we saw in the first table his overall value stayed steady – as we didn’t even see an increase in BABIP.  Regardless, this seems to be a trend worth keeping an eye on, and worth remembering during the World Series.

In specific regard to the World Series, Duda has a relatively significant platoon split; Career 91 wRC+ vs LHP, 136 wRC+ vs RHP.  With the Kansas City Royals featuring right-handers Yordano Ventura, Edinson Volquez, Johnny Cueto, Chris Young, Wade Davis, Ryan Madson, and Kelvin Herrera, among others, Duda looks to occupy an important role during the series.  With all of Kansas City’s probable starters being right-handed Duda should start every game, and it doesn’t seem like he will be pinch-hit for too often with Kansas City’s three best relievers being right-handed as well.

Lucas Duda; chronically underrated, and under-spoken, might just be the Mets’ most pivotal player during this World Series.  Or not.  Probably not, there are a lot of players on a baseball team, but he will assuredly be a pivotal player.  The Dude Abides.


Jacob’s Ladder: Arrieta’s Atypical Ascent

Let’s look at two pitchers:

________           ERA       ERA+      FIP     K/9     BB/9     HR/9

Jake Arrieta      1.77         222         2.35     9.3       1.9         0.4

Pitcher X               5.23          80         4.75     6.9       4.0         1.2

Pitcher X is not Jake’s long lost brother, but is in fact Jake Arrieta – those are his cumulative stats from 2010-2013, his first four years in the majors. And that’s not a small sample size; Arrieta accumulated 409 innings in his first four years. The top line is from 2015, a season which has put Arrieta within shouting distance of a Cy Young award.

No other pitcher has had a surge like this after floundering so badly for his first four years, but even before 2015, Arrieta was traveling through a baseball landscape witnessed by very few humans. Just 26 pitchers in major-league history have amassed over 400 innings in their first four years and “achieved” an ERA+ of 80 or worse. The list is here. It’s most notable for its lack of notability — an array of names of you haven’t heard of, interspersed with a few modern guys who, for the most part, failed to make an impact.

The other notable thing about the list is how short it is; most teams will have given up on a pitcher this consistently bad long before he’s eaten 400 innings of paychecks. Beside Arrieta, just two of the 26 had successful major-league careers as starters: Bullet Joe Bush and Camilo Pascual. None of them ever came close to Arrieta’s 222 ERA + in 2015; in this respect, Arrieta walks (or rather suppresses walks) alone.

Arrieta came up in 2010 to participate in two 90-loss Orioles seasons, but the Birds were taking flight, and in 2012 they would win 93, before slipping to 85 wins in 2013. These were good teams, patched together with Dan Duquette’s yard-sale bargains and Buck Showalter’s newly humanized intensity.

Arrieta had a quiet breakthrough in 2012, his third year with the Orioles, when his K/9 spiked at 8.7, while his BB/9 plunged from over 4 down to 2.7. These front-line starter numbers were buried by his 6.20 ERA, which in turn stemmed in part from a high homer rate (1.3/9). The Orioles understood that beneath the ERA there was progress, and did not trade him.

In 2013, Arrieta gave most of his gains back. The strikeouts remained, but he began walking the house. Even FIP began to have doubts, and on June 17 they shipped him to the Cubs (along with Pedro Strop, now a competent set-up man) in exchange for backup catcher Steve Clevenger and 90 mediocre innings from Scott Feldman. It looks like a disastrous trade now, but at the time it seemed suicidal for a contender to hand Arrieta the ball every fifth day.

Perhaps the Orioles’ inflexible approach to pitching mechanics impeded Arrieta’s development. Perhaps he developed a new pitch, or refined an old one. Maybe he needed a change of scenery. Whatever the case, Arrieta didn’t become Jake Arrieta with the Cubs right away. He brought the walks and homers with him in his carry-on luggage when he touched down in Chicago in late June. His strikeout rate actually dropped. But perhaps most importantly, his WHIP plunged, from over 1.7 with the Orioles in 2013 to just over 1.1 with the Cubs. Both rates were BABIP driven: in Baltimore, .343, in Chicago, an equally unsustainable .190. But Arrieta took advantage of his luck, using the emotional breathing space the sudden drop in traffic provided to focus on developing his devastating sinkerslidercutterwhatever. In this case the change of scenery provided an immediate and positive, if accidental, dividend. Arrieta’s BABIP has since returned to a normal neighborhood: .274 in 2014 and .246 in 2015. (His success this year was only partially BABIP-fueled, a fact that should the scare the pine tar off the bats of NL Central hitters.)

Although no one on the List of 26 is really a comp for Arrieta, Pascual probably comes closest, thanks to his dominating stuff. Featuring a knee-buckling curve, Pascual achieved strikeout rates that wouldn’t look at all out of place in today’s game. From 1958-1964, Pascual K/9 never fell below 7; this in an era when league strikeout rates were typically in the 5s. Wildness and gopheritis plagued Pascual in his early years, but he became a rotation mainstay for the Senators in ’58, and stuck with the franchise until 1966, a year after the now-Twins went to the Series.

Pascual had a quiet breakthrough in 1956, his third year with the Senators. His strikeout rate spiked at 7.7/9, while the walk rate dropped to a (still high) 4.2. Victimized by a ghastly 33 homers, his ERA was awful, but there were signs of promise. In 1957 it all went backwards. His ERA peaked at an eye-watering 6.11 on May 4, and after ebbing somewhat, reached another appalling summit at 5.49 on June 22, about the same time in the season that Arrieta’s career ended in Baltimore.

But the Senators did not blink. It was around this time, in the sweltering 1957 summer, that Camilo Pascual became Camilo Pascual. The strikeouts came back, the homers did not. He finished with a respectable 4.10 ERA, a figure he would easily beat for the next 8 years.

Pascual was only 23 during his Crossroads Year; Arrieta was 27, a much easier age for a team to give up on a player. Arrieta has Scott Boras as his agent; Pascual had the reserve clause as his ankle bracelet. Perhaps most importantly, the 1957 Senators were simply abominable. They would lose 99 games (out of 154!) in 1957, and indeed would exceed 90 losses during every season from 1955-1959.

In 2013, the Orioles couldn’t risk nine more Jake Arrieta starts if they hoped to contend; In 1957, the Senators wouldn’t contend until 1962, by which time they had moved to a different time zone. The 2013 Orioles’ team success produced a roster assembly failure, while the 1957 Senators’ team failure produced a roster assembly success.

Pascual was very good for several years. Arrieta has been outstanding for two. His FIP has been very consistent in his two full years with the Cubs: 2.26 in 2014, and actually slightly higher (2.35) in 2015.  Arrieta’s remarkable climb has reached the top rung; it remains to be seen how long he can stay there.


Measuring Team Chemistry with Social Science Theory

Every athlete, professional or otherwise, talks about that feeling of being on a team. There’s something that happens when a team “clicks” – it’s a united feeling of team spirit that propels team members to compete, most often referred to as team chemistry. In the social sciences there’s no measure of team chemistry, but there is however Team Cohesion, which is defined as:

A dynamic process that is reflected in the tendency of a group to stick

together and remain untied in the pursuit of its instrumental objectives

and/or for the satisfaction of member affective needs [1].

Team cohesion has been shown to exist across multiple work group settings (organizational, military and sport) [2], as well as across multiple sports (basketball, golf [3], softball, and baseball [4]). Perhaps more interestingly, cohesion has also been bi-directionally linked to performance: when teams perform better, they are more cohesive; and when they are more cohesive, they perform better [2,5]. And while the research on this relationship is clear, it has mostly been conducted with non-professional teams. Indeed, team cohesion is one of many other “unobservable” properties that are untapped within profession sports.

How can we measure team cohesion in professional sports?

 As researchers, we would normally use a validated survey to measure team cohesion – a survey that I could rely on to accurately measure team cohesion. Unfortunately, when I don’t have access to a team, I’m forced to use alternative methods. The first step is to examine the literature; a few key findings are brought to light about indications of team cohesion:

  • Team cohesion is related to the extent that members accept the roles on their team (captain, motivator, leader, follower, etc.) [6].
  • Charismatic leaders will refer to their teams more often than referring to themselves [7].
  • The higher the level of team cohesion, the better the team performance [2,5].

So, if I can somehow measure how often leaders refer to their teams (vs. themselves), then I can use this as an approximation of their leadership characteristics. And if leaders are acting like leaders, they may also be helping to solidify roles within their team. Therefore we might expect that:

Hypothesis 1: As leaders reference their team more, we should see increased team cohesion – and as team cohesion increases, we should see better performance.

A charismatic leader does not typically arise without a contextual or conditional trigger. Crisis often prompts the emergence of charismatic leadership – a setting that allows a charismatic leader to propose an ambitious goal [8]. Both the context and the charismatic leader influence one another, almost as if the leader requires crisis as an occasion to exemplify charismatic leadership [9]. Additionally, at the group level, team members have been shown to become more attached to the leader in times of crisis, prompting a greater presence of cohesion during times of crisis as followers rally around the charismatic leader [10].

In baseball, teams experience all types of crises throughout the long season, including injuries, losing streaks, playoff races, and team conflicts. Perhaps the most common and least contextual of these crisis is the race to the playoffs as the season comes to an end. With an understanding of how and when the playoff races begin to make an impression, I can expect to observe a temporal effect of charismatic leadership by using our previous indicator of team reference. That is, it may not only be that “there is a positive relationship between a leader’s team references and the amount of wins his team will have at the end of the regular season”, but also:

Hypothesis 2: The timing of when a team leader references his team can determine the effectiveness of his leadership.

Methods

As the first component of the measure, I needed to assess team leaders’ reference to themselves or their team, I used the most popular newspaper from that team’s city to extract quotations (e.g., San Francisco Chronicle for the Giants; the New York Times for the Yankees). A team leader was identified by teammates, coaches, or front offices as a “leader”, a “captain”, or having either of these qualities. If there was more than one identified team leader, I randomly chose between the two. I tracked the quotes from 8 randomly selected baseball team leaders from 8 randomly selected teams across an entire regular season (April 4th, 2012 – October 3rd, 2012). Statement settings included comments made in locker rooms after games, during the All-Star break, before a game started, or in any other setting. Any time the leader was documented as saying anything that appeared in the newspaper, that quote was documented for analysis. Leader quotes were qualitative coded independently between 3 different coders. Each quote was coded as containing “self-reference”, “team-reference”, and/or “other reference” (the 3 coders had 97% agreement on their final codes). I began this study in 2013 thus I used the 2012 season, which was the latest complete season at my disposal.

Due to the disparity in responses, the sample was aggregated based on team leaders who played on teams that finished with a certain number of wins. Since 1996, no AL team has made the playoffs with less than 86 wins [11]. During the same time period, no NL team has made the playoffs with less than 82 wins [12]. For this study, leaders were categorized based on how their teams finished the regular season (86 or more wins for AL teams and 82 or more wins for NL teams). Those at or above the win mark were titled “high team leader” (HTL) and those below the win mark were titled “low team leader” (LTL). Four teams in the sample met the HTL criteria and their combined record was 368 – 280 (.568 wining percentage). Not all HTLs were on teams that made the playoffs in 2012, but each of the four teams were competing for a playoff spot in the months of August and September. Four teams in the sample met the LTL criteria and their combined record was 296 – 352 (.457 winning percentage).

 

High or low team leader classification

Team League 2012 Regular Season Record Team Leader High or Low Team Leader
Angels AL 89-73 Torii Hunter HTL
Giants NL 94-68 Buster Posey HTL
Yankees AL 95-67 Derek Jeter HTL
Rays AL 90-72 Evan Longoria HTL
Rockies NL 64-98 Michael Cuddyer LTL
Twins AL 66-96 Justin Morneau LTL
White Sox AL 85-77 Paul Konerko LTL
Phillies NL 81-81 Jimmy Rollins LTL
     Table 1. Classification of high or low team leaders based on their team’s 2012 regular season record

Results

There was no significant correlation between the total number of team references and the total number of wins that a leader’s team had at the end of the regular season r = .237, p > .05). Nor was there an indication of a negative correlation between self-references and total number of team wins r = -.086, p > .05.

Leader responses were then aggregated between LTLs and HTLs. Of the 490 total responses, 252 responses were made after or in reference to a previous game. Quotes were then selected for these post-game interview responses after a leader’s team had won a game (162 total) or lost a game (90 total). After a loss, both HTLs and LTLs referred to their teams much more often than referring to themselves. LTLs were 7.20 times as likely to reference their team after a loss than reference themselves. When compared to LTLs, HTLs were less likely to refer to their team after loss (4.42:1). After a win, LTLs were 1.41 times as likely to reference their team than themselves. HTLs on the other hand were 2.32 times as likely to reference their team than themselves after a win (Table 1).

Reference to team or self as ratio

Leader Loss Win
HTL 31:7 (4.42:1) 65.28 (2.32:1)
LTL 36:5 (7.20:1) 45:32 (1.41:1)
     Table 2. Ratios of team vs. self references for each type of leader

The monthly distribution of team reference for LTLs was relatively even across all months of the regular season. The highest percentage was July (19.9%) and the lowest was August (12%), a difference of 7.9% (Figure 1). The overall standard deviation for team references by month was σ = 2.88. In contrast, team reference for HTLs was much more dynamic. The highest percentage was September (39.6%) and the lowest was June (5.8%), a difference of 33.8%. September team references for HTLs were more than double any other month. The overall standard deviation was σ = 12.2, with the resulting distribution becoming much more parabolic (Figure 2). The quadric trend line that is used to represent the team reference distribution for HTLs showed a very good fit R2 = .91.

nullFigure 1. Percentage of team reference by month LTLs
           Figure 2. Percentage of team reference by month HTLs with quadratic trend line

 

Discussion

The increased rate of team reference by HTLs as compared to LTLs may have helped to establish better role clarity – a characteristic of more cohesive teams. This was further marked by the fact that HTLs were on higher performing teams than LTLs. The direction of the team cohesion to performance relationship in this case is still unknown.

HTLs also referred to their teams most often during the end of the regular season. This relates to the theory that charismatic leaders will “activate” in times of crisis. In turn, this helps to create more team cohesion as members attach themselves to leaders in times of crisis.

 

[1] Carron, A.V., Colman, M.M., Wheeler, J., & Stevens D. (2002). Cohesion and Performance in Sport: A Meta Analysis. Journal of Sport & Exercise Psychology, 24, 168-188.

[2] Mullen, B. and Copper, C. (1994). The relation between group cohesiveness and performance: an integration. Psychological Bulletin.115, 210-227.

[3] Vincer, D., & Loughead, T.M. (2010). The Relationship Among Athlete Leadership Behaviors and Cohesion in Team Sports. The Sport Psychologist, 24, 448-467.

[4] Carron, A.V., Bray, S.R., & Eys, M.A. (2002). Team Cohesion and Team Success in Sport. Journal of Sports Sciences. 20(2). 119-126.

[5] Oliver, L.W., Harman, J., Hoover, E., Hayes, S.M., & Pandhi, N.A. (2003) A quantitative integration of the military cohesion literature. Military Psychology, 11, 57-83.

[6] Carron, A. V., & Eys, M. A. (2012). Group dynamics in sport (4th ed.). Morgantown, Fitness Information Technology.

[7] Shamir, B., Arthur, M.B., & House, R.J. (1994). The rhetoric or charismatic leadership: A theoretical extension, a case study, and implications for research. The Leadership Quarterly, 5(1), 25-42.

[8] Poon, J. & Fatt, T. (2000). Charismatic Leadership. Equal Opportunities International. 19(8), 24-28.

[9] Conger, J. A. (1999). Charismatic and transformational leadership in organizations: An insider’s perspective on these developing streams of research. The Leadership Quarterly, 10, 145-179.

[10] Kets de Vries, F. R. (1988). Prisoners of leadership. Human Relations, 41, 261-280.

[11] Gaines, C. (2011, April 21). Chart of the Day: What it takes to make the playoffs in Baseball. Business Insider. Retrieved from http://www.businessinsider.com/chart-of-the-day- what-it-takes-to-make-the-playoffs-in-baseball-2011-4

[12] Bloom, B.M. (2005). Padres Try to Recover from 82-80 Record. San Diego Padres. Retrieved from http://m.padres.mlb.com/news/article/1236830/


Hardball Retrospective – The “Original” 1931 Philadelphia Athletics

In “Hardball Retrospective: Evaluating Scouting and Development Outcomes for the Modern-Era Franchises”, I placed every ballplayer in the modern era (from 1901-present) on their original team. Therefore, Frank Tanana is listed on the Angels roster for the duration of his career while the White Sox declare Edd Roush and the Yankees claim Hippo Vaughn. I calculated revised standings for every season based entirely on the performance of each team’s “original” players. I discuss every team’s “original” players and seasons at length along with organizational performance with respect to the Amateur Draft (or First-Year Player Draft), amateur free agent signings and other methods of player acquisition.  Season standings, WAR and Win Shares totals for the “original” teams are compared against the “actual” team results to assess each franchise’s scouting, development and general management skills.

Expanding on my research for the book, the following series of articles will reveal the finest single-season rosters for every Major League organization based on overall rankings in OWAR and OWS along with the general managers and scouting directors that constructed the teams. “Hardball Retrospective” is available in digital format on Amazon, Barnes and Noble, GooglePlay, iTunes and KoboBooks. The paperback edition is available on Amazon, Barnes and Noble and CreateSpace. Supplemental Statistics, Charts and Graphs along with a discussion forum are offered at TuataraSoftware.com.

Don Daglow (Intellivision World Series Major League Baseball, Earl Weaver Baseball, Tony LaRussa Baseball) contributed the foreword for Hardball Retrospective. The foreword and preview of my book are accessible here.

Terminology

OWAR – Wins Above Replacement for players on “original” teams

OWS – Win Shares for players on “original” teams

OPW% – Pythagorean Won-Loss record for the “original” teams

Assessment

The 1931 Philadelphia Athletics    OWAR: 53.6     OWS: 347     OPW%: .524

Connie Mack acquired all of the ballplayers on the 1931 Athletics roster. Based on the revised standings the “Original” 1931 A’s finished in second place, two games behind the Yankees. Philadelphia paced the Junior Circuit in OWS and led the League in OWAR for the fourth straight season (1928-1931).

“Bucketfoot” Al Simmons (.390/22/128) collected his second successive batting title and placed third in the American League MVP balloting. Mickey Cochrane drilled 31 doubles and delivered a .349 BA. Max “Camera Eye” Bishop amassed over 100 bases on balls in eight consecutive seasons (1926-1933). Jimmie Foxx belted 30 round-trippers and drove in 120 baserunners. Charlie Grimm aka “Jolly Cholly” contributed a .331 BA with 33 doubles and 11 triples.

Jimmie Foxx ranks second to Lou Gehrig among first basemen while Lefty Grove places runner-up to Walter Johnson according to Bill James in “The New Bill James Historical Baseball Abstract.” Teammates cataloged in the “NBJHBA” top 100 rankings include Cochrane (4th-C), Simmons (7th-LF), Wally Schang (20th-C), Bishop (43rd-2B), Jimmie Dykes (52nd-3B), Grimm (85th-1B), Joe Dugan (88th-3B) and Doc Cramer (91st-CF).

LINEUP POS WAR WS
Max Bishop 2B 5.27 24.91
Mickey Cochrane C 5.68 28.31
Al Simmons LF 5.89 33.75
Jimmie Foxx 3B/1B 3.93 24.11
Charlie Grimm 1B 3.02 20.08
Rube Bressler LF 0.39 3.09
Lou Finney RF 0.31 1.69
Dib Williams SS -0.32 9.16
BENCH POS WAR WS
Jimmie Dykes 3B 0.65 13.13
Charlie Berry C 1.88 10.79
Val Picinich C 0.18 1.41
Glenn Myatt C -0.05 3.87
Joe Palmisano C -0.1 0.72
Lena Styles C -0.15 0.73
Cy Perkins C -0.16 0.49
Joe Dugan 3B -0.19 0.09
Wally Schang C -0.32 1.16
Eric McNair 3B -0.35 5.71
Doc Cramer CF -0.54 3.61
Frank Sigafoos 3B -0.68 0.34
Joe Boley SS -1.15 3.29

Lefty Grove claimed the 1931 American League MVP award with a dominant performance including League-bests in victories (31), ERA (2.06), WHIP (1.077) and complete games (27). He also struck out the most batsmen in the circuit for the seventh year in a row. George “Moose” Earnshaw topped the 20-win plateau for the third straight season. Herb Pennock and Tom Zachary furnished 11 victories apiece.

ROTATION POS WAR WS
Lefty Grove SP 10.74 41.58
George Earnshaw SP 5.57 28.08
Tom Zachary SP 3.99 19.78
Herb Pennock SP 2.78 9.47
BULLPEN POS WAR WS
Eddie Rommel SP 2.6 12.06
Fred Heimach SP 0.85 9.61
Lew Krausse SP 0.11 0.92
Hank McDonald SP 0.05 3.95
Jim Peterson SW -0.1 0.3
Sol Carter RP -0.32 0
Bill Shores SP -0.64 0.14
Dolly Gray SP -0.95 9.99
Socks Seibold SP -1.22 6.27

The “Original” 1931 Philadelphia Athletics roster

NAME POS WAR WS General Manager Scouting Director
Lefty Grove SP 10.74 41.58 Connie Mack
Al Simmons LF 5.89 33.75 Connie Mack
Mickey Cochrane C 5.68 28.31 Connie Mack
George Earnshaw SP 5.57 28.08 Connie Mack
Max Bishop 2B 5.27 24.91 Connie Mack
Tom Zachary SP 3.99 19.78 Connie Mack
Jimmie Foxx 1B 3.93 24.11 Connie Mack
Charlie Grimm 1B 3.02 20.08 Connie Mack
Herb Pennock SP 2.78 9.47 Connie Mack
Eddie Rommel SP 2.6 12.06 Connie Mack
Charlie Berry C 1.88 10.79 Connie Mack
Fred Heimach SP 0.85 9.61 Connie Mack
Jimmie Dykes 3B 0.65 13.13 Connie Mack
Rube Bressler LF 0.39 3.09 Connie Mack
Lou Finney RF 0.31 1.69 Connie Mack
Val Picinich C 0.18 1.41 Connie Mack
Lew Krausse SP 0.11 0.92 Connie Mack
Hank McDonald SP 0.05 3.95 Connie Mack
Glenn Myatt C -0.05 3.87 Connie Mack
Jim Peterson SW -0.1 0.3 Connie Mack
Joe Palmisano C -0.1 0.72 Connie Mack
Lena Styles C -0.15 0.73 Connie Mack
Cy Perkins C -0.16 0.49 Connie Mack
Joe Dugan 3B -0.19 0.09 Connie Mack
Wally Schang C -0.32 1.16 Connie Mack
Dib Williams SS -0.32 9.16 Connie Mack
Sol Carter RP -0.32 0 Connie Mack
Eric McNair 3B -0.35 5.71 Connie Mack
Doc Cramer CF -0.54 3.61 Connie Mack
Bill Shores SP -0.64 0.14 Connie Mack
Frank Sigafoos 3B -0.68 0.34 Connie Mack
Dolly Gray SP -0.95 9.99 Connie Mack
Joe Boley SS -1.15 3.29 Connie Mack
Socks Seibold SP -1.22 6.27 Connie Mack

Honorable Mention

The “Original” 1911 Athletics            OWAR: 46.1     OWS: 303     OPW%: .597

Philadelphia coasted to the pennant by a nine-game margin over Boston. “Shoeless” Joe Jackson posted a .408 BA in his first full season. He collected 233 safeties, scored 126 runs and led the Junior Circuit with a .468 OBP. Eddie Collins swiped 38 bags while batting at a .365 clip. “Home Run” Baker (.334/11/115) topped the American League in circuit clouts for the first of four consecutive campaigns. Matty McIntyre totaled 102 runs and produced a .323 BA. “Gettysburg” Eddie Plank delivered a 23-8 record with a 2.10 ERA including six shutouts. Jack Coombs led the League with 28 victories despite allowing 360 hits in 336.2 innings pitched. Bris Lord aka the “Human Eyeball” supplied a .310 BA and accrued 92 tallies.

The “Original” 2002 Athletics            OWAR: 45.8     OWS: 304     OPW%: .578

Jason Giambi (.314/41/122) coaxed 109 bases on balls and tallied 120 runs as the ’02 squad finished five games ahead of the Angels for the American League pennant. Miguel Tejada (.308/34/131) achieved MVP honors and made his first All-Star appearance while registering 108 aces and 204 base knocks. Barry Zito claimed the Cy Young Award with a record of 23-5 and an ERA of 2.75. Tim Hudson contributed 15 victories and a 2.98 ERA while portsider Mark Mulder accrued 19 wins. Eric Chavez launched 34 long balls, drove in 109 baserunners and earned the second of six consecutive Gold Glove Awards.

On Deck

The “Original” 1907 Phillies

References and Resources

Baseball America – Executive Database

Baseball-Reference

James, Bill. The New Bill James Historical Baseball Abstract. New York, NY.: The Free Press, 2001. Print.

James, Bill, with Jim Henzler. Win Shares. Morton Grove, Ill.: STATS, 2002. Print.

Retrosheet – Transactions Database

Seamheads – Baseball Gauge

Sean Lahman Baseball Archive


Give Me a Rise

It is well established that having more rise on your four-seam fastball is a good thing. The question then becomes, can we identify the optimal amount of rise as compared to the league-average fastball. For the purposes of this analysis, we will look at swinging-strike rate, from all four-seam fastballs thrown since the dawn of the PITCHf/x era, in regular-season action.

We in the sabermetrically-inclined community tend to pooh-pooh popular baseball concepts, particularly ones where the science, on the surface, doesn’t appear to jive with the age-old baseball wisdom. Don’t worry, this is not a DIPS discussion, nor a discussion on a pitcher’s ability to manage contact. I bring up this concept in relation to the term “late life” as in movement later in the pitches trajectory. Physics tell us that the ball will have a very predictable trajectory from the moment the ball leaves the pitchers hand, until it reaches the front of the plate. That, however, is merely half the story. There are two important points I want to bring up:

  1. Batters cannot compute vertical trajectory explicitly; they essentially tap into a huge vault of experience telling them how far a pitch will drop based on their experience with pitches of similar velocity.
  2. A hitter’s swing is largely ballistic (very difficult to change mid-swing) and takes about 0.18 seconds to execute. That means that a hitter has roughly 0.2 seconds post-release of the ball to gather information and form an educated guess as to where the ball will end up.

Based on these assumptions, I computed late movement, in both the vertical direction and horizontal direction. I then compared this to the expected vertical movement based on the velocity (more velocity, less drop obviously). This to me is the optimal way to look at movement, since presumably they cannot gather any more information. A great hitter may be able to factor in their knowledge of the pitcher’s ability to rise the fastball, but they are fighting their memories of all the other fastballs they’ve seen, so more difficult than you would think.

Which brings us to a very interesting graph: The height and colours in the histogram reflect the magnitude of the swinging-strike rates, shown in sequential order of velocity. If you scroll all the way to the bottom, you’ll see that the center of the histogram is somewhere around -.6, or 0.6 feet more rise than the average four-seam fastball when looking at the pitch 0.2 seconds after release until it crosses home plate.

We see a very clear normal curve, with more “normal” at higher n. Thus we can now compute the value of rise in a four-seam fastball, as distributed by a normal curve centered around 0.6 feet above the mean drop. Not really a stats guy, so not sure how to do that exactly. What I find interesting is that the 7 inches or so of rise is pretty consistent across the velocity spectrum. I’m not sure why it peaks at this point, though I would surmise that it’s probably the sweet spot where the hitter feels like they can make contact, but can’t, as opposed to extreme rise which would freeze the hitter.

This leads us to our last graph (warning: this one scrolls for a while). You’ll see the same graph as above, but you’ll see Whiff%, GB% and HR% stacked one on top of the other.

This actually paints a very intuitive picture. If there is more rise than average, you’ll get swinging strikes. If it drops more than average, you’ll get groundballs and if it drops about what you’d expect, you’ll get some groundballs, but also homers. Ignore the SSS noise with homers at the higher velocities. Again what is interesting with the GB% and Whiff% histograms are how consistent they are irrespective of velocity. So… if velocity doesn’t impact this analysis, let’s collapse it all into one final graph:

Paints a very clear picture: if your four-seam fastball isn’t getting at least 5 inches of late rise, you are going to be giving up a lot of homers. Note that swing% (swings/total pitches) is normally distributed around a mean of .2 feet of rise and appears to track pretty closely to HR%, implying that hard contact is not affected within 1 standard deviation.

Looking forward to the feedback.


Vertical Command – Or Lack Thereof

I read a great book by Mike Stadler called the Psychology of Baseball. In it he referenced that it is far more difficult for humans to control where a ball ends up vertically (due to the need for advanced spatial reasoning) compared to horizontally. You can find his discussion starting on page 86. Amazon Link

I’m going to show you three pictures which will illustrate this quite well. Data is inclusive of all pitches thrown in regular season games since 2010. The first is a heat map of sorts which maps vertical distance from the center of the zone (from PITCHf/x data sz_top and sz_bottom) on the y axis and velocity on the x axis. What we see quite clearly is that it is *much* better to throw a four-seam fastball up in the zone than down in the zone, almost irrespective of velocity. In fact, a 92 MPH four-seam fastball thrown 0.8 feet above the center of the zone will get about 13% swings and misses; a 98 mph four-seam fastball thrown below the center of the zone will get 12% swings and misses. Behold the graph, from a fan:

Four Seam Fastball, Depth x Velocity
Four-Seam Fastball, Depth x Velocity

The question then becomes, if a pitcher throws the ball up in the zone, how will the probability of a HR change? This brings us to picture #2, where we have the same x and y axes (apparently that’s the plural of axis, thanks google), but instead we have HR% (# of HRs/Total Pitches). I’ve removed 99+ MPHs from the graph as they were displaying SSS noise.

HR% by Depth and Velocity
HR% by Depth and Velocity

So interestingly, if you look at the totals on the right, it paints a visual that HRs are NOT hit on high fastballs, but rather on fastballs closer to the heart of the zone (vertically). In fact (and a story for another day) there is a 97% R-squared correlation between distance from the center of the zone and HR%. On an aside, this also reproduces other research which indicate that faster fastballs yield fewer home runs. The trend is also quite linear (don’t have a computed R2 for that, but that’s old news anyway).

Now, if you are far more likely to get a swinging strike and you aren’t putting yourself at risk for a home run by throwing up in the zone, if we looked at a distribution of four-seam fastballs, we should see a higher proportion of four-seamers up in the zone, ideally right at the top 0.8 to 1.0 feet above the zone, where whiffs are plentiful and HRs are scarce. Beware SSS in some of the higher velocities, but note that a 95 MPH fastball only .4 feet above the center of the zone will yield more HRs than an 88 MPH fastball thrown at the top of the zone (the 95 MPH fastball will still yield more whiffs, but just goes to show how important command is). This is what we actually see:

A nearly uniform distribution across all velocities, slightly skewed to below the center of the zone. I’m not ready to conclude that pitchers are not capable of pitching up in the zone with four-seam fastballs, it may just be old school “pitch down in the zone” thinking. I still find it astonishing how consistent the data is across the velocity spectrum. It almost appears to me that if a pitcher can simply pitch higher in the zone with a four-seam fastball, they can make their stuff play up a lot, sort of like MadBum:

Still not pitching at the top end of the zone, but definitely skewed higher, with his distribution centered around .3 feet above the heart of the zone.


GB% by Pitch Type and Location

Red = High GB% rate (ground balls / total pitches)
Yellow = Medium ; Green = Low

The size of the circle also represents the magnitude.

Numbers are in Feet, with -X being inside (handedness neutral) and Z being height in feet above the center of the strike zone (as per PITCHf/x strike zone top and bottom). The X is flipped for left handed batters. After I’ve published a few of these, I’ll work on publishing a version to Tableau Public, though not sure how it will perform given the huge underlying data set.

Some observations:

1) The cutter, which appeared to have two hot zones for swings and misses, appears to have only one hot zone for groundballs, of about .5 feet to 1 foot below the center of the zone and between .4 feet away and .4 feet in from the center of the plate. In the previous post we saw that as you went farther away from the plate horizontally and about .5 foot lower, you get swinging strikes.

2) Changeups down and away get groundballs. They also get swings and misses. Groundbreaking stuff here…

3) Two-seamers and sinkers have a very large area that get groundballs (another shocker), though what surprises me is how high it starts (almost at the center of the plate). It makes me wonder if I need to double-check my methodology. As you get lower in the zone, you get fewer swings and more takes, so the GB% goes down dramatically.

4) Curveballs only get groundballs if they are in the strike zone when crossing the plate (down and away). If you bury it, you basically trade the GB for a swing and a miss. I’m thinking I need to rebuild this chart with fewer grids, but a bunch of pie charts, to somehow visualize how results morph based on location.

Finally figured out how to get PITCHf/x data into Tableau (used Alteryx to scrape MLB) — having lots of fun and appreciate the feedback!