Archive for Research

Visualizing Major League Baseball During the Aughts

2010 marks the end of the “aught” decade for Major League Baseball.  I thought I would take the opportunity to analyze the last 10 years by visualizing team data.  I used Tableau Public to create the visualization and pulled team data from ESPN.com (on-field statistics) and USA Today (team payroll).

The data is visualized through three dashboards.  The first visualizes the relationship between run differential (RunDiff) and OPS differential (OPSDiff) as well as the cost per win for teams.  The second visualization is in table form and can be sorted and filtered along a number of dimensions.  The final visualization looks at expected wins and actual wins.

Read the rest of this entry »


A Pitch F/X Look at Cliff Lee

Lee has a tremendous variety of movement in his pitches. He has three pitches that tail away from righties (fourseam, twoseam, changeup) and a nasty curveball with a ton of movement. For most pitchers this would be plenty; but Cliff Lee is not like most pitchers. He also packs a cutter with as much horizontal movement as some sliders.

We can see this with the following graph, which is from the catcher’s perspective (same with all following graphs):

CLeepitchmovementandlocation

CU=curveball, FC=cutter, FF=fourseam, FT=twoseam, CH=changeup. The black box represents the strikezone and has the average pitch locations for each pitch.

Looking at a pitcher’s entire repertoire like this is useful, but it can be more interesting to look at pitches individually when it comes to pitchers like Lee.

FFdensRHw_strikezone

FFdensLH_w_strikezone




Against righties his location is pretty varied with the fourseam. He mainly locates the pitch middle-away, but often goes up and in too. Against lefties, he consistently pounds the outer half.

Pitch Type Count Selection% Swing% Swing-Miss% HR% GB% LD% FB%
vs·RHB
FF 352 13.8 50.9 12.3 0.9 28.9 25.0 46.1
vs LHB
FF 305 36.4 47.2 13.2 0.7 45.6 12.3 42.1

FTdensRHw_strikezone FTdensLHw_strikezone


Against righties he primarily throws the twoseam pitch up and away, which explains why he has a high flyball rate on a pitch typically associated with groundballs. Against lefties the pitch is pretty much thrown low and over the middle of the plate.

Pitch Type Count Selection% Swing% Swing-Miss% HR% GB% LD% FB%
vs·RHB
FT 1174 46.2 48.2 14.5 0.5 31.0 21.0 48.1
vs·LHB
FT 241 28.8 46.1 11.7 0.0 59.6 27.7 12.8

FCdensRHw_strike

FCdensLHw_strike


Against righties the pitch is a real weapon; the cutter results in many whiffs and a solid amount of groundballs. Against lefties the pitch isn’t as remarkable, but still solid. His location against lefties with the cutter is very similar to his location with his fourseamer against lefties.

Pitch Type Count Selection% Swing% Swing-Miss% HR% GB% LD% FB%
vs·RHB
FC 510 20.1 54.9 20.4 0.6 47.0 23.0 30.0
vs·LHB
FC 185 22.1 49.7 17.4 1.1 41.9 18.6 39.5

CUdensRHw_strike

CUdensLHw_strike


His location against righties and lefties is pretty much the same, though he does backdoor the pitch occasionally to righties. He pretty much only throws his curve late in counts for strikeouts.

Pitch Type Count Selection% Swing% Swing-Miss% HR% GB% LD% FB%
vs·RHB
CU 170 6.7 44.1 37.3 0.0 76.0 12.0 12.0
vs·LHB
CU 49 5.8 36.7 38.9 0.0 20.0 20.0 60.0

CLeedensCH

Pitch Type Count Selection% Swing% Swing-Miss% HR% GB% LD% FB%
CH 293 11.5 58.7 29.7 0.3 42.5 17.5 40.0

Only one graph here because he only threw 20 changeups to lefties the entire year, so I’m just going to ignore those. According to Fangraphs pitch run values, his changeup was his most effective pitch this year. And you can see why; he was great and locating the pitch down and away.

*all data and tables are from Joe Lefkowitz’ site.

*This article was originally posted on www.pendingpinstripes.net


Graphical wOBA by Count

I am a big fan of graphs and baseball. Fangraphs made me excited because putting complex data into reasonably easy to understand graphs helps open up sabermetrics to more fans. I’m a big fan of statistical analysis, but after a while, a table full of numbers just starts running together and stops making sense. That’s what makes graphs such an effective tool.

I’ve dabbled in graphs myself. When people were creating the WAR graphs to compare hall of famers, I made a sample graph showing cumulative WAR by age on Tom Tango’s Book Blog:

(click for a larger image)

Read the rest of this entry »


Another Season Gone By Without Realignment

For the first time ever in the divisional era, 1 division has managed to run the table against the other two. Barring the Royals taking 3 of 4 from Tampa over the weekend, there won’t be a single team in the American League who will have put together a winning record against the AL East.

Since 2007, only 8 teams in the American league have been able to put together a winning record against the AL East, (LAA 3 times, OAK twice, TEX, DET and SEA once). This season, as things currently stand, the Central and West divisions have combined to go 149 – 198 (.429%) winning percentage against the AL East, a record comparable to the Cleveland Indians or the Washington Nationals.

What’s worse, is that the AL East will end again with the team that has either the worst, or second worst record in the American League (assuming Buck Showalter doesn’t phone it in this weekend), but looking deeper into their divisional performance, how bad exactly is the AL East’s punching bag?

(Record, Winning Percentage, % Diff from total)
2010 Orioles: Non-AL East Record (39-47*, .453, +.054), 4 games at home v. Detroit left
2009 Orioles: Non-AL East Record (40-50, .444, +.049)
2008 Orioles: Non-AL East Record (46-43, .516, +.095)

AVERAGE % Diff = +.066, or 10.7 Wins

And the 4th place Jays?

2010 Blue Jays: Non-AL East Record (43-43*, .500, -.019), 4 games in Minnesota left
2009 Blue Jays: Non-AL East Record (49-41, .544, +.081)
2008 Blue Jays: Non-AL East Record (49-41, .544, +.013)

AVERAGE % Diff = +.025, or 4 Wins

This means, on average, even the worst that the AL East has to offer, averages almost 5% points higher against non-divisional foes, or roughly 7 wins better across a 162 game season. How much are those 7 wins worth? If you’re Tampa, obviously zero since the only way to fill the ballpark there is to give away 20,000 tickets, but to a team like Seattle or Toronto (as highlighted above), the 2009 Jays managed 75 wins at 23,162 fans a game, whereas the 2008 version of 86 wins averaged 29,626. Obviously its a very rudimentary look at attendance and I’m ignoring plenty of factors, but the fact is and has always been that outside of Tampa Bay, people will inevitably jump on the bandwagons and go to see winning teams win ballgames. The point I’d like to be making here is the financial impact of regional divisional slotting, and if you take the leap with me, obviously the analysis will have a more profound impact.

For comparison purposes, lets look at the AL CENTRAL 4th and 5th place finishers

2010 Kansas City: Non-AL Central Record (36-50*, .418, +.007), 4 games at home v. Tampa left
2009 Kansas City: Non-AL Central Record (33-57, .367, -.034)
2008 Detroit: Non-AL Central Record (47-43*, .522, +.065), 13-5 in interleague games

2010 Cleveland: Non-AL Central Record (33-55, .389, -.039)
2009 Cleveland: Non-AL Central Record (33-55, .389, -.012)
2008 Kansas City: Non-AL Central Record (44-46*, .489, +.025). 13-5 in interleague games

AVERAGE % Diff = .002, or +.4 Wins

The 2008 figures are both skewed in the positive directions due to AL Central success during interleague play, getting to play the even weaker NL West in the majority of their IL games, and zero games against that division’s winner (84-78 Dodgers), so discounting those games, the result is:

AVERAGE % Diff = -.015 or -2.7 Wins

Does the AL East really cost a team 7-10 Wins a season like it has the Orioles? Probably not, there are plenty of other factors that go into a teams eventual W-L, but traditional Strength of Schedule metrics that value the winning % of your opponents, or Runs Scored/Runs Allowed Differentials often don’t capture the simple facts that your record and team stats will be positively influenced by facing the Royals 18 times a season. Who knows, maybe the 2010 Blue Jays, who lead the majors in slugging and home runs, could’ve competed this season in another division where its young pitching staff wasn’t exposed to the three highest scoring teams in the league (NY Boston Tampa) for 54 games? Under this current alignment, we can never be too sure.

On the positive, at least the divisions regionally make sense in their current form, unlike football. I’m sorry, putting Miami in the AFC EAST, Baltimore in the AFC NORTH, and Indianapolis is in the AFC SOUTH makes about as much sense as the BCS national championship formula.


Making Cent$ of Home Field Advantage

Over the last couple of weeks there has been a lot of debate about the value of home field advantage in baseball. The discussion was crystallized most recently when the Yankees rested key bullpen members in their first place showdown against the Rays, but it has really be going on since the advent of the Wild Card.

Yesterday, we wondered about the value of finishing first from an accomplishment perspective, but ultimately, that is a very intangible way of looking at the question. At SI.com, Joe Seehan looked at home field advantage from a competitive standpoint and came to the conclusion that it really wasn’t an advantage at all. According to Sheehan’s research, the number one seed has advanced to the World Series in only eight of 24 chances since 1998, when the current playoff format was established. What’s more, over that span, the home team has only gone 45-39 in all post season series, according to Sheehan. In other words, there really isn’t a home field advantage in baseball during the postseason.

There is at least one more vantage point from which to consider this question, and it could very well be the most important: economics.

In the post season, gate revenue (i.e., attendance) is divided between the players and hosting team using the following format:

  • Players: 60% of gate receipts from first three games of LDS and first four games of LCS and World Series; no contribution from other games.
  • Home team: 40%* of gate receipts from first three games of LDS and first four games of LCS and World Series; 100% of gate receipts from all other games.

*A small percentage (approximately 1.5%) of LDS gate receipts goes to the umpires, while 15% of LCS and World Series gate receipts go to MLB.

On the face of it, there seems to be an economic advantage to having home field. But, is it real, and if so, how significant is it?

Read the rest of this entry »


The Leadoff Walk

We’ve all heard a broadcaster comment on the impending doom of a leadoff walk and yet they fail to seem to apply the same sort of fateful outcome for a single. I thought it would be interesting to find the outcomes of each of the ways a player can leadoff an inning by getting on first base and see if it affects whether or not the runner goes on to score. I took the retrosheet data sine 1952 (but not including this year) that I have as a MySQL database and created a quick python script to determine these results. I took it further and examined if the breakdown were any different in late game situations, as I’m always hearing “You never want to walk the leadoff batter but especially late in close ball games”. I was also curious if even in general more solitary runs get manufactured once a leadoff runner gets on base in late game situations.

Total times batter lead off an inning by getting to first: 508312
Total times runner scored: 192150

So a leadoff batter who starts on first base scores 37.80% percent of the time, here is the breakdown via the means they get aboard

Any inning

Single      325455 Scored 122662   37.69%
Walk        150570 Scored  57189   37.98%
HBP          11865 Scored   4600   38.77%
Error        19260 Scored   7270   37.74%
Strikeout     1007 Scored    375   37.24%
Catcher's Int. 155 Scored     54   34.84%
Totals      508312 Scored 192150   37.80%

So it appears as though it’s not much of a statistically significant difference between the walk and the single. The HBP numbers seems to be a bit of an outlier, I’m wondering if that is just sample size or if such an outcome rattles the pitcher to the point of that much more runs being produced.

Lets now examine the breakdown based upon the stage of the game.

6th inning or earlier

Single      217421 Scored  83243 38.29%
Walk        100587 Scored  38798 38.57%
HBP           7879 Scored   3070 38.96%
Error        12778 Scored   4880 38.19%
Strikeout      645 Scored    244 37.83%
Catcher's Int. 107 Scored     36 33.64%
Totals      339417 Scored 130271 38.38%

7th inning or later

Single     108034 Scored 39419 36.49%
Walk        49983 Scored 18391 36.79%
HBP          3986 Scored  1530 38.38%
Error        6482 Scored  2390 36.97%
Strikeout     362 Scored   131 36.19%
Catcher's Int. 48 Scored    18 37.50%
Totals     168895 Scored 61879 36.64%

Interesting how 1.74% more leadoff runners reaching first score in the earlier innings.  Is this a comment on the failure of manufacturing runs or pitching being different in the later stages of the game?  Perhaps a deeper look based upon “close game situations” is in order for that.


2010 Pitchf/x Summit Recap

A few weeks ago, Sportvision hosted the 3rd Annual Pitchf/x Summit.  Sportvision is the company behind the Pitchf/x system and has initiated Fieldf/x, which I’ll get into in a minute.  The goal of the summit was to share some of the research being done in baseball analysis, while also serving to explain the possibilities that exist with the new system.  Without further ado, here were the presentations:

Using Velocity Components to Evaluate Pitch Effectiveness (Matt Lentzner/Mike Fast): The purpose of this study was to change the reference point by which Pitchf/x data are measured.  Often, fastballs show more movement than breaking balls, but without the proper frame of reference, it means nothing.  Mike and Matt were able to demonstrate how to determine the horizontal and vertical velocities with respect to the batter’s eye and make the Pitchf/x data more meaningful.

Pitchf/x Application in Player Development and Evaluation (Dr. Glenn “Butch” Schoenhals): Dr. Schoenhals has a Pitchf/x system set up at his instructional school, which allows pupils (including some major leaguers) to see the their pitches broken down immediately and make adjustments.  In conjunction with three cameras set up around the pitcher, the Pitchf/x data provide benefit to both pitchers and instructors in learning/teaching how to pitch.

Okajima’s Mystery Pitch (Matt Lentzner): Hideki Okajima throws a pitch roughly 20% of the time that had previously been classified as a curveball, more specifically a “rainbow curveball.”  Actually, it didn’t really fit any of the known pitch types.  Using his research on pitch types and arm slots (“The Pitching Peanut”), we see that this pitch has almost no break, is faster than a curveball but slower than a slider, and falls at the exact center of the peanut.  His explanation: Okajima is the Boston pitcher who is actually throwing the gyroball, not his more famous teammate Daisuke Matsuzaka.

Leaving the No-Spin Zone (Alan Nathan): Dr. Nathan showed his experiments that relate the spin of the baseball just before and just after it is hit. The result? The two are almost totally independent of each other! I couldn’t believe that, but Dr. Nathan made a lot of sense.  This was a high-grade physics lesson, crashed into about 20 minutes.  He explained why balls tend to curve toward the foul lines; he showed that the bat actually “grips” the ball for a few nanoseconds or so before the ball explodes off the bat, which contrasts the earlier model of the ball “rolling” off the bat.  Really, really cool.

Fieldf/x System Overview (Vidya Elangovan): And the main event began.  Fieldf/x is a new tracking system that utilizes cameras attached to the light standards in baseball stadiums (for now, just AT&T Park) to track the movement of every person on the field 15 TIMES A SECOND.  As soon as I heard that, my mind started going crazy and I don’t think I paid attention for about 5 minutes.  The only issue at the time is that the system does not include the ball (but it will).  All ball events currently have to be added by someone watching the video.  The following presentations showed some of the things you can actually do with the data, and it’s fairly obvious that these data, particularly when connected to batted ball data through the Hitf/x database, are about to revolutionize how baseball players are evaluated.

Infield Defense with Fieldf/x (John Walsh): Actually the first presentation, thanks to being in Italy, (tough life), but it really would have been more helpful after the overview.  Either way, a lot of cool stuff.  First thing he said was that in tracking the different players, he noticed that an average centerfielder runs 8 miles per game, which stunned me and kept my attention.  Thanks to these new data, we can also see the effects of shifts and also what players away from the ball are doing while teammates are attempting to make plays.  Other questions John poses: can we see infielders cheating in a certain direction as the pitchers throws the ball? Do infielders lean in a certain direction before the pitch? Based on his initial investigations, he saw that third basemen step toward the line as the pitch is delivered and shortstops step directly at home plate.  Weird, but potentially important, and just a peak into what can be obtained.

From Raw Data to Analytical Database (Peter Jensen): As a baseball nerd and a programming dork, this was really cool.  Peter Jensen took the 400,000 lines of code that results from each game and wrote a macro to display what actually happened in the game in an Excel worksheet.  The simulation relates the position of each player as well as an approximation of where the ball is throughout the play.  His solution with regards to the reorganization of the data was very impressive for a first run, and it is absolutely vital to make the data useful for analysis.

Using Fieldf/x to Assess Fielders’ Routes to Fly Balls (Dave Allen): These next three were absolutely incredible to me (and I’m sure the last three would have joined them had I had the time to stay).  By using the data to reconstruct fielders’ routes to the ball, Allen surmises that the Fieldf/x data can be used to determine the speed of an outfielder as they pursue a ball, the starting points of each fielder at the time of the pitch (and hit), and how efficient each player is in getting to the ball (measuring the distance traveled against the shortest distance to the ball).  To me, this is something that teams can use to help players they already have by addressing alignment issues or noticing what is happening during the different points of pursuit.  Are outfielders getting good reads/jumps on the ball?  Are they running in straight lines or weaving?  Simply put, the data can confirm for us (and also measure exactly and more efficiently) what our eyes (and scouts’ eyes) have seen.

Measuring Base Running with Fieldf/x (Mike Fast): Mike’s presentation examined the different portions of base running and what the data can be used for.  Mike was able to track each base runner’s path around the bases, even what they were doing on pitches that weren’t hit (during which we would typically say “nothing happened”).  Obviously, with all of these data, there’s a lot happening.  Also, by knowing the position of the player at each moment in time, we can track both his speed and acceleration as rounds the bases; very valuable information for measuring “baseball speed.”

Fieldf/x of Probabilities: Converting Time and Distance into Outs (Jeremy Greenhouse): The coolest of the presentations.  As soon as he said the words “probability model,” I was sold.  Jeremy first examined stolen base attempts (in the thirteen games of data released, he only found four) and tried to determine the different component times of the stolen base attempt.  Some things he brought up that were interesting: “Pop” times, or the time it takes a catcher to catch the ball and get it to second base, was between 2.0 and 2.2 seconds for all attempts, which suggests that a lot of stolen bases are taken off pitchers, not catchers.  The ability to get a good lead is now measurable, as well as the jump a runner gets on the pitcher.

Jeremy also developed a model to determine the probability that a player makes a play on a ball hit near him.  The model was based on where the player is, where the ball would come down, and how long it would take the ball to get there.  From there, the player’s probability can change based on his jump, route, speed, and what I called “catching ability,” or the ability to actually make a play on the ball when in the vicinity.  It was shocking to see some of the plays made where players started out with low (less than 10 percent) chances of catching the ball, but by getting a good jump and running (quickly) in a straight line toward the ball, their probability would increase each 1/15 of a second.  He then showed the video of these plays and we were able to see the spectacular catches made by really good outfielders.  This also applies to outfielders who start with a low probability to make the catch, but increase it as they, for example, chase a ball into the gap, close quickly on it, but don’t catch it.  The ability to increase the probability of a catch is very valuable and that knowledge would be immensely valuable to teams.  Lastly, he also showed how bad outfielders can turn outs into hits by reading the ball poorly, getting bad jumps, and being indecisive.  Super cool, and as soon as the presentations are made available online (which hopefully will be soon), I will link to some of them, but especially some of these graphs.

Unfortunately, I missed the following presentations, so I will just show the abstracts presented in the program.

Where Fielders Field: Spatial and Time Considerations (Matt Thomas): Continued application of close-range photogrammetry through high-resolution digital photography to baseball is revealing hitherto unseen patterns of fielding in the game. Matt examines these patterns and where data permit, factors time into this examination. After reviewing general trends he notes specific achievements and then speculates on whether any of this freshly quantified insight tells us what makes for good (and not so good) fielding.

Scoutf/x (Max Marchi): This presentation evaluates players’ tools with Pitchf/x, Hitf/x, and Fieldf/x.

True Defensive Range (TDR): Getting out of the Zone (Greg Rybarczyk): Greg intends to display detailed tracking of the 25 batted balls in the released data that were hit in the air to the outfield. Presented data will include the relative positions of the outfielders and the ball from the time the ball leaves the bat until the time it is retrieved by the fielder. Using the essential elements of this data (fielder starting position, ball hang time and landing point), he outlines the fundamentals of a new outfield defensive metric, called ‘True Defensive Range’ or TDR, which should provide more accurate player defensive ratings with a smaller required sample size than current metrics. Full realization of this metric will require establishment of baseline values using the full data set. TDR for infielders will employ a similar method, but it will not be covered during this presentation.

The Future of Sportvision’s Data Collection (Greg Moore): Greg will talk about several bits of baseball data that Sportvision might collect in the future, and he will discuss how the data can be used in conjunction with Pitchf/x, Hitf/x, and Fieldf/x. Greg will also conclude the 2010 Pitchf/x Summit with closing remarks.

Obviously, there was a lot of cool stuff presented.  As mentioned, only 13 games worth of data were released to the analysts and most of the presentations were about determining what could be done with the data.  But with enough work and research, it will not only change the way teams and analysts evaluate players, but also will give teams another tool with which to teach their players and improve the guys they already have on the roster.  We’ll also know exactly what skills are important in each aspect of the game (base running, fielding, etc.), and as we learn these things we’ll discover other things we want to know.  I’d love to know what you guys think of all this and I’ll try to answer any questions you have about what can and can’t be measured and how we’ll use it in the future.

UPDATE: After I wrote this mess, I discovered this, much cleaner, detailed, mess, written by Baseball Prospectus writer Ben Lindbergh.  I’ll link to it down here because I want you to read what I wrote instead of Ben’s running diary.  Sorry, Ben.

This article was originally published at Knuckleballs, written by Dan Hennessey.


Tribe’s Ongoing Draft Difficulties

Successful small-market teams contend by acquiring above-average talent on two levels: (a) development, either through international free agency or the amateur draft, and (b) trades. 

Success in one area, and not the other, can create a small window of opportunity to win, but for a team to compete for multiple seasons it is important to find talent on both avenues.  

Many fans remember John Hart as the architect of those great Cleveland teams in the late 90’s, but few realize how fortunate he was to step into that situation. Much of the groundwork for the Indians renaissance can be traced back to former General Manager Hank Peters.

During his four-year tenure as GM, Peters drafted Manny Ramirez, Charles Nagy, Jim Thome, Brian Giles, David Bell, Chad Ogea, and Paul Byrd.  He also did well when he acquired Sandy Alomar Jr. and Carlos Baerga in a trade with San Diego for Joe Carter

The foundation was set for Hart to succeed.

Hart, as he should, receives a lot of the credit for those teams.  He was one of the first general managers to sign young players, not quite at their peak, to long term contracts. He acquired a borderline future Hall of Famer in Kenny Lofton for Willie Blair and Eddie Taubensee – both would become more journeyman than established player.

John Hart did a lot of things right but what he could not do was mimic Peters’ draft success. 

Hart’s tenure as General Manager lasted ten years, 1992 – 2001, and in that time he managed to draft and sign only three impact players – CC Sabathia, Sean Casey, and Richie Sexson. Both Casey and Sexson were eventually traded for spare parts.

Using WAR (Wins Above Replacement player) as a statistical guideline the top ten players drafted by Hart are:

Name Year Drafted Round Career WAR
C.C. Sabathia 1998 1 38
Sean Casey 1995 2 15.8
Richie Sexson 1993 24 14.4
Russell Branyan 1994 7 8.5
Steve Kline 1993 8 7.9
Luke Scott 2001 9 7.3
David Riske 1996 56 6.9
Ryan Church 2000 14 6.9
Paul Shuey 1992 1 6.8
Jon Nunnally 1992 3 5.3

While all those players carved out careers of varying success, only Sabathia ranks in the top 1000 players in career WAR.  Hart’s predecessor and current GM, Mark Shapiro, was not as lucky and stepped into a situation much more difficult.

 Hart’s final two drafts epitomized the team’s long standing draft failures and would eventually signal the beginning of Cleveland’s first rebuilding process in nearly a decade. 

In 2000 and 2001, the Indians owned 3 of the top 55 picks and 4 of the top 43 picks, respectively.  The only player chosen to mount any type of tangible career was Toronto long reliever/spot starter Brian Tallet.

As with Hart, the foundation for Shapiro was set – but this one signaled the beginning of a rebuilding period and the end of era. 

In November of 2001, Shapiro was promoted to General Manager of a team with high-priced, aging veterans and a farm system wrought with failure.  Shapiro sought to rebuild the organization from the ground up – in four seasons – the Hank Peters way.

He went about stocking the farm system by trading Bartolo Colon for Cliff Lee, Brandon Phillips, and Grady Sizemore.  Shapiro then drafted college phenom Jeremy Guthrie and signed him to a club record $3 million bonus.  He followed up the 2002 draft by selecting Michael Aubrey, a polished college hitter, in the first round the next season.   Shapiro set about reestablishing Cleveland baseball.

On paper, his way parallels that of Peters.  One problem: while he has had success in trading for prospects – Shin-Soo Choo, Asdrubal Cabrera, and Carlos Santana among others – his draft picks have not panned out.  Guthrie bounced between Buffalo and Cleveland, the bullpen and the rotation.  Aubrey spent more time in the chiropractor’s office then on the field and the rest of prospects failed to make a longstanding impact.

A small-market team has to have the ability to replace players as they enter arbitration years or they will never be able to consistently compete – which would explain the two winning seasons the Indians have had since 2002. 

Shapiro and assistant GM Chris Antonetti have changed draft philosophies in recent years and have begun taking players no longer consider “safe” choices, now focusing on “toolsy” players.  In the last three years the team has added much more promising prospects like Alex White, Lonnie Chisenhall, and 2010 second rounder LeVar Washington. 

Much like in 1992 and again in 2002, the front office and the team have both entered into another period of transition.  Chris Antonetti will spearhead another rebuilding effort in hopes of creating something fans have not experienced in so many years – a consistently competitive team.  It, of course, would be easier if he steps in a position much like Hank Peters created and not what John Hart left. 

Hank Peters’ time highlighted how building a successful franchise is built on strong talent development and smart trades.  His time in Cleveland is a perfect example on intermixing strong draft results and dealing higher priced veteran players for prospects.  In four years he was able to set the foundation, along with Hart’s subsequent tweaking, for the Tribe’s revival.


xBABIP Experiment: Mark Kotsay

I am tired of Mark Kotsay. I am tired of his automatic 4-3 ground outs. I am tired of his lazy fly balls to left center. I am tired of his .190 batting average with runners in scoring position. I am tired of his .688 OPS. But most of all, I am tired of people in the White Sox organization defending Mark Kotsay. From Ozzie Guillen to Hawk Harrelson and Chris Rongey, the excuses are coming from every corner of the organization. And as an objective White Sox fan, the constant excuses are getting tiring. Luck or no luck, Mark Kotsay is a bad baseball player, that much is for certain.

Kotsay does nothing well and he contributes nothing on the field to this White Sox team, as shown by Mark Kotsay’s -0.6 WAR, good for the fourth worst in all of Major League Baseball amongst players who have at least 280 plate appearances. This shouldn’t come as a surprise, as of this moment, Mark Kotsay is hitting .228. Yet you have Ozzie Guillen saying things like, “Personally, the numbers out there for Kotsay [are not what] he deserves.” Followed by…“You can ask his teammates, you can ask [hitting coach] Greg Walker. He should have better numbers than what he has.”

You can ask any average White Sox fan or anybody in the White Sox organization and they will say that Kotsay has been unlucky or he “deserves” better. However, just how much better? Fortunately for us, the great people in the sabermetric community have come up with something that tries to battle this thing called luck. I think everybody knows of BABIP by now, but there is something better, something more contextual: xBABIP (Expected Batting Average on Balls In Play) . The concept is simple, take the mean batting average of line drives, ground balls, and fly balls that are not home runs, then create a BABIP based on those averages.

So what if Kotsay wasn’t lucky or unlucky? What if this was a perfect world where the average always happens? For the record, Mark Kotsay is a good player for this little experiment since his career offensive numbers are about as average as you can get. Currently, Mark Kotsay’s xBABIP is .269. This is more or less based on his line drive rate of 15.9%. His actual BABIP is .239. So as you see, he has been pretty unfortunate as that’s a 30 point disparity. Now let’s take this a step further, let’s say that .269 xBABIP is his actual BABIP. Mark Kotsay has hit 222 balls into the field of play (this does not include home runs), if he gets hits 26.9% of the time on those 222 balls, he would have 60 hits. Add his 7 home runs to those 60 hits and you have 67 hits in 258 at bats, which comes out to a .259 Batting Average. What about his On Base Percentage? Taking those 67 hits while adding his 30 walks divided by his 286 plate appearances, we would get an OBP of .336. So far so good right? Looks like Mark Kotsay would be a decent ballplayer if it wasn’t for those “hang wiffums” right?

Hold on just a second here, we can also apply this to his Slugging Percentage. We can play the rate game, which is a dangerous game to play, but we’ll do it anyways. Of the 53  hits Mark Kotsay has put in play, 38 have been singles, 13 have been doubles, and 2 have been triples. So from this, we can see that 71% of Kotsay’s non-home run hits have been singles, 25% have been doubles, and 4% have been triples. As I said before, this is a dangerous game to play, almost a fallacy, but since Kotsay does have an appropriate sample size here, it might be safer than usual. So taking these new numbers to his 60 expected hits, his new hit figures are 43 singles, 14 doubles, and 2 triples. This would result in a Slugging Percentage of .414. By adding Kotsay’s expected OBP and SLG together we come up with a .750 Expected On Base Plus Slugging or OPS, which is just about average.

Alright, so how do these new expected rates affect Mark Kotsay’s value? If we calculated an expected wOBA from these newly calculated values, Mark Kotsay would have a .329 Expected wOBA,  just about average. We can then calculate this into a run value to produce a new expected WAR. In this case, Kotsay would have produced -0.99 batting runs (without ballpark adjustment) in comparison to the average replacement player, much better than his previous rate of -6.2. So in this case, Kotsay’s WAR goes from -0.6 to -.08. A half win difference can go a long way at times.

So what does this tell us? Well first off, it says that Kotsay is a very average hitter in a luck-isolated world and average hitters should not be DHing 1/3 of the games for a team that already has issues scoring runs.  It also tells us is that Mark Kotsay has no place on this current White Sox team. He is a replacement level player who is only capable of DHing and playing 1B and that’s even if he hit like the “deserved” to hit. With Mark Teahen coming back and Brent Lillibridge already on the team, this team could be incredibly versatile. Isn’t that what Ozzie Guillen wanted? Isn’t that why Ozzie said no to Jim Thome, who is clubbing the ball for the rival Twins and is also a great clubhouse guy? This love affair with Mark Kotsay has gone too far. He is in fact costing this team on the offensive side of the ball. I would have no problem if Kotsay stays on this team as a pinch hitter and starts maybe once a week; he’s apparently a good guy in the clubhouse (as is his wife, I imagine). But the fact that this replacement level player has played 3/4 of this team’s games is disturbing. With the way that this situation has been tended to, you’d think the White Sox’ new slogan would be something along the lines of “White Sox Baseball: Here to Make Friends, Not to Win”.


Keeping Up With the Musials

It’s safe to say that Andruw Jones has been one of the most disappointing baseball players in recent memory. Just five years ago, Jones was in the middle of a fantastic season wherein he hit 51 homers with a .922 OPS (despite a .240 BABIP) and was worth 8.3 WAR. As recently as 2007, he slammed 26 long balls while driving in 94 and accumulating 3.8 WAR.

Then disaster struck. In 2008, after signing a two-year, $36 million with the Dodgers, Jones absolutely tanked, hitting just .158 with three homers and a .505 OPS; he struck out in more than a third of his at-bats and his once prodigious power disappeared, as evidenced by his Michael Bourn-esque .091 ISO.

In the 160 games Jones has played with the Rangers and White Sox in 2009-10, he’s regained some of his lost power, bashing 32 homers with a .244 ISO in just under 600 plate appearances. However, those numbers don’t seem particularly special for a guy who’s spent the majority of his time at first base and DH, especially when combined with a putrid .209 batting average. No one’s mistaking him for an All-Star.

And yet, there is no doubt that Andruw Jones belongs in the National Baseball Hall of Fame.

Wait, what?

For starters, let’s not be too hasty and dismiss his earlier offensive accomplishments. In 12 years with the Braves, he averaged 33 homers and 98 RBI per 162 games with an .824 OPS. He hit the 20/20 club three times, including his 31/27 season in 1998.

His 403 career homers put him 46th all-time — ahead of current Cooperstown residents Al Kaline (399), Jim Rice (382), Ralph Kiner (369), and Albert Pujols (okay, so he’s not in the Hall of Fame yet, but I’m sure they’re already planning out his plaque). And while 31 was a tad on the young side for a complete collapse, don’t forget that he had established himself as a key part of the Braves’ outfield before he was old enough to drink. But all of that is just icing on the cake.

Forget everything he did at the plate, on the basepaths, or in the dugout; if for no other reason, Andruw Jones deserves to be enshrined because of what he did in center field. Jones isn’t just one of the best defensive outfielders of his generation — he’s arguably the best-fielding outfielder of all time, and surely ranks among the top glovesmen in baseball history at any position.

Jones won 10 consecutive Gold Gloves from 1998-2007. Even opening it up to players who were honored in multiple, nonconsecutive years, that beats Ichiro (nine), Torii Hunter (nine), Andre Dawson (eight), Jim Edmonds (eight), Larry Walker (seven), and Kenny Lofton (four). The only outfielders who have ever done better are Willie Mays and Roberto Clemente (12 apiece), but I’m sure you’ll join me in condoning Jones for not quite living up to their lofty standard.

Of course, you could argue that Gold Gloves are a popularity contest, and aren’t necessarily the best way to determine the game’s best defenders (see “Kemp, Matt” and “Jeter, Derek” last year). It’s true, they don’t accurately describe Jones’ accomplishments — they don’t do them justice.

According to TotalZone (used for seasons from 1954-2001) and Ultimate Zone Rating (2002-now), Jones has saved 274.3 runs in his career with his glove. Two-hundred seventy-four point-three runs. That’s about 28-wins worth of value for his career without taking into account anything he’s done with his bat.

If that number isn’t terribly impressive to you, perhaps you should consider the context: it’s the best score of any outfielder in baseball history, and a look at the Top 10 shows that it’s not particularly close:

1. Andruw Jones 274.3
2. Roberto Clemente 204.0
3. Barry Bonds 187.7
4. Willie Mays 185.0
5. Carl Yastrzemski 185.0
6. Paul Blair 174.0
7. Jesse Barfield 162.0
8. Al Kaline 156.0
9. Jim Piersall 156.0
10. Brian Jordan 148.0

These statistics are far from perfect, and there’s definitely an argument to be made that the older numbers are particularly flawed. But even if we can’t use it to compare players of different eras (could the margin of error really be more than 70 runs?), we can see just how amazing Jones has been by comparing him to his contemporaries. If you noticed that the only other names of those 10 who played at the same time as Jones were Bonds (whose days as a serviceable fielder were numbered by the time Jones made his debut) and the woefully unappreciated Jordan, you can probably see where this is going.

Darin Erstad (146.6)? Ichiro (120.2)? Carl Crawford (119.8)? Lofton (114.5)? Mike Cameron (110.7)? Walker (86.0)? Edmonds (57.5)? None of them even come close. In fact, Jones’ score is better than any two of those names’ combined.

It’s not just outfielders, either. Jones’ TZR/UZR is the second best of all-time, behind only Brooks Robinson. Compare his 274.3 runs saved with Cal Ripken Jr.’s 181.0, Ivan Rodriguez’ 156.0, Luis Aparicio’s 149.0, and Omar Vizquel’s 136.4. He even beats true defensive legends like Joe Tinker (180.0), Honus Wagner (85.0), and the amazing Ozzie Smith (239.0). If you can go toe-to-toe with the “Wizard of Oz” in the field, you barely need a pulse offensively to deserve a place in Cooperstown.

Jones hasn’t had time to slowly build up his score by being a consistently solid fielder; instead, he grabbed the bull by the horns and has enjoyed some of the best individual defensive seasons in baseball history.

In 1998 — at age 21 — he was worth 35 runs in the field, which at the time was tied for the second-best defensive performance since tracking began in 1950. In 1999, he promptly went out and beat that, earning 36 TZR. All told, he appears on the Top 80 list for single-season Total Zone Rating five times. And that’s not including UZR, which has been kinder to him than TZR since 2003.

Will the BBWAA vote him in when his time comes? Probably not. Even assuming the voters have learned how to use the newfangled defensive metrics by then (far from a sure thing, given that a majority of NL Cy Young voters implicitly declared wins to be the most important pitching statistic last year), there are too many reasons for them to doubt his candidacy.

While TZR and UZR make sense and are great tools for getting a general idea of a player’s defensive prowess, they’re too inconsistent for fans to take as the word of God (though, in my opinion, a 70-run lead is more than enough to cancel out the margin of error). Aside from that, you’ve just got a free-swinging, power-hitting outfielder (a dime a dozen over the last 20 years) who fell off a cliff right before his 32nd birthday. He’d have to return to his younger form and maintain it for at least a few more years in order to have a realistic shot at Cooperstown.

But, as the Beatles once sang, “all you need is glove” (unless I heard that wrong), and that’s what Ozzie Smith proved when he got more than 90 percent of the vote for the Hall of Fame in 2002. Combine phenomenal defense with a solid bat (remember those 403 homers?) and there’s no question Andruw Jones deserves a spot in Cooperstown.

Lewie Pollis is a freshman at Brown University studying political science. He also contributes to BleacherReport.com, ManCaveSports.org, and Green Pages, the quarterly publication of the U.S. Green Party.