It is widely talked about by announcers and baseball fans alike, that knuckleball pitchers can throw hitters off their game and leave them in funks for days. Some managers even sit certain players to avoid this effect. I decided to analyze to determine if there really is an effect and what its value is. R.A. Dickey is the main knuckleballer in the game today, and he is a special breed with the extra velocity he has.
Most people that try to analyze this Dickey effect tend to group all the pitchers that follow in to one grouping with one ERA and compare to the total ERA of the bullpen or rotation. This is a simplistic and non-descriptive way of analyzing the effect and does not look at the how often the pitchers are pitching not after Dickey.
I decided to determine if there truly is an effect on pitchers’ statistics (ERA, WHIP, K%, BB%, HR%, and FIP) who follow Dickey in relief and the starters of the next game against the same team. I went through every game that Dickey has pitched and recorded the stats (IP, TBF, H, ER, BB, K) of each reliever individually and the stats of the next starting pitcher, if the next game was against the same team. I did this for each season. I then took the pitchers’ stats for the whole year and subtracted their stats from their following Dickey stats to have their stats when they did not follow Dickey. I summed the stats for following Dickey and weighted each pitcher based on the batters he faced over the total batters faced after Dickey. I then calculated the rate stats from the total. This weight was then applied to the not after Dickey stats. So for example if Janssen faced 19.11% of batters after Dickey, it was adjusted so that he also faced 19.11% of the batters not after Dickey. This gives an effective way of comparing the statistics and an accurate relationship can be determined. The not after Dickey stats were then summed and the rate stats were calculated as well. The two rate stats after Dickey and not after Dickey were compared using this formula (afterDickeySTAT-notafterDickeySTAT)/notafterDickeySTAT. This tells me how much better or worse relievers or starters did when following Dickey in the form of a percentage.
I then added the stats after Dickey for starters and relievers from all four years and the stats not after Dickey and I applied the same technique of weighting the sample so that if Niese’12 faced 10.9% of all starter batters faced following a Dickey start against the same team, it was adjusted so that he faced 10.9% of the batters faced by starters not after Dickey (only the starters that pitched after Dickey that season). The same technique was used from the year to year technique and a total % for each stat was calculated.
The most important stat to look at is FIP. This gives a more accurate value of the effect. Also make note of the BABIP and ERA, and you can decide for yourself if the BABIP is just luck, or actually better/worse contact. Normally I would regress the results based on BABIP and HR/FB, but FIP does not include BABIP and I do not have the fly ball numbers.
The size of the sample was also included, aD means after Dickey and naD is not after Dickey. Here are the results for starters following Dickey against the same team.
It can be concluded that starters after Dickey see an improvement across the board. Like I said, it is probably better to use FIP rather than ERA. Starters see an approximate 18.9% decrease in their FIP when they follow Dickey over the past 4 years. So assuming 130 IP are pitched after Dickey by a league average set of pitchers (~4.00 FIP), this would decrease their FIP to around 3.25. 130 IP was selected assuming ⅔ of starter innings (200) against the same team. Over 130 IP this would be a 10.8 run difference or around 1.1 WAR! This is amazingly significant and appears to be coming mainly from a reduction in HR%. If we regress the HR% down to -10% (seems more than fair), this would reduce the FIP reduction down to around 7%. A 7% reduction would reduce a 4.00 FIP down to 3.72, and save 4.0 runs or 0.4 WAR.
Here are the numbers for relievers following Dickey in the same game.
Relievers see a more consistent improvement in the FIP components (K, BB, HR) between each other (11.4, 8.1, 4.9). FIP was reduced 10.3%. Assuming 65 IP (in between 2012 and 2013) innings after Dickey of an average bullpen (or slightly above average, since Dickey will likely have setup men and closers after him) with a 3.75 FIP, FIP would get reduced to 3.36 and save 3 runs or 0.3 WAR.
Combining the un-regressed results, by having pitchers pitch after him, Dickey would contribute around 1.4 WAR over a full season. If you assume the effect is just 10% reduction in FIP for both groups, this number comes down to around 0.9 WAR, which is not crazy to think at all based off the results. I can say with great confidence, that if Dickey pitches over 200 innings again next year, he will contribute above 1.0 WAR just from baffling hitters for the next guys. If we take the un-regressed 1.4 WAR and add it to his 2013 WAR (2.0) we get 3.4 WAR, if we add in his defence (7 DRS), we get 4.1 WAR. Even though we all were disappointed with Dickey’s season, with the effect he provides and his defence, he is still all-star calibre.
Just for fun, lets apply this to his 2012. He had 4.5 WAR in 2012, add on the 1.4 and his 6 DRS we get 6.5 WAR, wow! Using his RA9 WAR (6.2) instead (commonly used for knucklers instead of fWAR) we get 7.6 WAR! That’s Miguel Cabrera value! We can’t include his DRS when using RA9 WAR though, as it should already be incorporated.
This effect may even be applied further, relievers may (and likely do) get a boost the following day as well as starters. Assuming it is the same boost, that’s around another 2.5 runs or 0.25 WAR. Maybe the second day after Dickey also sees a boost? (A lot smaller sample size since Dickey would have to pitch first game of series). We could assume the effect is cut in half the next day, and that’d still be another 2 runs (90 IP of starters and relievers). So under these assumptions, Dickey could effectively have a 1.8 WAR after effect over a full season! This WAR is not easy to place, however, and cannot just be added onto the teams WAR, it is hidden among all the other pitchers’ WARs (just like catcher framing).
You may be disappointed with Dickey’s 2013, but he is still well worth his money. He is projected for 2.8 WAR next year by Steamer, and adding on the 1.4 WAR Dickey Effect and his defence, he could be projected to really have a true underlying value of almost 5 WAR. That is well worth the $12.5M he will earn in 2014.
For more of my articles, head over to Breaking Blue where we give a sabermetric view on the Blue Jays, and MLB. Follow on twitter @BreakingBlueMLB and follow me directly @CCBreakingBlue.
TIPS is a new ERA estimator that I have created. The post on the estimator can be found here.
In short, TIPS is an estimator that attempts to measure pitcher skill completely independent from all other factors other than batter-pitcher relationships (removing defense, catchers, umpires, batted ball luck, etc.). The formula is:
TIPS 6.5*O-Looking(PitchF/x) – 9.75*SwStr% – 4.8*Foul% + C (around 2.60)
where: O-Looking(PitchF/x) = 1 – O-Swing% (PitchF/x), SwStr% = percent of pitches swung at and missed, Foul% = percent of contacts fouled off
The estimator was found to be the most predictive of any estimator in samples less than 70 IP.
I have taken the free agent custom leaderboards provided by Dave Cameron and ranked the pitchers by TIPS.
TIPS may not have as much power with starting pitchers, since the samples will be larger than 70 IP, but since these pitchers will be changing defense, park, and catcher, I believe it can be useful (when used with FIP and xFIP). Click this text for the starting pitcher leaderboard.
If you cannot view the google spreadsheet, here are the top free agent starting pitchers by TIPS. Yes, I know Lincecum has since signed, but he is still included.
Kazmir, Marcum, Haren, Hughes, and Johnson all look like really good value signings (when comparing their ERA and FIP/xFIP/TIPS). Scott Kazmir is someone who I believe could be a legit number 2 guy moving forward if he can keep his velocity. I know Jason Marquis had a 4.05 ERA, but he is someone you should be wishing your team does not sign.
But now on to where TIPS really shines, relievers!
Here is the RHP leaderboard and LHP leaderboard. I am also providing the full combined leaderboard:
There are a few notable FA relief pitchers. Mujica, Benoit, Nathan, Rodney, Balfour, Hawkins, and Gregg all closed this year. Crain is a pitcher who could potentially close as well. Looking at the closers, Mujica is alone in the top tier by TIPS. Then Benoit, Crain, and Nathan are second tier. Rodney and Balfour are in the next tier, while Hawkins and then Gregg are in the final tiers. Gregg in particular looks like a RP that no team should touch. Parra and Logan make for some good LOOGY signs if teams are looking for left-handed relievers. There a quite a few names in this list that would do a fine job in filling out a bullpen. It goes to show that trading for bullpen pieces might be akin to trading your brother or sister your blueberry for their strawberry when there is a pack of strawberries on the counter. A bit of a random analogy, but it makes sense. The SP crop is much thinner than the RP crop. There are no big name or potential number 1 pitchers in the FA crop, which means teams that are looking to add to the front of their rotation might have to do so through trade.
On a bit of a side note, I wanted to talk a little more about TIPS. Why does TIPS really like Mujica? It loves his amazing 44.2% O-Swing% and his 12.5% SwStr% isn’t too shabby either. O-Swing% (I use the PitchF/x value), SwStr%, and Foul% are peripherals that you should be accustomed to looking at and understanding. Foul% is not readily available, but is not too hard to calculate. What value is good? What is bad? I will explain here:
To finish this off, I’d like to say Koji Uehara is a monster. 39.2% O-Swing% (Above Excellent), 18.5 SwStr% (Above Excellent), and 60.8% Foul% (Almost Excellent).
In the Marlins deal last November, Josh Johnson was the main headlining piece along with Jose Reyes and Mark Buehrle. Then the Blue Jays added R.A. Dickey in December and the starting rotation looked to be very strong. Dickey, Morrow, Johnson, Buehrle, and Happ were all supposed to have strong seasons and hope for a 2013 World Series title was in abundance. Then came April. The rotation struggled, terribly. Josh Johnson seemed to be the worst infringer of them all. He was the worst disappointment of the season. But was he actually that bad?
Using all of the standard metrics for pitchers, Josh Johnson was brutal. He was 2-8 with a 6.20 ERA and 1.66 WHIP. He also only pitched 81 and a third innings. How could you possibly say he had a good season? Those stats look worse than 2012 Ricky Romero. If you take a look at his K/9 of 9.18 you see he had the best K/9 of his career. You also see that he had the worst BB/9 of his more recent years at 3.32. These two stats are a little deceiving in this case however. Because of his much longer innings, his K/9 and BB/9 would both be up as he faces more batters per inning. We then have to look at the rate per batter. He had a K% of 21.6%, which is just shy of his career average (not best, as K/9 suggests) of 21.9%. This makes his strikeout rate look less appealing but it is still very good. The adverse effect is applied to his walk rate, as his BB% was 7.8%. This mark is better than his last two years and better than his career average of 8.1%.
Now on to why I believe Josh Johnson will be a good starter next year and onward. In case you haven’t heard of them before, there are ERA-accompanying stats called FIP, xFIP, and SIERA. These stats try to eliminate events that are beyond the pitcher’s control (fielding independent pitching). FIP is calculated from K’s, BB’s, and HR’s to IP. xFIP is the same, except that it corrects the pitcher’s HR total to what it would be with a league average HR/FB rate. SIERA uses a more complex formula based on K%, BB%, and batted ball profiles (ground balls, fly balls, and pop ups) to approximate ERA. These three stats do a much better job of predicting future ERA than they do of current ERA. ERA fluctuates greatly from year to year and sample to sample for pitchers, while the guts of these metrics are more constant. ERA is not stable as it depends on luck in BABIP, HR/FB, and LOB as well as team defense. FIP is usually closest to the ERA of the sample, as it doesn’t account for HR/FB luck. SIERA is the best at predicting future ERA, followed closely by xFIP, FIP, and lastly, ERA.
So while Josh Johnson’s ERA is 6.20, his BABIP is an inflated .356 (compared to a career average of .305 and league average of .294) and this should regress back towards the mean. FIP has BABIP luck taken out of the equation and has Johnson with a FIP of 4.62. This is much lower than the 6.20 ERA, but 4.62 is still not very good for a pitcher of his price-tag. However FIP does not assume a league average HR/FB rate, this is where xFIP comes into play. Johnson’s HR/FB% this year is an abysmal 18.5% (compared to a 8.2% career average and 10.6% league average). It can be assumed that this will regress towards the mean as well next year. So accounting for this absurd HR/FB%, Josh Johnson had an xFIP of 3.60. That looks a little better doesn’t it? Especially since xFIP does a better job of predicting future ERA.
The one problem with using FIP and xFIP in this case however, is that they are based of rates with IP as the denominator. As I discussed earlier, due to the long nature of Josh Johnson’s innings, this would increase the K, BB, and HR per inning as more batters come to the plate. This is where SIERA comes into play as the best statistic to use in this case. SIERA, as mentioned prior, deals with rates where PA (or BF) is the denominator. It is also shown that batted ball profiles are somewhat controllable by the pitcher and have an impact on results. In most cases, xFIP and SIERA are very similar, but replacing the IP denominator with BF and including some batted ball profile gives SIERA the slight edge in predictability. Josh Johnson’s SIERA this year was 3.73, which is probably the best guess as to what we can expect his ERA to be going forward.
3.73 or 3.60 look excellent and amazing considering the results we saw. What a ray of hope! But what if he really was just more hittable this year? What if he wasn’t unlucky and batters can just hit him? This is what I will look into now.
Johnson’s injury history and the effect it has had on his velocity is well documented. He is not the same pitcher he was in ’09 and ’10. He is a different pitcher now, but he has been this way for two years, not one. Josh Johnson is the same pitcher that he was in 2012 when he posted a 3.81 ERA for the Marlins (he might even be better). How is this possible you say? His ERA has jumped 2.39 runs! I will dive into all of his peripherals to prove that he hasn’t changed that much.
First let’s take a look at his velocity (I will be using PITCHf/x numbers for all values).
His average FB velocity in 2012 was 92.8mph, while this year it is 92.9mph. Slider velocity was 86.9mph and now is 86.1mph. Curve was 78.5mph and now is 79.1mph while his changeup was 87.6mph and now is 88.6mph. All of these velocities are very constant! There is nothing here inferring that he is more hittable than last year, let’s move on.
Let’s look at plate discipline to see if there is anything that suggests hittability. His O-Swing% (outside zone swing%) was 30.9% and now is 32.3%. This should decrease hittability if anything, since contact should be worse on pitches outside of the zone. His Z-Swing (zone swing%) is a constant 60.4% compared to last year. His O-Contact% is slightly up (59.5% to 61.9%) but this shouldn’t matter, as these pitches should be less hittable. His Z-Contact% is slightly down (90.9% to 89.6%), which should be good as it means more whiffs in the zone. His zone% in also slightly down (44.9% to 43.7%), but who cares if he doesn’t walk more batters. Lastly, his SwStr% (swinging strike%) is essentially constant (9.2% to 9.3%). Again there is nothing here to suggest that batters should be able to hit him better.
I have heard some people say that he just gets rattled when things go bad. I’d like to partially debunk this theory, as his pace (time between pitches) is essentially the same as last year (20.9s in 2012 and 21.0s in 2013). Pitchers who are rattled generally take more time between pitches. There’s not really any other stats that can prove otherwise, as all his peripherals are fairly constant.
The one main difference that is notable in his peripherals between 2012 and 2013, is his 2-seam fastball use. He has used his two-seamer 13.3% of the time compared to only 4.8% last season. This difference has come at an expense of all three of his secondary pitches, which are all slightly down in usage. Is his two-seamer a bad pitch? It’s certainly not his best. I would take pitch values from this year with a grain of salt, as they are all low due to his bad luck, but his two-seamer has been below average for three years in a row: -1.94 RAA/100 pitches (runs above average) in 2011, -2.43 RAA/100 in 2012 and -1.99 RAA/100 in 2013. Other than his changeup since his velocity decline (which went from average to well below average), the two-seamer has been consistently his worst pitch. The fact that he is using it more is not a good thing, but this is easily corrected if it is pointed out to him. It has nothing to do with a lack of ability. His above average curve and slider have taken a hit in usage and this needs to be corrected.
Pitch selection hasn’t been too much of an issue for him in terms of strikeouts and walks however. Both his K% and BB% are trending the right direction from last year. His K% is up 0.9%, while his BB% is down 0.4%. These both suggest he has improved since last year, and his xFIP and SIERA mirror that. xFIP has gone from 3.73 to 3.60 while SIERA has improved from 3.86 to 3.73. He has been getting better at pitching with his reduced velocity, not worse (as it appears on the surface).
One counter argument to this could be that he’s just throwing more meatballs down the middle that are getting hit, but also mean he walks less and strike out more. This was partially debunked by his lower zone% and lower z-contact% from before, but I want a little more proof that this is not the case. FanGraphs, with the help of PITCHf/x, is an amazing website that, in addition to all these fancy stats, also provides heat maps for pitchers to see exactly where they are throwing the ball.
Here are Johnson’s 2012 heat maps:
And here are his 2013 heat maps:
Not much difference is there? He enjoys throwing down and away the most, and this hasn’t changed at all. In case you’re wondering, there is less yellow in 2013 because he’s thrown about half as many pitches.
Another theory I have heard would be that his pitches are straighter now. I will look into this. This actually might have a case. His movement on each pitch has decreased since last year (around .6 inches for each pitch). However, we need to look into the numbers a little deeper. PITCHf/x movement in the z-direction (up or down) excludes gravity and gives a movement number in which the ball would move without gravity. What does this mean if we have positive movement values (which Johnson does with every pitch except his curve)? It means that, without gravity, each pitch would move up. In reality, gravity is much larger than this movement force and the balls drop. So a larger positive movement number means that the ball will drop less than a smaller movement value, and therefore have less movement. Johnson’s fastball and this two-seam fastball (to a larger extent), both have less rise this year, this means they actually have more drop. His slider is about the same while his changeup and curve are showing slightly less drop. I might say this is a problem, but his curve was his best pitch this year while his changeup has been bad for 2 years anyways and should just be a show pitch. I would be more concerned if he was showing less movement in the horizontal direction, but this isn’t the case. With the exception of his changeup (which is moving less), each pitch’s horizontal movement is almost identical to 2012. All things considered, nothing here suggests that he is any more hittable, especially considering his batted ball profile.
One last thing to look at is to see if batters are getting better contact aside from high home run rates is batted ball profile. Again these almost look identical to 2012. His line drive rate is slightly up (23.6% to 24.2%). It isn’t much, but still a small concern. His ground ball rate is related and took a small hit (46.2% to 45.1%). His fly ball rate is slightly up too (30.2% to 30.7%), but that’s not a problem either. His infield fly ball rate is also up (7.2% to 8.6%) which is actually good since they are almost always an out. His infield-hit rate is up (5.1% to 5.9%) showing some more of his bad luck. Again, SIERA takes batted balls into consideration and it wasn’t too concerned with his rates with the 3.73. There are some xBABIP formulae out there that predict what BABIP should be based on batted balls. These formulae are better at suggesting if a pitcher (or batter) has changed their true talent BABIP (instead of getting lucky) then actually predicting BABIP. Using Steve Staud’s xBABIP that uses LD%, FB%, and IFFB%, Josh Johnson’s 2012 and 2013 xBABIPs are nearly identical (.3163 to .3159). Matt Swartz’s xBABIP uses GB% and K% and yields .2894 in 2012 and .2880 in 2013. This is almost exactly the same again. This suggests that Josh Johnson’s true talent BABIP has not changed and that he has been getting very unlucky. There is no large or conclusive outliers in Josh Johnson’s stats suggesting that he his any different of a pitcher than in 2012.
Another thing that I would like to add is that Josh Johnson has been very consistent at preventing home runs and having a HR/FB rate that is less than league average. This is shown by his 8.2% career average and that he has posted HR/FB rates lower than league average in every year of his career except 2013. This causes his FIP to be consistently lower than his xFIP and SIERA (has been every year save 2013). So while xFIP and SIERA are the best estimators of ERA, Josh Johnson usually outperforms them in FIP. He had an excellent 3.40 FIP last year and was just a bit unlucky with LOB%, which cause his ERA to higher at 3.81. Using all of this information and the proof that Josh Johnson hasn’t changed, it would be safe to say that his ERA should be around 3.55 next year (if he were still in the NL) if everything keeps trending the same way.
There are two more things to consider though: league change and age. The AL ERA this year is 0.26 runs higher than the NL ERA. This can be accounted for in the 3.55, which brings him back to around 3.70-3.90. Age is another thing to consider, Josh Johnson is going from 29 to 30 years old. As a pitcher, this actually gives him an approximate 0.05 decrease in ERA. This generalization is shown in this graph from Baseball Prospectus. Taking this into consideration I believe we will see Josh Johnson post an ERA between 3.65 and 3.85 next year.
So let’s say we have Johnson posting a 3.75 ERA next year. A full season of Johnson should be around 3.0 WAR, cut his innings in half (injury risk) and that’s still 1.5 WAR. With wins being worth approximately $9M next year, Josh Johnson could realistically be worth anywhere from $13.5M to $27M, depending on injuries. A qualifying offer will be around that $13.5M. So even with a qualifying offer, the downside is that you will pay what you get, while the upside is much better. You can’t really lose. However I don’t think the Jays need to pay him $13.5M. Remember, he posted a 6.20 ERA this year. GMs around the league, as well as agents, will want to stay away from a bad, injury-prone pitcher. I believe the Jays could extend Johnson at around $11M/year over three years. At this price you could most certainly expect positive value from him. There are not really any cases like this to compare the situation with, so predicting possible contracts is a shot in the dark, but no matter the contract, I am positive it will be worth it. The Blue Jays definitely need to extend Josh Johnson as soon as possible. It is one of the best buy low opportunities they’ll ever encounter.
FIP, xFIP, SIERA are all very good ERA estimators, and their predictability is well documented. It is well known that SIERA is the best ERA estimator over samples that occur from season to season, followed very close by xFIP, with FIP lagging behind. FIP is best at showing actual performance though, because is uses all real events (K, BB, HR). Skill is commonly best attributed to either xFIP or SIERA. ERA is also well known to be the worst metric at predicting future performance, unless the sample size is very large <500IP with the pitcher remaining in the same or a very similar pitching environment.
FIP, xFIP, and SIERA are supposed to be Defense Independent Metrics, and they are. Well, they are independent of field defense, but there is one small error in the claim of defense independent. K’s and BB’s are not completely independent of defense. Catcher pitch framing plays a role in K’s and BB’s. Catchers can be good or bad at changing balls into strikes and this affects K’s and BB’s. Umpire randomness and umpire bias also play a role in K’s and BB’s. It is unknown how much of getting umpires to call more strikes is a skill for a pitcher or not. Some pitchers are consistent at getting more strike calls (Buehrle, Janssen) or less strike calls (Dickey, Delabar), but for most pitchers it is very random (especially in small sample sizes). For example Jason Grilli was in the top 5% in 2013 but was in bottom 10% in 2012.
I wanted to come up with another ERA estimator that eliminates catcher framing, umpire randomness and bias, and eliminates defense. I took the sample of pitchers who have pitched at least 200IP since 2008 (N=410) and analyze how different statistics that meet this criteria affect ERA-. I used ERA- since it takes out park factors and adjusts for the changes in the league from year to year. I looked at the plate discipline pitchf/x numbers (O-Swing, Z-Swing, O-Contact, Z-Contact, Swing, Contact, Zone, SwStr), the six different results based off plate discipline (zone or o-zone, swing or looking, contact or miss for ZSC%, ZSM%, ZL%, OSC%, OSM%, OL%), and batted ball profiles (GB%, LD%, FB%, IFFB%). *Please note that all plate discipline data is PitchF/X data, not the the other plate discipline on FanGraphs, this is important as the values differ*
The stats with very little to absolutely no correlation (R^2<0.01) were: Z-Swing%, Zone%, OSC%, ZSC%, ZL% (was a bit surprised as this would/should be looking strike%), GB%, and FB%. These guys are obviously a no-no to include in my estimator.
The stats with little correlation (R^2<0.1) were: Swing%, LD%, and IFFB%. I shouldn’t use these either.
O-Contact% (0.17), Z-Contact%, (.302), Contact% (.319), OSM% (0.206), and ZSM% (.248) are all obviously directly related to SwStr%. SwStr% had the highest correlation (.345) out of any of these stats. There is obviously no need to include all of the sub stats when I can just use SwStr%. SwStr% will be used in my metric.
OL% (0.105) is an obvious component of O-Swing% (0.192). O-Swing had the second highest correlation of the metrics (other than the components of SwStr%). I will use it as well. The theory behind using O-Swing% is that when the batter doesn’t swing it should almost always be a ball (which is bad), but when the batter swings, there are a two outcomes, a swing and miss (which is a for sure strike) or contact. Intuitively, you could say that contact on pitches outside the zone is not as harmful to pitchers as pitches inside the zone, as the batter should get worse contact. This is partially supported in the lower R^2 for O-Contact% to Z-Contact%. It is more harmful for a pitcher to have a batter make contact on a pitch in the zone, than a pitch out of the zone. This is why O-Swing is important and I will use it.
Using just SwStr% and O-Swing%, I came up with a formula to estimate (with the help of Excel) ERA-. I ran this formula through different samples and different tests, but it just didn’t come up with the results I was looking for. The standard deviation was way too small compared to the other estimators, and the root mean square error was just not good enough for predicting future ERA-.
I did not expect/want this estimator to be more predictive than xFIP or SIERA. This is because xFIP and SIERA have more environmental impacts in them that remain fairly constant. K% is always a better predictor of future K% than any xK% that you can come up with. Same with BB% Why? Probably because the environment of catcher framing, and umpire bias remain somewhat constant. Also (just speculation) pitchers who have good control can throw a pitch well out of the zone when they are ahead in the count, just to try and get the batter to swing or to “set-up” a pitch. They would get minus points for this from O-Swing, depending on how far the pitch is off the plate, but it may not affect their K% or BB% if they come back and still strike out the batter.
So I didn’t expect my statistic to be more predictive, but the standard deviation coupled with not that great of RMSE (was still better than ERA and FIP with a min of 40IP), caused me to be unhappy with my stat.
I then started to think about if there were any stats that were only dependent on the reaction between batter an pitcher that are skill based that FanGraphs does not have readily available? I started thinking about foul balls and wondered if foul ball rates were skill based and if they were related to ERA-. I then calculated the number of foul balls that each pitcher had induced. To find this I subtracted BIP (balls in play or FB+GB+LD+BU+IFFB) from contacts (Contact%*Swing%*Pitches). This gave me the number of fouls. I then calculated the rates of fouls/pitch and foul/contacts and compared these to ERA-. Foul/Contact or what I’m calling Foul%, had an R^2 of .239. That’s 2nd to only SwStr%. This got me excited, but I needed to know if Foul% is skill based and see what else it correlates with.
This article from 2008 gave me some insight into Foul%. Foul% correlates well to K% (obviously) and to BB% (negative relationship), since a foul is a strike. Foul% had some correlation to SwStr%, this is good as it means pitchers who are good at getting whiffs are also usually good at getting fouls. Foul% also had some correlation to FB% and GB%. The more fouls you give up, the more fly balls you give up (and less GB). This doesn’t matter however, as GB% and FB% had no correlation to ERA-. Foul% is also fairly repeatable year to year as evidenced in the article, so it is a skill. I will come up with a new estimator that includes Foul% as well.
I decided to use O-Looking% instead of O-Swing%, just to get a value that has a positive relationship to ERA (more O-looking means higher ERA), because SwStr% and O-Swing are negatively related. O-Looking is just the opposite of O-Swing and is calculated as (1 – O-Swing%).
The formula that Excel and I came up with is this: (I am calling the metric TIPS, for True Independent Pitching Skill)
TIPS = 6.5*O-Looking(PitchF/x)% – 9.5*SwStr% – 5.25*Foul% + C
C is a constant that changes from year to year to adjust to the ERA scale (to make an average TIPS = average ERA). For 2013 this constant was 2.68.
I converted this to TIPS- to better analyze the statistic. FIP, xFIP, and SIERA were also converted to FIP-, xFIP-, and SIERA-. I took all pitchers’ seasons from 2008-2013 to analyze. The sample varied in IP from 0.1 IP to 253 IP. I found the following season’s ERA- for each pitcher if they pitched more than 20 IP the next year and eliminated any huge outliers. Here were the results with no min IP. RMSE is root mean square error (smaller is better), AVG is the average difference (smaller is better), R^2 is self explanatory (larger is better), and SD is the standard deviation.
Wow TIPS- beats everyone! But why? Most likely because I have included small samples and TIPS- is based off per pitch, as opposed to per batter (SIERA) or per inning (xFIP and FIP). There are far more pitches than AB or IP so TIPS will stabilize very fast. Let’s eliminate small sample sizes and look again.
Now, TIPS is beaten out by xFIP and SIERA, but beats ERA and and is close to FIP (wins in RMSE, loses in R^2). This is what I expected, as I explained earlier K% and BB% are always better at predicting future K% and BB% and they are included in SIERA and xFIP. SIERA and xFIP take more concrete events (K, BB, GB) than TIPS. I didn’t want to beat these estimators, but instead wanted a estimator that is independent of everything except for pitcher-batter reaction.
TIPS won when there was no IP limit, so it obviously is the best to use in smaller sample sizes, but when is it better than xFIP and SIERA, and where does it start falling behind? I plotted the RMSE for my entire sample at each IP. Theoretically these should be an inverse relationship. After 150 IP it gets a bit iffy, as most of my sample is less than 100 IP. I’m more interested in IP under 100 anyhow.
Orange is TIPS, Blue is ERA, Red is FIP, Green is xFIP, and Purple is SIERA. If you can’t see xFIP, it’s because it is directly underneath SIERA (they are almost identical). This is roughly what the graph should look like to 100 IP:
Looking at the graph, at what IPs is TIPS better than predicting future ERA than xFIP and SIERA? It appears to be from 0 IP to around 70 IP.
Here is the graph for 1/RMSE (higher R^2). Higher number is better. This is the most accurate graph as the relationship should be inverse.
The 70-80 IP mark is clear here as well.
I’m not suggesting my estimator is better than xFIP or SIERA, it isn’t in samples over 75 IP, but I think it is, and can be, a very powerful tool. Most bullpen pitchers stay under 75 IP in a season. This means that my unnamed estimator would be very useful for bullpen arms in predicting future ERA. I also believe and feel that my estimator is a very good indicator of the raw skill of a pitcher. It would probably be even more predictive if we had robo-umps that eliminated umpire bias and randomness and pitch framing.
2013 TIPS Leaders with 100+IP
And Leaders from 40IP to 100IP
I decided to determine if there truly is an effect on pitchers’ statistics (ERA, WHIP, K%, BB%) who follow Dickey in relief and the starters of the next game against the same team. I went through every game that Dickey has pitched and recorded the stats (IP, TBF, H, ER, BB, K) of each reliever individually and the stats of the next starting pitcher if the next game was against the same team. I did this for each season. I then took the pitchers’ stats for the whole year and subtracted their stats from their following Dickey stats to have their stats when they did not follow Dickey. I summed the stats for following Dickey and weighted each pitcher based on the batters he faced over the total batters faced after Dickey. I then calculated the rate stats from the total. This weight was then applied to the not after Dickey stats. So for example if Francisco faced 19.11% of batters after Dickey, it was adjusted so that he also faced 19.11% of the batters not after Dickey. This gives an effective way of comparing the statistics and an accurate relationship can be determined. The not after Dickey stats were then summed and the rate stats were calculated as well. The two rate stats after Dickey and not after Dickey were compared using this formula (afterDickeySTAT-notafterDickeySTAT)/notafterDickeySTAT. This tells me how much better or worse relievers or starters did when following Dickey in the form of a percentage.
I then added the stats after Dickey for starters and relievers from all three years and the stats not after Dickey and I applied the same technique of weighting the sample so that if Niese’12 faced 10.9% of all starter batters faced following a Dickey start against the same team, it was adjusted so that he faced 10.9% of the batters faced by starters not after Dickey (only the starters that pitched after Dickey that season). The same technique was used from the year to year technique and a total % for each stat was calculated.
Here is the weighted year by year breakdown of the starters’ statistics following Dickey and a total (- indicates a decrease which is desired for all stats except K%):
ERA: -46.94% with 5/5 starters seeing a decrease
WHIP: -16.16% with 4/5 seeing a decrease
K%: 47.04% with 4/5 seeing an increase
BB%: 6.50% with 3/5 seeing a decrease
HR%: -50.53% with 5/5 seeing a decrease
BABIP: -14.08% with 4/5 seeing a decrease
FIP: -25.17% with 5/5 seeing a decrease
ERA: 17.92% with 0/3 seeing a decrease
WHIP: -9.63% with 2/3 seeing a decrease
K%: -2.64% with 2/3 seeing an increase
BB%: -15.94% with 2/3 seeing a decrease
HR%: -9.21% with 2/3 seeing a decrease
BABIP: -15.14% with 2/3 seeing a decrease
FIP: -5.58% with 2/3 seeing a decrease
ERA: -23.82% with 5/7 seeing a decrease
WHIP: 1.68% with 5/7 seeing a decrease
K%: -22.91% with 1/7 seeing an increase
BB%: -2.34% with 5/7 seeing a decrease
HR%: -43.61% with 5/7 seeing a decrease
BABIP: -3.61% with 4/7 seeing a decrease
FIP: -10.61% with 5/7 seeing a decrease
ERA: -17.21% with 10/15 seeing a decrease
WHIP: -8.10% with 11/15 seeing a decrease
K%: -3.38% with 7/15 seeing an increase
BB%: -5.17% with 10/15 seeing a decrease
HR%: -32.96% with 12/15 seeing a decrease
BABIP: -11.04% with 10/15 seeing a decrease
FIP: -13.34% with 12/15 seeing a decrease
So for starters that pitch in games following Dickey against the same team, it can be concluded that there is an effect on ERA, WHIP, BABIP, and FIP and a slight effect on BB% and on K%. There is also a large effect on HR rates which we can attribute the ERA effect to. This also tells us that batters are making worse contact the day after Dickey.
So a starter (like Morrow) who follows Dickey against the same team can expect to see around a 17.2% reduction in his ERA that game compared to if he was not following Dickey against the same opponent. For example if Morrow had a 3.00 ERA in games not after Dickey he can expect a 2.48 ERA in games after Dickey.
So if in a full season where Morrow follows Dickey against the same team 66% of the time (games 2 and 3 of a series) in which he normally would have a 3.00 ERA without Dickey ahead of him, he could expect a 2.66 ERA for the season. This seams to be a significant improvement and would equate to a 7.6 run difference (or 0.8 WAR) over 200 innings.
Here is a year by year breakdown of relievers after Dickey (these are smaller sample sizes so I will not include how many relievers saw an increase or decrease):
As expected there was a good effect on the relievers’ ERA, FIP, K%, and BB%, but the WHIP and BABIP were affected negatively. This tells me that the batters were more free swinging after just seeing Dickey (more hits, less walks, more strikeouts).
So in a season where there are 55 IP after Dickey in games (like in 2012) there would be a 16.6% reduction in runs given up in those 55 innings. If the bullpen’s ERA is 4.20 without Dickey it can be expected to be 3.50 after Dickey. Over 55 IP this difference would save 4.3 runs (or 0.4 WAR).
Combine this with the saved starter runs and you get 11.9 runs saved or (1.2 WAR). This is Dickey’s underlying value with the team that he creates by baffling hitters. This 1.2 WAR is if Morrow has a 3.00 ERA normally and the bullpen has a 4.00 ERA. If Morrow normally had a 4.00 ERA than his ERA would reduce to 3.54 over the season with 10.2 runs saved for 200 innings (1.0 WAR) and if the bullpen has a 4.00 ERA normally as well, 4.1 runs would be saved there, equating to 14.3 runs saved or a 1.4 WAR over a season.