## Estimating Team Wins With Innings Pitched

Throughout the baseball season, I like to estimate teams wins, but I don’t do it in the traditional way. Some time ago, I discovered that I could use innings pitched to get a close estimate. Here’s what I do:

1) Take team games played and divide by 2;

2) Take the team’s innings pitched and subtract the team opponents’ innings pitched;

For example, the Washington Nationals, as of the All-Star break, have played 88 games. They have 789.33 IP, and their opponents have 781.33 IP. So I take 88 divided by 2, which gives me 44. Then I take 789.33 minus 781.33, which gives me 8. Then 44 plus 8 gives me an estimate of 52 team wins. Checking the standings, I see that Washington indeed has 52 wins.

How does my method compare with the traditional Pythagorean? (The Pythagorean method, of course, takes runs scored squared and divides by runs scored squared plus runs allowed squared.) I’ve set up some charts to demonstrate. First, let me present the relevant statistics for all teams as of the All-Star break (all statistics courtesy CBS Sportsline):

 Team GP IP IPA R RA Arizona 89 797 787 446 344 Atlanta 87 783 787.67 405 449 Baltimore 88 782.67 790.67 392 470 Boston 89 794.67 795 431 366 Chi. Cubs 88 785 787 399 399 Chi. White Sox 87 760.33 771.33 397 429 Cincinnati 88 781.67 786.67 424 463 Cleveland 87 768.67 763.67 421 347 Colorado 91 812.33 806.67 461 419 Detroit 87 762.67 766.67 409 440 Houston 89 800 784.33 527 365 Kansas City 87 775.33 775.67 362 387 L.A. Angels 92 817 824.33 377 399 L.A. Dodgers 90 806.33 786.67 463 300 Miami 87 771.67 777 410 429 Milwaukee 91 818.67 809.33 451 406 Minnesota 88 785.67 781 403 463 N.Y. Mets 86 773 775 406 455 N.Y. Yankees 86 768 765.33 477 379 Oakland 89 784 790.67 382 470 Philadelphia 87 775 790.33 332 424 Pittsburgh 89 800.67 802 378 403 San Diego 88 776.33 781 312 440 San Francisco 90 813.33 827.33 431 435 Seattle 90 800 797.67 354 453 St. Louis 88 798 793 402 389 Tampa Bay 90 805 802.33 428 412 Texas 88 783.67 783 444 415 Toronto 88 789 788.33 366 430 Washington 88 789.33 781.33 486 396

Now let me present a chart showing how many teams wins are predicted by my method and the Pythagorean method (for the Pythagorean method, I’m using 1.82 as my exponent, as shown by MLB on their Standings page):

 Team EST W (IP) EST W (R) Actual W Arizona 54.50 54.82 53 Atlanta 38.83 39.43 42 Baltimore 36.00 36.80 42 Boston 44.17 51.07 50 Chi. Cubs 42.00 44.00 43 Chi. White Sox 32.50 40.44 38 Cincinnati 39.00 40.48 39 Cleveland 48.50 51.07 47 Colorado 51.16 49.45 52 Detroit 39.50 40.61 39 Houston 60.17 58.84 60 Kansas City 43.16 40.86 44 L.A. Angels 38.67 43.63 45 L.A. Dodgers 64.66 61.90 61 Miami 38.17 41.71 41 Milwaukee 54.84 49.84 50 Minnesota 48.67 38.47 45 N.Y. Mets 41.00 38.56 39 N.Y. Yankees 45.67 51.87 45 Oakland 37.83 36.20 39 Philadelphia 28.17 33.97 29 Pittsburgh 43.17 41.91 42 San Diego 39.33 30.67 38 San Francisco 31.00 44.62 34 Seattle 47.33 35.07 43 St. Louis 49.00 45.32 43 Tampa Bay 47.67 46.56 47 Texas 44.67 46.70 43 Toronto 44.67 37.59 41 Washington 52.00 52.11 52

My method appears in the second column, and the Pythagorean method appears in the third column, with actual team wins in the last column. My method, as shown above, gives estimated wins directly. The Pythagorean method actually computes winning percentage. To get the estimated wins for the Pythagorean method, I multiplied the team’s estimated winning percentage by the team’s games played.

The methods are pretty close! On a couple of teams, though, the methods miss by a wide margin. I’m way off on the Angels, for example, while Pythagoras is off on the Giants. But which of these methods is closer overall? I did an r-squared between each of the estimated win columns and the actual wins and got these results:

 RSQ (IP) RSQ (R) 0.8497 0.7147

Mine’s a little higher, but let’s use mean squared error (MSE) as a cross-check. Here are my numbers:

 Team MSE (IP) MSE (R) Arizona 2.25 3.33 Atlanta 10.05 6.61 Baltimore 36.00 27.05 Boston 33.99 1.15 Chi. Cubs 1.00 1.00 Chi. White Sox 30.25 5.94 Cincinnati 0.00 2.20 Cleveland 2.25 16.60 Colorado 0.71 6.53 Detroit 0.25 2.60 Houston 0.03 1.34 Kansas City 0.71 9.86 L.A. Angels 40.07 1.88 L.A. Dodgers 13.40 0.81 Miami 8.01 0.50 Milwaukee 23.43 0.03 Minnesota 13.47 42.61 N.Y. Mets 4.00 0.20 N.Y. Yankees 0.45 47.20 Oakland 1.37 7.82 Philadelphia 0.69 24.74 Pittsburgh 1.37 0.01 San Diego 1.77 53.77 San Francisco 9.00 112.82 Seattle 18.75 62.92 St. Louis 36.00 5.36 Tampa Bay 0.45 0.19 Texas 2.79 13.70 Toronto 13.47 11.61 Washington 0.00 0.01 AVG 10.20 15.68

I’m not a numbers person, so if I’ve made made errors in my calculations, please let me know, and I will never, ever trouble you fine readers again with another post. But I’ve published previous studies of both methods (in other places, under other names) and have found each time that my method edges out the Pythagorean in both r-squared and MSE.

If my method works at all, it’s because better teams typically have to get more outs to finish off their opponents. If the Dodgers, say, are at home against the Phillies, chances are they’re already winning when they go to the bottom of the ninth, and so the Dodgers don’t have to come to bat. That means the Dodgers had to get 27 outs and the Phillies had to get only 24. Conversely, on the road, if the Dodgers are leading the Phillies, the Phillies have to come to bat in the bottom of the ninth, and the Dodgers have to get the full 27 outs to end the game.

One caveat: my method tends to be more descriptive than predictive, so it’s a better measure of how a team has performed, not a good predictor of how a team will perform in the future. The Pythagorean method is much better as a predictive tool.

So there it is! My estimated team wins method. I hope you find it useful.