Linearization and Fantasy Baseball

November 1, 2016

Among the astounding phenomena abundant throughout calculus, linearization remains one of the least glamorous. It’s incredibly simple, taught in less than a day, and a more precise (and more complicated) method can often be substituted for it. On the other hand, it’s an incredibly powerful tool and one with weighty implications for fantasy baseball. Because of the concept’s relative simplicity, a reader with even the most basic inkling of what calculus actually is should be able to understand the idea of it, so don’t let a fear of mathematics deter you.

First, let’s think about graphs, functions, and derivatives. Put simply, continuous functions, whether they’re linear, quadratic, or exponential, will generally experience some rate of change — slope. Think of it as the change in the y direction per unit change in the x direction between two points. This is considered a secant line, or the average rate of change between two points. More interesting, however, is the concept of the tangent line, or the instantaneous rate of change at a given point. Note that the tangent line only touches the function at one point rather than two, meaning that we can easily evaluate and analyze the rate of change when comparing two points on a curve. Importantly, the magnitude of the slope of the tangent line tells us the rate by which the function is increasing or decreasing. So the greater the slope, the faster it is increasing (perhaps indicating an exponential function), and the lesser the slope, the more it is decreasing (a negative quadratic).

In calculus, the formula for linearization is:

L(x) = f(a) – f'(a)(x – a)

Here, given some value of a, we get a y-value, or f(a). From there, we subtract the product of the derivative of f(a) and the difference between the value we are estimating, x, and the value we already have, a. This gives the linear approximation and we get a pretty good estimate.

When rendered down to its most basic essence, linearization is a glorified form of estimation that gives credence to gut instinct through a formula. Using the tangent line at a certain point, one can make very incremental estimations, but it’s important to note that they must be very small. The farther from the initial point a that one travels to find an approximation of y, the less accurate the result will be.

It seems that this would have little application to baseball, but that’s incorrect. Recently, I started toying with a couple of formulas that could actually have some importance in the realm of amateur fantasy baseball with the usage of a regression line for an entire player’s career in pretty much any statistic.

L(x) = f(k) – f'(a)(x – a)

Here, f(k) is the actual value at the known point (k), f'(a) is the derivative of the predicted point on the regression line, x is the point for which we are predicting the value, and a is the value we start from.

L(x) = f(a) – f'(a)(x – a)

Differing here, f(a) is the predicted value at the regression line, f'(a) is the derivative of the predicted point on the regression line, x is the point for which we are predicting the value, and a is the value we start from.

I don’t know which would work best, but my guess is that first formula would be most accurate due to its mix of actual and predicted values. Neither of them would be terribly precise, but it’s a heck of a lot better than relying on what you feel might be best.

Regardless of which formula you might prefer, the implications of the linearization idea as applied to fantasy baseball are apparent. Probably best used for 10-day predictions, linearization mixes short-term performance with long-term talent to assess how well a player might perform for a short period of time — whether he’s likely to continue streaking, slumping, or somewhere in between. Rather than having to rely on gut instinct or dated and/or biased statistical analysis, a fantasy player could rely on some concrete math to make short-term decisions. This would be especially helpful in leagues that play for only a month, or can only alter their rosters once a week, or even at the end of a highly competitive season (perhaps making the risky move of dropping a slumping MVP for the streaking rookie).

It’s understandable if it’s unclear how to use one of the formulas at this point. To simplify matters, let’s use formula 1 to demonstrate how this might work in regard to something as simple as batting average. So what you might have is a regression line for a player of rolling 10-game predicted batting averages plotted along with actual values. In this case, x-values are 10-game rolling averages by each 0.01 (the intervals are arbitrary). So 1.1 is the x-value at 110 games played, while 1.2 is the x-value at 120 games. Let’s just say for simplicity that the player has played 110 games in his career, had an actual average of .264 during the last 10-game stretch, and the derivative of the regression line at this point is 0.12. We want to guess his average for the next 10 games, up to career game number 120.

L(1.2) = .264 – (0.12)(1.2 – 1.1)

L(1.2) = .254

We’d expect him to hit .254 over the next 10 games. Hopefully that makes some sense. Obviously it’s still in development and I haven’t done a whole lot of research yet, but expect some to come out later along with some clarifying material if necessary. Confusion is to be expected, but with some explanation applied linearization could potentially help a lot of people out next season in fantasy.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG