Solved – How to interpret the coefficients of a log-linear regression with quadratic terms

I have a regression equation of this kind:

\$\$log {y} = a + bx + cx^2 + epsilon\$\$

where \$a\$ is the intercept, \$b\$ and \$c\$ are the coefficients of \$x\$ and \$x^2,\$ and \$epsilon\$ is the error. How do I interpret the impact of the variable \$x\$? I am fairly sure that I should not interpret \$x\$ and \$x^2\$ separately, but I can't figure out what their combined impact on \$y\$ is!

Contents

By "impact" of \$x\$ I understand you want to estimate the change in the predicted value when \$x\$ changes by some (small) amount \$delta x.\$ This is a simple calculation beginning with the fitted model

\$\$log(hat y(x)) = hat a + hat b x + hat c x^2\$\$

where the "hats" on the terms designate estimated values. Plugging in \$x+delta x\$ for the changed value of \$x\$ and subtracting the original value of \$loghat y\$ gives

\$\$log(hat y(x+delta x)) – log(hat y(x)) = hat b, delta x + hat c (2x, delta x + (delta x)^2).\$\$

Provided \$hat c(delta x)^2\$ is of negligible size compared to the remaining terms on the right hand side; that is, when

\$\$left|hat c, delta xright| ll left|hat b + 2 hat c, xright|,\$\$

we may neglect it for these interpretive purposes and write

\$\$logleft(frac{hat y(x+delta x)}{hat y(x)}right) = log(hat y(x+delta x)) – log(hat y(x)) approx left(hat b + 2 hat c xright) delta x .\$\$

On the left is the logarithm of the relative change in the predicted response \$hat y(x).\$ For small relative changes the (natural) logarithm will be very close to 1/100th of the percentage difference. For instance, when the log is 0.15, the relative change will be very close to a +15% increase. (For many purposes this rule of thumb holds for percentages between \$pm 20%,\$ roughly.)

On the right is a multiple of the change \$delta x\$ induced in the regressor. That multiple is \$hat b + 2hat c x.\$ Of note is that it depends on the value of \$x\$ you started with. In other words, the change in the response depends on what the regressor value is: it is not constant.

Another way to restate this interpretation is to exponentiate both sides, which expresses the response on its original (rather than log) scale, yielding

\$\$hat y(x+delta x) approx hat y(x)expleft(left(hat b + 2 hat c xright) delta xright) approx hat y(x)left(1 + left(hat b + 2 hat c xright) delta xright).\$\$

The new value, on the left hand side, is expressed as change of the old value by approximately \$100% times left(hat b + 2 hat c xright) delta x.\$

Although this might seem a little complicated and not easy to remember, please note that all the calculations involved are simple: they are just some multiplications and additions. To those familiar with the differential Calculus, they can be read directly off the original model equation with only the simplest mental arithmetic, because (taking differentials) it is immediate that

\$\$frac{y^prime (x)}{y(x)}, dx = frac{d}{dx} log(y(x) ), dx = (b + 2cx), dx\$\$

and all you have to do is "put hats on" all the estimates and, as usual, interpret \$dx\$ as a (sufficiently) small increment in \$x.\$

Rate this post