It seems to be commonly accepted that $e^beta$ corresponds to the OR in logistic regressions. Although I understand that in the univariate case this definitely seems to correspond, i.e.
$$
OR = frac{text{odds}(F(x+1))}{text{odds}(F(x))} = frac{frac{F(x+1)}{1 – F(x+1)}}{frac{F(x)}{1 – F(x)}} = frac{e^{beta_0 + beta_1 (x+1)}}{e^{beta_0 + beta_1 x}} = e^{beta_1}
$$
This doesn't seem as clear for the multiple predictor case. What would be the proof for this? What probabilities are being compared? I have noticed that with added predictors, the OR changes, the explanation given being that it represents the OR with other variables taken into account, how? Is it really still the OR once it changes?
Best Answer
I think the relationship might be clearer with a different expression of the model. Consider the logistic regression model as its expressed on p. 225 of this excerpt:
Formally, the model logistic regression model is that $$logfrac{p(x)}{1-p(x)} = beta_0 + beta_1x$$
Equivalently:
$$frac{p(x)}{1-p(x)} = e^{beta_0 + beta_1x}$$
In words, logistic regression models log odds as a linear function of the predictors; the odds are the exponentiation of this linear combination. In the multivariate case, this gives:
$$frac{p(x)}{1-p(x)} = e^{beta_0 + beta_1x_1…+beta_nx_n}$$
Look at it as a product
$$frac{p(x)}{1-p(x)} = e^{beta_0}e^{beta_1x_1}…e^{beta_nx_n}$$
and the relationship between the fitted coefficients and predictors is clear: For every one-unit increase in $x_i$, the log-odds increase by a factor of $e^{beta_i}$. This seems to be the relationship that your formula seeks to stress. I believe the proof you ask for is this:
$$OR = frac{operatorname{odds}(F(hat{textbf x}))}{operatorname{odds}(F(textbf x))} = frac{frac{F(hat{textbf x})}{1 – F(hat{textbf x})}}{frac{F(textbf x)}{1 – F(textbf x)}} = frac{e^{beta_0 + beta_1 x_1 + … + beta_m(x_m+1)+beta_nx_n}}{e^{beta_0 + beta_1 x_1 + … + beta_mx_m+beta_nx_n}} = e^{beta_m}$$
with $hat{x}_i = x_i$ for $i ne m$, and $hat x_m = x_m + 1$. (For my part, I don't find this particularly intuitive or illustrative.)
What probabilities are being compared?
If $x_m$ is continuous, it's comparing the odds with a one-unit change in $x_m$, while holding all other predictors equal; if $x_m$ is binary, it's comparing two cases in which all variables are equal except the presence or absence of $x_m$.
Is it really still the OR once it changes?
You may find it more helpful to think of it as your understanding of the odds ratio changing in light of new information.
Similar Posts:
- Solved – How is Logistic Regression related to Logistic Distribution
- Solved – Relationship between logit and odds ratios
- Solved – Exponentiated logistic regression coefficient different than odds ratio
- Solved – Converting logistic regression coefficient and confidence interval from log-odds scale to probability scale
- Solved – Adding log odds for combined probability from logistic regression coefficients