# Solved – Why doesn’t the exponential family include all distributions

Bishop, Pattern Recognition and Machine Learning (2006)

which defines the exponential family as distributions of the form (Eq. 2.194):
$$p(mathbf x|boldsymbol eta) = h(mathbf x) g(boldsymbol eta) exp {boldsymbol eta^mathrm T mathbf u(mathbf x)}$$
But I see no restrictions placed on $$h(mathbf x)$$ or $$mathbf u(mathbf x)$$. Doesn't this mean that any distribution can be put in this form, by appropriate choice of $$h(mathbf x)$$ and $$mathbf u(mathbf x)$$ (in fact only one of them has to be chosen properly!)? So how come the exponential family does not include all probability distributions? What am I missing?

Finally, a more particular question that I am interested in is this: Is the Bernoulli distribution in the exponential family? Wikipedia claims it is, but since I am obviously confused about something here, I would like to see why.

Contents

Well, one consequence of your definition: $$p(mathbf x|boldsymbol eta) = h(mathbf x) g(boldsymbol eta) exp {boldsymbol eta^mathrm T mathbf u(mathbf x)}$$ is that the support of the distribution family indexed by parameter $$eta$$ do not depend on $$eta$$. (The support of a probability distribution is the (closure of) the least set with probability one, or in other words, where the distribution lives.) So it is enough to give a counterexample of a distribution family with support depending on the parameter, the most easy example is the following family of uniform distributions: $$text{U}(0, eta), quad eta > 0$$. (the other answer by @Chaconne gives a more sophisticated counterexample).