# Solved – KL divergence invariant to affine transformation

I read in this tutorial on page 20 that $$KL$$ divergence is invariant to affine transformation, but I think it is incorrect.

Say we have two 1D normal distributions $$P_{1}(x) = mathcal N(mu_{1}, sigma_{1})$$ and $$P_{2}(x) = mathcal N(mu_{2}, sigma_{2})$$. So that $$KL(P_1(x)|P_{2}(x))= E_{1}(ln frac{P_{1}(x)}{P_{2}(x)}) = ln(frac{sigma_{2}}{sigma_{1}}) + frac{1}{2sigma_2^2}(sigma_1^2+(mu_1-mu_2)^2)-frac{1}{2}$$

If we define an affine transformation as $$x^{'} = mu_1 + frac{1}{sigma}(x – mu_1)$$

We will have
$$P_1(x^{'}) = sigma P_1(x = mu_1+ sigma(x' – mu_1)) = mathcal N(mu_1, frac{sigma_1^2}{sigma^2})$$ and
$$P_2(x^{'}) = sigma P_2(x = mu_1+ sigma(x' – mu_1)) = mathcal N(mu_1-frac{1}{sigma}(mu_1-mu_2), frac{sigma_2^2}{sigma^2})$$
Then, the $$KL$$ divergence for the two transformed distributions is $$KL(P_1(x')|P_2(x')) = E'_1(ln frac{P_1(x')}{P_2(x')}) = ln (frac{sigma_{2}}{sigma_{1}}) + frac{1}{2sigma_2^2}(sigma^2 sigma_1^2+(mu_1-mu_2)^2)-frac{sigma^2}{2}$$

So clearly, for such a simple case $$KL$$ divergence is not invariant.

However, $$KL$$ divergence is invariant under affine transformation is crucial for the proof in the tutorial that I referred to.

So, have I misunderstood something?

EDIT:

I think part of my misunderstanding lies in the way that I calculate $$P_1(x')$$ and $$P_2(x')$$. So I will expand this part so others can see where I got it wrong.
$$P_1(x') = sigma P_1(x) = sigma P_1(mu_1+sigma (x'-mu_1))$$
given that $$P_1(x)=mathcal N(mu_1, sigma_1)$$
so,
$$sigma P_1(mu_1+sigma (x'-mu_1)) = sigma frac{1}{sqrt{2pi}sigma_1} e^{-frac{1}{2sigma_1^2}(sigma (x' – mu_1))^2} = frac{1}{sqrt{2pi} frac{sigma_1}{sigma}} e^{-frac{1}{2frac{sigma_1^2}{sigma^2}}((x' – mu_1))^2} = mathcal N(mu_1, frac{sigma_1^2}{sigma^2})$$
Then in the exact the same way, I have $$P_2(x^{'}) = sigma P_2(x = mu_1+ sigma(x' – mu_1)) = mathcal N(mu_1-frac{1}{sigma}(mu_1-mu_2), frac{sigma_2^2}{sigma^2})$$

Is there any problem with this?

Contents