Solved – Markov chain convergence, total variation and KL divergence

I have a few related questions regarding the convergence of continuous-state Markov chains.

The theorems that I found claim that Markov chains converge in total variation if they are $phi$-irreducible and aperiodic (e.g., Theorem 4). I am confused by the fact that the way this result is phrased, it does not seem to depend on the choice of $phi$ but $phi$-irreducibility seems to always hold for the trivial measure. Maybe I misinterpreted the definition of $phi$-irreducibility? Could somebody help me understand this important result?

The other question that I have is: Under which conditions do Markov chains converge in KL-divergence (ideally with respect to $D_text{KL}[P^n(x, cdot), Q]$)? Is this known? Pinsker's inequality tells me that convergence in either KL-divergence is stronger than convergence in total variation.


It is important to state the theorem correctly with all conditions. Theorem 4 in the paper by Roberts and Rosenthal states that the $n$-step transition probabilities $P^n(x, cdot)$ converge in total variation to a probability measure $pi$ for $pi$-almost all $x$ if the chain is $phi$-irreducible, aperiodic and has $pi$ as invariant initial distribution, that is, if $$pi(A) = int P(x, A) pi(mathrm{d}x).$$ There is also a technical condition that the $sigma$-algebra on the state space should be countably generated. We return to this below. It is quite important for the general application of the theorem that one knows upfront that there is an invariant $pi$ — otherwise the chain can be null recurrent. In the MCMC context on $mathbb{R}^d$ of the cited paper the chains are constructed with a given target distribution as invariant distribution so in this context it is only the $phi$-irreducibility and aperiodicity that we need to check.

The authoritative reference on these matters is Meyn and Tweedies book Markov Chains and Stochastic Stability, which is also cited heavily in the paper. However, as far as I can tell, there are minor differences in the results presented in the paper and the book, and the paper do have a proof of Theorem 4.

Returning to the question, the $phi$-measure used to define $phi$-irreducibility is by assumption non-zero, so the trivial measure is ruled out (this is actually missing in the Meyn and Tweedie book, but stated correctly in the paper on page 31. The Meyn and Tweedie book also lacks the assumption of $sigma$-finiteness that Roberts and Rosenthal make. I cannot see that it is possible to give this up either.)

To return to the assumption on a countably generated $sigma$-algebra on a general state space, this assumption ensures that $phi$-irreducible chains have small sets, see Theorem 19 in the paper. If you can prove the existence of a small set by other means the assumption on the $sigma$-algebra can be dropped.

Regarding the second question, I am afraid I can't be of much assistance. Why is this of interest? I have not encountered problems where KL-convergence was needed specifically.

Similar Posts:

Rate this post

Leave a Comment