I was watching a video on Gaussian distributions and it defined the degenerate univariate Gaussian as a Gaussian where the variance is zero. However, I am really struggling to understand how the Gaussian can be defined for this.
The Gaussian is supposed to be non-zero on the whole line. So, in the degenerate case that will not be the case as it would be zero except at a point. Is there some limiting case where this might be defined?
On top of it, putting the variance to zero in the formula, of course does not work. So, I assume that it is like that in the limit but could not find any good explanation for it.
Best Answer
This question uses "Gaussian" in two distinct mathematical senses (and therein lies its resolution): first as a distribution and then–at the beginning of the second paragraph–as a probability density function. However, a degenerate Gaussian does not have a PDF. Therefore we should be visualizing the distributions in terms of the one object that is guaranteed to exist no matter what; namely, the cumulative distribution function. The CDF of a degenerate Gaussian (of mean $mu$) leaps from $0$ to $1$ at the value $mu$, creating no difficulties with definitions or limiting values.
The Gaussian density indeed "degenerates" as the standard deviation decreases, because it becomes arbitrarily large at the mean and shrinks to zero elsewhere, as shown in these plots of PDFs of Gaussian ("Normal") distributions with standard deviations $1, 1/4, 1/16,$ and $1/64$. (For better visualization, the vertical axis is cut off at the peak of the third distribution; the peak of the last and narrowest one, shown in blue, extends above $25$.)
The peak must become very large to compensate for a shrinking width, because a PDF represents probability by means of area and, as required by the axioms of probability, the total area will equal $1$ only when the curve grows large in the other (vertical) direction. See A Probability distribution value exceeding 1 is OK for further explanation. This behavior has no well-defined limit at $mu$ but the limits at all other numbers are zero. No matter what value we care to assign to the limit at $mu$–even "$+infty$"–the area under this limiting function is zero, so it cannot be the PDF of any distribution.
The CDFs instead approach a definite curve in the limit of small standard deviations, which is evident in this corresponding plot of the CDFs of these four distributions:
The colors correspond to the distributions in the same way as the previous plot. The CDF for the distribution with standard deviation $1/64$, shown in blue, leaps from $0$ to $1$ within a very short space around the mean $mu$. In the limit of zero standard deviation, the leap will be instantaneous: the limiting curve is zero at all values less than $mu$ and $1$ at all values greater than $mu$ or greater. (A subtle point is that the value of the limiting curve at $mu$ itself is $1/2.$ This is well understood; the relevant theorems do not assert that the limiting values at points where the limiting CDF has a jump will be correct.) This leap represents an "atom" at $mu$ where all the probability is concentrated. The limiting function determines a valid probability distribution, but now, because it locates all the probability within a denumerable set of points (namely, a single point), it is discrete rather than continuous. Ordinarily we would work with its probability mass function (equal to $1$ at $mu$).