Solved – Divergence Measures on Distribution over a Vector Space

This is much more of a soft question, but I was wondering what work there is on distributions specifically over a vector space? Where I'm coming from is that I have run Latent Dirichlet Allocation on several timestamped corpuses and have embedded the words into a vector space via word2vec. Now I have topics over time to compare.

Rather than using typical divergence/distance measures on distributions to compare topics, it seems like there should be additional structure to analyze (potentially an algebraic approach?), given that outcomes are elements of a vector space.

My background in statistics is not the strongest, so apologies if this is quite simple!

One way to measure the distance between two distributions apart from the wellknown divergence measures as K-L Divergence, Mahalanobis etc.. is the MMD, which is a non-parametric measure so we do not need any estimation.

We assume that we have two data sets which follow different distributions. The Maximum Mean Discrepancy (MMD) is a non-parametric distance of these two distributions.

MMD is representing distances between distributions as distances between mean embeddings of features. Assuming that we have two distributions P and Q over a set X (feature set). The MMD is defined by a feature map φ:X→H, where H is a reproducing kernel Hilbert space. In general, the MMD is

MMD(P,Q) = ∥EX∼P[φ(X)]−EY∼Q[φ(Y)]∥H.

Similar Posts:

Rate this post

Leave a Comment