This is much more of a soft question, but I was wondering what work there is on distributions specifically over a vector space? Where I'm coming from is that I have run Latent Dirichlet Allocation on several timestamped corpuses and have embedded the words into a vector space via word2vec. Now I have topics over time to compare.

Rather than using typical divergence/distance measures on distributions to compare topics, it seems like there should be additional structure to analyze (potentially an algebraic approach?), given that outcomes are elements of a vector space.

My background in statistics is not the strongest, so apologies if this is quite simple!

**Contents**hide

#### Best Answer

One way to measure the distance between two distributions apart from the wellknown divergence measures as K-L Divergence, Mahalanobis etc.. is the MMD, which is a non-parametric measure so we do not need any estimation.

We assume that we have two data sets which follow different distributions. The Maximum Mean Discrepancy (MMD) is a non-parametric distance of these two distributions.

MMD is representing distances between distributions as distances between mean embeddings of features. Assuming that we have two distributions P and Q over a set X (feature set). The MMD is defined by a feature map φ:X→H, where H is a reproducing kernel Hilbert space. In general, the MMD is

MMD(P,Q) = ∥EX∼P[φ(X)]−EY∼Q[φ(Y)]∥H.

### Similar Posts:

- Solved – Can we apply KL divergence to the probability distributions on different domains
- Solved – Jensen-Shannon Divergence for multiple probability distributions
- Solved – Jensen-Shannon Divergence for multiple probability distributions
- Solved – distance measure of two discrete probability histograms (distance between two vectors)
- Solved – distance measure of two discrete probability histograms (distance between two vectors)