This is much more of a soft question, but I was wondering what work there is on distributions specifically over a vector space? Where I'm coming from is that I have run Latent Dirichlet Allocation on several timestamped corpuses and have embedded the words into a vector space via word2vec. Now I have topics over time to compare.
Rather than using typical divergence/distance measures on distributions to compare topics, it seems like there should be additional structure to analyze (potentially an algebraic approach?), given that outcomes are elements of a vector space.
My background in statistics is not the strongest, so apologies if this is quite simple!
Best Answer
One way to measure the distance between two distributions apart from the wellknown divergence measures as K-L Divergence, Mahalanobis etc.. is the MMD, which is a non-parametric measure so we do not need any estimation.
We assume that we have two data sets which follow different distributions. The Maximum Mean Discrepancy (MMD) is a non-parametric distance of these two distributions.
MMD is representing distances between distributions as distances between mean embeddings of features. Assuming that we have two distributions P and Q over a set X (feature set). The MMD is defined by a feature map φ:X→H, where H is a reproducing kernel Hilbert space. In general, the MMD is
MMD(P,Q) = ∥EX∼P[φ(X)]−EY∼Q[φ(Y)]∥H.
Similar Posts:
- Solved – Can we apply KL divergence to the probability distributions on different domains
- Solved – Jensen-Shannon Divergence for multiple probability distributions
- Solved – Jensen-Shannon Divergence for multiple probability distributions
- Solved – distance measure of two discrete probability histograms (distance between two vectors)
- Solved – distance measure of two discrete probability histograms (distance between two vectors)