Solved – How does python-glove compute most similar

i am trying to understand how python-glove computes most-similar terms.

Is it using cosine similarity?

Example from python-glove github
https://github.com/maciejkula/glove-python/tree/master/glove
:enter image description here

I know that from gensim's word2vec, the most_similar method computes similarity using cosine distance.
enter image description here

Looking at the code, python-glove also computes the cosine similarity. In _similarity_query it performs these operations:

dst = (np.dot(self.word_vectors, word_vec)                / np.linalg.norm(self.word_vectors, axis=1)                / np.linalg.norm(word_vec)) 

You can find the code here if no updates have been performed (otherwise search for the _similarity_query).

As you can observe, the function first computes the dot product between the word vectors and the current word embedding, after that the division between the norms (or length) is performed, which corresponds to the definition of the cosine distance $$text{CosineSimilarity(u, v)} = frac {u . v} {||u||_2 ||v||_2} = cos(theta) tag{1}$$ .

Similar Posts:

Rate this post

Leave a Comment