Solved – how does the loss function work in word2vec

I was watching CS224n and I Came across this equation for word2vec loss function.
As in the blue box, "for each documenttraining example t we are calculating the probability of context words given the current word". I wanted to know why we are multiplying the probabilities as in the red boxes. I might be missing out on some math, it would be great if someone can help me. Thanks.enter image description here

The probabilities are being multiplied because you want to compute the probability of two (or more) events happening at the same time, which is equal to the product of the probabilities of the individual events, under the assumption that the events are independent. I highly recommend you to check basic Wikipedia articles on Maximum Likelihood before to continue, so that you understand the general mechanism.

Similar Posts:

Rate this post

Leave a Comment