Based on this post on quora.com and other sources I got the impression that each word in the word2vec representation is represented by a vector containing e.g. 500 dimensions. However, when looking into the code on sentiment analysis from the tutorial at deeplearning.net, I've found that each sentence is simply captured in a vector with dimensions of ~40-80. One such example is below:
x = [17, 25, 769, 83, 3, 14, 80, 62, 3221, 5, 928, 3, 1782, 6, 1, 1, 771, 24, 3350, 7, 1112, 228, 5, 3978, 4, 17, 25, 1212, 80, 6, 189, 7, 62, 1293, 5, 514, 4, 2, 131, 10, 1146, 480, 59, 413, 213, 117, 3, 14, 824, 69, 611, 2, 239, 73, 222, 72, 2338, 2, 67, 147, 4, 15, 1164, 123, 17, 10, 6, 100, 111, 23, 45, 228, 25, 4427, 8, 1131, 73, 31, 4] y = [1]
With a sentence consisting of many words, it is clear that this vector cannot represent each word using 200 or 500 dimensions. Instead it seems that each word is approxmiated by a single scalar value. How can this be?
Best Answer
Each word in a given sentence is converted into an index on an array that contains a dictionary of all words in your corpus. This is your variable length x vector and is separate from the word2vec embeddings.
This index is also used to lookup the associated fixed length word2vec vectors in a lookup table during training and prediction.
This RNN deeplearning tutorial explains it better with code samples.
In torch this word index lookup is handled by the Lookup table layer.