Solved – Contextual Embedding

I understand word embeddings and word2vec.

In this paper: https://arxiv.org/pdf/1603.01547.pdf

they are saying a new type of word embedding.

Our model uses one word embedding function and two encoder functions. The word embedding function e translates words into vector representations. The first encoder function is a document encoder f that encodes *every word from the document* d *in the context of the whole document*. We call this the **contextual embedding**. 

Is this some new way of encoding, How can I implement this? Thanks .

The contextual embedding of a word is just the corresponding hidden state of a bi-GRU:

In our model the document encoder $f$ is implemented as a bidirectional Gated Recurrent Unit (GRU) network whose hidden states form the contextual word embeddings, that is $f_i(d) = overrightarrow{f_i}(d) ,, ||,, overleftarrow{f_i}(d)$, where $||$ denotes vector concatenation and $overrightarrow{f_i}$ and $overleftarrow{f_i}$ denote forward and backward contextual embeddings from the respective recurrent networks.

In red is the contextual embedding of the first word:

enter image description here

Similar Posts:

Rate this post

Leave a Comment