As a warm up with recurrent neural networks, I'm trying to predict a sine wave from another sine wave of another frequency.

My model is a simple RNN, its forward pass can be expressed as follow:

$$

begin{aligned}

r_t &= sigma(W_{in} cdot x_t + W_{rec} cdot r_{t-1}))\

z_t &= W_{out} cdot r_t

end{aligned}

$$

where $sigma$ is the sigmoïd function.

When both input the input and expected output are two sine waves of the same frequency but with (possibly) a phase shift, the model *is able to properly converge* to a reasonable approximation.

However, in the following case, the model converge to a local minima and predicts zero all the time:

- input: $x = sin(t)$
- expected output: $y = sin(frac{t}{2})$

Here's what the network predicts when given the full input sequence after 10 epochs of training, using mini-batches of size 16, a learning rate of 0.01, a sequence length of 16 and hidden layers of size 32:

Which leads me to think the network is unable to learn through time and relies only on the current input to make its prediction.

I tried to tune the learning rate, sequences length and hidden layers size without much success.

I'm having the exact same issue with an LSTM. I don't want to believe these architectures are that flawed, any hints on what am I doing wrong ?

*I'm using an rnn package for Torch, the code is in a Gist.*

**Contents**hide

#### Best Answer

Your data basically cannot be learned with an RNN trained that way. Your input is $sin(t)$ is $2pi$-periodic $sin(t) = sin(t+2pi)$

but your target $sin(t/2)$ is $4pi$-periodic and $sin(t/2) = -sin(t+2pi)$

Therefore, in your dataset you'll have pairs of identical inputs with opposite outputs. In terms of Mean Squared Error, it means that the optimal solution is a null function.

These are two slices of your plot where you can see identical inputs but opposite targets