Solved – LSTM network window size selection and effect

When working with an LSTM network in Keras. The first layer has the input_shape parameter show below. model.add(LSTM(50, input_shape=(window_size, num_features), return_sequences=True)) I don't quite follow the window size parameter and the effect it will have on the model. As far as I understand, to make a decision the network not only makes use of current … Read more

Solved – What’s relationship between Linear Regression & Recurrent Neural Networks

I am a beginner of machine learning, and I just studied Linear Regression. $$h(x) = sum_{i=0}^n theta_i x_i$$ By finding the minimum values of $theta$ via Gradient descent or Normal equation, we could get the equation of $h(x)$ to solve the problem, or make the hypothesis of a price of real estate, or classify the … Read more

Solved – What’s relationship between Linear Regression & Recurrent Neural Networks

I am a beginner of machine learning, and I just studied Linear Regression. $$h(x) = sum_{i=0}^n theta_i x_i$$ By finding the minimum values of $theta$ via Gradient descent or Normal equation, we could get the equation of $h(x)$ to solve the problem, or make the hypothesis of a price of real estate, or classify the … Read more

Solved – How to prepare the input layer for recurrent neural network if there are many categorical variables

I am building a recurrent neural network (RNN). The feature set contains many categorical variables. Some of them are like users and items. In this case, if I use one-hot encoding and concatenate these vectors into a big one, the resulting vector will be super sparse. Is it fine to do this? I am not … Read more

Solved – Loss functions that act on real-valued output vectors (and NOT just on 1-hot vectors)

I am trying to modify Andrej Karpathy's char-RNN code. As far as I understand, the loss function used in his code for a LSTM is the Softmax function function (in the file model/LSTM.lua ). I understand Softmax is the multi-class equivalent of the Logistic loss function (used for 2-class classification). The site here says that … Read more

Solved – Loss functions that act on real-valued output vectors (and NOT just on 1-hot vectors)

I am trying to modify Andrej Karpathy's char-RNN code. As far as I understand, the loss function used in his code for a LSTM is the Softmax function function (in the file model/LSTM.lua ). I understand Softmax is the multi-class equivalent of the Logistic loss function (used for 2-class classification). The site here says that … Read more

Solved – Role of delays in LSTM networks

LSTM network is assumed to be about memory, keeping the important information for predictions. If it is the case, why do we need to consider delayed inputs as well? My assumption would be that the LSTM – if the model is sufficiently complex – shall somehow remember the very last inputs if relevant. (Similar trick … Read more

Solved – Role of delays in LSTM networks

LSTM network is assumed to be about memory, keeping the important information for predictions. If it is the case, why do we need to consider delayed inputs as well? My assumption would be that the LSTM – if the model is sufficiently complex – shall somehow remember the very last inputs if relevant. (Similar trick … Read more

Solved – RNNs for Sparse Time Series Data

I have time series data that looks something like this: Series 1 Series 2 ╔══════╦════════╦════════╗ ╔══════╦════════╦════════╦════════╗ ║ Time ║ Value1 ║ Value2 ║ ║ Time ║ Value1 ║ Value2 ║ Value3 ║ ╠══════╬════════╬════════╣ ╠══════╬════════╬════════╬════════╣ ║ 3:30 ║ 10 ║ 100 ║ ║ 3:32 ║ 12 ║ 56 ║ 34 ║ ║ 3:31 ║ 11 ║ … Read more

Solved – RNNs for Sparse Time Series Data

I have time series data that looks something like this: Series 1 Series 2 ╔══════╦════════╦════════╗ ╔══════╦════════╦════════╦════════╗ ║ Time ║ Value1 ║ Value2 ║ ║ Time ║ Value1 ║ Value2 ║ Value3 ║ ╠══════╬════════╬════════╣ ╠══════╬════════╬════════╬════════╣ ║ 3:30 ║ 10 ║ 100 ║ ║ 3:32 ║ 12 ║ 56 ║ 34 ║ ║ 3:31 ║ 11 ║ … Read more