LSTM equations with minibatches Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsTensorFlow and Categorical variablesKeras LSTM: use weights from Keras model to replicate predictions using numpyValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (0,)Reshaping big dataset with MinMaxScaler giving errorHow to define the shape of hidden and meory state in Numpy, keras?What does GlobalMaxPooling1D() do to output of LSTM unit in Keras?Understanding LSTM input shape for kerasLSTM Long Term Dependencies KerasUnderstanding LSTM structure3 dimensional array as input with Embedding Layer and LSTM in Keras

How do I overlay a PNG over two videos (one video overlays another) in one command using FFmpeg?

Can I take recommendation from someone I met at a conference?

Is it OK if I do not take the receipt in Germany?

2 sample t test for sample sizes - 30,000 and 150,000

tabularx column has extra padding at right?

What's the difference between using dependency injection with a container and using a service locator?

/bin/ls sorts differently than just ls

How is an IPA symbol that lacks a name (e.g. ɲ) called?

Is Bran literally the world's memory?

Is there a verb for listening stealthily?

Would I be safe to drive a 23 year old truck for 7 hours / 450 miles?

"Destructive force" carried by a B-52?

How to create a command for the "strange m" symbol in latex?

lm and glm function in R

Do chord progressions usually move by fifths?

Lights are flickering on and off after accidentally bumping into light switch

Putting Ant-Man on house arrest

Who can become a wight?

Why not use the yoke to control yaw, as well as pitch and roll?

Can gravitational waves pass through a black hole?

Why did Bronn offer to be Tyrion Lannister's champion in trial by combat?

How to get a single big right brace?

Why aren't these two solutions equivalent? Combinatorics problem

“Since the train was delayed for more than an hour, passengers were given a full refund.” – Why is there no article before “passengers”?

LSTM equations with minibatches

Announcing the arrival of Valued Associate #679: Cesar Manara

Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)

2019 Moderator Election Q&A - Questionnaire

2019 Community Moderator Election ResultsTensorFlow and Categorical variablesKeras LSTM: use weights from Keras model to replicate predictions using numpyValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (0,)Reshaping big dataset with MinMaxScaler giving errorHow to define the shape of hidden and meory state in Numpy, keras?What does GlobalMaxPooling1D() do to output of LSTM unit in Keras?Understanding LSTM input shape for kerasLSTM Long Term Dependencies KerasUnderstanding LSTM structure3 dimensional array as input with Embedding Layer and LSTM in Keras

I'm looking at the code behind a Keras LSTM, and I noticed something I find odd.

Suppose we're feeding in input of size (batch_size, time_steps, input_dim)
where:

batch_size is the number of examples in the minibatch,

time_steps is the number of time steps to look back (i.e the window size)

input_dim is the number of input variables.

Suppose we want to input minibatches of size batch_size=5, time_steps=2, and input_dim=1.

That is, we have a univariate time series with five examples per minibatch and we use the previous two values of the time series to predict the next value.

Say we want to accomplish this using an LSTM with size 3 (that is, this is the number of hidden units and the number of output units). Call this size units.

In lines 1871-1995 of the code here, the LSTM is being built. The equation I'm confused about is on lines 1989-1990, which corresponds to the calculation of the input gate:

i = self.recurrent_activation(x_i + K.dot(h_tm1_i, self.recurrent_kernel_i))

The variable x_i is calculated as:

x_i = K.dot(inputs_i, self.kernel_i)

where

inputs_i = the input at time i, which has shape (batch_size, time_steps, input_dim), in our case would have shape (5, 2, 1)

self.kernel_i = the weight matrix that is multiplied by the current input at time t, and has shape (input_dim, units), in our case would have shape (1, 3)

I understand this dot product is accomplished via broadcasting, and the final shape is (batch_size, time_steps, units), which in our case is (5, 2, 3).

Now let's examine the dot product:

K.dot(h_tm1_i, self.recurrent_kernel_i))

where

h_tm1_i = the recurrent hidden state at time i, and has shape equal to (batch_size, units) according to line 295. In our case, that's (5, 3)

self.recurrent_kernel = the weight matrix that is multiplied by the previous hidden state, and has shape (units, units), which in our case would be (3, 3).

Somehow via the magic of broadcasting, the dot product gives something with shape (5, 3)

We're left with needing to add x_i and K.dot(h_tm1_i, self.recurrent_kernel_i)), which have shapes (5, 2, 3) and (5, 3) respectively. When I try to do that myself in tensorflow, I get an error:

ValueError: Dimensions must be equal, but are 2 and 5 for 'add_1' (op: 'Add') with input shapes: [5,2,3], [5,3].

Clearly I've done something wrong somewhere, but I can't see my logic error. Can anyone help?

EDIT: To reproduce the error:

>>> import tensorflow as tf
>>> import keras
>>> from keras import backend as K
>>> inputs_i = tf.ones([5, 2, 1])
>>> kernel_i = tf.ones([1,3])
>>> h_tm1_i = tf.ones([5,3])
>>> rec_i = tf.ones([3,3])
>>> x_i = K.dot(inputs_i, kernel_i)
>>> x_i
<tf.Tensor 'Reshape_9:0' shape=(5, 2, 3) dtype=float32>
>>> K.dot(h_tm1_i, rec_i)
<tf.Tensor 'MatMul_4:0' shape=(5, 3) dtype=float32>
>>> x_i + K.dot(h_tm1, rec_i) #Raises ValueError

edited 1 hour ago

asked 1 hour ago

StatsSorceress

1,1473824

add a comment |

I'm looking at the code behind a Keras LSTM, and I noticed something I find odd.

Suppose we're feeding in input of size (batch_size, time_steps, input_dim)
where:

batch_size is the number of examples in the minibatch,

time_steps is the number of time steps to look back (i.e the window size)

input_dim is the number of input variables.

Suppose we want to input minibatches of size batch_size=5, time_steps=2, and input_dim=1.

That is, we have a univariate time series with five examples per minibatch and we use the previous two values of the time series to predict the next value.

Say we want to accomplish this using an LSTM with size 3 (that is, this is the number of hidden units and the number of output units). Call this size units.

In lines 1871-1995 of the code here, the LSTM is being built. The equation I'm confused about is on lines 1989-1990, which corresponds to the calculation of the input gate:

i = self.recurrent_activation(x_i + K.dot(h_tm1_i, self.recurrent_kernel_i))

The variable x_i is calculated as:

x_i = K.dot(inputs_i, self.kernel_i)

where

inputs_i = the input at time i, which has shape (batch_size, time_steps, input_dim), in our case would have shape (5, 2, 1)

self.kernel_i = the weight matrix that is multiplied by the current input at time t, and has shape (input_dim, units), in our case would have shape (1, 3)

I understand this dot product is accomplished via broadcasting, and the final shape is (batch_size, time_steps, units), which in our case is (5, 2, 3).

Now let's examine the dot product:

K.dot(h_tm1_i, self.recurrent_kernel_i))

where

h_tm1_i = the recurrent hidden state at time i, and has shape equal to (batch_size, units) according to line 295. In our case, that's (5, 3)

self.recurrent_kernel = the weight matrix that is multiplied by the previous hidden state, and has shape (units, units), which in our case would be (3, 3).

Somehow via the magic of broadcasting, the dot product gives something with shape (5, 3)

ValueError: Dimensions must be equal, but are 2 and 5 for 'add_1' (op: 'Add') with input shapes: [5,2,3], [5,3].

Clearly I've done something wrong somewhere, but I can't see my logic error. Can anyone help?

EDIT: To reproduce the error:

>>> import tensorflow as tf
>>> import keras
>>> from keras import backend as K
>>> inputs_i = tf.ones([5, 2, 1])
>>> kernel_i = tf.ones([1,3])
>>> h_tm1_i = tf.ones([5,3])
>>> rec_i = tf.ones([3,3])
>>> x_i = K.dot(inputs_i, kernel_i)
>>> x_i
<tf.Tensor 'Reshape_9:0' shape=(5, 2, 3) dtype=float32>
>>> K.dot(h_tm1_i, rec_i)
<tf.Tensor 'MatMul_4:0' shape=(5, 3) dtype=float32>
>>> x_i + K.dot(h_tm1, rec_i) #Raises ValueError

edited 1 hour ago

asked 1 hour ago

StatsSorceress

1,1473824

add a comment |

I'm looking at the code behind a Keras LSTM, and I noticed something I find odd.

Suppose we're feeding in input of size (batch_size, time_steps, input_dim)
where:

batch_size is the number of examples in the minibatch,

time_steps is the number of time steps to look back (i.e the window size)

input_dim is the number of input variables.

Suppose we want to input minibatches of size batch_size=5, time_steps=2, and input_dim=1.

That is, we have a univariate time series with five examples per minibatch and we use the previous two values of the time series to predict the next value.

Say we want to accomplish this using an LSTM with size 3 (that is, this is the number of hidden units and the number of output units). Call this size units.

In lines 1871-1995 of the code here, the LSTM is being built. The equation I'm confused about is on lines 1989-1990, which corresponds to the calculation of the input gate:

i = self.recurrent_activation(x_i + K.dot(h_tm1_i, self.recurrent_kernel_i))

The variable x_i is calculated as:

x_i = K.dot(inputs_i, self.kernel_i)

where

inputs_i = the input at time i, which has shape (batch_size, time_steps, input_dim), in our case would have shape (5, 2, 1)

self.kernel_i = the weight matrix that is multiplied by the current input at time t, and has shape (input_dim, units), in our case would have shape (1, 3)

I understand this dot product is accomplished via broadcasting, and the final shape is (batch_size, time_steps, units), which in our case is (5, 2, 3).

Now let's examine the dot product:

K.dot(h_tm1_i, self.recurrent_kernel_i))

where

h_tm1_i = the recurrent hidden state at time i, and has shape equal to (batch_size, units) according to line 295. In our case, that's (5, 3)

self.recurrent_kernel = the weight matrix that is multiplied by the previous hidden state, and has shape (units, units), which in our case would be (3, 3).

Somehow via the magic of broadcasting, the dot product gives something with shape (5, 3)

ValueError: Dimensions must be equal, but are 2 and 5 for 'add_1' (op: 'Add') with input shapes: [5,2,3], [5,3].

Clearly I've done something wrong somewhere, but I can't see my logic error. Can anyone help?

EDIT: To reproduce the error:

>>> import tensorflow as tf
>>> import keras
>>> from keras import backend as K
>>> inputs_i = tf.ones([5, 2, 1])
>>> kernel_i = tf.ones([1,3])
>>> h_tm1_i = tf.ones([5,3])
>>> rec_i = tf.ones([3,3])
>>> x_i = K.dot(inputs_i, kernel_i)
>>> x_i
<tf.Tensor 'Reshape_9:0' shape=(5, 2, 3) dtype=float32>
>>> K.dot(h_tm1_i, rec_i)
<tf.Tensor 'MatMul_4:0' shape=(5, 3) dtype=float32>
>>> x_i + K.dot(h_tm1, rec_i) #Raises ValueError

edited 1 hour ago

asked 1 hour ago

StatsSorceress

1,1473824

I'm looking at the code behind a Keras LSTM, and I noticed something I find odd.

Suppose we're feeding in input of size (batch_size, time_steps, input_dim)
where:

batch_size is the number of examples in the minibatch,

time_steps is the number of time steps to look back (i.e the window size)

input_dim is the number of input variables.

Suppose we want to input minibatches of size batch_size=5, time_steps=2, and input_dim=1.

That is, we have a univariate time series with five examples per minibatch and we use the previous two values of the time series to predict the next value.

Say we want to accomplish this using an LSTM with size 3 (that is, this is the number of hidden units and the number of output units). Call this size units.

In lines 1871-1995 of the code here, the LSTM is being built. The equation I'm confused about is on lines 1989-1990, which corresponds to the calculation of the input gate:

i = self.recurrent_activation(x_i + K.dot(h_tm1_i, self.recurrent_kernel_i))

The variable x_i is calculated as:

x_i = K.dot(inputs_i, self.kernel_i)

where

inputs_i = the input at time i, which has shape (batch_size, time_steps, input_dim), in our case would have shape (5, 2, 1)

self.kernel_i = the weight matrix that is multiplied by the current input at time t, and has shape (input_dim, units), in our case would have shape (1, 3)

I understand this dot product is accomplished via broadcasting, and the final shape is (batch_size, time_steps, units), which in our case is (5, 2, 3).

Now let's examine the dot product:

K.dot(h_tm1_i, self.recurrent_kernel_i))

where

h_tm1_i = the recurrent hidden state at time i, and has shape equal to (batch_size, units) according to line 295. In our case, that's (5, 3)

self.recurrent_kernel = the weight matrix that is multiplied by the previous hidden state, and has shape (units, units), which in our case would be (3, 3).

Somehow via the magic of broadcasting, the dot product gives something with shape (5, 3)

ValueError: Dimensions must be equal, but are 2 and 5 for 'add_1' (op: 'Add') with input shapes: [5,2,3], [5,3].

Clearly I've done something wrong somewhere, but I can't see my logic error. Can anyone help?

EDIT: To reproduce the error:

>>> import tensorflow as tf
>>> import keras
>>> from keras import backend as K
>>> inputs_i = tf.ones([5, 2, 1])
>>> kernel_i = tf.ones([1,3])
>>> h_tm1_i = tf.ones([5,3])
>>> rec_i = tf.ones([3,3])
>>> x_i = K.dot(inputs_i, kernel_i)
>>> x_i
<tf.Tensor 'Reshape_9:0' shape=(5, 2, 3) dtype=float32>
>>> K.dot(h_tm1_i, rec_i)
<tf.Tensor 'MatMul_4:0' shape=(5, 3) dtype=float32>
>>> x_i + K.dot(h_tm1, rec_i) #Raises ValueError

python neural-network deep-learning tensorflow lstm

edited 1 hour ago

asked 1 hour ago

StatsSorceress

1,1473824

edited 1 hour ago

asked 1 hour ago

StatsSorceress

1,1473824

edited 1 hour ago

asked 1 hour ago

StatsSorceress

1,1473824

asked 1 hour ago

StatsSorceress

1,1473824

asked 1 hour ago

StatsSorceress

1,1473824

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49741%2flstm-equations-with-minibatches%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

MRn,AqvX19gpWL2M8YjIqE,LpE1J9

搜尋此網誌

Hfrxdjt

0

Your Answer

Post as a guest

0

0

Post as a guest

Popular posts from this blog

0

Your Answer

Sign up or log in

Post as a guest

Post as a guest

0

0

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog