Why does the classic Neural Network perform better than LSTM in Sentiment Analysis The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsAccuracy drops if more layers trainable - weirdNeural network accuracy for simple classificationSimple prediction with KerasTraining Accuracy stuck in KerasKeras LSTM accuracy stuck at 50%Value error in Merging two different models in kerasWhy is my Keras model not learning image segmentation?Bidirectional GRU: validation loss stuck on plateau diverges from well performing training lossMetrics values are equal while training and testing a modelIN CIFAR 10 DATASET

One-dimensional Japanese puzzle

Drawing vertical/oblique lines in Metrical tree (tikz-qtree, tipa)

Simulating Exploding Dice

Button changing its text & action. Good or terrible?

Am I ethically obligated to go into work on an off day if the reason is sudden?

60's-70's movie: home appliances revolting against the owners

different output for groups and groups USERNAME after adding a username to a group

Why don't hard Brexiteers insist on a hard border to prevent illegal immigration after Brexit?

Is there a writing software that you can sort scenes like slides in PowerPoint?

For what reasons would an animal species NOT cross a *horizontal* land bridge?

Pretty sure I'm over complicating my loops but unsure how to simplify

Variable with quotation marks "$()"

Do working physicists consider Newtonian mechanics to be "falsified"?

How do spell lists change if the party levels up without taking a long rest?

Is it ethical to upload a automatically generated paper to a non peer-reviewed site as part of a larger research?

How to politely respond to generic emails requesting a PhD/job in my lab? Without wasting too much time

What happens to a Warlock's expended Spell Slots when they gain a Level?

"is" operation returns false even though two objects have same id

How to handle characters who are more educated than the author?

Single author papers against my advisor's will?

How to make Illustrator type tool selection automatically adapt with text length

Can withdrawing asylum be illegal?

What other Star Trek series did the main TNG cast show up in?

The following signatures were invalid: EXPKEYSIG 1397BC53640DB551

Why does the classic Neural Network perform better than LSTM in Sentiment Analysis

The 2019 Stack Overflow Developer Survey Results Are In

Announcing the arrival of Valued Associate #679: Cesar Manara

Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)

2019 Moderator Election Q&A - Questionnaire

2019 Community Moderator Election ResultsAccuracy drops if more layers trainable - weirdNeural network accuracy for simple classificationSimple prediction with KerasTraining Accuracy stuck in KerasKeras LSTM accuracy stuck at 50%Value error in Merging two different models in kerasWhy is my Keras model not learning image segmentation?Bidirectional GRU: validation loss stuck on plateau diverges from well performing training lossMetrics values are equal while training and testing a modelIN CIFAR 10 DATASET

My goal is to predict the polarity of some reviews (negative, positive or neutral). I tried two different neural networks:

 left_branch = Input((7000, ))
 left_branch_dense = Dense(512, activation = 'relu')(left_branch)

 right_branch = Input((14012, ))
 right_branch_dense = Dense(512, activation = 'relu')(right_branch)
 merged = Concatenate()([left_branch_dense, right_branch_dense])
 output_layer = Dense(3, activation = 'softmax')(merged)

 model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
 model.fit([np.array(review_matrix), np.array(X_train)], labels,epochs=2, verbose=1)
 model.save('model.merged') 

 #############################################################################


 #############################################################################

 #We will try to merge two different models in a different way: Accuracy: 70

 # Prepare the review column for embedding: 
 review_matrix_for_embedding = prepare_for_encoding(train_set[4].tolist(), 7000) # Shape: (1503,100) 

 second_matrix = np.array(pd.concat([onehot_category, aspect_matrix],axis=1))


 left_branch = Input(shape=(100,), dtype='int32')
 # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector; 
 # input_length: Size of the array
 left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
 lstm_out = LSTM(256)(left_branch_embedding)
 lstm_out = Dropout(0.7)(lstm_out)
 lstm_out = Dense(128, activation='sigmoid')(lstm_out)

 right_branch = Input((7012, ))
 merged = Concatenate()([lstm_out, right_branch])
 output_layer = Dense(3, activation = 'softmax')(merged)

 model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
 model.fit([review_matrix_for_embedding, second_matrix], labels,epochs=5, verbose=1)

The first one does 80% accuracy while the second one does 70%, with embedding vectors and LSTM layer. How is it possible? Is there anything wrong in my architecture?

edited Mar 11 at 18:16

Nischal Hp

48829

asked Mar 11 at 14:49

nolw38

bumped to the homepage by Community♦ 51 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

add a comment |

My goal is to predict the polarity of some reviews (negative, positive or neutral). I tried two different neural networks:

 left_branch = Input((7000, ))
 left_branch_dense = Dense(512, activation = 'relu')(left_branch)

 right_branch = Input((14012, ))
 right_branch_dense = Dense(512, activation = 'relu')(right_branch)
 merged = Concatenate()([left_branch_dense, right_branch_dense])
 output_layer = Dense(3, activation = 'softmax')(merged)

 model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
 model.fit([np.array(review_matrix), np.array(X_train)], labels,epochs=2, verbose=1)
 model.save('model.merged') 

 #############################################################################


 #############################################################################

 #We will try to merge two different models in a different way: Accuracy: 70

 # Prepare the review column for embedding: 
 review_matrix_for_embedding = prepare_for_encoding(train_set[4].tolist(), 7000) # Shape: (1503,100) 

 second_matrix = np.array(pd.concat([onehot_category, aspect_matrix],axis=1))


 left_branch = Input(shape=(100,), dtype='int32')
 # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector; 
 # input_length: Size of the array
 left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
 lstm_out = LSTM(256)(left_branch_embedding)
 lstm_out = Dropout(0.7)(lstm_out)
 lstm_out = Dense(128, activation='sigmoid')(lstm_out)

 right_branch = Input((7012, ))
 merged = Concatenate()([lstm_out, right_branch])
 output_layer = Dense(3, activation = 'softmax')(merged)

 model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
 model.fit([review_matrix_for_embedding, second_matrix], labels,epochs=5, verbose=1)

The first one does 80% accuracy while the second one does 70%, with embedding vectors and LSTM layer. How is it possible? Is there anything wrong in my architecture?

edited Mar 11 at 18:16

Nischal Hp

48829

asked Mar 11 at 14:49

nolw38

bumped to the homepage by Community♦ 51 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

add a comment |

My goal is to predict the polarity of some reviews (negative, positive or neutral). I tried two different neural networks:

 left_branch = Input((7000, ))
 left_branch_dense = Dense(512, activation = 'relu')(left_branch)

 right_branch = Input((14012, ))
 right_branch_dense = Dense(512, activation = 'relu')(right_branch)
 merged = Concatenate()([left_branch_dense, right_branch_dense])
 output_layer = Dense(3, activation = 'softmax')(merged)

 model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
 model.fit([np.array(review_matrix), np.array(X_train)], labels,epochs=2, verbose=1)
 model.save('model.merged') 

 #############################################################################


 #############################################################################

 #We will try to merge two different models in a different way: Accuracy: 70

 # Prepare the review column for embedding: 
 review_matrix_for_embedding = prepare_for_encoding(train_set[4].tolist(), 7000) # Shape: (1503,100) 

 second_matrix = np.array(pd.concat([onehot_category, aspect_matrix],axis=1))


 left_branch = Input(shape=(100,), dtype='int32')
 # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector; 
 # input_length: Size of the array
 left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
 lstm_out = LSTM(256)(left_branch_embedding)
 lstm_out = Dropout(0.7)(lstm_out)
 lstm_out = Dense(128, activation='sigmoid')(lstm_out)

 right_branch = Input((7012, ))
 merged = Concatenate()([lstm_out, right_branch])
 output_layer = Dense(3, activation = 'softmax')(merged)

 model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
 model.fit([review_matrix_for_embedding, second_matrix], labels,epochs=5, verbose=1)

The first one does 80% accuracy while the second one does 70%, with embedding vectors and LSTM layer. How is it possible? Is there anything wrong in my architecture?

edited Mar 11 at 18:16

Nischal Hp

48829

asked Mar 11 at 14:49

nolw38

My goal is to predict the polarity of some reviews (negative, positive or neutral). I tried two different neural networks:

 left_branch = Input((7000, ))
 left_branch_dense = Dense(512, activation = 'relu')(left_branch)

 right_branch = Input((14012, ))
 right_branch_dense = Dense(512, activation = 'relu')(right_branch)
 merged = Concatenate()([left_branch_dense, right_branch_dense])
 output_layer = Dense(3, activation = 'softmax')(merged)

 model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
 model.fit([np.array(review_matrix), np.array(X_train)], labels,epochs=2, verbose=1)
 model.save('model.merged') 

 #############################################################################


 #############################################################################

 #We will try to merge two different models in a different way: Accuracy: 70

 # Prepare the review column for embedding: 
 review_matrix_for_embedding = prepare_for_encoding(train_set[4].tolist(), 7000) # Shape: (1503,100) 

 second_matrix = np.array(pd.concat([onehot_category, aspect_matrix],axis=1))


 left_branch = Input(shape=(100,), dtype='int32')
 # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector; 
 # input_length: Size of the array
 left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
 lstm_out = LSTM(256)(left_branch_embedding)
 lstm_out = Dropout(0.7)(lstm_out)
 lstm_out = Dense(128, activation='sigmoid')(lstm_out)

 right_branch = Input((7012, ))
 merged = Concatenate()([lstm_out, right_branch])
 output_layer = Dense(3, activation = 'softmax')(merged)

 model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
 model.fit([review_matrix_for_embedding, second_matrix], labels,epochs=5, verbose=1)

The first one does 80% accuracy while the second one does 70%, with embedding vectors and LSTM layer. How is it possible? Is there anything wrong in my architecture?

keras nlp

edited Mar 11 at 18:16

Nischal Hp

48829

asked Mar 11 at 14:49

nolw38

edited Mar 11 at 18:16

Nischal Hp

48829

asked Mar 11 at 14:49

nolw38

edited Mar 11 at 18:16

Nischal Hp

48829

edited Mar 11 at 18:16

Nischal Hp

48829

edited Mar 11 at 18:16

Nischal Hp

48829

asked Mar 11 at 14:49

nolw38

asked Mar 11 at 14:49

nolw38

asked Mar 11 at 14:49

nolw38

bumped to the homepage by Community♦ 51 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

bumped to the homepage by Community♦ 51 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

add a comment |

2 Answers
2

active

oldest

votes

First of all, I have noticed that you have used sigmoid activation function for your LSTM Dense Layer and in your ANN you used relu, maybe, MAYBE, this can be a reason for your lower performance. That is could be happening because sigmoid functions suffer from two problems:
- Saturation of gradients: sigmoid functions have tail distributions, meaning that they saturate in this 'flat' regions practically diminishing the gradient to zero and affecting backpropagation/training process.
- Sigmoid outputs are not zero-centered: This is an issue due the gradient calculation during backpropagation. Either having all enters positivo or negative will add a 'zigzag' effect difficulting the training process.

My comments above are taken from this excellent tutorial that you should read: http://cs231n.github.io/neural-networks-1/
I tried to summarize it but they did a master work and I think you should read it.

Second, we must consider other factors such as: are you analyzing your train/test/val losses? Maybe your LSTM networks just takes longer to train and reach its minimum. You need to work a little more on these parameters before taking any conclusions. Plot a graph showing your training and validation loss so we can assess if your model is underfitting.

Lastly, I have a question for you: Why should your LSTM network perform better than a simple ANN? Although LSTM + Embeddings are powerful techniques that have gained attention in a lot of fields, essentially NLP, that will be not every task that they beat classical approaches. I myself have tried with different data sets, and depending on the application, simple ML algorithms such as SVM would easily beat the more complex ones, including sentiment analysis.

So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.

answered Mar 11 at 20:58

Victor Oliveira

3657

add a comment |

Thank you for your answer.
I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!

Here are pictures of training for both the one who does 79% accuracy (merged two classic neural networks), and the one in which I do 70: (of course I don't talk about the training accuracy but the test accuracy).
I see a difference in the loss value, but I don't know how to interpret it? Does this mean that for the less good architecture, I don't reach a minimum? If yes, what can I modify in my NN ?

Thank you for the time you took to answer!!

Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?

 left_branch = Input(shape=(100,), dtype='int32')
 # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector; 
 # input_length: Size of the array
 left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
 lstm_out = LSTM(256)(left_branch_embedding)
 lstm_out = Dropout(0.7)(lstm_out)
 lstm_out = Dense(128, activation='sigmoid')(lstm_out)

 right_branch = Input((7012, ))
 merged = Concatenate()([lstm_out, right_branch])
 output_layer = Dense(3, activation = 'softmax')(merged)

 model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

answered Mar 12 at 10:21

nolw38

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47094%2fwhy-does-the-classic-neural-network-perform-better-than-lstm-in-sentiment-analys%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.

answered Mar 11 at 20:58

Victor Oliveira

3657

add a comment |

So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.

answered Mar 11 at 20:58

Victor Oliveira

3657

add a comment |

So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.

answered Mar 11 at 20:58

Victor Oliveira

3657

So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.

answered Mar 11 at 20:58

Victor Oliveira

3657

answered Mar 11 at 20:58

Victor Oliveira

3657

answered Mar 11 at 20:58

Victor Oliveira

3657

answered Mar 11 at 20:58

Victor Oliveira

3657

add a comment |

Thank you for your answer.
I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!

Thank you for the time you took to answer!!

Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?

 left_branch = Input(shape=(100,), dtype='int32')
 # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector; 
 # input_length: Size of the array
 left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
 lstm_out = LSTM(256)(left_branch_embedding)
 lstm_out = Dropout(0.7)(lstm_out)
 lstm_out = Dense(128, activation='sigmoid')(lstm_out)

 right_branch = Input((7012, ))
 merged = Concatenate()([lstm_out, right_branch])
 output_layer = Dense(3, activation = 'softmax')(merged)

 model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

answered Mar 12 at 10:21

nolw38

add a comment |

Thank you for your answer.
I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!

Thank you for the time you took to answer!!

Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?

 left_branch = Input(shape=(100,), dtype='int32')
 # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector; 
 # input_length: Size of the array
 left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
 lstm_out = LSTM(256)(left_branch_embedding)
 lstm_out = Dropout(0.7)(lstm_out)
 lstm_out = Dense(128, activation='sigmoid')(lstm_out)

 right_branch = Input((7012, ))
 merged = Concatenate()([lstm_out, right_branch])
 output_layer = Dense(3, activation = 'softmax')(merged)

 model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

answered Mar 12 at 10:21

nolw38

add a comment |

Thank you for your answer.
I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!

Thank you for the time you took to answer!!

Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?

 left_branch = Input(shape=(100,), dtype='int32')
 # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector; 
 # input_length: Size of the array
 left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
 lstm_out = LSTM(256)(left_branch_embedding)
 lstm_out = Dropout(0.7)(lstm_out)
 lstm_out = Dense(128, activation='sigmoid')(lstm_out)

 right_branch = Input((7012, ))
 merged = Concatenate()([lstm_out, right_branch])
 output_layer = Dense(3, activation = 'softmax')(merged)

 model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

answered Mar 12 at 10:21

nolw38

Thank you for your answer.
I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!

Thank you for the time you took to answer!!

Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?

 left_branch = Input(shape=(100,), dtype='int32')
 # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector; 
 # input_length: Size of the array
 left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
 lstm_out = LSTM(256)(left_branch_embedding)
 lstm_out = Dropout(0.7)(lstm_out)
 lstm_out = Dense(128, activation='sigmoid')(lstm_out)

 right_branch = Input((7012, ))
 merged = Concatenate()([lstm_out, right_branch])
 output_layer = Dense(3, activation = 'softmax')(merged)

 model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

answered Mar 12 at 10:21

nolw38

answered Mar 12 at 10:21

nolw38

answered Mar 12 at 10:21

nolw38

answered Mar 12 at 10:21

nolw38

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Hfrxdjt

bumped to the homepage by Community♦ 51 mins ago

bumped to the homepage by Community♦ 51 mins ago

bumped to the homepage by Community♦ 51 mins ago

bumped to the homepage by Community♦ 51 mins ago

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

bumped to the homepage by Community♦ 51 mins ago

bumped to the homepage by Community♦ 51 mins ago

bumped to the homepage by Community♦ 51 mins ago

bumped to the homepage by Community♦ 51 mins ago

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

2 Answers
2

2 Answers
2

2 Answers
2