Why does the classic Neural Network perform better than LSTM in Sentiment Analysis The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsAccuracy drops if more layers trainable - weirdNeural network accuracy for simple classificationSimple prediction with KerasTraining Accuracy stuck in KerasKeras LSTM accuracy stuck at 50%Value error in Merging two different models in kerasWhy is my Keras model not learning image segmentation?Bidirectional GRU: validation loss stuck on plateau diverges from well performing training lossMetrics values are equal while training and testing a modelIN CIFAR 10 DATASET

One-dimensional Japanese puzzle

Drawing vertical/oblique lines in Metrical tree (tikz-qtree, tipa)

Simulating Exploding Dice

Button changing its text & action. Good or terrible?

Am I ethically obligated to go into work on an off day if the reason is sudden?

60's-70's movie: home appliances revolting against the owners

different output for groups and groups USERNAME after adding a username to a group

Why don't hard Brexiteers insist on a hard border to prevent illegal immigration after Brexit?

Is there a writing software that you can sort scenes like slides in PowerPoint?

For what reasons would an animal species NOT cross a *horizontal* land bridge?

Pretty sure I'm over complicating my loops but unsure how to simplify

Variable with quotation marks "$()"

Do working physicists consider Newtonian mechanics to be "falsified"?

How do spell lists change if the party levels up without taking a long rest?

Is it ethical to upload a automatically generated paper to a non peer-reviewed site as part of a larger research?

How to politely respond to generic emails requesting a PhD/job in my lab? Without wasting too much time

What happens to a Warlock's expended Spell Slots when they gain a Level?

"is" operation returns false even though two objects have same id

How to handle characters who are more educated than the author?

Single author papers against my advisor's will?

How to make Illustrator type tool selection automatically adapt with text length

Can withdrawing asylum be illegal?

What other Star Trek series did the main TNG cast show up in?

The following signatures were invalid: EXPKEYSIG 1397BC53640DB551



Why does the classic Neural Network perform better than LSTM in Sentiment Analysis



The 2019 Stack Overflow Developer Survey Results Are In
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsAccuracy drops if more layers trainable - weirdNeural network accuracy for simple classificationSimple prediction with KerasTraining Accuracy stuck in KerasKeras LSTM accuracy stuck at 50%Value error in Merging two different models in kerasWhy is my Keras model not learning image segmentation?Bidirectional GRU: validation loss stuck on plateau diverges from well performing training lossMetrics values are equal while training and testing a modelIN CIFAR 10 DATASET










0












$begingroup$


My goal is to predict the polarity of some reviews (negative, positive or neutral). I tried two different neural networks:



 left_branch = Input((7000, ))
left_branch_dense = Dense(512, activation = 'relu')(left_branch)

right_branch = Input((14012, ))
right_branch_dense = Dense(512, activation = 'relu')(right_branch)
merged = Concatenate()([left_branch_dense, right_branch_dense])
output_layer = Dense(3, activation = 'softmax')(merged)

model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit([np.array(review_matrix), np.array(X_train)], labels,epochs=2, verbose=1)
model.save('model.merged')

#############################################################################


#############################################################################

#We will try to merge two different models in a different way: Accuracy: 70

# Prepare the review column for embedding:
review_matrix_for_embedding = prepare_for_encoding(train_set[4].tolist(), 7000) # Shape: (1503,100)

second_matrix = np.array(pd.concat([onehot_category, aspect_matrix],axis=1))


left_branch = Input(shape=(100,), dtype='int32')
# input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
# input_length: Size of the array
left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
lstm_out = LSTM(256)(left_branch_embedding)
lstm_out = Dropout(0.7)(lstm_out)
lstm_out = Dense(128, activation='sigmoid')(lstm_out)

right_branch = Input((7012, ))
merged = Concatenate()([lstm_out, right_branch])
output_layer = Dense(3, activation = 'softmax')(merged)

model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit([review_matrix_for_embedding, second_matrix], labels,epochs=5, verbose=1)


The first one does 80% accuracy while the second one does 70%, with embedding vectors and LSTM layer. How is it possible? Is there anything wrong in my architecture?










share|improve this question











$endgroup$




bumped to the homepage by Community 51 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.



















    0












    $begingroup$


    My goal is to predict the polarity of some reviews (negative, positive or neutral). I tried two different neural networks:



     left_branch = Input((7000, ))
    left_branch_dense = Dense(512, activation = 'relu')(left_branch)

    right_branch = Input((14012, ))
    right_branch_dense = Dense(512, activation = 'relu')(right_branch)
    merged = Concatenate()([left_branch_dense, right_branch_dense])
    output_layer = Dense(3, activation = 'softmax')(merged)

    model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    model.fit([np.array(review_matrix), np.array(X_train)], labels,epochs=2, verbose=1)
    model.save('model.merged')

    #############################################################################


    #############################################################################

    #We will try to merge two different models in a different way: Accuracy: 70

    # Prepare the review column for embedding:
    review_matrix_for_embedding = prepare_for_encoding(train_set[4].tolist(), 7000) # Shape: (1503,100)

    second_matrix = np.array(pd.concat([onehot_category, aspect_matrix],axis=1))


    left_branch = Input(shape=(100,), dtype='int32')
    # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
    # input_length: Size of the array
    left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
    lstm_out = LSTM(256)(left_branch_embedding)
    lstm_out = Dropout(0.7)(lstm_out)
    lstm_out = Dense(128, activation='sigmoid')(lstm_out)

    right_branch = Input((7012, ))
    merged = Concatenate()([lstm_out, right_branch])
    output_layer = Dense(3, activation = 'softmax')(merged)

    model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    model.fit([review_matrix_for_embedding, second_matrix], labels,epochs=5, verbose=1)


    The first one does 80% accuracy while the second one does 70%, with embedding vectors and LSTM layer. How is it possible? Is there anything wrong in my architecture?










    share|improve this question











    $endgroup$




    bumped to the homepage by Community 51 mins ago


    This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

















      0












      0








      0


      1



      $begingroup$


      My goal is to predict the polarity of some reviews (negative, positive or neutral). I tried two different neural networks:



       left_branch = Input((7000, ))
      left_branch_dense = Dense(512, activation = 'relu')(left_branch)

      right_branch = Input((14012, ))
      right_branch_dense = Dense(512, activation = 'relu')(right_branch)
      merged = Concatenate()([left_branch_dense, right_branch_dense])
      output_layer = Dense(3, activation = 'softmax')(merged)

      model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
      model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
      model.fit([np.array(review_matrix), np.array(X_train)], labels,epochs=2, verbose=1)
      model.save('model.merged')

      #############################################################################


      #############################################################################

      #We will try to merge two different models in a different way: Accuracy: 70

      # Prepare the review column for embedding:
      review_matrix_for_embedding = prepare_for_encoding(train_set[4].tolist(), 7000) # Shape: (1503,100)

      second_matrix = np.array(pd.concat([onehot_category, aspect_matrix],axis=1))


      left_branch = Input(shape=(100,), dtype='int32')
      # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
      # input_length: Size of the array
      left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
      lstm_out = LSTM(256)(left_branch_embedding)
      lstm_out = Dropout(0.7)(lstm_out)
      lstm_out = Dense(128, activation='sigmoid')(lstm_out)

      right_branch = Input((7012, ))
      merged = Concatenate()([lstm_out, right_branch])
      output_layer = Dense(3, activation = 'softmax')(merged)

      model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
      model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
      model.fit([review_matrix_for_embedding, second_matrix], labels,epochs=5, verbose=1)


      The first one does 80% accuracy while the second one does 70%, with embedding vectors and LSTM layer. How is it possible? Is there anything wrong in my architecture?










      share|improve this question











      $endgroup$




      My goal is to predict the polarity of some reviews (negative, positive or neutral). I tried two different neural networks:



       left_branch = Input((7000, ))
      left_branch_dense = Dense(512, activation = 'relu')(left_branch)

      right_branch = Input((14012, ))
      right_branch_dense = Dense(512, activation = 'relu')(right_branch)
      merged = Concatenate()([left_branch_dense, right_branch_dense])
      output_layer = Dense(3, activation = 'softmax')(merged)

      model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
      model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
      model.fit([np.array(review_matrix), np.array(X_train)], labels,epochs=2, verbose=1)
      model.save('model.merged')

      #############################################################################


      #############################################################################

      #We will try to merge two different models in a different way: Accuracy: 70

      # Prepare the review column for embedding:
      review_matrix_for_embedding = prepare_for_encoding(train_set[4].tolist(), 7000) # Shape: (1503,100)

      second_matrix = np.array(pd.concat([onehot_category, aspect_matrix],axis=1))


      left_branch = Input(shape=(100,), dtype='int32')
      # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
      # input_length: Size of the array
      left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
      lstm_out = LSTM(256)(left_branch_embedding)
      lstm_out = Dropout(0.7)(lstm_out)
      lstm_out = Dense(128, activation='sigmoid')(lstm_out)

      right_branch = Input((7012, ))
      merged = Concatenate()([lstm_out, right_branch])
      output_layer = Dense(3, activation = 'softmax')(merged)

      model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
      model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
      model.fit([review_matrix_for_embedding, second_matrix], labels,epochs=5, verbose=1)


      The first one does 80% accuracy while the second one does 70%, with embedding vectors and LSTM layer. How is it possible? Is there anything wrong in my architecture?







      keras nlp






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 11 at 18:16









      Nischal Hp

      48829




      48829










      asked Mar 11 at 14:49









      nolw38nolw38

      63




      63





      bumped to the homepage by Community 51 mins ago


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







      bumped to the homepage by Community 51 mins ago


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.






















          2 Answers
          2






          active

          oldest

          votes


















          0












          $begingroup$

          First of all, I have noticed that you have used sigmoid activation function for your LSTM Dense Layer and in your ANN you used relu, maybe, MAYBE, this can be a reason for your lower performance. That is could be happening because sigmoid functions suffer from two problems:
          - Saturation of gradients: sigmoid functions have tail distributions, meaning that they saturate in this 'flat' regions practically diminishing the gradient to zero and affecting backpropagation/training process.
          - Sigmoid outputs are not zero-centered: This is an issue due the gradient calculation during backpropagation. Either having all enters positivo or negative will add a 'zigzag' effect difficulting the training process.



          My comments above are taken from this excellent tutorial that you should read: http://cs231n.github.io/neural-networks-1/
          I tried to summarize it but they did a master work and I think you should read it.



          Second, we must consider other factors such as: are you analyzing your train/test/val losses? Maybe your LSTM networks just takes longer to train and reach its minimum. You need to work a little more on these parameters before taking any conclusions. Plot a graph showing your training and validation loss so we can assess if your model is underfitting.



          Lastly, I have a question for you: Why should your LSTM network perform better than a simple ANN? Although LSTM + Embeddings are powerful techniques that have gained attention in a lot of fields, essentially NLP, that will be not every task that they beat classical approaches. I myself have tried with different data sets, and depending on the application, simple ML algorithms such as SVM would easily beat the more complex ones, including sentiment analysis.



          So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.






          share|improve this answer









          $endgroup$




















            0












            $begingroup$

            Thank you for your answer.
            I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!



            Here are pictures of training for both the one who does 79% accuracy (merged two classic neural networks), and the one in which I do 70: (of course I don't talk about the training accuracy but the test accuracy).
            I see a difference in the loss value, but I don't know how to interpret it? Does this mean that for the less good architecture, I don't reach a minimum? If yes, what can I modify in my NN ?



            Thank you for the time you took to answer!!



            70



            79



            Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?



             left_branch = Input(shape=(100,), dtype='int32')
            # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
            # input_length: Size of the array
            left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
            lstm_out = LSTM(256)(left_branch_embedding)
            lstm_out = Dropout(0.7)(lstm_out)
            lstm_out = Dense(128, activation='sigmoid')(lstm_out)

            right_branch = Input((7012, ))
            merged = Concatenate()([lstm_out, right_branch])
            output_layer = Dense(3, activation = 'softmax')(merged)

            model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
            model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])





            share|improve this answer









            $endgroup$













              Your Answer








              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "557"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader:
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              ,
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













              draft saved

              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47094%2fwhy-does-the-classic-neural-network-perform-better-than-lstm-in-sentiment-analys%23new-answer', 'question_page');

              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              0












              $begingroup$

              First of all, I have noticed that you have used sigmoid activation function for your LSTM Dense Layer and in your ANN you used relu, maybe, MAYBE, this can be a reason for your lower performance. That is could be happening because sigmoid functions suffer from two problems:
              - Saturation of gradients: sigmoid functions have tail distributions, meaning that they saturate in this 'flat' regions practically diminishing the gradient to zero and affecting backpropagation/training process.
              - Sigmoid outputs are not zero-centered: This is an issue due the gradient calculation during backpropagation. Either having all enters positivo or negative will add a 'zigzag' effect difficulting the training process.



              My comments above are taken from this excellent tutorial that you should read: http://cs231n.github.io/neural-networks-1/
              I tried to summarize it but they did a master work and I think you should read it.



              Second, we must consider other factors such as: are you analyzing your train/test/val losses? Maybe your LSTM networks just takes longer to train and reach its minimum. You need to work a little more on these parameters before taking any conclusions. Plot a graph showing your training and validation loss so we can assess if your model is underfitting.



              Lastly, I have a question for you: Why should your LSTM network perform better than a simple ANN? Although LSTM + Embeddings are powerful techniques that have gained attention in a lot of fields, essentially NLP, that will be not every task that they beat classical approaches. I myself have tried with different data sets, and depending on the application, simple ML algorithms such as SVM would easily beat the more complex ones, including sentiment analysis.



              So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.






              share|improve this answer









              $endgroup$

















                0












                $begingroup$

                First of all, I have noticed that you have used sigmoid activation function for your LSTM Dense Layer and in your ANN you used relu, maybe, MAYBE, this can be a reason for your lower performance. That is could be happening because sigmoid functions suffer from two problems:
                - Saturation of gradients: sigmoid functions have tail distributions, meaning that they saturate in this 'flat' regions practically diminishing the gradient to zero and affecting backpropagation/training process.
                - Sigmoid outputs are not zero-centered: This is an issue due the gradient calculation during backpropagation. Either having all enters positivo or negative will add a 'zigzag' effect difficulting the training process.



                My comments above are taken from this excellent tutorial that you should read: http://cs231n.github.io/neural-networks-1/
                I tried to summarize it but they did a master work and I think you should read it.



                Second, we must consider other factors such as: are you analyzing your train/test/val losses? Maybe your LSTM networks just takes longer to train and reach its minimum. You need to work a little more on these parameters before taking any conclusions. Plot a graph showing your training and validation loss so we can assess if your model is underfitting.



                Lastly, I have a question for you: Why should your LSTM network perform better than a simple ANN? Although LSTM + Embeddings are powerful techniques that have gained attention in a lot of fields, essentially NLP, that will be not every task that they beat classical approaches. I myself have tried with different data sets, and depending on the application, simple ML algorithms such as SVM would easily beat the more complex ones, including sentiment analysis.



                So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.






                share|improve this answer









                $endgroup$















                  0












                  0








                  0





                  $begingroup$

                  First of all, I have noticed that you have used sigmoid activation function for your LSTM Dense Layer and in your ANN you used relu, maybe, MAYBE, this can be a reason for your lower performance. That is could be happening because sigmoid functions suffer from two problems:
                  - Saturation of gradients: sigmoid functions have tail distributions, meaning that they saturate in this 'flat' regions practically diminishing the gradient to zero and affecting backpropagation/training process.
                  - Sigmoid outputs are not zero-centered: This is an issue due the gradient calculation during backpropagation. Either having all enters positivo or negative will add a 'zigzag' effect difficulting the training process.



                  My comments above are taken from this excellent tutorial that you should read: http://cs231n.github.io/neural-networks-1/
                  I tried to summarize it but they did a master work and I think you should read it.



                  Second, we must consider other factors such as: are you analyzing your train/test/val losses? Maybe your LSTM networks just takes longer to train and reach its minimum. You need to work a little more on these parameters before taking any conclusions. Plot a graph showing your training and validation loss so we can assess if your model is underfitting.



                  Lastly, I have a question for you: Why should your LSTM network perform better than a simple ANN? Although LSTM + Embeddings are powerful techniques that have gained attention in a lot of fields, essentially NLP, that will be not every task that they beat classical approaches. I myself have tried with different data sets, and depending on the application, simple ML algorithms such as SVM would easily beat the more complex ones, including sentiment analysis.



                  So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.






                  share|improve this answer









                  $endgroup$



                  First of all, I have noticed that you have used sigmoid activation function for your LSTM Dense Layer and in your ANN you used relu, maybe, MAYBE, this can be a reason for your lower performance. That is could be happening because sigmoid functions suffer from two problems:
                  - Saturation of gradients: sigmoid functions have tail distributions, meaning that they saturate in this 'flat' regions practically diminishing the gradient to zero and affecting backpropagation/training process.
                  - Sigmoid outputs are not zero-centered: This is an issue due the gradient calculation during backpropagation. Either having all enters positivo or negative will add a 'zigzag' effect difficulting the training process.



                  My comments above are taken from this excellent tutorial that you should read: http://cs231n.github.io/neural-networks-1/
                  I tried to summarize it but they did a master work and I think you should read it.



                  Second, we must consider other factors such as: are you analyzing your train/test/val losses? Maybe your LSTM networks just takes longer to train and reach its minimum. You need to work a little more on these parameters before taking any conclusions. Plot a graph showing your training and validation loss so we can assess if your model is underfitting.



                  Lastly, I have a question for you: Why should your LSTM network perform better than a simple ANN? Although LSTM + Embeddings are powerful techniques that have gained attention in a lot of fields, essentially NLP, that will be not every task that they beat classical approaches. I myself have tried with different data sets, and depending on the application, simple ML algorithms such as SVM would easily beat the more complex ones, including sentiment analysis.



                  So to conclude, try these things and let us know your results. Also, if anyone disagrees with my answer, I would like to discuss it. I hope it helps.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Mar 11 at 20:58









                  Victor OliveiraVictor Oliveira

                  3657




                  3657





















                      0












                      $begingroup$

                      Thank you for your answer.
                      I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!



                      Here are pictures of training for both the one who does 79% accuracy (merged two classic neural networks), and the one in which I do 70: (of course I don't talk about the training accuracy but the test accuracy).
                      I see a difference in the loss value, but I don't know how to interpret it? Does this mean that for the less good architecture, I don't reach a minimum? If yes, what can I modify in my NN ?



                      Thank you for the time you took to answer!!



                      70



                      79



                      Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?



                       left_branch = Input(shape=(100,), dtype='int32')
                      # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
                      # input_length: Size of the array
                      left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
                      lstm_out = LSTM(256)(left_branch_embedding)
                      lstm_out = Dropout(0.7)(lstm_out)
                      lstm_out = Dense(128, activation='sigmoid')(lstm_out)

                      right_branch = Input((7012, ))
                      merged = Concatenate()([lstm_out, right_branch])
                      output_layer = Dense(3, activation = 'softmax')(merged)

                      model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
                      model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])





                      share|improve this answer









                      $endgroup$

















                        0












                        $begingroup$

                        Thank you for your answer.
                        I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!



                        Here are pictures of training for both the one who does 79% accuracy (merged two classic neural networks), and the one in which I do 70: (of course I don't talk about the training accuracy but the test accuracy).
                        I see a difference in the loss value, but I don't know how to interpret it? Does this mean that for the less good architecture, I don't reach a minimum? If yes, what can I modify in my NN ?



                        Thank you for the time you took to answer!!



                        70



                        79



                        Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?



                         left_branch = Input(shape=(100,), dtype='int32')
                        # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
                        # input_length: Size of the array
                        left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
                        lstm_out = LSTM(256)(left_branch_embedding)
                        lstm_out = Dropout(0.7)(lstm_out)
                        lstm_out = Dense(128, activation='sigmoid')(lstm_out)

                        right_branch = Input((7012, ))
                        merged = Concatenate()([lstm_out, right_branch])
                        output_layer = Dense(3, activation = 'softmax')(merged)

                        model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
                        model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])





                        share|improve this answer









                        $endgroup$















                          0












                          0








                          0





                          $begingroup$

                          Thank you for your answer.
                          I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!



                          Here are pictures of training for both the one who does 79% accuracy (merged two classic neural networks), and the one in which I do 70: (of course I don't talk about the training accuracy but the test accuracy).
                          I see a difference in the loss value, but I don't know how to interpret it? Does this mean that for the less good architecture, I don't reach a minimum? If yes, what can I modify in my NN ?



                          Thank you for the time you took to answer!!



                          70



                          79



                          Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?



                           left_branch = Input(shape=(100,), dtype='int32')
                          # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
                          # input_length: Size of the array
                          left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
                          lstm_out = LSTM(256)(left_branch_embedding)
                          lstm_out = Dropout(0.7)(lstm_out)
                          lstm_out = Dense(128, activation='sigmoid')(lstm_out)

                          right_branch = Input((7012, ))
                          merged = Concatenate()([lstm_out, right_branch])
                          output_layer = Dense(3, activation = 'softmax')(merged)

                          model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
                          model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])





                          share|improve this answer









                          $endgroup$



                          Thank you for your answer.
                          I changed sigmoid for RELU, and the result is the same. Anyway, I will keep RELU now!!



                          Here are pictures of training for both the one who does 79% accuracy (merged two classic neural networks), and the one in which I do 70: (of course I don't talk about the training accuracy but the test accuracy).
                          I see a difference in the loss value, but I don't know how to interpret it? Does this mean that for the less good architecture, I don't reach a minimum? If yes, what can I modify in my NN ?



                          Thank you for the time you took to answer!!



                          70



                          79



                          Edit: Where can I modify parameters like learning rate in the LSTM part of the Neural network?



                           left_branch = Input(shape=(100,), dtype='int32')
                          # input_dim: Size of maximum integer (7001 here); output dim: Size of embedded vector;
                          # input_length: Size of the array
                          left_branch_embedding = Embedding(7000, 300, input_length=100)(left_branch)
                          lstm_out = LSTM(256)(left_branch_embedding)
                          lstm_out = Dropout(0.7)(lstm_out)
                          lstm_out = Dense(128, activation='sigmoid')(lstm_out)

                          right_branch = Input((7012, ))
                          merged = Concatenate()([lstm_out, right_branch])
                          output_layer = Dense(3, activation = 'softmax')(merged)

                          model = Model(inputs=[left_branch, right_branch], outputs=output_layer)
                          model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Mar 12 at 10:21









                          nolw38nolw38

                          63




                          63



























                              draft saved

                              draft discarded
















































                              Thanks for contributing an answer to Data Science Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid


                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.

                              Use MathJax to format equations. MathJax reference.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47094%2fwhy-does-the-classic-neural-network-perform-better-than-lstm-in-sentiment-analys%23new-answer', 'question_page');

                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Ружовы пелікан Змест Знешні выгляд | Пашырэнне | Асаблівасці біялогіі | Літаратура | НавігацыяДагледжаная версіяправерана1 зменаДагледжаная версіяправерана1 змена/ 22697590 Сістэматыкана ВіківідахВыявына Вікісховішчы174693363011049382

                              ValueError: Error when checking input: expected conv2d_13_input to have shape (3, 150, 150) but got array with shape (150, 150, 3)2019 Community Moderator ElectionError when checking : expected dense_1_input to have shape (None, 5) but got array with shape (200, 1)Error 'Expected 2D array, got 1D array instead:'ValueError: Error when checking input: expected lstm_41_input to have 3 dimensions, but got array with shape (40000,100)ValueError: Error when checking target: expected dense_1 to have shape (7,) but got array with shape (1,)ValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (0,)Keras exception: ValueError: Error when checking input: expected conv2d_1_input to have shape (150, 150, 3) but got array with shape (256, 256, 3)Steps taking too long to completewhen checking input: expected dense_1_input to have shape (13328,) but got array with shape (317,)ValueError: Error when checking target: expected dense_3 to have shape (None, 1) but got array with shape (7715, 40000)Keras exception: Error when checking input: expected dense_input to have shape (2,) but got array with shape (1,)

                              Illegal assignment from SObject to ContactFetching String, Id from Map - Illegal Assignment Id to Field / ObjectError: Compile Error: Illegal assignment from String to BooleanError: List has no rows for assignment to SObjectError on Test Class - System.QueryException: List has no rows for assignment to SObjectRemote action problemDML requires SObject or SObject list type error“Illegal assignment from List to List”Test Class Fail: Batch Class: System.QueryException: List has no rows for assignment to SObjectMapping to a user'List has no rows for assignment to SObject' Mystery