Why can decision trees have a high amount of variance The Next CEO of Stack Overflow2019 Community Moderator ElectionAre decision tree algorithms linear or nonlinearWhy might several types of models give almost identical results?Aggregating Decision Treesdecision trees on mix of categorical and real value parametersWhat knowledge do I need in order to write a simple AI program to play a game?Why don't tree ensembles require one-hot-encoding?Why do we pick random features in random forestWhy Decision Tree boundary forms a square shape and SVM a circular/oval one?In a random forest, are all decision trees given same priority?How decision trees work in Pythonwhy do we need row sampling in random forests?

How do I go from 300 unfinished/half written blog posts, to published posts?

Crossing the line between justified force and brutality

ls Ordering[Ordering[list]] optimal?

How can a function with a hole (removable discontinuity) equal a function with no hole?

Why does standard notation not preserve intervals (visually)

Return of the Riley Riddles in Reverse

The King's new dress

Is it a good idea to use COLUMN AS (left([Another_Column],(4)) insetead of LEFT in the select?

How to safely derail a train during transit?

Why use "finir par" instead of "finir de" before an infinitive?

Can a caster that cast polymorph on itself end it at any point even if their Int is low?

Why were Madagascar and New Zealand discovered so late?

What happens if you roll doubles 3 times then land on "Go to jail?"

How to Reset Passwords on Multiple Websites Easily?

Is expanding the research of a group into machine learning as a PhD student risky?

What makes a siege story/plot interesting?

+1 instead of double roll for advantage

Anatomically Correct Strange Women In Ponds Distributing Swords

In place solution to remove duplicates from a sorted list

How to pronounce the slash sign

Visit to the USA with ESTA approved before trip to Iran

Why do remote companies require working in the US?

How do we know the LHC results are robust?

Term for the "extreme-extension" version of a straw man fallacy?



Why can decision trees have a high amount of variance



The Next CEO of Stack Overflow
2019 Community Moderator ElectionAre decision tree algorithms linear or nonlinearWhy might several types of models give almost identical results?Aggregating Decision Treesdecision trees on mix of categorical and real value parametersWhat knowledge do I need in order to write a simple AI program to play a game?Why don't tree ensembles require one-hot-encoding?Why do we pick random features in random forestWhy Decision Tree boundary forms a square shape and SVM a circular/oval one?In a random forest, are all decision trees given same priority?How decision trees work in Pythonwhy do we need row sampling in random forests?










1












$begingroup$


I've heard that decision trees can have a high amount of variance, and that for a data set $D$ split into test/train the decision tree could be quite different depending on how the data was split. Apparently, this provides motivation for algorithms such as Random Forest. Is this correct? Why does a decision tree suffer from high variability?










share|improve this question









New contributor




baxx is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$
















    1












    $begingroup$


    I've heard that decision trees can have a high amount of variance, and that for a data set $D$ split into test/train the decision tree could be quite different depending on how the data was split. Apparently, this provides motivation for algorithms such as Random Forest. Is this correct? Why does a decision tree suffer from high variability?










    share|improve this question









    New contributor




    baxx is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$














      1












      1








      1





      $begingroup$


      I've heard that decision trees can have a high amount of variance, and that for a data set $D$ split into test/train the decision tree could be quite different depending on how the data was split. Apparently, this provides motivation for algorithms such as Random Forest. Is this correct? Why does a decision tree suffer from high variability?










      share|improve this question









      New contributor




      baxx is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      I've heard that decision trees can have a high amount of variance, and that for a data set $D$ split into test/train the decision tree could be quite different depending on how the data was split. Apparently, this provides motivation for algorithms such as Random Forest. Is this correct? Why does a decision tree suffer from high variability?







      machine-learning classification decision-trees training variance






      share|improve this question









      New contributor




      baxx is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      baxx is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited 10 mins ago









      Media

      7,42262163




      7,42262163






      New contributor




      baxx is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 33 mins ago









      baxxbaxx

      1063




      1063




      New contributor




      baxx is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      baxx is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      baxx is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




















          1 Answer
          1






          active

          oldest

          votes


















          0












          $begingroup$

          The point is that if your training data does not have the same input features with different labels which leads to $0$ Bayes error, the decision tree can learn it entirely and that can lead to overfitting also known as high variance. This is why people usually use pruning using cross-validation for avoiding the trees to get overfitted to the training data.



          Decision trees are powerful classifiers. Algorithms such as Bagging try to use powerful classifiers in order to achieve ensemble learning for finding a classifier that does not have high variance. One way can be ignoring some features and using the others, Random Forest, in order to find the best features which can generalize well. The other can be using choosing random training data for training each decision tree and after that put it that again inside the training data, bootstrapping.



          The reason that decision trees can overfit is due to their VC. Although it is not infinite, unlike 1-NN, it is very large which leads to overfitting. It simply means you have to provide multiple numerous data in order not to overfit. For understanding VC dimension of decision trees, take a look at Are decision tree algorithms linear or nonlinear.






          share|improve this answer









          $endgroup$












          • $begingroup$
            "the same input features with different labels which leads to 0 Bayes error", I'm not sure what you mean by this.
            $endgroup$
            – baxx
            2 mins ago











          Your Answer





          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "557"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );






          baxx is a new contributor. Be nice, and check out our Code of Conduct.









          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48166%2fwhy-can-decision-trees-have-a-high-amount-of-variance%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0












          $begingroup$

          The point is that if your training data does not have the same input features with different labels which leads to $0$ Bayes error, the decision tree can learn it entirely and that can lead to overfitting also known as high variance. This is why people usually use pruning using cross-validation for avoiding the trees to get overfitted to the training data.



          Decision trees are powerful classifiers. Algorithms such as Bagging try to use powerful classifiers in order to achieve ensemble learning for finding a classifier that does not have high variance. One way can be ignoring some features and using the others, Random Forest, in order to find the best features which can generalize well. The other can be using choosing random training data for training each decision tree and after that put it that again inside the training data, bootstrapping.



          The reason that decision trees can overfit is due to their VC. Although it is not infinite, unlike 1-NN, it is very large which leads to overfitting. It simply means you have to provide multiple numerous data in order not to overfit. For understanding VC dimension of decision trees, take a look at Are decision tree algorithms linear or nonlinear.






          share|improve this answer









          $endgroup$












          • $begingroup$
            "the same input features with different labels which leads to 0 Bayes error", I'm not sure what you mean by this.
            $endgroup$
            – baxx
            2 mins ago















          0












          $begingroup$

          The point is that if your training data does not have the same input features with different labels which leads to $0$ Bayes error, the decision tree can learn it entirely and that can lead to overfitting also known as high variance. This is why people usually use pruning using cross-validation for avoiding the trees to get overfitted to the training data.



          Decision trees are powerful classifiers. Algorithms such as Bagging try to use powerful classifiers in order to achieve ensemble learning for finding a classifier that does not have high variance. One way can be ignoring some features and using the others, Random Forest, in order to find the best features which can generalize well. The other can be using choosing random training data for training each decision tree and after that put it that again inside the training data, bootstrapping.



          The reason that decision trees can overfit is due to their VC. Although it is not infinite, unlike 1-NN, it is very large which leads to overfitting. It simply means you have to provide multiple numerous data in order not to overfit. For understanding VC dimension of decision trees, take a look at Are decision tree algorithms linear or nonlinear.






          share|improve this answer









          $endgroup$












          • $begingroup$
            "the same input features with different labels which leads to 0 Bayes error", I'm not sure what you mean by this.
            $endgroup$
            – baxx
            2 mins ago













          0












          0








          0





          $begingroup$

          The point is that if your training data does not have the same input features with different labels which leads to $0$ Bayes error, the decision tree can learn it entirely and that can lead to overfitting also known as high variance. This is why people usually use pruning using cross-validation for avoiding the trees to get overfitted to the training data.



          Decision trees are powerful classifiers. Algorithms such as Bagging try to use powerful classifiers in order to achieve ensemble learning for finding a classifier that does not have high variance. One way can be ignoring some features and using the others, Random Forest, in order to find the best features which can generalize well. The other can be using choosing random training data for training each decision tree and after that put it that again inside the training data, bootstrapping.



          The reason that decision trees can overfit is due to their VC. Although it is not infinite, unlike 1-NN, it is very large which leads to overfitting. It simply means you have to provide multiple numerous data in order not to overfit. For understanding VC dimension of decision trees, take a look at Are decision tree algorithms linear or nonlinear.






          share|improve this answer









          $endgroup$



          The point is that if your training data does not have the same input features with different labels which leads to $0$ Bayes error, the decision tree can learn it entirely and that can lead to overfitting also known as high variance. This is why people usually use pruning using cross-validation for avoiding the trees to get overfitted to the training data.



          Decision trees are powerful classifiers. Algorithms such as Bagging try to use powerful classifiers in order to achieve ensemble learning for finding a classifier that does not have high variance. One way can be ignoring some features and using the others, Random Forest, in order to find the best features which can generalize well. The other can be using choosing random training data for training each decision tree and after that put it that again inside the training data, bootstrapping.



          The reason that decision trees can overfit is due to their VC. Although it is not infinite, unlike 1-NN, it is very large which leads to overfitting. It simply means you have to provide multiple numerous data in order not to overfit. For understanding VC dimension of decision trees, take a look at Are decision tree algorithms linear or nonlinear.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 11 mins ago









          MediaMedia

          7,42262163




          7,42262163











          • $begingroup$
            "the same input features with different labels which leads to 0 Bayes error", I'm not sure what you mean by this.
            $endgroup$
            – baxx
            2 mins ago
















          • $begingroup$
            "the same input features with different labels which leads to 0 Bayes error", I'm not sure what you mean by this.
            $endgroup$
            – baxx
            2 mins ago















          $begingroup$
          "the same input features with different labels which leads to 0 Bayes error", I'm not sure what you mean by this.
          $endgroup$
          – baxx
          2 mins ago




          $begingroup$
          "the same input features with different labels which leads to 0 Bayes error", I'm not sure what you mean by this.
          $endgroup$
          – baxx
          2 mins ago










          baxx is a new contributor. Be nice, and check out our Code of Conduct.









          draft saved

          draft discarded


















          baxx is a new contributor. Be nice, and check out our Code of Conduct.












          baxx is a new contributor. Be nice, and check out our Code of Conduct.











          baxx is a new contributor. Be nice, and check out our Code of Conduct.














          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48166%2fwhy-can-decision-trees-have-a-high-amount-of-variance%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Ружовы пелікан Змест Знешні выгляд | Пашырэнне | Асаблівасці біялогіі | Літаратура | НавігацыяДагледжаная версіяправерана1 зменаДагледжаная версіяправерана1 змена/ 22697590 Сістэматыкана ВіківідахВыявына Вікісховішчы174693363011049382

          ValueError: Error when checking input: expected conv2d_13_input to have shape (3, 150, 150) but got array with shape (150, 150, 3)2019 Community Moderator ElectionError when checking : expected dense_1_input to have shape (None, 5) but got array with shape (200, 1)Error 'Expected 2D array, got 1D array instead:'ValueError: Error when checking input: expected lstm_41_input to have 3 dimensions, but got array with shape (40000,100)ValueError: Error when checking target: expected dense_1 to have shape (7,) but got array with shape (1,)ValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (0,)Keras exception: ValueError: Error when checking input: expected conv2d_1_input to have shape (150, 150, 3) but got array with shape (256, 256, 3)Steps taking too long to completewhen checking input: expected dense_1_input to have shape (13328,) but got array with shape (317,)ValueError: Error when checking target: expected dense_3 to have shape (None, 1) but got array with shape (7715, 40000)Keras exception: Error when checking input: expected dense_input to have shape (2,) but got array with shape (1,)

          Illegal assignment from SObject to ContactFetching String, Id from Map - Illegal Assignment Id to Field / ObjectError: Compile Error: Illegal assignment from String to BooleanError: List has no rows for assignment to SObjectError on Test Class - System.QueryException: List has no rows for assignment to SObjectRemote action problemDML requires SObject or SObject list type error“Illegal assignment from List to List”Test Class Fail: Batch Class: System.QueryException: List has no rows for assignment to SObjectMapping to a user'List has no rows for assignment to SObject' Mystery