Clustering with groups in data related to cluster label The 2019 Stack Overflow Developer Survey Results Are InClustering not producing even clustersPrepping Data For Usage ClusteringBest approach for this unsupervised clustering problem with categorical data?k-means clustering data with large number of meaningless valuesClustering individuals with random observationsExplaination or Description of clusters after clusteringHow to determine x and y in 2 dimensional K-means clustering?What are suitable predictive analytics models for data from multiple sensors?clustering 2-dimensional euclidean vectors - appropriate dissimilarity measureconfusing regarding to kmeans clulstering for data correlation

Old scifi movie from the 50s or 60s with men in solid red uniforms who interrogate a spy from the past

Is bread bad for ducks?

Why didn't the Event Horizon Telescope team mention Sagittarius A*?

What could be the right powersource for 15 seconds lifespan disposable giant chainsaw?

Can we generate random numbers using irrational numbers like π and e?

Output the Arecibo Message

I am an eight letter word. What am I?

Why “相同意思的词” is called “同义词” instead of "同意词"?

Did the UK government pay "millions and millions of dollars" to try to snag Julian Assange?

Button changing its text & action. Good or terrible?

Is an up-to-date browser secure on an out-of-date OS?

Loose spokes after only a few rides

What is the meaning of Triage in Cybersec world?

How do you keep chess fun when your opponent constantly beats you?

Kerning for subscripts of sigma?

Did Scotland spend $250,000 for the slogan "Welcome to Scotland"?

What does Linus Torvalds mean when he says that Git "never ever" tracks a file?

Flight paths in orbit around Ceres?

Merge two greps into single one

What is this business jet?

What information about me do stores get via my credit card?

What is the light source in the black hole images?

Dropping list elements from nested list after evaluation

Is it ok to offer lower paid work as a trial period before negotiating for a full-time job?



Clustering with groups in data related to cluster label



The 2019 Stack Overflow Developer Survey Results Are InClustering not producing even clustersPrepping Data For Usage ClusteringBest approach for this unsupervised clustering problem with categorical data?k-means clustering data with large number of meaningless valuesClustering individuals with random observationsExplaination or Description of clusters after clusteringHow to determine x and y in 2 dimensional K-means clustering?What are suitable predictive analytics models for data from multiple sensors?clustering 2-dimensional euclidean vectors - appropriate dissimilarity measureconfusing regarding to kmeans clulstering for data correlation










0












$begingroup$


I want to predict which device got used in which room. Therefore I've got device and sensor data.



My idea was to create a feature vector lie this:



 ----------------------------------------------------------
Data-Vector: | u_1 u_2 u_3 | x_1 ... x_7 | y_1 ... y_12 | z_1 ... z_4 |
----------------------------------------------------------
Categories: | device_data | room 1 data | room 2 data | room 3 data |
----------------------------------------------------------


My device data contains amongst other things:

+ timestamps when the device got turned on/off

+ average power consumption and divergences



My room data contains for example:

+ sensor data of motion detector and timestamps

+ sensor data of lamps (turned on/off) and timestamps

+ weather data



In the feature vector I've got the room data closest to the turn on/off timestamp.



All data points itself are floats.



My idea was to use k-means for clustering.

My problems are:

1. When using k-means, how can I tell which cluster correlates to which room label (room1, room2 or room3)?

2. I think it could be beneficial if I add somehow the information: which sensor is which room.



Can I manipulate the data in k-means algorithm so it will only consider:

the device data and room 1 data for the first cluster and sets everything else to zero

the device data and room 2 data for the second cluster

and so on...



This way I could tell that cluster x correlates to room x.



Or will this somehow break the k-means algorithm?










share|improve this question











$endgroup$




bumped to the homepage by Community 51 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.



















    0












    $begingroup$


    I want to predict which device got used in which room. Therefore I've got device and sensor data.



    My idea was to create a feature vector lie this:



     ----------------------------------------------------------
    Data-Vector: | u_1 u_2 u_3 | x_1 ... x_7 | y_1 ... y_12 | z_1 ... z_4 |
    ----------------------------------------------------------
    Categories: | device_data | room 1 data | room 2 data | room 3 data |
    ----------------------------------------------------------


    My device data contains amongst other things:

    + timestamps when the device got turned on/off

    + average power consumption and divergences



    My room data contains for example:

    + sensor data of motion detector and timestamps

    + sensor data of lamps (turned on/off) and timestamps

    + weather data



    In the feature vector I've got the room data closest to the turn on/off timestamp.



    All data points itself are floats.



    My idea was to use k-means for clustering.

    My problems are:

    1. When using k-means, how can I tell which cluster correlates to which room label (room1, room2 or room3)?

    2. I think it could be beneficial if I add somehow the information: which sensor is which room.



    Can I manipulate the data in k-means algorithm so it will only consider:

    the device data and room 1 data for the first cluster and sets everything else to zero

    the device data and room 2 data for the second cluster

    and so on...



    This way I could tell that cluster x correlates to room x.



    Or will this somehow break the k-means algorithm?










    share|improve this question











    $endgroup$




    bumped to the homepage by Community 51 mins ago


    This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

















      0












      0








      0





      $begingroup$


      I want to predict which device got used in which room. Therefore I've got device and sensor data.



      My idea was to create a feature vector lie this:



       ----------------------------------------------------------
      Data-Vector: | u_1 u_2 u_3 | x_1 ... x_7 | y_1 ... y_12 | z_1 ... z_4 |
      ----------------------------------------------------------
      Categories: | device_data | room 1 data | room 2 data | room 3 data |
      ----------------------------------------------------------


      My device data contains amongst other things:

      + timestamps when the device got turned on/off

      + average power consumption and divergences



      My room data contains for example:

      + sensor data of motion detector and timestamps

      + sensor data of lamps (turned on/off) and timestamps

      + weather data



      In the feature vector I've got the room data closest to the turn on/off timestamp.



      All data points itself are floats.



      My idea was to use k-means for clustering.

      My problems are:

      1. When using k-means, how can I tell which cluster correlates to which room label (room1, room2 or room3)?

      2. I think it could be beneficial if I add somehow the information: which sensor is which room.



      Can I manipulate the data in k-means algorithm so it will only consider:

      the device data and room 1 data for the first cluster and sets everything else to zero

      the device data and room 2 data for the second cluster

      and so on...



      This way I could tell that cluster x correlates to room x.



      Or will this somehow break the k-means algorithm?










      share|improve this question











      $endgroup$




      I want to predict which device got used in which room. Therefore I've got device and sensor data.



      My idea was to create a feature vector lie this:



       ----------------------------------------------------------
      Data-Vector: | u_1 u_2 u_3 | x_1 ... x_7 | y_1 ... y_12 | z_1 ... z_4 |
      ----------------------------------------------------------
      Categories: | device_data | room 1 data | room 2 data | room 3 data |
      ----------------------------------------------------------


      My device data contains amongst other things:

      + timestamps when the device got turned on/off

      + average power consumption and divergences



      My room data contains for example:

      + sensor data of motion detector and timestamps

      + sensor data of lamps (turned on/off) and timestamps

      + weather data



      In the feature vector I've got the room data closest to the turn on/off timestamp.



      All data points itself are floats.



      My idea was to use k-means for clustering.

      My problems are:

      1. When using k-means, how can I tell which cluster correlates to which room label (room1, room2 or room3)?

      2. I think it could be beneficial if I add somehow the information: which sensor is which room.



      Can I manipulate the data in k-means algorithm so it will only consider:

      the device data and room 1 data for the first cluster and sets everything else to zero

      the device data and room 2 data for the second cluster

      and so on...



      This way I could tell that cluster x correlates to room x.



      Or will this somehow break the k-means algorithm?







      machine-learning clustering k-means






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jun 13 '18 at 14:35







      GAV3

















      asked Jun 11 '18 at 15:34









      GAV3GAV3

      12




      12





      bumped to the homepage by Community 51 mins ago


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







      bumped to the homepage by Community 51 mins ago


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.






















          1 Answer
          1






          active

          oldest

          votes


















          0












          $begingroup$

          If you have apriori known labels, then use the labels, not clustering.



          Clustering is quite difficulty and fragile, so always prefer a simpler solution if you can.



          It's not very clear what you are trying to solve. Select columns to split the data into rooms? Just select the columns then.






          share|improve this answer









          $endgroup$












          • $begingroup$
            I have the power consumption of devices and some sensor data of rooms. I know which sensors are in which room, but I don't know anything about the devices. My goal is to predict which device got turned on in which room based on the sensor and device data. Since I don't know anything about the devices or the correlation of the sensor data, I can not label the vector and thus need clustering, right?
            $endgroup$
            – GAV3
            Jun 12 '18 at 13:45










          • $begingroup$
            What is the device data then? We don't have your data. We can only guess! You need to ask much more precise questions.
            $endgroup$
            – Anony-Mousse
            Jun 12 '18 at 19:51










          • $begingroup$
            Ok I added to my question above, what my data contains and what exactly my problems and ideas to fix them are.
            $endgroup$
            – GAV3
            Jun 13 '18 at 14:37











          Your Answer





          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "557"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f32955%2fclustering-with-groups-in-data-related-to-cluster-label%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0












          $begingroup$

          If you have apriori known labels, then use the labels, not clustering.



          Clustering is quite difficulty and fragile, so always prefer a simpler solution if you can.



          It's not very clear what you are trying to solve. Select columns to split the data into rooms? Just select the columns then.






          share|improve this answer









          $endgroup$












          • $begingroup$
            I have the power consumption of devices and some sensor data of rooms. I know which sensors are in which room, but I don't know anything about the devices. My goal is to predict which device got turned on in which room based on the sensor and device data. Since I don't know anything about the devices or the correlation of the sensor data, I can not label the vector and thus need clustering, right?
            $endgroup$
            – GAV3
            Jun 12 '18 at 13:45










          • $begingroup$
            What is the device data then? We don't have your data. We can only guess! You need to ask much more precise questions.
            $endgroup$
            – Anony-Mousse
            Jun 12 '18 at 19:51










          • $begingroup$
            Ok I added to my question above, what my data contains and what exactly my problems and ideas to fix them are.
            $endgroup$
            – GAV3
            Jun 13 '18 at 14:37















          0












          $begingroup$

          If you have apriori known labels, then use the labels, not clustering.



          Clustering is quite difficulty and fragile, so always prefer a simpler solution if you can.



          It's not very clear what you are trying to solve. Select columns to split the data into rooms? Just select the columns then.






          share|improve this answer









          $endgroup$












          • $begingroup$
            I have the power consumption of devices and some sensor data of rooms. I know which sensors are in which room, but I don't know anything about the devices. My goal is to predict which device got turned on in which room based on the sensor and device data. Since I don't know anything about the devices or the correlation of the sensor data, I can not label the vector and thus need clustering, right?
            $endgroup$
            – GAV3
            Jun 12 '18 at 13:45










          • $begingroup$
            What is the device data then? We don't have your data. We can only guess! You need to ask much more precise questions.
            $endgroup$
            – Anony-Mousse
            Jun 12 '18 at 19:51










          • $begingroup$
            Ok I added to my question above, what my data contains and what exactly my problems and ideas to fix them are.
            $endgroup$
            – GAV3
            Jun 13 '18 at 14:37













          0












          0








          0





          $begingroup$

          If you have apriori known labels, then use the labels, not clustering.



          Clustering is quite difficulty and fragile, so always prefer a simpler solution if you can.



          It's not very clear what you are trying to solve. Select columns to split the data into rooms? Just select the columns then.






          share|improve this answer









          $endgroup$



          If you have apriori known labels, then use the labels, not clustering.



          Clustering is quite difficulty and fragile, so always prefer a simpler solution if you can.



          It's not very clear what you are trying to solve. Select columns to split the data into rooms? Just select the columns then.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jun 12 '18 at 6:06









          Anony-MousseAnony-Mousse

          5,165625




          5,165625











          • $begingroup$
            I have the power consumption of devices and some sensor data of rooms. I know which sensors are in which room, but I don't know anything about the devices. My goal is to predict which device got turned on in which room based on the sensor and device data. Since I don't know anything about the devices or the correlation of the sensor data, I can not label the vector and thus need clustering, right?
            $endgroup$
            – GAV3
            Jun 12 '18 at 13:45










          • $begingroup$
            What is the device data then? We don't have your data. We can only guess! You need to ask much more precise questions.
            $endgroup$
            – Anony-Mousse
            Jun 12 '18 at 19:51










          • $begingroup$
            Ok I added to my question above, what my data contains and what exactly my problems and ideas to fix them are.
            $endgroup$
            – GAV3
            Jun 13 '18 at 14:37
















          • $begingroup$
            I have the power consumption of devices and some sensor data of rooms. I know which sensors are in which room, but I don't know anything about the devices. My goal is to predict which device got turned on in which room based on the sensor and device data. Since I don't know anything about the devices or the correlation of the sensor data, I can not label the vector and thus need clustering, right?
            $endgroup$
            – GAV3
            Jun 12 '18 at 13:45










          • $begingroup$
            What is the device data then? We don't have your data. We can only guess! You need to ask much more precise questions.
            $endgroup$
            – Anony-Mousse
            Jun 12 '18 at 19:51










          • $begingroup$
            Ok I added to my question above, what my data contains and what exactly my problems and ideas to fix them are.
            $endgroup$
            – GAV3
            Jun 13 '18 at 14:37















          $begingroup$
          I have the power consumption of devices and some sensor data of rooms. I know which sensors are in which room, but I don't know anything about the devices. My goal is to predict which device got turned on in which room based on the sensor and device data. Since I don't know anything about the devices or the correlation of the sensor data, I can not label the vector and thus need clustering, right?
          $endgroup$
          – GAV3
          Jun 12 '18 at 13:45




          $begingroup$
          I have the power consumption of devices and some sensor data of rooms. I know which sensors are in which room, but I don't know anything about the devices. My goal is to predict which device got turned on in which room based on the sensor and device data. Since I don't know anything about the devices or the correlation of the sensor data, I can not label the vector and thus need clustering, right?
          $endgroup$
          – GAV3
          Jun 12 '18 at 13:45












          $begingroup$
          What is the device data then? We don't have your data. We can only guess! You need to ask much more precise questions.
          $endgroup$
          – Anony-Mousse
          Jun 12 '18 at 19:51




          $begingroup$
          What is the device data then? We don't have your data. We can only guess! You need to ask much more precise questions.
          $endgroup$
          – Anony-Mousse
          Jun 12 '18 at 19:51












          $begingroup$
          Ok I added to my question above, what my data contains and what exactly my problems and ideas to fix them are.
          $endgroup$
          – GAV3
          Jun 13 '18 at 14:37




          $begingroup$
          Ok I added to my question above, what my data contains and what exactly my problems and ideas to fix them are.
          $endgroup$
          – GAV3
          Jun 13 '18 at 14:37

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f32955%2fclustering-with-groups-in-data-related-to-cluster-label%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Ружовы пелікан Змест Знешні выгляд | Пашырэнне | Асаблівасці біялогіі | Літаратура | НавігацыяДагледжаная версіяправерана1 зменаДагледжаная версіяправерана1 змена/ 22697590 Сістэматыкана ВіківідахВыявына Вікісховішчы174693363011049382

          ValueError: Error when checking input: expected conv2d_13_input to have shape (3, 150, 150) but got array with shape (150, 150, 3)2019 Community Moderator ElectionError when checking : expected dense_1_input to have shape (None, 5) but got array with shape (200, 1)Error 'Expected 2D array, got 1D array instead:'ValueError: Error when checking input: expected lstm_41_input to have 3 dimensions, but got array with shape (40000,100)ValueError: Error when checking target: expected dense_1 to have shape (7,) but got array with shape (1,)ValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (0,)Keras exception: ValueError: Error when checking input: expected conv2d_1_input to have shape (150, 150, 3) but got array with shape (256, 256, 3)Steps taking too long to completewhen checking input: expected dense_1_input to have shape (13328,) but got array with shape (317,)ValueError: Error when checking target: expected dense_3 to have shape (None, 1) but got array with shape (7715, 40000)Keras exception: Error when checking input: expected dense_input to have shape (2,) but got array with shape (1,)

          Illegal assignment from SObject to ContactFetching String, Id from Map - Illegal Assignment Id to Field / ObjectError: Compile Error: Illegal assignment from String to BooleanError: List has no rows for assignment to SObjectError on Test Class - System.QueryException: List has no rows for assignment to SObjectRemote action problemDML requires SObject or SObject list type error“Illegal assignment from List to List”Test Class Fail: Batch Class: System.QueryException: List has no rows for assignment to SObjectMapping to a user'List has no rows for assignment to SObject' Mystery