How to group this dataframe in python? Unicorn Meta Zoo #1: Why another podcast? Announcing the arrival of Valued Associate #679: Cesar Manara 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsHow duplicated items can be deleted from dataframe in pandasConvert List to DataFrameHow would you optimize this python/pandas code?How to load this data from .dat into dataframe using pythonCreate a new row when a character exists in Python Dataframehow to improve searching index in dataframeValue extraction from a python dataframe [ problem statement specific ]Group datetime64 values per week in dataframeHow to add date column in python pandas dataframeSegmenting data in a Dataframe and assigning order numbers (Python using Pandas)

Align column where each cell has two decimals with siunitx

Has a Nobel Peace laureate ever been accused of war crimes?

How to count in linear time worst-case?

Did the Roman Empire have Penal Colonies?

Putting Ant-Man on house arrest

Does Feeblemind produce an ongoing magical effect that can be dispelled?

What to do with someone that cheated their way through university and a PhD program?

Is Diceware more secure than a long passphrase?

Flash for group photos near wall

What if Force was not Mass times Acceleration?

Are these square matrices always diagonalisable?

Check if a string is entirely made of the same substring

Married in secret, can marital status in passport be changed at a later date?

Does the set of sets which are elements of every set exist?

Identify story/novel: Tribe on colonized planet, not aware of this. "Taboo," altitude sickness, robot guardian (60s? Young Adult?)

AI positioning circles within an arc at equal distances and heights

Expansion//Explosion and Siren Stormtamer

Is this homebrew racial feat, Stonehide, balanced?

My bank got bought out, am I now going to have to start filing tax returns in a different state?

How to not starve gigantic beasts

Need of separate security plugins for both root and subfolder sites Wordpress?

Could Neutrino technically as side-effect, incentivize centralization of the bitcoin network?

Arriving in Atlanta after US Preclearance in Dublin. Will I go through TSA security in Atlanta to transfer to a connecting flight?

Multiple fireplaces in an apartment building?



How to group this dataframe in python?



Unicorn Meta Zoo #1: Why another podcast?
Announcing the arrival of Valued Associate #679: Cesar Manara
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsHow duplicated items can be deleted from dataframe in pandasConvert List to DataFrameHow would you optimize this python/pandas code?How to load this data from .dat into dataframe using pythonCreate a new row when a character exists in Python Dataframehow to improve searching index in dataframeValue extraction from a python dataframe [ problem statement specific ]Group datetime64 values per week in dataframeHow to add date column in python pandas dataframeSegmenting data in a Dataframe and assigning order numbers (Python using Pandas)










1












$begingroup$


I have this problem:



import pandas as pd

stripline = "----------------------------"

rawData =
'order number': ['11xa', '11xa', '11xa', '21xb', '31xc'],
'working area': ['LLA', 'LLE', 'LLS', 'MLA', 'MLE'],
'time': ['1', '6', '13', '35', '24']


df = pd.DataFrame(rawData)
print("original data:")
print(df.head())

print(stripline)

rawData2 =
'order number': ['11xa', '21xb', '31xc'],
'working area': ['LLS', 'MLA', 'MLE'],
'time': ['20', '35', '24']

df2 = pd.DataFrame(rawData2)

print("expected result:")
print("group after order number, sum all times to that order and choose working field with the biggest time")
print(df2.head())


How can I manipulate my dataframe df to get the df2?



I want to sum up all values in the time column that correspond to an order number. I want to use the working field with the highest time and especially I want to keep the rest of the data. The new data frame has three orders, the old one five.










share|improve this question









$endgroup$




bumped to the homepage by Community 2 hours ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.



















    1












    $begingroup$


    I have this problem:



    import pandas as pd

    stripline = "----------------------------"

    rawData =
    'order number': ['11xa', '11xa', '11xa', '21xb', '31xc'],
    'working area': ['LLA', 'LLE', 'LLS', 'MLA', 'MLE'],
    'time': ['1', '6', '13', '35', '24']


    df = pd.DataFrame(rawData)
    print("original data:")
    print(df.head())

    print(stripline)

    rawData2 =
    'order number': ['11xa', '21xb', '31xc'],
    'working area': ['LLS', 'MLA', 'MLE'],
    'time': ['20', '35', '24']

    df2 = pd.DataFrame(rawData2)

    print("expected result:")
    print("group after order number, sum all times to that order and choose working field with the biggest time")
    print(df2.head())


    How can I manipulate my dataframe df to get the df2?



    I want to sum up all values in the time column that correspond to an order number. I want to use the working field with the highest time and especially I want to keep the rest of the data. The new data frame has three orders, the old one five.










    share|improve this question









    $endgroup$




    bumped to the homepage by Community 2 hours ago


    This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

















      1












      1








      1





      $begingroup$


      I have this problem:



      import pandas as pd

      stripline = "----------------------------"

      rawData =
      'order number': ['11xa', '11xa', '11xa', '21xb', '31xc'],
      'working area': ['LLA', 'LLE', 'LLS', 'MLA', 'MLE'],
      'time': ['1', '6', '13', '35', '24']


      df = pd.DataFrame(rawData)
      print("original data:")
      print(df.head())

      print(stripline)

      rawData2 =
      'order number': ['11xa', '21xb', '31xc'],
      'working area': ['LLS', 'MLA', 'MLE'],
      'time': ['20', '35', '24']

      df2 = pd.DataFrame(rawData2)

      print("expected result:")
      print("group after order number, sum all times to that order and choose working field with the biggest time")
      print(df2.head())


      How can I manipulate my dataframe df to get the df2?



      I want to sum up all values in the time column that correspond to an order number. I want to use the working field with the highest time and especially I want to keep the rest of the data. The new data frame has three orders, the old one five.










      share|improve this question









      $endgroup$




      I have this problem:



      import pandas as pd

      stripline = "----------------------------"

      rawData =
      'order number': ['11xa', '11xa', '11xa', '21xb', '31xc'],
      'working area': ['LLA', 'LLE', 'LLS', 'MLA', 'MLE'],
      'time': ['1', '6', '13', '35', '24']


      df = pd.DataFrame(rawData)
      print("original data:")
      print(df.head())

      print(stripline)

      rawData2 =
      'order number': ['11xa', '21xb', '31xc'],
      'working area': ['LLS', 'MLA', 'MLE'],
      'time': ['20', '35', '24']

      df2 = pd.DataFrame(rawData2)

      print("expected result:")
      print("group after order number, sum all times to that order and choose working field with the biggest time")
      print(df2.head())


      How can I manipulate my dataframe df to get the df2?



      I want to sum up all values in the time column that correspond to an order number. I want to use the working field with the highest time and especially I want to keep the rest of the data. The new data frame has three orders, the old one five.







      python pandas dataframe






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Oct 24 '18 at 11:01









      ScienceLoverScienceLover

      111




      111





      bumped to the homepage by Community 2 hours ago


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







      bumped to the homepage by Community 2 hours ago


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.






















          2 Answers
          2






          active

          oldest

          votes


















          0












          $begingroup$

          1) Convert time column to integer:



          df['time'] = df['time'].astype(int)


          2) Find working area with maximum time:



          for index, row in df.iterrows():
          df.at[index, 'max working area'] = df[df['time'] == df[df['order number'] == row['order number']]['time'].max()]['working area'].values[0]


          3) Aggregate time column:



          df2 = df.groupby(['order number', 'max working area']).sum()


          Is this what you wanted?






          share|improve this answer









          $endgroup$




















            0












            $begingroup$

            This line of code should do it for you :



            df.groupby(["order number", "working area"])['time'].agg(sum)





            share|improve this answer











            $endgroup$













              Your Answer








              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "557"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader:
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              ,
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













              draft saved

              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f40140%2fhow-to-group-this-dataframe-in-python%23new-answer', 'question_page');

              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              0












              $begingroup$

              1) Convert time column to integer:



              df['time'] = df['time'].astype(int)


              2) Find working area with maximum time:



              for index, row in df.iterrows():
              df.at[index, 'max working area'] = df[df['time'] == df[df['order number'] == row['order number']]['time'].max()]['working area'].values[0]


              3) Aggregate time column:



              df2 = df.groupby(['order number', 'max working area']).sum()


              Is this what you wanted?






              share|improve this answer









              $endgroup$

















                0












                $begingroup$

                1) Convert time column to integer:



                df['time'] = df['time'].astype(int)


                2) Find working area with maximum time:



                for index, row in df.iterrows():
                df.at[index, 'max working area'] = df[df['time'] == df[df['order number'] == row['order number']]['time'].max()]['working area'].values[0]


                3) Aggregate time column:



                df2 = df.groupby(['order number', 'max working area']).sum()


                Is this what you wanted?






                share|improve this answer









                $endgroup$















                  0












                  0








                  0





                  $begingroup$

                  1) Convert time column to integer:



                  df['time'] = df['time'].astype(int)


                  2) Find working area with maximum time:



                  for index, row in df.iterrows():
                  df.at[index, 'max working area'] = df[df['time'] == df[df['order number'] == row['order number']]['time'].max()]['working area'].values[0]


                  3) Aggregate time column:



                  df2 = df.groupby(['order number', 'max working area']).sum()


                  Is this what you wanted?






                  share|improve this answer









                  $endgroup$



                  1) Convert time column to integer:



                  df['time'] = df['time'].astype(int)


                  2) Find working area with maximum time:



                  for index, row in df.iterrows():
                  df.at[index, 'max working area'] = df[df['time'] == df[df['order number'] == row['order number']]['time'].max()]['working area'].values[0]


                  3) Aggregate time column:



                  df2 = df.groupby(['order number', 'max working area']).sum()


                  Is this what you wanted?







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Oct 24 '18 at 12:14









                  Pratham SolankiPratham Solanki

                  61




                  61





















                      0












                      $begingroup$

                      This line of code should do it for you :



                      df.groupby(["order number", "working area"])['time'].agg(sum)





                      share|improve this answer











                      $endgroup$

















                        0












                        $begingroup$

                        This line of code should do it for you :



                        df.groupby(["order number", "working area"])['time'].agg(sum)





                        share|improve this answer











                        $endgroup$















                          0












                          0








                          0





                          $begingroup$

                          This line of code should do it for you :



                          df.groupby(["order number", "working area"])['time'].agg(sum)





                          share|improve this answer











                          $endgroup$



                          This line of code should do it for you :



                          df.groupby(["order number", "working area"])['time'].agg(sum)






                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Oct 24 '18 at 12:45









                          n1k31t4

                          6,6562421




                          6,6562421










                          answered Oct 24 '18 at 11:57









                          KaustubhKaustubh

                          1166




                          1166



























                              draft saved

                              draft discarded
















































                              Thanks for contributing an answer to Data Science Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid


                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.

                              Use MathJax to format equations. MathJax reference.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f40140%2fhow-to-group-this-dataframe-in-python%23new-answer', 'question_page');

                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Ружовы пелікан Змест Знешні выгляд | Пашырэнне | Асаблівасці біялогіі | Літаратура | НавігацыяДагледжаная версіяправерана1 зменаДагледжаная версіяправерана1 змена/ 22697590 Сістэматыкана ВіківідахВыявына Вікісховішчы174693363011049382

                              ValueError: Error when checking input: expected conv2d_13_input to have shape (3, 150, 150) but got array with shape (150, 150, 3)2019 Community Moderator ElectionError when checking : expected dense_1_input to have shape (None, 5) but got array with shape (200, 1)Error 'Expected 2D array, got 1D array instead:'ValueError: Error when checking input: expected lstm_41_input to have 3 dimensions, but got array with shape (40000,100)ValueError: Error when checking target: expected dense_1 to have shape (7,) but got array with shape (1,)ValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (0,)Keras exception: ValueError: Error when checking input: expected conv2d_1_input to have shape (150, 150, 3) but got array with shape (256, 256, 3)Steps taking too long to completewhen checking input: expected dense_1_input to have shape (13328,) but got array with shape (317,)ValueError: Error when checking target: expected dense_3 to have shape (None, 1) but got array with shape (7715, 40000)Keras exception: Error when checking input: expected dense_input to have shape (2,) but got array with shape (1,)

                              Illegal assignment from SObject to ContactFetching String, Id from Map - Illegal Assignment Id to Field / ObjectError: Compile Error: Illegal assignment from String to BooleanError: List has no rows for assignment to SObjectError on Test Class - System.QueryException: List has no rows for assignment to SObjectRemote action problemDML requires SObject or SObject list type error“Illegal assignment from List to List”Test Class Fail: Batch Class: System.QueryException: List has no rows for assignment to SObjectMapping to a user'List has no rows for assignment to SObject' Mystery