How to group this dataframe in python? Unicorn Meta Zoo #1: Why another podcast? Announcing the arrival of Valued Associate #679: Cesar Manara 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsHow duplicated items can be deleted from dataframe in pandasConvert List to DataFrameHow would you optimize this python/pandas code?How to load this data from .dat into dataframe using pythonCreate a new row when a character exists in Python Dataframehow to improve searching index in dataframeValue extraction from a python dataframe [ problem statement specific ]Group datetime64 values per week in dataframeHow to add date column in python pandas dataframeSegmenting data in a Dataframe and assigning order numbers (Python using Pandas)
Align column where each cell has two decimals with siunitx
Has a Nobel Peace laureate ever been accused of war crimes?
How to count in linear time worst-case?
Did the Roman Empire have Penal Colonies?
Putting Ant-Man on house arrest
Does Feeblemind produce an ongoing magical effect that can be dispelled?
What to do with someone that cheated their way through university and a PhD program?
Is Diceware more secure than a long passphrase?
Flash for group photos near wall
What if Force was not Mass times Acceleration?
Are these square matrices always diagonalisable?
Check if a string is entirely made of the same substring
Married in secret, can marital status in passport be changed at a later date?
Does the set of sets which are elements of every set exist?
Identify story/novel: Tribe on colonized planet, not aware of this. "Taboo," altitude sickness, robot guardian (60s? Young Adult?)
AI positioning circles within an arc at equal distances and heights
Expansion//Explosion and Siren Stormtamer
Is this homebrew racial feat, Stonehide, balanced?
My bank got bought out, am I now going to have to start filing tax returns in a different state?
How to not starve gigantic beasts
Need of separate security plugins for both root and subfolder sites Wordpress?
Could Neutrino technically as side-effect, incentivize centralization of the bitcoin network?
Arriving in Atlanta after US Preclearance in Dublin. Will I go through TSA security in Atlanta to transfer to a connecting flight?
Multiple fireplaces in an apartment building?
How to group this dataframe in python?
Unicorn Meta Zoo #1: Why another podcast?
Announcing the arrival of Valued Associate #679: Cesar Manara
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsHow duplicated items can be deleted from dataframe in pandasConvert List to DataFrameHow would you optimize this python/pandas code?How to load this data from .dat into dataframe using pythonCreate a new row when a character exists in Python Dataframehow to improve searching index in dataframeValue extraction from a python dataframe [ problem statement specific ]Group datetime64 values per week in dataframeHow to add date column in python pandas dataframeSegmenting data in a Dataframe and assigning order numbers (Python using Pandas)
$begingroup$
I have this problem:
import pandas as pd
stripline = "----------------------------"
rawData =
'order number': ['11xa', '11xa', '11xa', '21xb', '31xc'],
'working area': ['LLA', 'LLE', 'LLS', 'MLA', 'MLE'],
'time': ['1', '6', '13', '35', '24']
df = pd.DataFrame(rawData)
print("original data:")
print(df.head())
print(stripline)
rawData2 =
'order number': ['11xa', '21xb', '31xc'],
'working area': ['LLS', 'MLA', 'MLE'],
'time': ['20', '35', '24']
df2 = pd.DataFrame(rawData2)
print("expected result:")
print("group after order number, sum all times to that order and choose working field with the biggest time")
print(df2.head())
How can I manipulate my dataframe df to get the df2?
I want to sum up all values in the time column that correspond to an order number. I want to use the working field with the highest time and especially I want to keep the rest of the data. The new data frame has three orders, the old one five.
python pandas dataframe
$endgroup$
bumped to the homepage by Community♦ 2 hours ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
$begingroup$
I have this problem:
import pandas as pd
stripline = "----------------------------"
rawData =
'order number': ['11xa', '11xa', '11xa', '21xb', '31xc'],
'working area': ['LLA', 'LLE', 'LLS', 'MLA', 'MLE'],
'time': ['1', '6', '13', '35', '24']
df = pd.DataFrame(rawData)
print("original data:")
print(df.head())
print(stripline)
rawData2 =
'order number': ['11xa', '21xb', '31xc'],
'working area': ['LLS', 'MLA', 'MLE'],
'time': ['20', '35', '24']
df2 = pd.DataFrame(rawData2)
print("expected result:")
print("group after order number, sum all times to that order and choose working field with the biggest time")
print(df2.head())
How can I manipulate my dataframe df to get the df2?
I want to sum up all values in the time column that correspond to an order number. I want to use the working field with the highest time and especially I want to keep the rest of the data. The new data frame has three orders, the old one five.
python pandas dataframe
$endgroup$
bumped to the homepage by Community♦ 2 hours ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
$begingroup$
I have this problem:
import pandas as pd
stripline = "----------------------------"
rawData =
'order number': ['11xa', '11xa', '11xa', '21xb', '31xc'],
'working area': ['LLA', 'LLE', 'LLS', 'MLA', 'MLE'],
'time': ['1', '6', '13', '35', '24']
df = pd.DataFrame(rawData)
print("original data:")
print(df.head())
print(stripline)
rawData2 =
'order number': ['11xa', '21xb', '31xc'],
'working area': ['LLS', 'MLA', 'MLE'],
'time': ['20', '35', '24']
df2 = pd.DataFrame(rawData2)
print("expected result:")
print("group after order number, sum all times to that order and choose working field with the biggest time")
print(df2.head())
How can I manipulate my dataframe df to get the df2?
I want to sum up all values in the time column that correspond to an order number. I want to use the working field with the highest time and especially I want to keep the rest of the data. The new data frame has three orders, the old one five.
python pandas dataframe
$endgroup$
I have this problem:
import pandas as pd
stripline = "----------------------------"
rawData =
'order number': ['11xa', '11xa', '11xa', '21xb', '31xc'],
'working area': ['LLA', 'LLE', 'LLS', 'MLA', 'MLE'],
'time': ['1', '6', '13', '35', '24']
df = pd.DataFrame(rawData)
print("original data:")
print(df.head())
print(stripline)
rawData2 =
'order number': ['11xa', '21xb', '31xc'],
'working area': ['LLS', 'MLA', 'MLE'],
'time': ['20', '35', '24']
df2 = pd.DataFrame(rawData2)
print("expected result:")
print("group after order number, sum all times to that order and choose working field with the biggest time")
print(df2.head())
How can I manipulate my dataframe df to get the df2?
I want to sum up all values in the time column that correspond to an order number. I want to use the working field with the highest time and especially I want to keep the rest of the data. The new data frame has three orders, the old one five.
python pandas dataframe
python pandas dataframe
asked Oct 24 '18 at 11:01
ScienceLoverScienceLover
111
111
bumped to the homepage by Community♦ 2 hours ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 2 hours ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
1) Convert time column to integer:
df['time'] = df['time'].astype(int)
2) Find working area with maximum time:
for index, row in df.iterrows():
df.at[index, 'max working area'] = df[df['time'] == df[df['order number'] == row['order number']]['time'].max()]['working area'].values[0]
3) Aggregate time column:
df2 = df.groupby(['order number', 'max working area']).sum()
Is this what you wanted?
$endgroup$
add a comment |
$begingroup$
This line of code should do it for you :
df.groupby(["order number", "working area"])['time'].agg(sum)
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f40140%2fhow-to-group-this-dataframe-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
1) Convert time column to integer:
df['time'] = df['time'].astype(int)
2) Find working area with maximum time:
for index, row in df.iterrows():
df.at[index, 'max working area'] = df[df['time'] == df[df['order number'] == row['order number']]['time'].max()]['working area'].values[0]
3) Aggregate time column:
df2 = df.groupby(['order number', 'max working area']).sum()
Is this what you wanted?
$endgroup$
add a comment |
$begingroup$
1) Convert time column to integer:
df['time'] = df['time'].astype(int)
2) Find working area with maximum time:
for index, row in df.iterrows():
df.at[index, 'max working area'] = df[df['time'] == df[df['order number'] == row['order number']]['time'].max()]['working area'].values[0]
3) Aggregate time column:
df2 = df.groupby(['order number', 'max working area']).sum()
Is this what you wanted?
$endgroup$
add a comment |
$begingroup$
1) Convert time column to integer:
df['time'] = df['time'].astype(int)
2) Find working area with maximum time:
for index, row in df.iterrows():
df.at[index, 'max working area'] = df[df['time'] == df[df['order number'] == row['order number']]['time'].max()]['working area'].values[0]
3) Aggregate time column:
df2 = df.groupby(['order number', 'max working area']).sum()
Is this what you wanted?
$endgroup$
1) Convert time column to integer:
df['time'] = df['time'].astype(int)
2) Find working area with maximum time:
for index, row in df.iterrows():
df.at[index, 'max working area'] = df[df['time'] == df[df['order number'] == row['order number']]['time'].max()]['working area'].values[0]
3) Aggregate time column:
df2 = df.groupby(['order number', 'max working area']).sum()
Is this what you wanted?
answered Oct 24 '18 at 12:14
Pratham SolankiPratham Solanki
61
61
add a comment |
add a comment |
$begingroup$
This line of code should do it for you :
df.groupby(["order number", "working area"])['time'].agg(sum)
$endgroup$
add a comment |
$begingroup$
This line of code should do it for you :
df.groupby(["order number", "working area"])['time'].agg(sum)
$endgroup$
add a comment |
$begingroup$
This line of code should do it for you :
df.groupby(["order number", "working area"])['time'].agg(sum)
$endgroup$
This line of code should do it for you :
df.groupby(["order number", "working area"])['time'].agg(sum)
edited Oct 24 '18 at 12:45
n1k31t4
6,6562421
6,6562421
answered Oct 24 '18 at 11:57
KaustubhKaustubh
1166
1166
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f40140%2fhow-to-group-this-dataframe-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown