It took me days to understand the groupby for pandas. What I understood so far is that it sets the "grouped-by" column as the dataframe index unless the index is specified by a set_index command.
syntax: df.groupby('col1')
With this, you get to cut up your dataframe into smaller pieces for further processing. Groupby does the processing quicker (as tested with %time) compared to other types of slicing. Once sliced, you can use aggregate (.agg(dict{})) to further manipulate your data.
I am just trying to sound smart. Well...you'd be better off going here: http://pandas.pydata.org/pandas-docs/stable/groupby.html
No comments:
Post a Comment