whenever i work with pandas
i try to use an ipython
style console so that I can mess about with the dataframe quickly and then later solidify what I did into code.
that, and a combination of stackexchange and pandas documentation
I rated this free online book which has a chapter on numpy
and a chapter on pandas
:
https://jakevdp.github.io/PythonDataScienceHandbook/
the set up for groupby is usually the same.
data = sorted(data, key=sorter)
grouped = groupby(data, key=sorter)
groupby
is a generator object
iterating through groupby
gives you (key, group)
pairs.
a group
is also a generator object
iterating through the group
gives you the data with that key
that’s why in all the code with itertools groupby
you have to iterate through the group
generator object.