Python - Groupby

Hey all

Any ideas how to sum 1 column where they other columns are duplicates on multiple rows. example below :

input file :


Any ideas on solution? I don’t want to use pythons pandas

If you already have a pandas dataframe, using dataframe operations is the most efficient way to do things.

That being said, for pure python you can use a dict with tuples as keys

#notice how each row is a tuple (since tuples are hashable)
rows = (
my_map = {r[:3]:0 for r in rows}
for r in rows:
[ k+(v,) for k,v in my_map.items()]


[(11, 'No', 'No', 15),
 (12, 'Yes', 'Yes', 5)]

thank you! Instead of hard coding the values in the input file is there a way to read the file’s values instead?

This will do the trick.

with open('file_name','r') as f:
    data = [ tuple(i.strip().split(',')) for i in f]

one additional step you need to do is to convert some of the values to int since the data here is string. I will leave that as an exercise for u.

1 Like