Python - Groupby

Hey all

Any ideas how to sum 1 column where they other columns are duplicates on multiple rows. example below :

input file :
11,No,No,10
12,Yes,Yes,5
11,No,No,5

Output:
11,No,No,15
12,Yes,Yes,5

Any ideas on solution? I don’t want to use pythons pandas

Firstly, welcome to the forums.

While we are primarily here to help people with their Free Code Camp progress, we are open to people on other paths, too.

With your current questions, we don’t have enough context to know what you already know or don’t know, so it is impossible to guide you without just telling you the answer (which we won’t do).

It is pretty typical on here for people to share a codepen / repl.it / jsfiddle example of what they have tried so that anyone helping has more of an idea of what help is actually helpful.

Please provide some example of what you’ve tried and I’m sure you’ll get more help.

Happy coding :slight_smile:

If you already have a pandas dataframe, using dataframe operations is the most efficient way to do things.

That being said, for pure python you can use a dict with tuples as keys

#notice how each row is a tuple (since tuples are hashable)
rows = (
    (11,'No','No',10),
    (12,'Yes','Yes',5),
    (11,'No','No',5),
)
my_map = {r[:3]:0 for r in rows}
for r in rows:
    my_map[r[:3]]+=r[3]
[ k+(v,) for k,v in my_map.items()]

Out:

[(11, 'No', 'No', 15),
 (12, 'Yes', 'Yes', 5)]

thank you! Instead of hard coding the values in the input file is there a way to read the file’s values instead?

This will do the trick.

with open('file_name','r') as f:
    data = [ tuple(i.strip().split(',')) for i in f]

one additional step you need to do is to convert some of the values to int since the data here is string. I will leave that as an exercise for u.

1 Like