This is regarding the test to select the country with the highest percentage of people with a salary of
I thought the easiest way to do this would be to
native-country, then select the
salary column, and do
value_counts to get normalized percentages. This way I can just select the row with the highest
country_percentages = df.groupby('native-country')['salary'].value_counts(normalize=True) highest_earning_country = country_percentages.idxmax() highest_earning_country_percentage = (country_percentages.max() * 100).round(1)
native-country salary ? <=50K 0.749571 >50K 0.250429 Cambodia <=50K 0.631579 >50K 0.368421 Canada <=50K 0.677686 ... United-States >50K 0.245835 Vietnam <=50K 0.925373 >50K 0.074627 Yugoslavia <=50K 0.625000 >50K 0.375000
Meanwhile, I have used an alternate approach which has allowed me to complete the challenge, but I’d like to try to understand if I was on the right lines with my initial solution and how I might’ve been able to fix it.
Is anyone familiar enough with Pandas to know if my initial approach was viable and how I could’ve completed it this way?
Your browser information:
User Agent is:
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:86.0) Gecko/20100101 Firefox/86.0.
Challenge: Demographic Data Analyzer
Link to the challenge: