# Highest earning country and respective percentage

Tell us what’s happening:
Describe your issue in detail here.
I’m failing only on 2 related cases, which is to find the highest-earning country and the respective percentage. So, even when I have downloaded the csv file, put the filter for the salary >50K, I get in total 7841 items. When I do additional filter by country and put United-States now I get 7171. So, this I also get as a result of my code, however, the right answer is Iran. How is that possible?

``````dataset_richest_by_country = ((df[df['salary']=='>50K']).groupby('native-country').count()['age']).sort_values(ascending=False)
total_rich =dataset_richest_by_country.sum()
``````

My boilerplate url

User Agent is: `Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36`

Challenge: Demographic Data Analyzer

1 Like

MOD EDIT: SOLUTION REDACTED

It is great that you solved the challenge, but instead of posting your full working solution, it is best to stay focused on answering the original poster’s question(s) and help guide them with hints and suggestions to solve their own issues with the challenge.

We are trying to cut back on the number of spoiler solutions found on the forum and instead focus on helping other campers with their questions and definitely not posting full working solutions.

2 Likes

It’s a percentage. There are more US entries, but as a percentage of the total entries, which country has the highest percentage of high salaries (>50K) of the total entries? You are correctly generating the number of >50K salaries per country in `dataset_richest_country`), but you need the percentage and in `highest_earning_country` you are just finding the largest number of earners. Or, to quote the specs:

• What country has the highest percentage of people that earn >50K and what is that percentage?

But I also have there the `total_rich` parameter. This is the sum of all people, who earn >50K and at the end I divide highest count (US) to the `total_rich` and multiply by 100 to get the percentage. Shouldn’t it be the right way?

`dataset_richest_by_country` doesn’t contain percentages. It’s dataset with the quantity of high earners in each country. It needs to be a percentage before you find the maximum. You’re trying to determine both the maximum percentage and which country has the maximum percentage, but you are setting `highest_earning_country` before you calculate the percentages for `highest_earning_country_percentage`.

1 Like

Thanks for the help, I got it. Although you get the richest people by country, you have to still divide it by the sum of all people by country.

Ok @JeremyLT from next time i will not post full solution. Thanks for guiding me

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.