Highest earning country and respective percentage

faridhuseynov.eng · August 30, 2021, 12:01pm

Tell us what’s happening:
Describe your issue in detail here.
I’m failing only on 2 related cases, which is to find the highest-earning country and the respective percentage. So, even when I have downloaded the csv file, put the filter for the salary >50K, I get in total 7841 items. When I do additional filter by country and put United-States now I get 7171. So, this I also get as a result of my code, however, the right answer is Iran. How is that possible?

Your code so far

dataset_richest_by_country = ((df[df['salary']=='>50K']).groupby('native-country').count()['age']).sort_values(ascending=False)
highest_earning_country = dataset_richest_by_country.head(1).index[0]
total_rich =dataset_richest_by_country.sum()
highest_earning_country_percentage =round(dataset_richest_by_country.head(1)[0]/total_rich*100,1)

My boilerplate url

Your browser information:

User Agent is: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36

Challenge: Demographic Data Analyzer

Link to the challenge:

AadityaSingh7 · August 30, 2021, 2:45pm

MOD EDIT: SOLUTION REDACTED
I hope this will help you

JeremyLT · August 30, 2021, 2:49pm

It is great that you solved the challenge, but instead of posting your full working solution, it is best to stay focused on answering the original poster’s question(s) and help guide them with hints and suggestions to solve their own issues with the challenge.

We are trying to cut back on the number of spoiler solutions found on the forum and instead focus on helping other campers with their questions and definitely not posting full working solutions.

jeremy.a.gray · August 31, 2021, 12:09am

It’s a percentage. There are more US entries, but as a percentage of the total entries, which country has the highest percentage of high salaries (>50K) of the total entries? You are correctly generating the number of >50K salaries per country in dataset_richest_country), but you need the percentage and in highest_earning_country you are just finding the largest number of earners. Or, to quote the specs:

What country has the highest percentage of people that earn >50K and what is that percentage?

faridhuseynov.eng · August 31, 2021, 4:47am

But I also have there the total_rich parameter. This is the sum of all people, who earn >50K and at the end I divide highest count (US) to the total_rich and multiply by 100 to get the percentage. Shouldn’t it be the right way?

jeremy.a.gray · August 31, 2021, 11:04am

dataset_richest_by_country doesn’t contain percentages. It’s dataset with the quantity of high earners in each country. It needs to be a percentage before you find the maximum. You’re trying to determine both the maximum percentage and which country has the maximum percentage, but you are setting highest_earning_country before you calculate the percentages for highest_earning_country_percentage.

faridhuseynov.eng · August 31, 2021, 11:43am

Thanks for the help, I got it. Although you get the richest people by country, you have to still divide it by the sum of all people by country.

AadityaSingh7 · August 31, 2021, 1:29pm

Ok @JeremyLT from next time i will not post full solution. Thanks for guiding me

system · March 2, 2022, 1:29am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Demographic Data Analyzer Challenger - highest percentage of people that earn >50K Python	6	876	June 25, 2022
Demographic Data Analyzer highest earning country Python	5	895	June 1, 2021
Highest earning country comes out to be USA Python	1	310	June 1, 2021
Demographic data - What country has the highest percentage of people that earn >50K? Python	7	2573	June 1, 2021
Data Analysis with Python Projects - Demographic Data Analyzer Python	2	465	January 24, 2023

Highest earning country and respective percentage

Related topics