Demographic Data Analyzer:Grouping and Sorting creates empty dataframe

I am currently attempting the demographic data analyzer project and I ran into a problem that I cant seem to solve. I am trying to answer the question: What is the most popular occupation in India that makes over 50k. So far I have the following:

#What percentage of people without advanced education make more than 50K?
    lower_education = df[['education', 'salary']].loc[(df['education']=='None')].value_counts()

The following is only printing the series,dtype: int64 and I am wondering why that is given that previous questions I have answered have outputted something other than the data structure info. If someone could explain that to me and tell me that I am going in the right direction as far as syntax that would be greatly appreciated. I think the only thing I have to add is sorting the combined columns by the occupation frequency and then I should be able to use .idxmax() to get the most popular occupation. Let me know if my logic is off as well.

Could you provide a link to your project?
I don’t really understand what error / output you are getting and your code snippet doesn’t seem related to your question.

It’s one of these REALLY annoying errors: salary is written with an uppercase “K”.

Salary is written with an uppercase K…? can you elaborate on that?

In your linked code, you check if salary is equal to the string “<=50k” or so, which will always be false because salary is either “<=50K” or “>50K” - written with an uppercase K.
So if the strings are always different, the condition will always be false and your selection will return an empty series-object.

Meanwhile in your code snipped of the first post, you check if education would be equal to the string “None” - which it never is. Hence it also creates an empty series-object, which is why it’s not giving anything but the data-structure info.

WOOWWW, big oof on my part thanks for catching that mistake. After correcting the issue I was able to get it working however upon looking at the query it looks like the answer to the question is Exec-managerial with 1968 occurrences but the correct answer according to the test units expected entry is Prof-specialty with 1859 occurrences. Do you happen to know why that is? Could it be a typo on the unit tests part?

The last thing you should ever expect is a failure in the provided code :wink:

Task: Identify the most popular occupation for those who earn >50K in India.
Your didn’t filter for India :wink:

1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.