I am currently attempting the demographic data analyzer project and I ran into a problem that I cant seem to solve. I am trying to answer the question: What is the most popular occupation in India that makes over 50k. So far I have the following:
#What percentage of people without advanced education make more than 50K?
lower_education = df[['education', 'salary']].loc[(df['education']=='None')].value_counts()
The following is only printing the series,dtype: int64 and I am wondering why that is given that previous questions I have answered have outputted something other than the data structure info. If someone could explain that to me and tell me that I am going in the right direction as far as syntax that would be greatly appreciated. I think the only thing I have to add is sorting the combined columns by the occupation frequency and then I should be able to use .idxmax() to get the most popular occupation. Let me know if my logic is off as well.
Could you provide a link to your project?
I don’t really understand what error / output you are getting and your code snippet doesn’t seem related to your question.
In your linked code, you check if salary is equal to the string “<=50k” or so, which will always be false because salary is either “<=50K” or “>50K” - written with an uppercase K.
So if the strings are always different, the condition will always be false and your selection will return an empty series-object.
Meanwhile in your code snipped of the first post, you check if education would be equal to the string “None” - which it never is. Hence it also creates an empty series-object, which is why it’s not giving anything but the data-structure info.
WOOWWW, big oof on my part thanks for catching that mistake. After correcting the issue I was able to get it working however upon looking at the query it looks like the answer to the question is Exec-managerial with 1968 occurrences but the correct answer according to the test units expected entry is Prof-specialty with 1859 occurrences. Do you happen to know why that is? Could it be a typo on the unit tests part?