Demographic-data-analyzer - higher_richer 46.5 vs 46.6

m.ramiro.r · May 31, 2024, 8:15pm

Hi! I can’t get pass this challenge because the test doesn’t pass.

I think i got the correct answer but i can’t find where the cause might be.

    # with and without `Bachelors`, `Masters`, or `Doctorate`
    higher_education = df[df["education"].isin(["Bachelors", "Masters", "Doctorate"])]
    lower_education = df[~df["education"].isin(["Bachelors", "Masters", "Doctorate"])]

    # percentage with salary >50K
    higher_education_rich = round(len(higher_education[higher_education["salary"] == ">50K"].value_counts())/len(higher_education.value_counts())*100, 1)
    lower_education_rich = round((len(lower_education[lower_education["salary"] == ">50K"].value_counts()))/len(lower_education.value_counts())*100, 1)

The test fails and it gives me this error on higher only:

Thanks in advance!

pkdvalis · May 31, 2024, 10:09pm

It’s so close, it seems like a rounding error.

What’s the effect of that round() function? I would double check anywhere you are doing rounding, check if you are rounding up or down, etc.

Do you want to round up or truncate?

m.ramiro.r · June 1, 2024, 12:13am

The challenge says: " Round all decimals to the nearest tenth." so i’m rounding with the round function passing 1 as the value. I don’t see how that could make some trouble…

I don’t want to cheat but the only way to get the test OK is with a “-0.1” in the creation of the variable.

pkdvalis · June 1, 2024, 12:50am

Well, you are off by exactly one tenth, it’s definitely a rounding error.

Try a different approach, examine what the number is before and after the rounding. Try a different method to get the correct number.

You’re correct, you don’t want to subtract 0.1 from everything, that will also result in incorrect results.

pkdvalis · June 1, 2024, 1:00am

Your brackets for round()are arranged a bit differently here

pkdvalis · June 1, 2024, 1:21am

Ok, the rounding and the brackets were red herrings. Look at this:

    print(len(higher_education.value_counts()))
    print(len(higher_education))

>>> 7488
>>> 7491

I believe if there are entries with the exact same stats, value_counts() will aggregate them as 1 line, so that’s why there’s less.

I don’t think there are any columns that guarantee each entry is unique.

Not positive, but I think that might be what accounts for the difference.

m.ramiro.r · June 3, 2024, 5:42pm

Oh man you really rock, thank you it was breaking my head!!!

I got it!

system · December 3, 2024, 5:42am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Higher_education_rich & lower_education_rich wronh values Python	3	315	October 24, 2021
Data Analysis with Python Projects - Demographic Data Analyzer Python	27	171	January 8, 2025
Demographic Data Analyzer ~ Higher/lower education Percentage miscalculation Python	5	730	December 30, 2021
TypeError: 'str' object cannot be interpreted as an integer Python	4	2558	June 1, 2021
Problem with Demographic Data Analyzer Python	4	681	April 9, 2022

Demographic-data-analyzer - higher_richer 46.5 vs 46.6

Related topics