Demographic-data-analyzer - higher_richer 46.5 vs 46.6

Hi! I can’t get pass this challenge because the test doesn’t pass.

I think i got the correct answer but i can’t find where the cause might be.

    # with and without `Bachelors`, `Masters`, or `Doctorate`
    higher_education = df[df["education"].isin(["Bachelors", "Masters", "Doctorate"])]
    lower_education = df[~df["education"].isin(["Bachelors", "Masters", "Doctorate"])]

    # percentage with salary >50K
    higher_education_rich = round(len(higher_education[higher_education["salary"] == ">50K"].value_counts())/len(higher_education.value_counts())*100, 1)
    lower_education_rich = round((len(lower_education[lower_education["salary"] == ">50K"].value_counts()))/len(lower_education.value_counts())*100, 1)

The test fails and it gives me this error on higher only:

Thanks in advance!

It’s so close, it seems like a rounding error.

What’s the effect of that round() function? I would double check anywhere you are doing rounding, check if you are rounding up or down, etc.

Do you want to round up or truncate?

The challenge says: " Round all decimals to the nearest tenth." so i’m rounding with the round function passing 1 as the value. I don’t see how that could make some trouble…

I don’t want to cheat but the only way to get the test OK is with a “-0.1” in the creation of the variable.

Well, you are off by exactly one tenth, it’s definitely a rounding error.

Try a different approach, examine what the number is before and after the rounding. Try a different method to get the correct number.

You’re correct, you don’t want to subtract 0.1 from everything, that will also result in incorrect results.

1 Like

Your brackets for round()are arranged a bit differently here

1 Like

Ok, the rounding and the brackets were red herrings. Look at this:

    print(len(higher_education.value_counts()))
    print(len(higher_education))

>>> 7488
>>> 7491

I believe if there are entries with the exact same stats, value_counts() will aggregate them as 1 line, so that’s why there’s less.

I don’t think there are any columns that guarantee each entry is unique.

Not positive, but I think that might be what accounts for the difference.

1 Like

Oh man you really rock, thank you it was breaking my head!!!

I got it!

1 Like