Demographic Data Analyzer Error

Tell us what’s happening:
Hello - below is my code and i got errors in the attached. Can someone please help me understand what causes those errors?


Your code so far
def calculate_demographic_data(print_data=True):
# Read data from file
df = pd.read_csv(‘adult.data.csv’)
# How many of each race are represented in this dataset? This should be a Pandas series with race names as the index labels.
race_count = df.groupby(‘race’)[‘race’].count()

# What is the average age of men?
average_age_men = df[df['sex']=='Male']['age'].mean()

# What is the percentage of people who have a Bachelor's degree?
percentage_bachelors = (df[df['education']=='Bachelors']['education'].count())/(df['education'].count())

# What percentage of people with advanced education (`Bachelors`, `Masters`, or `Doctorate`) make more than 50K?
# What percentage of people without advanced education make more than 50K?

# with and without `Bachelors`, `Masters`, or `Doctorate`
df1=df[(df['education'].isin(['Bachelors','Masters','Doctorate']))&(df['salary']=='>50K')]
df2=df[df['education'].isin(['Bachelors','Masters','Doctorate'])]
df3=df[(~df['education'].isin(['Bachelors','Masters','Doctorate']))&(df['salary']=='>50K')]
df4=df[~df['education'].isin(['Bachelors','Masters','Doctorate'])]
higher_education = df2
lower_education = df4

# percentage with salary >50K
higher_education_rich = df1['education'].count()/df2['education'].count()
lower_education_rich = df3['education'].count()/df4['education'].count()

# What is the minimum number of hours a person works per week (hours-per-week feature)?
min_work_hours = df['hours-per-week'].min()

# What percentage of the people who work the minimum number of hours per week have a salary of >50K?
num_min_workers = df[df['hours-per-week']==min_work_hours]['hours-per-week'].count()
num_min_salary_works=df[(df['hours-per-week']==min_work_hours)&(df['salary']=='>50K')]['hours-per-week'].count()

rich_percentage = num_min_salary_works/num_min_workers

# What country has the highest percentage of people that earn >50K?
count_salary_count=df[df['salary'] == '>50K']['native-country'].value_counts()
count_country=df['native-country'].value_counts()
highest_earning_country_table=count_salary_count/count_country
highest_earning_country_table.sort_values(ascending=False).fillna(0)
highest_earning_country=highest_earning_country_table.idxmax()
highest_earning_country_percentage = highest_earning_country_table.max()

# Identify the most popular occupation for those who earn >50K in India.
df6=df[(df['native-country']=='India')&(df['salary'] == '>50K')]
top_IN_occupation = df6.groupby('occupation')['occupation'].count().sort_values(ascending=False).idxmax()

Your browser information:

User Agent is: Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36.

Challenge: Demographic Data Analyzer

Link to the challenge:

Welcome to the forums @joo5227. It’s generally easier to post errors in a code block instead of an image and even better to add a link to the project on repl.it or a similar platform.

The last three of your errors are all of the type 39.4134134 is not equal to 39.4 or something similar. That sure looks like you need to round. For the first error, try printing your version of race_count to see how it compares with the one in the test.

for your first error change code to
race_count = df[‘race’].value_counts()
the last three would be solved with round to 1 decimal.

As mentioned above and also in the directions for the challenge you need to round the values to the nearest tenth.

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.