Demographic Data Analyzer code works but error

Tell us what’s happening:
Each line of code works in Jupyter but when I run it in Repl.it, it gives me the error below:

Number of each race:
race
Amer-Indian-Eskimo 311
Asian-Pac-Islander 1039
Black 3124
Other 271
White 27816
Name: race, dtype: int64
Average age of men: 39.43354749885268
Percentage with Bachelors degrees: 0.16446055096587942%
Percentage with higher education that earn >50K: 0.46535843011613937%
Percentage without higher education that earn >50K: 0.173713601914639%
Min work time: 1 hours/week
Percentage of rich among those who work fewest hours: 0.1%
Country with highest percentage of rich: 7171
Highest percentage of rich people in country: 7171.0%
Top occupations in India: 25
EEEEE.EEEF

ERROR: test_average_age_men (test_module.DemographicAnalyzerTestCase)

Traceback (most recent call last):
File “/home/runner/GoldenMemorableBackup/test_module.py”, line 16, in test_average_age_men
self.assertAlmostEqual(actual, expected, “Expected different value for average age of men.”)
File “/usr/lib/python3.8/unittest/case.py”, line 957, in assertAlmostEqual
if round(diff, places) == 0:
TypeError: an integer is required (got type str)

======================================================================
ERROR: test_higher_education_rich (test_module.DemographicAnalyzerTestCase)

Traceback (most recent call last):
File “/home/runner/GoldenMemorableBackup/test_module.py”, line 26, in test_higher_education_rich
self.assertAlmostEqual(actual, expected, “Expected different value for percentage with higher education that earn >50K.”)
File “/usr/lib/python3.8/unittest/case.py”, line 957, in assertAlmostEqual
if round(diff, places) == 0:
TypeError: an integer is required (got type str)

======================================================================
ERROR: test_highest_earning_country (test_module.DemographicAnalyzerTestCase)

Traceback (most recent call last):
File “/home/runner/GoldenMemorableBackup/test_module.py”, line 46, in test_highest_earning_country
self.assertAlmostEqual(actual, expected, “Expected different value for highest earning country.”)
File “/usr/lib/python3.8/unittest/case.py”, line 943, in assertAlmostEqual
diff = abs(first - second)
numpy.core._exceptions.UFuncTypeError: ufunc ‘subtract’ did not contain a loop with signature matching types (dtype(’<U21’), dtype(’<U21’)) -> dtype(’<U21’)

======================================================================
ERROR: test_highest_earning_country_percentage (test_module.DemographicAnalyzerTestCase)

Traceback (most recent call last):
File “/home/runner/GoldenMemorableBackup/test_module.py”, line 51, in test_highest_earning_country_percentage
self.assertAlmostEqual(actual, expected, “Expected different value for heighest earning country percentage.”)
File “/usr/lib/python3.8/unittest/case.py”, line 957, in assertAlmostEqual
if round(diff, places) == 0:
TypeError: an integer is required (got type str)

======================================================================
ERROR: test_lower_education_rich (test_module.DemographicAnalyzerTestCase)

Traceback (most recent call last):
File “/home/runner/GoldenMemorableBackup/test_module.py”, line 31, in test_lower_education_rich
self.assertAlmostEqual(actual, expected, “Expected different value for percentage without higher education that earn >50K.”)
File “/usr/lib/python3.8/unittest/case.py”, line 957, in assertAlmostEqual
if round(diff, places) == 0:
TypeError: an integer is required (got type str)

======================================================================
ERROR: test_percentage_bachelors (test_module.DemographicAnalyzerTestCase)

Traceback (most recent call last):
File “/home/runner/GoldenMemorableBackup/test_module.py”, line 21, in test_percentage_bachelors
self.assertAlmostEqual(actual, expected, “Expected different value for percentage with Bachelors degrees.”)
File “/usr/lib/python3.8/unittest/case.py”, line 957, in assertAlmostEqual
if round(diff, places) == 0:
TypeError: ‘str’ object cannot be interpreted as an integer

======================================================================
ERROR: test_race_count (test_module.DemographicAnalyzerTestCase)

Traceback (most recent call last):
File “/home/runner/GoldenMemorableBackup/test_module.py”, line 11, in test_race_count
self.assertAlmostEqual(actual, expected, “Expected race count values to be [27816, 3124, 1039, 311, 271]”)
File “/usr/lib/python3.8/unittest/case.py”, line 943, in assertAlmostEqual
diff = abs(first - second)
TypeError: unsupported operand type(s) for -: ‘list’ and ‘list’

======================================================================
ERROR: test_rich_percentage (test_module.DemographicAnalyzerTestCase)

Traceback (most recent call last):
File “/home/runner/GoldenMemorableBackup/test_module.py”, line 41, in test_rich_percentage
self.assertAlmostEqual(actual, expected, “Expected different value for percentage of rich among those who work fewest hours.”)
File “/usr/lib/python3.8/unittest/case.py”, line 957, in assertAlmostEqual
if round(diff, places) == 0:
TypeError: an integer is required (got type str)

======================================================================
FAIL: test_top_IN_occupation (test_module.DemographicAnalyzerTestCase)

Traceback (most recent call last):
File “/home/runner/GoldenMemorableBackup/test_module.py”, line 56, in test_top_IN_occupation
self.assertEqual(actual, expected, “Expected different value for top occupations in India.”)
AssertionError: 25 != ‘Prof-specialty’ : Expected different value for top occupations in India.


Ran 10 tests in 3.927s

Your code so far
Here’s my code:

import pandas as pd
import numpy as np

def calculate_demographic_data(print_data=True):
# Read data from file
df = pd.read_csv(“adult.data.csv”)

# How many of each race are represented in this dataset? This should be a Pandas series with race names as the index labels.
race_count = df.groupby('race')['race'].count()

# What is the average age of men?
average_age_men = df[df['sex'] == 'Male']['age'].mean()

# What is the percentage of people who have a Bachelor's degree?
percentage_bachelors = df[df['education'] == 'Bachelors'].shape[0] / df.shape[0]

# What percentage of people with advanced education (`Bachelors`, `Masters`, or `Doctorate`) make more than 50K?
# What percentage of people without advanced education make more than 50K?

# with and without `Bachelors`, `Masters`, or `Doctorate`
higher_education = df[df['education'].isin(['Bachelors', 'Masters', 'Doctorate'])]
lower_education = df[~df['education'].isin(['Bachelors', 'Masters', 'Doctorate'])]

# percentage with salary >50K
higher_education_rich = higher_education[higher_education['salary'] == '>50K']['salary'].count() / higher_education.shape[0]
lower_education_rich = lower_education[lower_education['salary'] == '>50K']['salary'].count() / lower_education.shape[0]

# What is the minimum number of hours a person works per week (hours-per-week feature)?
min_work_hours = df['hours-per-week'].min() 

# What percentage of the people who work the minimum number of hours per week have a salary of >50K?
num_min_workers = df[df['hours-per-week'] == 1]['hours-per-week'].count()

rich_percentage = df[(df['hours-per-week'] == 1) & (df['salary'] == '>50K')].shape[0] / num_min_workers

# What country has the highest percentage of people that earn >50K?
highest_earning_country = df[df['salary'] == '>50K'].groupby('native-country')['native-country'].count().max()
highest_earning_country_percentage = (highest_earning_country / df.groupby('native-country')['native-country'].count()).max()

# Identify the most popular occupation for those who earn >50K in India.
top_IN_occupation = df[(df['salary'] == '>50K') & (df['native-country'] == 'India')].groupby('occupation')['occupation'].count().max()

# DO NOT MODIFY BELOW THIS LINE

if print_data:
    print("Number of each race:\n", race_count) 
    print("Average age of men:", average_age_men)
    print(f"Percentage with Bachelors degrees: {percentage_bachelors}%")
    print(f"Percentage with higher education that earn >50K: {higher_education_rich}%")
    print(f"Percentage without higher education that earn >50K: {lower_education_rich}%")
    print(f"Min work time: {min_work_hours} hours/week")
    print(f"Percentage of rich among those who work fewest hours: {rich_percentage}%")
    print("Country with highest percentage of rich:", highest_earning_country)
    print(f"Highest percentage of rich people in country: {highest_earning_country_percentage}%")
    print("Top occupations in India:", top_IN_occupation)

return {
    'race_count': race_count,
    'average_age_men': average_age_men,
    'percentage_bachelors': percentage_bachelors,
    'higher_education_rich': higher_education_rich,
    'lower_education_rich': lower_education_rich,
    'min_work_hours': min_work_hours,
    'rich_percentage': rich_percentage,
    'highest_earning_country': highest_earning_country,
    'highest_earning_country_percentage':
    highest_earning_country_percentage,
    'top_IN_occupation': top_IN_occupation
}

Your browser information:

User Agent is: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36.

Challenge: Demographic Data Analyzer

Link to the challenge:

For the percentage questions, try multiplying the value you obtained by 100, as the question asks for percentage and not ratio.

Try also rounding every value to the first decimal. As already said above, you also have to multiply by 100 when asked for percentages.
Hope it helps :slight_smile:

Also for the last three questions you are giving the wrong type for answer.

You give the following:

Country with highest percentage of rich: 7171
Highest percentage of rich people in country: 7171.0%
Top occupations in India: 25

For country with highest percentage you should give a country name.
That percentage for rich people is too high.
And the top occupation should be a string too.

try using round function as well as multiply by 100 to get the required percentage.