Issue with Demographic Data Analyzer test module

Hi all,

I’m done with the Demographic Data Analyzer test except for one error which - I think - is related to the test-module.py code, and not to the code I have written. But I could use help understanding what’s going on and how to proceed!

One of the project requirements is to produce a Pandas series containing names of races in the dataset and a count of individuals in the dataset for each race. The prompt and my line of code are below:

# How many of each race are represented in this dataset? This should be a Pandas series with race names as the index labels.
    race_count = df.groupby('race').size()

The test module code creates a list from the resultant series and compares to a list of the expected values:

def test_race_count(self):
        actual = self.data['race_count'].tolist()
        expected = [27816, 3124, 1039, 311, 271]
        self.assertAlmostEqual(actual, expected, msg="Expected race count values to be [27816, 3124, 1039, 311, 271]")

Here’s the problem. When that code runs, an error is thrown saying that assertAlmostEqual can’t take two lists as arguments!

ERROR: test_race_count (test_module.DemographicAnalyzerTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/boilerplate-demographic-data-analyzer/test_module.py", line 11, in test_race_count
    self.assertAlmostEqual(actual, expected, msg="Expected race count values to be [27816, 3124, 1039, 311, 271]")
  File "/usr/lib/python3.8/unittest/case.py", line 943, in assertAlmostEqual
    diff = abs(first - second)
TypeError: unsupported operand type(s) for -: 'list' and 'list'

So I believe that something like assertCountEqual is required instead. But - needless to say - I don’t want to myself edit the test-module.py code to pass the test. What should I do?

Welcome, gideon.

I have not done this project, yet. So, I cannot say what you should do. Here is the same issue opened on the freeCodeCamp GitHub repo:
Date Analysis with Python: demographic-data-analyzer has incorrect assertAlmostEqual function calls · Issue #39243 · freeCodeCamp/freeCodeCamp (github.com)

Perhaps, mention your thoughts there, and it will help contributors triage the issue to be fixed.

Hope this helps

Thank you so much! I didn’t realize that the issue had been reported there -I’ll +1 it.

assertAlmostEqual is capable to determine if list (and other objects) are equal. Problem arises when they are not equal. That’s when one is subtracted from the other one, and if object doesn’t support such operation, it result in TypeError. So right now that error indicates that actual list is not equal to the expected one.

Try race_count = df['race'].value_counts()

Using the groupby method is going to sort the index. I believe that was why the test module threw the error.

1 Like

EDIT - jstephen95, didn’t see your second post until after I submitted! You’re right on both counts.

It turns out assertAlmostEqual only works on lists with identical elements in identical order; the list in the test module is sorted; and pandas value_counts() sorts descending by default.

I should stress to the creators of the assignment, however, that the requirement to sort the series in descending order is not included in the prompt:

  • How many people of each race are represented in this dataset? This should be a Pandas series with race names as the index labels. ( race column)

So even though my code returns correct data:

Amer-Indian-Eskimo 311
Asian-Pac-Islander 1039
Black 3124
Other 271
White 27816

This series is not sorted the same way as the expected series in the test module, and therefore throws an error. I recommend that the requirements for the assignment be made clearer.

I guess the creators doesn’t want us to use the groupby function for that particular exercise.
Also, the value_counts() sorts the values in descending order while groupby sorts the index in ascending order.