Pls what could be wrong with this code?

kingdanieldavid · November 25, 2021, 6:56pm

import pandas as pd


def calculate_demographic_data(print_data=True):
    # Read data from file
    df = pd.read_csv("adult.data.csv")

    # How many of each race are represented in this dataset? This should be a Pandas series with race names as the index labels.
    race_count = df.groupby("race")["race"].count()

    # What is the average age of men?
    average_age_men = df[df["sex"]=="Men"]["age"].mean()

    # What is the percentage of people who have a Bachelor's degree?
    percentage_bachelors = (df[df["education"]=="Bachelors"]["education-num"].count())/(df["education"].count())

    # What percentage of people with advanced education (`Bachelors`, `Masters`, or `Doctorate`) make more than 50K?
    # What percentage of people without advanced education make more than 50K?

    # with and without `Bachelors`, `Masters`, or `Doctorate`
    higher_education = df[df["education"].isin(["Bachelors", "Masters", "Doctorate"])]
    lower_education = df[~(df["education"].isin(["Bachelors", "Masters", "Doctorate"]))]

    # percentage with salary >50K
    higher_education_rich = (df[df["education"].isin(["Bachelors", "Masters", "Doctorate"])]["education-num"].count())/(df["salary"].count())
    lower_education_rich = (df[~(df["education"].isin(["Bachelors", "Masters", "Doctorate"]))]["education-num"].count())/(df["salary"].count())

    # What is the minimum number of hours a person works per week (hours-per-week feature)?
    min_work_hours =df["hours-per-week"].min()

    # What percentage of the people who work the minimum number of hours per week have a salary of >50K?
    num_min_workers = df[df["hours-per-week"]==min_work_hours]["hours-per-week"].count()

    rich_percentage = (df[(df["hours-per-week"]==min_work_hours)& (df["salary"]==">50K")])/(num_min_workers)*100

    # What country has the highest percentage of people that earn >50K?
    highest_earning_country =(df[df["salary"]==">50K"]["native-country"].value_counts()/df["native-country"].value_counts()*100).idxmax()
    highest_earning_country_percentage =(df[df["salary"]==">50K"]["native-country"].value_counts()/df["native-country"].value_counts()*100).max()

    # Identify the most popular occupation for those who earn >50K in India.
    top_IN_occupation = (df[(df["native-country"]=="India")&(df["salary"]==">50K")]["occupation"].value_counts().idxmax())

it is not working on my boiler, “https://replit.com/@Dandave11/boilerplate-demographic-data-analyzer-3#demographic_data_analyzer.py”

response after running the code:

File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/core/ops/array_ops.py", line 166, in _na_arithmetic_op
    result = func(left, right)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/core/computation/expressions.py", line 239, in evaluate
    return _evaluate(op, op_str, a, b)  # type: ignore[misc]
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/core/computation/expressions.py", line 69, in _evaluate_standard
    return op(a, b)
TypeError: unsupported operand type(s) for /: 'str' and 'int'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 6, in <module>
    demographic_data_analyzer.calculate_demographic_data()
  File "/home/runner/boilerplate-demographic-data-analyzer-3/demographic_data_analyzer.py", line 34, in calculate_demographic_data
    rich_percentage = (df[(df["hours-per-week"]==min_work_hours)& (df["salary"]==">50K")])/(num_min_workers)*100
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/core/ops/common.py", line 69, in new_method
    return method(self, other)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/core/arraylike.py", line 116, in __truediv__
    return self._arith_method(other, operator.truediv)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/core/frame.py", line 6866, in _arith_method
    new_data = self._dispatch_frame_op(other, op, axis=axis)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/core/frame.py", line 6893, in _dispatch_frame_op
    bm = self._mgr.apply(array_op, right=right)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 325, in apply
    applied = b.apply(f, **kwargs)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 381, in apply
    result = func(self.values, **kwargs)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/core/ops/array_ops.py", line 224, in arithmetic_op
    res_values = _na_arithmetic_op(left, right, op)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/core/ops/array_ops.py", line 173, in _na_arithmetic_op
    result = _masked_arith_op(left, right, op)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/pandas/core/ops/array_ops.py", line 131, in _masked_arith_op
    result[mask] = op(xrav[mask], y)
TypeError: unsupported operand type(s) for /: 'str' and 'int'
exit status 1


lasjorg · November 25, 2021, 7:04pm

I’ve edited your post for readability. When you enter a code block into a forum post, please precede it with a separate line of three backticks and follow it with a separate line of three backticks to make it easier to read.

You can also use the “preformatted text” tool in the editor (</>) to add backticks around text.

See this post to find the backtick on your keyboard.
Note: Backticks (`) are not single quotes (’).

lasjorg · November 25, 2021, 7:13pm

TypeError: unsupported operand type(s) for /: 'str' and 'int'

I would assume this means you are trying to divide a string with a number if I’m reading it right.

I don’t really know Python but I’m pretty sure & is a bitwise operator in Python. On line 41 in demographic_data_analyzer.py

That entire expression is just confusing to read and I can’t really make sense of it.

Jagaya · November 25, 2021, 7:36pm

You have to look at the error message and look for the file you are working on.
The “Traceback” means it’s showing you all the files the interpreter was currently working on, while encountering the error. Most of the files are just part of the used libraries and thus none of your concern. Same goes for main.py, as this is just the main file where the actual code is called.

So we got:

  File "/home/runner/boilerplate-demographic-data-analyzer-3/demographic_data_analyzer.py", line 34, in calculate_demographic_data
    rich_percentage = (df[(df["hours-per-week"]==min_work_hours)& (df["salary"]==">50K")])/(num_min_workers)*100

Then we have the actual error

TypeError: unsupported operand type(s) for /: 'str' and 'int'

Meaning you tried to divide a string and an integer in that order.
Soooo the left side of your division returns some kind of string:
(df[(df["hours-per-week"]==min_work_hours)& (df["salary"]==">50K")])
Or in this case, it returns a slice of the dataframe, which contains all kinds of data, including text…

And man I just wrote all this crap, only to figure out that you just need to get the size of the slice either by .shape() or len()… ^^°

sanity · November 25, 2021, 7:40pm

In here that’s not something unusual, pandas uses & operator working as a boolean and.

kingdanieldavid · November 25, 2021, 8:36pm

pls can you explain a bit with some kinda recommendation

Jagaya · November 25, 2021, 9:11pm

Can you specify the question? I thought I explained the issue sufficiently.

CodeLiveNow · November 25, 2021, 11:40pm

All your other lines use .count() why not this one?

kingdanieldavid · November 26, 2021, 3:19pm

sorry, I understand the point now. I have effected the changes and its working now. Thanks!!!

kingdanieldavid · November 26, 2021, 3:19pm

You are right, .count() is also applicable on that line.

CodeLiveNow · November 26, 2021, 3:34pm

Does adding .count() fix the issue?

kingdanieldavid · November 26, 2021, 3:40pm

‘’’
rich_percentage = (df[(df[“hours-per-week”]==min_work_hours) & (df[“salary”]==">50K")])[“hours-per-week”].count()/num_min_workers*100
print(rich_percentage)

‘’’
yes it did, but i also compared it with the hours-per-week column before adding .count()

system · May 28, 2022, 3:40am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Demographic Data Python	7	1861	October 11, 2021
Error whenever im trying to run this code Python	2	452	May 9, 2022
Demographic Data Analyzer Error Python	4	844	December 11, 2021
Problem with Demographic Data Analyzer Python	4	648	April 9, 2022
Data Analysis with Python Projects - Demographic Data Analyzer Python	2	315	August 9, 2023

Pls what could be wrong with this code?

Related topics