Data Analysis with Python Projects - Demographic Data Analyzer

Tell us what’s happening:

Hello!

Problem: Calculating the mean male age yields ‘inf’.

My code seems to work on my local python environment but breaks in Gitpod. I can’t seem to calculate the average male age since adding up all the ages or using .mean() yields infinity. Using “df[df[‘sex’]==‘Male’][‘age’].mean().round(1)” yields “inf”. Using total_age = int(df.loc[df[‘sex’] == ‘Male’][‘age’].sum()) yields extremely large numbers. How can I fix this?

Thank you very much for reading,
Cheers!

Your code so far

Your browser information:

User Agent is: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:124.0) Gecko/20100101 Firefox/124.0

Challenge Information:

Data Analysis with Python Projects - Demographic Data Analyzer

Can you share your full code please? Or a screenshot showing the problem?

I tested your code and got this result

import pandas as pd
  # Read data from file
  df = pd.read_csv('adult.data.csv')
  df[df['sex']=='Male']['age'].mean().round(1)

>>> 39.4

Hello! Thank you for replying!

I run the following code:

import pandas as pd

df = pd.read_csv("adult.data.csv", names = ["age","workclass","fnlwgt","education","education-num","marital-status","occupation","relationship","race","sex","capital-gain","capital-loss","hours-per-week","native-country","salary"])
average_age_men = df[df['sex'] == 'Male']['age'].mean()
print(average_age_men)

This code prints inf. Below is a screenshot:

Surprisingly your code seems to work, so I guess the problem is the “names” part?

I tested:

import pandas as pd
import numpy as np

df = pd.read_csv("adult.data.csv", names = ["age","workclass","fnlwgt","education","education-num","marital-status","occupation","relationship","race","sex","capital-gain","capital-loss","hours-per-week","native-country","salary"])
average_age_men = df[df['sex'] == 'Male']['age'].mean()
print(average_age_men)

df = pd.read_csv('adult.data.csv')
print(df[df['sex']=='Male']['age'].mean().round(1))

and got:

Thank you for your help!

Print the dataframe right after you load it and see what it looks like.

Sure, here is the result:

So summing “age”+39+50+… gives inf? I expected an error message like: “Can’t add str and int types” in this scenario.

Well, you’re taking the mean of a dataframe.

But now you see why you’re getting unexpected results