Data Analysis with Python Projects - Demographic Data Analyzer

Tell us what’s happening:
Describe your issue in detail here.
Console is not able to install pandas as it is viewing a text said “No module names pandas”

Your code so far
import pandas as pd

def calculate_demographic_data(print_data=True):
# Read data from file
df = pd.read_csv (“adult.data.csv”)

# How many of each race are represented in this dataset? This should be a Pandas series with race names as the index labels.
race_count = df['race'].value_counts()

# What is the average age of men?
average_age_men = df[df['sex'] == 'male']['age'].mean().round(1)

# What is the percentage of people who have a Bachelor's degree?
percentage_bachelors = round((len(df[df['education'] == 'Bachelors'])) / len(df), 1)

# What percentage of people with advanced education (`Bachelors`, `Masters`, or `Doctorate`) make more than 50K?
# What percentage of people without advanced education make more than 50K?

# with and without `Bachelors`, `Masters`, or `Doctorate`
higher_education = df[df["education"].isin(["Bachelors","Masters","Doctorate"])]
lower_education = df[~df["education"].isin(["Bachelors","Masters","Doctorate"])]

# percentage with salary >50K
higher_education_rich = round((len(higher_education)/len(higher_education[higher_education.salary > '50K'])) * 100, 1)
lower_education_rich = round((len(lower_education)/len(lower_education[lower_education.salary > '50K'])) * 100, 1)

# What is the minimum number of hours a person works per week (hours-per-week feature)?
min_work_hours = df['hours-per-week'].min()

# What percentage of the people who work the minimum number of hours per week have a salary of >50K?
num_min_workers = df[df['hours-per-week'] == min_work_hours]

rich_percentage = round((len(num_min_workers[num_min_workers.salary=='50k'])/len(num_min_workers))*100,1)

# What country has the highest percentage of people that earn >50K?
country_count = df['native-country'].value_counts()
country_rich_count = df[df['salary'] > '50k']['native-country'].value_counts()

highest_earning_country = (country_rich_count/country_count * 100).idxmax()
highest_earning_country_percentage = round((country_rich_count/country_count * 100).max(), 1)

# Identify the most popular occupation for those who earn >50K in India.
india_people = df[df['native-country']== 'India']
occupation_count = india_people['occupation'].value_counts()
top_IN_occupation = occupation_count.idxmax()

# DO NOT MODIFY BELOW THIS LINE

if print_data:
    print("Number of each race:\n", race_count) 
    print("Average age of men:", average_age_men)
    print(f"Percentage with Bachelors degrees: {percentage_bachelors}%")
    print(f"Percentage with higher education that earn >50K: {higher_education_rich}%")
    print(f"Percentage without higher education that earn >50K: {lower_education_rich}%")
    print(f"Min work time: {min_work_hours} hours/week")
    print(f"Percentage of rich among those who work fewest hours: {rich_percentage}%")
    print("Country with highest percentage of rich:", highest_earning_country)
    print(f"Highest percentage of rich people in country: {highest_earning_country_percentage}%")
    print("Top occupations in India:", top_IN_occupation)

return {
    'race_count': race_count,
    'average_age_men': average_age_men,
    'percentage_bachelors': percentage_bachelors,
    'higher_education_rich': higher_education_rich,
    'lower_education_rich': lower_education_rich,
    'min_work_hours': min_work_hours,
    'rich_percentage': rich_percentage,
    'highest_earning_country': highest_earning_country,
    'highest_earning_country_percentage':
    highest_earning_country_percentage,
    'top_IN_occupation': top_IN_occupation
}

Your browser information:

User Agent is: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 Edg/109.0.1518.78

Challenge: Data Analysis with Python Projects - Demographic Data Analyzer

Link to the challenge:

This is typically the dependency problem mentioned in many posts about the python data analysis projects. You’ll need to update your pyproject.toml file and delete and reinstall your dependencies (assuming you’re using replit). If that doesn’t work, you’ll need to post a link to your repl for debugging.

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.