Data Analysis with Python Projects - Demographic Data Analyzer

Tell us what’s happening:

Describe your issue in detail here.

Your code so far

import pandas as pd

# Load the dataset
data = pd.read_csv('demographic_data.csv')  # Replace 'your_dataset.csv' with your dataset filename

# How many people of each race are represented in this dataset?
race_counts = data['race'].value_counts()

# What is the average age of men?
average_age_men = data[data['sex'] == 'Male']['age'].mean()

# What is the percentage of people who have a Bachelor's degree?
bachelors_percentage = (data['education'] == 'Bachelors').mean() * 100

# What percentage of people with advanced education make more than 50K?
advanced_education = data['education'].isin(['Bachelors', 'Masters', 'Doctorate'])
higher_education_rich = data[advanced_education & (data['salary'] == '>50K')].shape[0] / data[advanced_education].shape[0] * 100

# What percentage of people without advanced education make more than 50K?
lower_education_rich = data[~advanced_education & (data['salary'] == '>50K')].shape[0] / data[~advanced_education].shape[0] * 100

# What is the minimum number of hours a person works per week?
min_work_hours = data['hours-per-week'].min()

# What percentage of the people who work the minimum number of hours per week have a salary of more than 50K?
num_min_workers = data[data['hours-per-week'] == min_work_hours]
rich_percentage = (num_min_workers[num_min_workers['salary'] == '>50K'].shape[0] / num_min_workers.shape[0]) * 100

# What country has the highest percentage of people that earn >50K and what is that percentage?
highest_earning_country = (data[data['salary'] == '>50K']['native-country'].value_counts() / data['native-country'].value_counts()).idxmax()
highest_earning_country_percentage = (data[(data['native-country'] == highest_earning_country) & (data['salary'] == '>50K')].shape[0] / data[data['native-country'] == highest_earning_country].shape[0]) * 100

# Identify the most popular occupation for those who earn >50K in India.
top_IN_occupation = data[(data['native-country'] == 'India') & (data['salary'] == '>50K')]['occupation'].value_counts().idxmax()

# Displaying the results
print("Race counts:")
print(race_counts)
print("\nAverage age of men:", round(average_age_men, 1))
print("\nPercentage of people with Bachelor's degree:", round(bachelors_percentage, 1))
print("\nPercentage of people with advanced education earning >50K:", round(higher_education_rich, 1))
print("\nPercentage of people without advanced education earning >50K:", round(lower_education_rich, 1))
print("\nMinimum number of hours worked per week:", min_work_hours)
print("\nPercentage of people working min hours per week earning >50K:", round(rich_percentage, 1))
print("\nCountry with the highest percentage earning >50K:", highest_earning_country)
print("Percentage:", round(highest_earning_country_percentage, 1))
print("\nMost popular occupation for >50K earners in India:", top_IN_occupation)

Your browser information:

User Agent is: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36

Challenge Information:

Data Analysis with Python Projects - Demographic Data Analyzer

###Replit Link For Helping Out
https://replit.com/join/acrieunwgo-smrutiparida

You appear to have created this post without editing the template. Please edit your post to Tell us what’s happening in your own words.

We see you have posted some code but did you have a question?

(You have not filled out the “Tell us what’s happening:” field above)

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.