Demographic data analyzer problem

vertebraofficial01 · February 29, 2024, 6:24pm

Hi, I’m working on project 2 of the data analysis certificate but I have a problem. I can’t understand what I’m wrong with calculating the percentage of people who have high education like bachelors masters and doctorate. here is my code:

higher_education_rich_1=df['education'].loc[(df['education']=='Bachelors')|(df['education']=='Masters')|(df['education']=='Doctorate')]
    salary=df['salary'].loc[df['salary']=='>50K']
    higher_education_rich=higher_education_rich_1.loc[higher_education_rich_1.index.isin(salary.index)==True].value_counts().sum()/len(df)*100
    lower_education_rich_1=df['education'].loc[~(df['education']=='Bachelors')|(df['education']=='Masters')|(df['education']=='Doctorate')]

a2937 · February 29, 2024, 6:33pm

Hi there. Can you please reformat your code so it’s easier to read? You would have to wrap your code with three of these symbol `, three at the top and three at the bottom. Adding that and proper spacing will go a long way to help us make sense of what you wrote.

Hope this helps.

vertebraofficial01 · February 29, 2024, 6:35pm

of course excuse me

'higher_education_rich_1=df[‘education’].loc[(df[‘education’]==‘Bachelors’)|(df[‘education’]==‘Masters’)|(df[‘education’]==‘Doctorate’)]

salary=df[‘salary’].loc[df[‘salary’]==‘>50K’]

higher_education_rich=higher_education_rich_1.loc[higher_education_rich_1.index.isin(salary.index)==True].value_counts().sum()/len(df)*100’

ILM · February 29, 2024, 6:37pm

When you enter a code block into a forum post, please precede it with a separate line of three backticks and follow it with a separate line of three backticks to make it easier to read.

You can also use the “preformatted text” tool in the editor (</>) to add backticks around text.

See this post to find the backtick on your keyboard.
Note: Backticks (`) are not single quotes (').

vertebraofficial01 · February 29, 2024, 6:41pm

higher_education_rich_1=df[‘education’].loc[(df[‘education’]==‘Bachelors’)|(df[‘education’]==‘Masters’)|(df[‘education’]==‘Doctorate’)]

salary=df[‘salary’].loc[df[‘salary’]==‘>50K’]

higher_education_rich=higher_education_rich_1.loc[higher_education_rich_1.index.isin(salary.index)==True].value_counts().sum()/len(df)*100

pkdvalis · February 29, 2024, 7:03pm

I’ve edited your code for readability. When you enter a code block into a forum post, please precede it with a separate line of three backticks and follow it with a separate line of three backticks to make it easier to read.

You can also use the “preformatted text” tool in the editor (</>) to add backticks around text.

See this post to find the backtick on your keyboard.
Note: Backticks (`) are not single quotes (').

vertebraofficial01 · February 29, 2024, 7:04pm

ok thanks you. I don’t know it

pkdvalis · February 29, 2024, 11:01pm

What percentage of people with advanced education (Bachelors, Masters, or Doctorate) make more than 50K?

You have calculated what percentage of the total group have both higher education and make more than 50k, since you are dividing by the original dataframe /len(df)*100

higher_education_rich=higher_education_rich_1.loc[higher_education_rich_1.index.isin(salary.index)==True].value_counts().sum()/len(df)*100

vertebraofficial01 · March 1, 2024, 5:20pm

Is it wrong? I don’t understand

pkdvalis · March 2, 2024, 1:10am

What percentage of people with advanced education (Bachelors, Masters, or Doctorate) make more than 50K?

This would be:
(advanced education & 50K) / (advanced education)

You calculated:
(advanced education & 50K) / (Total Group)

vertebraofficial01 · March 2, 2024, 10:24am

Oh thank you very much I didn’t understand that I had to divide not for the entire dataframe or the education column but for those values you correspond to high education. Thank you so much, I was just missing this step to finish the project. The low education I had managed to do it. Actually my first approach had been to create a crosstab as a percentage of the education and salary column and then select the higher education and the salary above 50K and then make the sum but it didn’t work. I then found a question here on the forum where a person had the same problem as me and was told not to group the data but to use the approach then that I used. Thank you very much. It was the only thing I didn’t understand.

system · August 31, 2024, 10:25pm

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Demographic Data Analyzer ~ Higher/lower education Percentage miscalculation Python	5	730	December 30, 2021
Data Analysis with Python Projects - Demographic Data Analyzer Python	27	171	January 8, 2025
Demographic Data Analyzer Help Python	6	668	August 20, 2021
Problem with Demographic Data Analyzer Python	4	681	April 9, 2022
Demographic-data-analyzer - higher_richer 46.5 vs 46.6 Python	7	221	December 3, 2024

Demographic data analyzer problem

Related topics