The exercise is to create two box plots using catplot, that show the total number of people who are active and not active in each cardio category and so on for other variables like gluc, cholesterol, etc
I am unable to code how to find the total number of people and use it as the y-axis values
Your code so far
So far I was able to do this:
df_melted = pd.melt(df, id_vars=[‘cardio’], value_vars=[‘active’, ‘gluc’, ‘cholesterol’, ‘smoke’, ‘alco’, ‘overweight’])
I can find the total value counts by saying df_melted.value_counts() but I can’t use this to plot with
Your browser information:
User Agent is: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36
Challenge: Data Analysis with Python Projects - Medical Data Visualizer
You’ll really need to post all your code (in a code block) or better, post a link to your repl for proper help.
You know you’re plotting two bar graphs and so you’ll need the frequency data for your bars, separated into two groups based on the cardio variable. Start trying to do your data cleaning/processing by printing the data frame initially and then after each transformation you do while trying to get the data in the correct format.
From the second picture, I want to use the column after value, which is total amount of unique values as y-axis with the x-axis being the variable and two plots for column cardio where cardio = 1 is one bar plot and cardio=0 is another bar plot.
I don’t know how to move forward from this point to create the catplot.