Medical Data Visualizer - Catplot not working properly

Tell us what’s happening:
I’m trying at the part where I’m trying to create a catplot with the processed data from df_cat. The dataframe seems to be correct, however I can’t get it to look anything like the example figure.

Your code so far

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

# Import data
df = pd.read_csv('numpy_pandas\data\medical_examination.csv')

# Add 'overweight' column
# Use np.where
df['overweight'] = np.where(df['weight']/pow(df['height']/100, 2) > 25, True, False)

# Normalize data by making 0 always good and 1 always bad. If the value of 'cholesterol' or 'gluc' is 1, make the value 0. If the value is more than 1, make the value 1.

df.loc[:, ['cholesterol', 'gluc']] = np.where(df.loc[:, ['cholesterol', 'gluc']] > 1, 1, 0)


# Draw Categorical Plot
def draw_cat_plot():
    # Create DataFrame for cat plot using `pd.melt` using just the values from 'cholesterol', 'gluc', 
    # 'smoke', 'alco', 'active', and 'overweight'.
    df_cat = df.melt(id_vars=['cardio'],
                    value_vars=['cholesterol', 'gluc', 'smoke', 
                                'alco', 'active', 'overweight'])
    

    # Group and reformat the data to split it by 'cardio'. Show the counts of each feature.
    df_cat = df_cat.groupby(['cardio', 'variable', 'value'])['variable'].count().reset_index(name='total')

    # Draw the catplot with 'sns.catplot()'
    # Get the figure for the output
    fig = sns.catplot(data=df_cat, x='variable', col='cardio',
             kind='count')  

    # Do not modify the next two lines
    fig.savefig('catplot.png')
    return fig

draw_cat_plot()

This is the output:


Additionally, I can’t get the hue parameter to work, as I get a traceback error suggesting that the ‘value’ column in the df_cat is not actually a column.

Your browser information:

User Agent is: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/117.0

Challenge: Data Analysis with Python Projects - Medical Data Visualizer

Link to the challenge:

Ok so I finally figured it out.

First, the correct parameter to passed to kind is ‘bar’ and not ‘count’.

Secondly, I also had to pass a hue parameter to split up the bars based on different values. It’s worth noting that matplotlib expects the dtype of this column to be and object, but if you did this like me then the graph won’t work and you’ll get a traceback because the column is has a dtype of int64. You can convert it to an object using the astype(str) function and then it’ll work properly.

The final graph should look something like this:

# Group and reformat the data to split it by 'cardio'. Show the counts of each feature.
    df_cat = df_cat.groupby(['cardio', 'variable', 'value'])['variable'].count().reset_index(name='total')
    df_cat['value'] = df_cat['value'].astype(str)

    # Draw the catplot with 'sns.catplot()'
    # Get the figure for the output
    fig = sns.catplot(data=df_cat, x='variable', y='total', col='cardio',
             kind='bar', errorbar=None, hue='value')
2 Likes

Appericated, it has helped me.

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.