Medical Data Visualizer Project - Errors

Hi everyone!

I have been working on the Medical Data Visualizer Project for a couple of days and it seemed like I almost had it; that is, until I ran it in replit and got a bunch of errors. Can anyone point me to some resources where I can find how to fix them? You can find my code below:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

# Import data
med_data = pd.read_csv('medical_examination.csv')
df = pd.DataFrame(data = med_data)

# Add 'overweight' column
bmi = (df['weight'] / ((df['height'] / 100) ** 2))
df['overweight'] = np.where(bmi >= 25, 1, 0)

# Normalize data by making 0 always good and 1 always bad. If the value of 'cholesterol' or 'gluc' is 1, make the value 0. If the value is more than 1, make the value 1.
df['cholesterol'] = np.where(df['cholesterol'] > 1, 1, 0)
df['gluc'] = np.where(df['gluc'] > 1, 1, 0)


# Draw Categorical Plot
def draw_cat_plot():
    df_cat = pd.DataFrame(data=pd.melt(df, id_vars='cardio', value_vars=['cholesterol', 'gluc', 'smoke', 'alco', 'active', 'overweight']))
    cat_order = ['active', 'alco', 'cholesterol', 'gluc', 'overweight', 'smoke']
    fig = sns.catplot(x='variable', hue='value', data=df_cat, kind='count', col='cardio', order=cat_order)
    fig.set_ylabels('total')
    fig.savefig('catplot.png')
    return fig

draw_cat_plot()




# Draw Heat Map
def draw_heat_map():
    correctap_mask = df['ap_lo'] <= df['ap_hi']
    correct_ht_mask = (df['height'] >= df['height'].quantile(0.025)) & (df['height'] <= df['height'].quantile(0.975))
    correct_wt_mask = (df['weight'] >= df['weight'].quantile(0.025)) & (df['weight'] <= df['weight'].quantile(0.975))
    clean_data_mask = correctap_mask & correct_ht_mask & correct_wt_mask
    df_heat = df[clean_data_mask]
    corr = df_heat.corr()
    mask = np.zeros_like(corr)
    mask[np.triu_indices_from(mask)] = True

    fig = sns.heatmap(corr, center=0, mask = mask, annot=True, vmax=0.30)
    cbar = fig.collections[0].colorbar
    cbar.set_ticks(np.linspace(-0.08, 0.24, 5))
    fig = fig.figure



    # Do not modify the next two lines
    fig.savefig('heatmap.png')
    return fig

I saw other people getting similar errors, but I haven’t looked at their code. Please help. Thanks!

The errors I am getting:

Matplotlib created a temporary config/cache directory at /tmp/matplotlib-kmwm7d4l because the default path (/config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
EE.[]
F
======================================================================
ERROR: test_bar_plot_number_of_bars (test_module.CatPlotTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/boilerplate-medical-data-visualizer-3/test_module.py", line 26, in test_bar_plot_number_of_bars
    actual = len([rect for rect in self.ax.get_children() if isinstance(rect, mpl.patches.Rectangle)])
AttributeError: 'numpy.ndarray' object has no attribute 'get_children'

======================================================================
ERROR: test_line_plot_labels (test_module.CatPlotTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/boilerplate-medical-data-visualizer-3/test_module.py", line 13, in test_line_plot_labels
    actual = self.ax.get_xlabel()
AttributeError: 'numpy.ndarray' object has no attribute 'get_xlabel'

======================================================================
FAIL: test_heat_map_values (test_module.HeatMapTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/boilerplate-medical-data-visualizer-3/test_module.py", line 47, in test_heat_map_values
    self.assertEqual(actual, expected, "Expected different values in heat map.")
AssertionError: Lists differ: [] != ['0.0', '0.0', '-0.0', '0.0', '-0.1', '0.5[616 chars]0.1']

Second list contains 91 additional elements.
First extra element 0:
'0.0'

Diff is 941 characters long. Set self.maxDiff to None to see it. : Expected different values in heat map.

----------------------------------------------------------------------
Ran 4 tests in 12.559s

FAILED (failures=1, errors=2)

The catplot needs the line “fig = fig.fig” for reasons I don’t know.
The heatmap on the other hand doesn’t.
So add the one, remove the other and see how it works.

The fig you are returning is not a matplotlib fig object like the tests expect. The seaborn methods return their own types and you need to find the part that is the matplotlib fig object and return that. sns.catplot() returns a seaborn FacetGrid and sns.heatmap() returns a matplotlib Axes object if i remember correctly; check the seaborn documentation or search the forums for “facetgrid” for more details.

After you fix the return object types, you still may have some formatting issues to fix as well.

1 Like

I added that to the catplot, but left the heatmap as it was. It is now only returning one failure:
https://replit.com/@HoracioRomo/boilerplate-medical-data-visualizer-4#medical_data_visualizer.py
It looks like the number of values used to create the heatmap is not as expected. I am guessing it has to do with not cleaning the data successfully. Do you see something I am missing?

I kind of followed @Jagaya 's advice:
https://replit.com/@HoracioRomo/boilerplate-medical-data-visualizer-4#medical_data_visualizer.py
I am now only getting one failure having to do with the number of values used to create the heatmap. Do you see something I may be missing? Thanks!

I finally got passed all the tests, this is my code:
https://replit.com/@HoracioRomo/boilerplate-medical-data-visualizer-4#main.py