Problems with "Data Analysis with Python Projects - Medical Data Visualizer"

MrMauriF · August 26, 2020, 3:14pm

Hi.

Im facing many problems and weird contradictions in this project.

1- The README does not specify what columns should the catplot use, although it says it should be similar to the example plot, and test_module checks for this columns. But the code comments says I should plot an other set of columns.

2- The code comments ask me to do the following:

# Group and reformat the data to split it by 'cardio'. Show the counts of each feature. You will have to rename one of the collumns for the catplot to work correctly.

Which I find completely stupid, unnecessary and wrong. I just did this and worked perfectly:

df_cat = pd.melt(df[["cardio", "cholesterol", "gluc", "smoke", "alco", "active", "overweight"]], id_vars="cardio")

# Draw the catplot with 'sns.catplot()'
fig = sns.catplot(x="variable", col="cardio", hue="value", data=df_cat, kind="count")

fig.set_axis_labels("variable", "total")

Is there any other way?

3- Now, even plotting an exact copy of the example plot following all conditions, I receive an error stating the following:

ERROR: test_bar_plot_number_of_bars (test_module.CatPlotTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/fcc-medical-data-visualizer/test_module.py", line 27, in test_bar_plot_number_of_bars
    actual = len([rect for rect in self.ax.get_children()
AttributeError: 'numpy.ndarray' object has no attribute 'get_children'

======================================================================
ERROR: test_line_plot_labels (test_module.CatPlotTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/fcc-medical-data-visualizer/test_module.py", line 13, in test_line_plot_labels
    actual = self.ax.get_xlabel()
AttributeError: 'numpy.ndarray' object has no attribute 'get_xlabel'

4- Code comments say that before plotting the heatmap I should clean the data, but no clue about what needs to be cleaned.

5- Now, trying to plot the heatmap using the following code based on the entire dataframe I have this problem:

# Clean the data
    df_heat = df

    # Calculate the correlation matrix
    corr = np.corrcoef(df_heat)

MemoryError: Unable to allocate 29.8 GiB for an array with shape (63259, 63259) and data type float64

Based on this error, I believe I should use a smaller portion of the dataframe but I wont probably have the same expected values.

craig.lunney · November 2, 2020, 4:54pm

Did you have any success with resolving issue 3? I have the same problem, if I run the code without the test module then it generates a heatmap.png that looks the same as the example (apart from the colourmap), but I am getting this same error with the test module.

lendoo · November 2, 2020, 7:30pm

Hi,
Try this code:

fig = sns.catplot(...).fig

craig.lunney · November 3, 2020, 1:46pm

This worked for me, thanks much appreciated.

Topic		Replies	Views
Medical Data Visualizer Confusion Python	54	13572	June 1, 2021
Python Medical Data Visualizer - Feedback + help with weird errors Code Feedback	3	379	October 16, 2021
Data Analysis with Python - Medical Data Visualizer Python	1	335	May 30, 2023
Medical Data Visualizer: 3 errors Python	6	945	November 26, 2021
Data Analysis with Python Projects - Medical Data Visualizer Python	1	365	April 21, 2023

Problems with "Data Analysis with Python Projects - Medical Data Visualizer"

Related topics