I am having some issues with the draw_heat_map and draw_cat_plot functions. What are these functions supposed to return?
Hi @skhanal7 !
You will receive more responses to your post if you provide the forum with more details.
Can you provide a link to your project?
I have moved this to the python category.
I found the project you are working on and added that to your post for context.
The link is above. I am working on the third project in the Data Analysis with Python Projects section. I think I figured out what I was doing wrong for the heat map and cat plot functions but I now have a different issue. The zeroes in my heat map need to be 0.0 instead of 0. Also, most of my values for the correlation in my heat map are correct but I am not sure why some are wrong. If I understood the directions correctly, I had to change all height and weight above the 97.5 percentile to the 97.5 percentile height and weight. For the ones below the 2.5 percentile I needed to change it to the 2.5 percentile weight and height. For the ap_hi and ap_lo I had to change the ap_lo to the ap_hi if ap_low was greater. Is that all correct?
I haven’t personally gone through the python sections yet.
So I won’t be able to assist you there.
But luckily, there are plenty of people on the the forum that can assist.
I would still suggest you share your code with the forum though so people can take a look at what’s going and better assist you.
You can also look through the previous discussions on this project.
You might find the answer you are looking for in one of those discussions.
No, you have to literally discard those values - which means creating a series only with values within the given boundaries.
Also if you do this, you have to do all 4 conditions in one command - otherwise you will get slightly wrong results in the heatmep.
And please share a link TO YOUR CODE - not to the challenge itself.
Without seeing your code, people will hardly be able to tell you where you might have made mistakes.
Here is a link to my code: https://repl.it/@skha789/boilerplate-medical-data-visualizer#medical_data_visualizer.py
Hopefully that is the right way to share my code.
For some reason I am getting an exit status 137. It does not tell me what the error is. I see that if I comment out the code I added to the heat map function I do not get that error. I am not sure what is wrong with the heat map function code, though.
Error-code 137 means the application ran out of memory.
Had the same issue and the only way I got around that was by making the figures a lot smaller. Look into plt.subplots() for that.
As I said about the quantiles, you should not change the data-entries but just ignore everything outside the said quantiles. Meaning everything you do with heighlis, weighlis and aplis is not needed - the following selection is enough.
Also your data cleaning can be done a lot easier.
Any calculation applied to a series creates a new series with the calculation applies to ALL entries and there are also .map()
df["double_height"] = df["height"].map(lambda x: 2*x) creates a new column with all the heights doubled. The original height-column will be unchanged - ofcourse you could also overwrite that. This makes both adding columns or changing values a lot easier.
I just realized you were new to the forum.
When users have questions on projects it is always best to share a link to the project you are working on or write code directly into the forum if it is a small code snippet.
Instructions for writing code in the forum:
When you enter a code block into a forum post, please precede it with a separate line of three backticks and follow it with a separate line of three backticks to make it easier to read.
You can also use the “preformatted text” tool in the editor (
</>) to add backticks around text.
See this post to find the backtick on your keyboard.
Note: Backticks (`) are not single quotes (’).
When you press run, should it take a few minutes for you to see if you passed the tests? For me it is taking a few minutes and I am not sure why. Whenever I run it I also get this message:
“Matplotlib created a temporary config/cache directory at /tmp/matplotlib-o1lixz14 because the default path (/config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.” How would I get rid of this?
Also, I am getting a memory issue even when I reduce the size of the plots. This is the
code I am using:
fig = None # Set up the matplotlib figure plt.subplots(figsize=(3.1,4.1)) fig, ax= plt.figure(), sns.heatmap(df_corr, vmin = -1, vmax = 1, center = 0, cmap = 'coolwarm', annot = True, mask = mask) fig.add_axes(ax) # Draw the heatmap with 'sns.heatmap()' # Do not modify the next two lines fig.savefig('heatmap.png') return fig
Yeah testing takes quite a while on this one. The site doesn’t seem to offer a lot of resources. I recommend using another way to test your code and copy-paste it there once it’s giving out the right results.
Personally I just use Google-Collabs, though any Python-IDE would do.
Regarding the matplotlib message, I get that as well, though no idea how to resolve that.
Now with the error 137 - I think it refers to ALL data in the project, making the other plots smaller (no idea how big they are) should also help. I can run you heatmap-code in my project.
Also please take a look at your data-preparation, because as I said, there is a mistake and this results in wrong values for the heatmap.
How would I change all the zeros (“0”) in the heat map to “0.0”?I tried doing it one way but it does not seem to work.
df_corr=df.corr(method="pearson") df_corr=df_corr.round(1) df_corr= df_corr.replace(0,0.0) print(df_corr) mask = np.zeros(df_corr.shape, dtype=bool) mask[mask==0]=0.0 mask[np.triu_indices(len(mask))] = True sns.heatmap(df_corr, vmin = -1, vmax = 1, center = 0, cmap = 'coolwarm', annot = True, mask = mask) plt.show()
You don’t. You also don’t need the .round(1) - the task asks of neither and by doing them anyway you are in fact introducing code that might make you fail the final test.
Focus on cleaning up the data properly and you should get the correct results without the need of additional actions.
Here is the message I get if I do not round:
"FAIL: test_heat_map_values (test_module.HeatMapTestCase)
Traceback (most recent call last):
File “/home/runner/boilerplate-medical-data-visualizer/test_module.py”, line 47, in test_heat_map_values
self.assertEqual(actual, expected, “Expected differnt values in heat map.”)
AssertionError: Lists differ: [‘0.0025’, ‘0.0034’, ‘-0.018’, ‘0.00033’, ‘-0.[795 chars].14’] != [‘0.0’, ‘0.0’, ‘-0.0’, ‘0.0’, ‘-0.1’, ‘0.5’, ‘[612 chars]0.1’]
First differing element 0:
Diff is 2250 characters long. Set self.maxDiff to None to see it. : Expected differnt values in heat map."
That is why I thought I may need to round. I know how to round but I am not sure how I would change a "0’ to a “0.0”.
Nah, if numbers are different, the issue is somewhere else.
Kinda annoying how it always takes like 2mins to start my repl to test something…
But turns out the easy way is to add an argument to the heatmap
sns.heatmap(..., fmt='.1f') - this will format numbers as floats with one decimal.