Medial data visualization help--help with first bar chart task

Tell us what’s happening:

Your code so far

Your browser information:

User Agent is: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36 Edg/85.0.564.63.

Challenge: Medical Data Visualizer

Link to the challenge:

Hello there.

Do you have a question?

If so, please edit your post to include it in the Tell us what’s happening section.

The more information you give us, the more likely we are to be able to help.

I’m trying to understand what they are asking for in the first task. What information do they need in the bar chart.

You essentially need to replicate the barchart in examples/Figure_1.png

What specifically are you struggling with?

Are you just not understanding this bit?

  • Add an ‘overweight’ column to the data. To determine if a person is overweight, first calculate their BMI by dividing their weight in kilograms by the square of their height in meters. If that value is > 25 then the person is overweight. Use the value 0 for NOT overweight and the value 1 for overweight.

In a sense. I’m not sure of what they want me to measure in the chart, but I will try.

I’m at my wits end wit this task. I don’t know what information they want in the chart

No worries. Take a break, if you need to. Then, come back to this:

  1. Take the data given in medical_examination.csv
  • This data has 12 rows and 4 columns
  1. Use the weight, and height data to create a new Feature. This feature is overweight, and is calculated with this formula:
  1. Normalise the cholestorol and gluc by either setting to 1 or 0

  2. Convert the data to long format (all of the data). Now re-create the example figure, with this data:

  3. Now, take all the data, and filter it out, based on the given criteria.

  4. Create a correlation matrix like this:
    image

Hope this helps a bit.

Thank you, I’ll try and see what happens

How do you divide columns by another columns. Honestly that video resolution was not very clear.

This is my code so far:

import csv
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

medfile = pd.read_csv("medical_examination.csv")

print(medfile)

a = (medfile["height"]**2)

Well, you are almost doing everything you need here:

a = (medfile["height"]**2)

Now, where is the weight property?

  • You have correctly accessed the height property
  • You have correctly performed a mathematical operation on it
  • How do you think you can divide weight by height, now?

I’ve edited your post for readability. When you enter a code block into a forum post, please precede it with a separate line of three backticks and follow it with a separate line of three backticks to make it easier to read.

You can also use the “preformatted text” tool in the editor (</>) to add backticks around text.

See this post to find the backtick on your keyboard.
Note: Backticks (`) are not single quotes (’).

I’m thinking of this formula to divide the columns.

overweight = (“a”, / medfile[“weight”])

But when but I tried it it didn’t work. If I do that i would then need to appended the result to the csv file as a new bracket. Then break it down into a list so I can plot it to a barchart.

The video quality wasn’t working with my eyes, so I couldn’t follow along carefully. An finding the right information elsewhere is tedious as well.

Let me break this down more:

What needs to be done

  1. Add a new column overweight to the data
  2. Access the weight property of the data
  3. Access the height property of the data
  4. Use this formula to give overweight the correct value:


What you have managed to do

  1. Access the height property…
  2. …and square the value

Hints

  • The same way you access height, you can access weight
  • The same way you access a property, you can set a property

Here is sudo-code:

set_overweight = (access_weight / (access_height ** 2))

I really don’t mean to be a bother, but I’m stuck. I have already created the "overweight " column, but need some help inserting it into the data. Each time I run it i get “None”.

code so far:

</>import csv
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

medfile = pd.read_csv("medical_examination.csv",)

overweight = ((medfile["height"]**2) / medfile["weight"])

print(medfile.insert(loc=13, column="overweight", value=overweight))</>

let me know my areas for improvement.

It is not a bother. People are here to help.

I will refer back to the lesson I suggest you review. Here is the code from the challenge (not in the video):

certificates_earned = pd.DataFrame({
    'Certificates': [8, 2, 5, 6],
    'Time (in months)': [16, 5, 9, 12]
})
names = ['Tom', 'Kris', 'Ahmad', 'Beau']

certificates_earned.index = names
longest_streak = pd.Series([13, 11, 9, 7], index=names)
certificates_earned['Longest streak'] = longest_streak
  • certificates_earned is the dataframe
  • index is added to certificates_earned
  • 'Longest streak' is added to certificates_earned

Can you see how those properties are added?

I do not intend to be mean here: This kind of data/variable accessing/creating is very basic to every programming language, and looks almost identical in most programming languages. If this is new to you, I would recommend you go back over the Python for Everybody section which should cover this.

Hope this helps


Also, please read what I mentioned above about putting code in this forum. It is very important to correctly format Python code, because indentation is important.

Sure, thanks for the advice.

1 Like

I have successfully added the “overweight column.” I’m now trying to visualize the data into the bar chart. How would I break it down.