Medical Data Visualizer

Hi all,
Not sure if I am causing issues posting my questions repeatedly; all I need is help. Let me know if posting topics (seeking assistance) also affects marking of assignments and in turn of Certification completion.

Working on the Medical Data Visualizer; can someone tell me I am doing something seriously wrong… or?

Add an ‘overweight’ column to the data. To determine if a person is overweight, first calculate their BMI by dividing their weight in kilograms by the square of their height in meters. If that value is > 25 then the person is overweight. Use the value 0 for NOT overweight and the value 1 for overweight.
Normalize data by making 0 always good and 1 always bad. If the value of ‘cholesterol’ or ‘gluc’ is 1, make the value 0. If the value is more than 1, make the value 1.

My code below to start with; is this correct? How do I check both the things viz. BMI value is > 25 and user overweight 0 or 1 ?
Your help is much appreciated. Thank you in advance.

df[‘overweight’] = df[df[‘weight’] / (df[‘height’]) ** 2]

Okay, so you need to do three things here:

  1. Determine the BMI,
  2. Perform a logical operation to see if BMI is greater than 25.
  3. Create a df['overweight'] whose elements are 1 if bmi>25 and 0 if bmi<=25.

Here is some help:

  1. BMI requires the height to be in metres, whereas the height column is in centimetres. So you need to divide df['height'] by 100 in your calculation (remember to use brackets around that).
  2. You can now do a logical operation to test if bmi>25 or not. Using something like df['weight']/(df['height']/100)**2 > 25 will return a series with True or False, depending on the bmi.
  3. Now you need to set df['overweight'] = 1 if bmi>25 and df['overweight'] = 0 if bmi<=25.

Putting the above steps together into two lines (there are several other ways to do this)
df.loc[df['weight']/(df['height']/100)**2 > 25, 'overweight'] = 1
df.loc[df['weight']/(df['height']/100)**2 <= 25, 'overweight'] = 0
The first line will create the overweight column and set its value to 1 for bmi>25 (when the logical test returns True) and NaN for the rest (when the logical test returns False this does nothing essentially). The second line will set the rest of the values to 0.

Hope this helps.

2 Likes

Hi kitanikita,
Thank you for your reply. Yes; all that I could imagine was .loc and the calculation of BMI and couldn’t sum it altogether in one line as you have showed me. I shall continue with the assignment further and see how it goes.

Cheers
Akshay

You could do it in one step with something like this

import numpy as np
df['overweight'] = np.where( (df.weight/(df.height ** 2) * 10000) > 25, 1, 0)
1 Like

That really helped me too! I bet you’ve helped a lot of us :slight_smile: