Medical Data Visualizer Interpretation

Hi guys,

I need help to understand what the challenge wants me to do here :

Clean the data. Filter out the following patient segments that represent incorrect data:

diastolic pressure is higher than systolic (Keep the correct data with (df['ap_lo'] <= df['ap_hi']))
height is less than the 2.5th percentile (Keep the correct data with (df['height'] >= df['height'].quantile(0.025)))
height is more than the 97.5th percentile
weight is less than the 2.5th percentile
weight is more than the 97.5th percentile

Am I supposed to remove those values from the dataframe ?

For example in :

“height is more then the 97.5th percentile” → Am I supposed to remove height values where the height is above 97.5% of the percentile?

“weight is less than the 2.5th percentile” → remove weight that is less than 2.5% or keep?

So, if the 2.5% percentile is 150cm and the 97.5% percentile is 180cm, we should remove the values inferior or equal to 150cm and remove the values above 180cm.

Correct ?

The exercise isn’t clear.

Maybe read the entire sentence again?
It’s literally saying “Filter out the following data” including examples as “keep the rest” - just use the same logic three more times.
You are supposed to remove the data.

Also just picture it in your head: What is more likely to be flawed data? That in the middle or that at the edges?

1 Like

You are the best, bro. Thanks for answering <3

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.