vs. Medical data visualisation

Dear all, takes more than 5 minutes to edit the 70k-users-dataframe as requested before I can even solve the 4 issues.
Is it a problem with my code to edit the dataframe?

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
# Import data
df = pd.read_csv('medical_examination.csv')
# Add 'overweight' column
df['overweight'] = None
# Normalize data by making 0 always good and 1 always bad. If the value of 'cholestorol' or 'gluc' is 1, make the value 0. If the value is more than 1, make the value 1
for index in df.index:
  if df.loc[index, 'cholesterol']==1:                               
    df.loc[index, 'cholesterol']=0
  elif df.loc[index, 'cholesterol']>1:
    df.loc[index, 'cholesterol']=1
  if df.loc[index, 'gluc']==1:
    df.loc[index, 'gluc']=0
  elif df.loc[index, 'gluc']>1:
    df.loc[index, 'gluc']=1

Try to find way not using for loop here. Comparing to some other ways of modifying data using numpy and pandas, using basic loops is very slow.

Following lesson can be helpful:

1 Like

Leads to:

df.loc[df['cholesterol']==1, 'cholesterol'] = 0
df.loc[df['cholesterol']>1, 'cholesterol'] = 1                    
df.loc[df['gluc']==1, 'gluc'] = 0
df.loc[df['gluc']>1, 'gluc'] = 1

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.