IndentationError: unexpected indent - Anaconda

fngwira14 · December 17, 2019, 9:30pm

The Code Below:

 class_names = set(feat_df.loc[:,'label'])
    # Binarize the labels
    # print(class_names)
#    lb = label_binarize(y = y, classes = list(class_names))
    # classes.remove('unknown')
    # lb.fit(y) #for LabelBinarizer not lable_binerize()
    # lb.classes_ #for LabelBinarizer not lable_binerize

    # Split the training data for cross validation
    (X_train, X_test), (y_train, y_test) = train_test_split(X, y, test_size=0.2, 
                                                        random_state=0)
   
    df_y_train = pd.DataFrame(y_train, columns=['label']) #,'Date','group_idx'])
    
    print('df_y_train.shape', df_y_train.shape,'X_train', X_train.shape)
    ##### Dimensionality Reduction ####

Error Message::
File "<ipython-input-50-1c94ab12f530>", line 10
    (X_train, X_test), (y_train, y_test) = train_test_split(X, y, test_size=0.2,
    ^
IndentationError: unexpected indent

Sky020 · December 18, 2019, 9:22am

Hello fngwira.

I have edited your post for readability. In the future, use Markdown to format your posts, by placing any code in between backticks (`).
Markdown_Forums

To answer your question:
Remove the parentheses around your split output variables.
X_train, X_test, y_train, y_test = ...

Hope this helps

fngwira14 · December 18, 2019, 1:21pm

@Sky020 the error message is still there:

File “”, line 10
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
^
IndentationError: unexpected indent

Sky020 · December 18, 2019, 2:29pm

Do you have that line inside of any functions, statements, or class definitions?

The error is just saying that it is not expecting the line to be indented (white space) that much.

fngwira14 · December 18, 2019, 2:35pm

This is my Code:::

def ML_with_CV_feat(cv_feat_file='../data/cv_feat.csv', n_comp=100, 
                    plotting=False):
            
    # Importing the bottleneck features for each image
    feat_df = pd.read_csv(cv_feat_file, index_col=0, dtype='unicode')
    ##-- Dealing with NaN
    feat_df.fillna(0, inplace=True)  
    feat_df['blob_detected'] = feat_df['blob_detected']*1
    #['cell_area', 'cell_eccentricity', 'cell_solidity', 'average_blue', 'average_green', 'average_red', 'blob_detected', 'num_of_blobs', 'average_blob_area']
#    feat_df = feat_df.sample(frac=0.01)
    feat_df.drop(columns=['cell_area', 'cell_eccentricity', 'cell_solidity',
                           'average_blue', 'average_green', 'average_red'],
                 inplace=True)
    #Removing features that do not seperate populations of cell class
    
    column_names = feat_names = list(feat_df.columns)
    print(column_names)
    for X in ['label','fn']:
        feat_names.remove(x)
#    feat_df = feat_df.iloc[0:300,:]
    mask = feat_df.loc[:, 'label'].isin(['Infected', 'Uninfected'])
    feat_df = feat_df.loc[mask, :].drop_duplicates()
    
    print('Number of features:', len(feat_names))
    y = feat_df.loc[:,['label']].values
    print(type(y), y.shape)

    print('Number of samples for each label \n', feat_df.groupby('label')['label'].count())
#    print(feat_df.head())
    X = feat_df.loc[:, feat_names].astype(float).values
    print('/nColumn feat names after placing into X',
          list(feat_df.loc[:, feat_names].columns))
class_names = set(feat_df.loc[:,'label'])
    # Binarize the labels
    # print(class_names)
#    lb = label_binarize(y = y, classes = list(class_names))
    # classes.remove('unknown')
    # lb.fit(y) #for LabelBinarizer not lable_binerize()
    # lb.classes_ #for LabelBinarizer not lable_binerize

    # Split the training data for cross validation
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, 
                                                        random_state=0)
   
    df_y_train = pd.DataFrame(y_train, columns=['label']) #,'Date','group_idx'])
    
    print('df_y_train.shape', df_y_train.shape,'X_train', X_train.shape)
    ##### Dimensionality Reduction ####

Error Message:: File “”, line 10
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
^
IndentationError: unexpected indent

Sky020 · December 18, 2019, 2:44pm

If you want this inside the function ML_with_CV_feat():
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

Then, add however many spaces (indents) you need to this:
class_names = set(feat_df.loc[:,'label'])
So that it is the same level as all the other code inside the function.

If you do not want the split testing data to be defined inside the function, then make it the same indentation as the class_names variable.

In Python, the indentation of your code defines what section goes with another.

Hope this helps

fngwira14 · December 18, 2019, 2:49pm

Not much if possible, can you offer further editing support…mmmm!!

Sky020 · December 18, 2019, 3:01pm

Use this:

def ML_with_CV_feat(cv_feat_file='../data/cv_feat.csv', n_comp=100, plotting=False):
            
    feat_df = pd.read_csv(cv_feat_file, index_col=0, dtype='unicode')
    feat_df.fillna(0, inplace=True)  
    feat_df['blob_detected'] = feat_df['blob_detected']*1
    #['cell_area', 'cell_eccentricity', 'cell_solidity', 'average_blue', 'average_green', 'average_red', 'blob_detected', 'num_of_blobs', 'average_blob_area']
    #feat_df = feat_df.sample(frac=0.01)
    feat_df.drop(columns=['cell_area', 'cell_eccentricity', 'cell_solidity', 'average_blue', 'average_green', 'average_red'], inplace=True)
    
    column_names = feat_names = list(feat_df.columns)
    print(column_names)

    for X in ['label','fn']: #! THIS DOES NOT MAKE SENSE
        feat_names.remove(x) #CHOOSE TO USE 'X' OR 'x'...WHAT IS 'x'?

    #feat_df = feat_df.iloc[0:300,:]
    mask = feat_df.loc[:, 'label'].isin(['Infected', 'Uninfected'])
    feat_df = feat_df.loc[mask, :].drop_duplicates()
    
    print('Number of features:', len(feat_names))
    y = feat_df.loc[:,['label']].values
    print(type(y), y.shape)

    print('Number of samples for each label \n', feat_df.groupby('label')['label'].count())

    X = feat_df.loc[:, feat_names].astype(float).values
    print('/nColumn feat names after placing into X', list(feat_df.loc[:, feat_names].columns))
    class_names = set(feat_df.loc[:,'label'])

    # print(class_names)
    #lb = label_binarize(y = y, classes = list(class_names))
    # classes.remove('unknown')
    # lb.fit(y) #for LabelBinarizer not lable_binerize()
    # lb.classes_ #for LabelBinarizer not lable_binerize

    # Split the training data for cross validation
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
   
    df_y_train = pd.DataFrame(y_train, columns=['label']) #,'Date','group_idx'])
    
    print('df_y_train.shape', df_y_train.shape,'X_train', X_train.shape)
    ##### Dimensionality Reduction ####

Try that. Look out for my comments that I added in CAPITAL LETTERS

fillermark · December 3, 2020, 5:37am

As the error message indicates, you have an indentation error . This error occurs when a statement is unnecessarily indented or its indentation does not match the indentation of former statements in the same block. Python not only insists on indentation, it insists on consistent indentation . You are free to choose the number of spaces of indentation to use, but you then need to stick with it. If you indent one line by 4 spaces, but then indent the next by 2 (or 5, or 10, or …), you’ll get this error.

However, by default, mixing tabs and spaces is still allowed in Python 2 , but it is highly recommended not to use this “feature”. Python 3 disallows mixing the use of tabs and spaces for indentation. Replacing tabs with 4 spaces is the recommended approach for writing Python code .

jwilkins.oboe · December 3, 2020, 5:39am

Hi @fillermark!

This post has not been active for over a year.

Please only reply to newer topics.

Thanks!

Topic		Replies	Views
Indentation error Python	2	228	October 1, 2021
Hello, Can someone Help me Please Thanks :D Python	2	263	September 10, 2021
Python Visual Studio Question Python	8	854	October 24, 2021
Data Analysis with Python Projects - Demographic Data Analyzer Python	5	317	April 4, 2023
Python - Indentation Error Python	9	3901	February 17, 2021

IndentationError: unexpected indent - Anaconda

Related Topics