Loop through all excel files and all worksheets

Using python 3. I need to loop through a folder that contains excel files and each file has multiple sheets. How do I loop through all the files and all the sheets and extract to a dataframe?

What I was able to accomplish only returns one excel file and all the worksheets for that file but I need for all files. Please help.

This is what I have so far:

from xlsxwriter import Workbook
import pandas as pd
import openpyxl
import glob
import os

path = 'filestoimport/*.xlsx'

for filepath in glob.glob(path):
    xl = pd.ExcelFile(filepath)


    # Define an empty list to store individual DataFrames
    list_of_dfs = []
    list_of_dferror= []
    for sheet_name in xl.sheet_names:
        df = xl.parse(sheet_name, usecols='A,D,N,B,C,E,F,G,H,I,J,K,L,M', header=0)
        df.columns = df.columns.str.replace(' ', '')

        df['sheetname'] = sheet_name  # this adds `sheet_name` into the column
        
        # using basename function from os
        # module to print file name
        file_name = os.path.basename(filepath)
        df['sourcefilename'] = file_name
  
        # only add sheets containing columns ['Status', 'ProjectID']
        column_names = ['Status', 'ProjectID']
        if set(column_names).issubset(df.columns):
         df['Status'].fillna('', inplace=True)
         df['Addedby'].fillna('', inplace=True)
      # And append it to the list
         list_of_dfs.append(df)
      
    # Combine all DataFrames into one
    data = pd.concat(list_of_dfs, ignore_index=True)

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.