With Python, you can merge or combine Excel workbooks if you have multiple workbooks to analyze or anything similar.
It can be painful to manually retrieve the data. You would have to open every single file and you may end up working in a confusing environment. We can automate that in Python in less than 10-15 lines of code.
We have two files data1
and data2
and we will merge these two files into one.
First of all, we want to import pandas
module in our code.
And for those of you who don't know what it is, it's a module used mainly to manipulate data in Python.
import pandas.as pd
After that, we want to specify the location of the Excel files that we want to merge.
This is the list we are going to loop through later to make sure we go over each file.
Also, we need to create a blank DataFrame and stored in a variable merge
.
excel_files = ['location_of_your_first_excel_file', 'location_of_your_second_excel_file']
merge = pd.DataFrame()
It's now time to loop through our Excel list and read those files in a DataFrame. I also make sure to not copy the header from the second workbook file by using the skiprows
argument. We are telling pandas
to ignore the first row of the second workbook, we don't wanna that in our file. We just need the merged data.
for file in excel_files:
df = pd.read_excel(file, skiprows = 1)
merge = merge.append(df, ignore_index = True)
merge.to_excel('Merged_Files.xlsx')
We finally append the results to merge and we should be done with logic.
At the end we need to have all in the output file, we can call this "Merged_Files"
Whole code looks like this:
import pandas.as pd
excel_files = ['location_of_your_first_excel_file', 'location_of_your_second_excel_file']
merge = pd.DataFrame()
for file in excel_files:
df = pd.read_excel(file, skiprows = 1)
merge = merge.append(df, ignore_index = True)
merge.to_excel('Merged_Files.xlsx')
That is all for today.
Have a nice day.