Working with Pandas and NumPy¶
openpyxl is able to work with the popular libraries Pandas and NumPy
openpyxl has builtin support for the NumPy types float, integer and boolean. DateTimes are supported using the Pandas’ Timestamp type.
Working with Pandas Dataframes¶
openpyxl.utils.dataframe.dataframe_to_rows() function provides a
simple way to work with Pandas Dataframes:
from openpyxl.utils.dataframe import dataframe_to_rows wb = Workbook() ws = wb.active for r in dataframe_to_rows(df, index=True, header=True): ws.append(r)
While Pandas itself supports conversion to Excel, this gives client code additional flexibility including the ability to stream dataframes straight to files.
To convert a dataframe into a worksheet highlighting the header and index:
wb = Workbook() ws = wb.active for r in dataframe_to_rows(df, index=True, header=True): ws.append(r) for cell in ws['A'] + ws: cell.style = 'Pandas' wb.save("pandas_openpyxl.xlsx")
Alternatively, if you just want to convert the data you can use write-only mode:
from openpyxl.cell.cell import WriteOnlyCell wb = Workbook(write_only=True) ws = wb.create_sheet() cell = WriteOnlyCell(ws) cell.style = 'Pandas' def format_first_row(row, cell): for c in row: cell.value = c yield cell rows = dataframe_to_rows(df) first_row = format_first_row(next(rows), cell) ws.append(first_row) for row in rows: row = list(row) cell.value = row row = cell ws.append(row) wb.save("openpyxl_stream.xlsx")
This code will work just as well with a standard workbook.
Converting a worksheet to a Dataframe¶
To convert a worksheet to a Dataframe you can use the values property. This is very easy if the worksheet has no headers or indices:
df = DataFrame(ws.values)
If the worksheet does have headers or indices, such as one created by Pandas, then a little more work is required:
from itertools import islice data = ws.values cols = next(data)[1:] data = list(data) idx = [r for r in data] data = (islice(r, 1, None) for r in data) df = DataFrame(data, index=idx, columns=cols)