Working with Pandas and NumPy¶
openpyxl has builtin support for the NumPy types float, integer and boolean. DateTimes are supported using the Pandas’ Timestamp type.
Working with Pandas Dataframes¶
openpyxl.utils.dataframe.dataframe_to_rows() function provides a
simple way to work with Pandas Dataframes:
from openpyxl.utils.dataframe import dataframe_to_rows wb = Workbook() ws = wb.active for r in dataframe_to_rows(df, index=True, header=True): ws.append(r)
While Pandas itself supports conversion to Excel, this gives client code additional flexibility including the ability to stream dataframes straight to files.
To convert a dataframe into a worksheet highlighting the header and index:
wb = Workbook() ws = wb.active for r in dataframe_to_rows(df, index=True, header=True): ws.append(r) for cell in ws['A'] + ws: cell.style = 'Pandas' wb.save("pandas_openpyxl.xlsx")
Alternatively, if you just want to convert the data you can use write-only mode:
from openpyxl.cell.cell import WriteOnlyCell wb = Workbook(write_only=True) ws = wb.create_sheet() cell = WriteOnlyCell(ws) cell.style = 'Pandas' def format_first_row(row, cell): for c in row: cell.value = c yield cell rows = dataframe_to_rows(df) first_row = format_first_row(next(rows), cell) ws.append(first_row) for row in rows: row = list(row) cell.value = row row = cell ws.append(row) wb.save("openpyxl_stream.xlsx")
This code will work just as well with a standard workbook.
Converting a worksheet to a Dataframe¶
To convert a worksheet to a Dataframe you can use the values property. This is very easy if the worksheet has no headers or indices:
df = DataFrame(ws.values)
If the worksheet does have headers or indices, such as one created by Pandas, then a little more work is required:
data = ws.values cols = next(data)[1:] data = list(data) idx = [r for r in data] data = (islice(r, 1, None) for r in data) df = DataFrame(data, index=idx, columns=cols)