Performance¶
openpyxl attempts to balance functionality and performance. Where in doubt, we have focused on functionality over optimisation: performance tweaks are easier once an API has been established. Memory use is fairly high in comparison with other libraries and applications and is approximately 50 times the original file size, e.g. 2.5 GB for a 50 MB Excel file. As many use cases involve either only reading or writing files, the Read-only mode modes mean this is less of a problem.
Benchmarks¶
All benchmarks are synthetic and extremely dependent upon the hardware but they can nevertheless give an indication.
Write Performance¶
The benchmark code can be adjusted to use more sheets and adjust the proportion of data that is strings. Because the version of Python being used can also significantly affect performance, a driver script can also be used to test with different Python versions with a tox environment.
Performance is compared with the excellent alternative library xlsxwriter
Versions:
python: 2.7.1
openpyxl: 2.5.7
xlsxwriter: 1.0.9
Dimensions:
Rows = 1000
Cols = 50
Sheets = 4
Proportion text = 0.10
Times:
xlsxwriter : 2.69
xlsxwriter (optimised): 2.48
openpyxl : 4.59
openpyxl (optimised) : 3.65
Versions:
python: 3.5.6
openpyxl: 2.5.7
xlsxwriter: 1.0.9
Dimensions:
Rows = 1000
Cols = 50
Sheets = 4
Proportion text = 0.10
Times:
xlsxwriter : 3.20
xlsxwriter (optimised): 3.08
openpyxl : 5.54
openpyxl (optimised) : 3.95
Versions:
python: 3.6.6
openpyxl: 2.5.7
xlsxwriter: 1.0.9
Dimensions:
Rows = 1000
Cols = 50
Sheets = 4
Proportion text = 0.10
Times:
xlsxwriter : 3.18
xlsxwriter (optimised): 3.02
openpyxl : 4.28
openpyxl (optimised) : 3.34
Versions:
python: 3.7.0
openpyxl: 2.5.7
xlsxwriter: 1.0.9
Dimensions:
Rows = 1000
Cols = 50
Sheets = 4
Proportion text = 0.10
Times:
xlsxwriter : 3.02
xlsxwriter (optimised): 2.94
openpyxl : 4.11
openpyxl (optimised) : 3.12
Read Performance¶
Performance is measured using a file provided with a previous bug report and compared with the older xlrd library. xlrd is primarily for the older BIFF file format of .XLS files but it does have limited support for XLSX.
The code for the benchmark shows the importance of choosing the right options when working with a file. In this case disabling external links stops openpyxl opening cached copies of the linked worksheets.
One major difference between the libraries is that openpyxl’s read-only mode opens a workbook almost immediately making it suitable for multiple processes, this also readuces memory use significantly. xlrd does also not automatically convert dates and times into Python datetimes, though it does annotate cells accordingly but to do this in client code significantly reduces performance.
Versions:
python: 2.7.1
xlread: 1.1.0
openpyxl: 2.5.11
xlrd
Workbook loaded 61.26s
OptimizationData 0.18s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 0.07s
Store days 100% 0.07s
Total time 61.59s
openpyxl
Workbook loaded 131.36s
OptimizationData 4.03s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 3.46s
Store days 100% 3.50s
Total time 142.36s
openpyxl, read-only
Workbook loaded 0.97s
OptimizationData 23.39s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 21.69s
Store days 100% 17.02s
Total time 63.07s
openpyxl, read-only, values only
Workbook loaded 0.94s
OptimizationData 35.88s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 29.64s
Store days 100% 22.53s
Total time 89.00s
Versions:
python: 3.5.6
xlread: 1.1.0
openpyxl: 2.5.11
xlrd
Workbook loaded 67.22s
OptimizationData 0.25s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 0.08s
Store days 100% 0.08s
Total time 67.63s
openpyxl
Workbook loaded 140.84s
OptimizationData 4.27s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 3.67s
Store days 100% 3.88s
Total time 152.66s
openpyxl, read-only
Workbook loaded 1.30s
OptimizationData 37.25s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 26.24s
Store days 100% 20.37s
Total time 85.17s
openpyxl, read-only, values only
Workbook loaded 1.29s
OptimizationData 40.53s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 34.11s
Store days 100% 26.61s
Total time 102.54s
Versions:
python: 3.6.7
xlread: 1.1.0
openpyxl: 2.5.11
xlrd
Workbook loaded 51.13s
OptimizationData 0.24s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 0.08s
Store days 100% 0.07s
Total time 51.51s
openpyxl
Workbook loaded 114.45s
OptimizationData 3.27s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 2.70s
Store days 100% 3.08s
Total time 123.51s
openpyxl, read-only
Workbook loaded 1.07s
OptimizationData 24.99s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 20.25s
Store days 100% 16.48s
Total time 62.79s
openpyxl, read-only, values only
Workbook loaded 1.10s
OptimizationData 34.35s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 28.17s
Store days 100% 21.69s
Total time 85.32s
Versions:
python: 3.7.1
xlread: 1.1.0
openpyxl: 2.5.11
xlrd
Workbook loaded 59.14s
OptimizationData 0.24s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 0.08s
Store days 100% 0.07s
Total time 59.53s
openpyxl
Workbook loaded 114.90s
OptimizationData 3.27s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 2.78s
Store days 100% 3.16s
Total time 124.12s
openpyxl, read-only
Workbook loaded 0.95s
OptimizationData 24.66s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 19.98s
Store days 100% 15.51s
Total time 61.10s
openpyxl, read-only, values only
Workbook loaded 0.92s
OptimizationData 31.89s
Output Model 0.00s
>>DATA>> 0.00s
Store days 0% 27.09s
Store days 100% 21.85s
Total time 81.75s