Description
Context
Currently DataFrame.to_excel
supports freeze_panes
, and there are requests to support other Excel features (#42560). I've gone through the features of openpyxl and Xlsxwriter, here are other Excel features that pandas might support:
- autofilter
- hiding rows/column/worksheets
- password protection
- conditional formatting
- data validation
- dropdown lists
- defined names
- cell comments
Not listed here are styles as I believe the styler
supports most of these (cc @ahawryluk).
It seems difficult to justify pandas supporting some of these, but not others. On the other hand, implementing all or even just some of them for each engine adds a lot of complexity and maintenance burden to pandas. I haven't checked, but it may also be the case that some features are supported by certain engines but not by others.
While pandas documents the writer
of ExcelWriter
as having no public attributes, third party engines document using writer.book
e.g. xlsxwriter.
Proposal 1: Add more arguments to write_excel.
Add other arguments to to_excel
, such as autofilter, hiding row/columns/worksheets, conditional formatting. At least these three I perceive to be commonly used.
This is the most user friendly option in that it allows users access to these features with a single function call and the ability to switch engines without having to refactor their code. However as mentioned above, this adds complexity to pandas and is a maintenance burden.
Proposal 2: Make writer.book
public, prefix all or most other attributes with an _
.
This would allow users to use the features that the engine supports directly with ExcelWriter
, e.g.
with ExcelWriter("test.xlsx") as writer:
df.to_excel(writer)
worksheet = writer.book.get_worksheet_by_name("Sheet1")
worksheet...
This would require refactoring the writer.sheets
as in current state it can get out of sync with writer.book
if e.g. a user does writer.book.add_sheet(...)
.
This proposal is less complex/maintenance than Proposal 1, more user friendly than Proposal 3.
Proposal 3: Don't support any Excel features.
Users wanting these features would either be required to (a) write their DataFrames directly with the engine of choice or (b) write their DataFrames with pandas, then open a book with the engine of their choice if the engine supports modifying existing Excel workbooks.
Less complex/maintenance than Proposal 2, but also less user friendly.
I am currently in favor of Proposal 2. I also see a mix of Proposal 1 and Proposal 2 as viable, where some features are supported directly as arguments in to_excel
with other features require the user to use writer.book
for others.
cc @pandas-dev/pandas-core