performance when serializing pandas DataFrames

the function  __get_cell_data (https://github.com/kz26/PyExcelerate/blob/dev/pyexcelerate/Worksheet.py#L227) operates on each cell individually.
when serializing a pandas.DataFrame, most of the time, the columns are of a unique type (dtype) and could benefit from some "columnar" approach (instead of row by row, cell by cell approach) to speed up things:
- the ´if´ statements could be evaluated only once per column
- the conversion to string/xml could leverage some "apply / applymap" from pandas
- ...
have you already thought about ways to improve this by keeping the "columnar" info further down the pipe (vs transforming everything to cells) for DataFrames ? it is quite specific yet it is a case lot of pandas users are hitting (slowness in exporting to excel).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

performance when serializing pandas DataFrames #107

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

performance when serializing pandas DataFrames #107

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions