Skip to content

Pandas VS Polars benchmark PR (don't merge)#179

Open
armgilles wants to merge 46 commits intodev_pandasfrom
dev_polars
Open

Pandas VS Polars benchmark PR (don't merge)#179
armgilles wants to merge 46 commits intodev_pandasfrom
dev_polars

Conversation

@armgilles
Copy link
Owner

@armgilles armgilles commented Nov 21, 2024

Just here to compare perf pandas VS polars in codspeed benchmarks !

* Fix install in front and got to be in PROD

Signed-off-by: Armand <arm.gilles@gmail.com>

* Just to check in site-package on dir below ROOT_DIR

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update Ruff version in pre commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Change check for prod with new structure and Front

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove non usefull reset_index in oslandia API data crunsh

Signed-off-by: Armand <arm.gilles@gmail.com>

* Format

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add matplotlib as dep (visualisation.py)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add seaborn as dep for deploy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to fix error install ci watcher 'error: Your local changes to the following files would be overwritten by checkout'

Signed-off-by: Armand <arm.gilles@gmail.com>

* minor update on notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
Signed-off-by: Armand <arm.gilles@gmail.com>
* Update for front (#111)

* Fix install in front and got to be in PROD

Signed-off-by: Armand <arm.gilles@gmail.com>

* Just to check in site-package on dir below ROOT_DIR

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update Ruff version in pre commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Change check for prod with new structure and Front

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove non usefull reset_index in oslandia API data crunsh

Signed-off-by: Armand <arm.gilles@gmail.com>

* Format

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add matplotlib as dep (visualisation.py)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add seaborn as dep for deploy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to fix error install ci watcher 'error: Your local changes to the following files would be overwritten by checkout'

Signed-off-by: Armand <arm.gilles@gmail.com>

* minor update on notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Bump to version 1.2.2a (#112)

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
* Update for front (#111)

* Fix install in front and got to be in PROD

Signed-off-by: Armand <arm.gilles@gmail.com>

* Just to check in site-package on dir below ROOT_DIR

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update Ruff version in pre commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Change check for prod with new structure and Front

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove non usefull reset_index in oslandia API data crunsh

Signed-off-by: Armand <arm.gilles@gmail.com>

* Format

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add matplotlib as dep (visualisation.py)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add seaborn as dep for deploy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to fix error install ci watcher 'error: Your local changes to the following files would be overwritten by checkout'

Signed-off-by: Armand <arm.gilles@gmail.com>

* minor update on notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Bump to version 1.2.2a (#112)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Migre read_activity_vcub into Polars #117

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add Polars Pyarrow dep in package

Signed-off-by: Armand <arm.gilles@gmail.com>

* Improve migration #117

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update bench test #117

Signed-off-by: Armand <arm.gilles@gmail.com>

* Fix pandas output in Bench tests #117

Signed-off-by: Armand <arm.gilles@gmail.com>

* Typo

Signed-off-by: Armand <arm.gilles@gmail.com>

* forget to retour output with pandas... #117

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
* Update for front (#111)

* Fix install in front and got to be in PROD

Signed-off-by: Armand <arm.gilles@gmail.com>

* Just to check in site-package on dir below ROOT_DIR

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update Ruff version in pre commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Change check for prod with new structure and Front

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove non usefull reset_index in oslandia API data crunsh

Signed-off-by: Armand <arm.gilles@gmail.com>

* Format

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add matplotlib as dep (visualisation.py)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add seaborn as dep for deploy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to fix error install ci watcher 'error: Your local changes to the following files would be overwritten by checkout'

Signed-off-by: Armand <arm.gilles@gmail.com>

* minor update on notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Bump to version 1.2.2a (#112)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add Polars dependencies to the project

Signed-off-by: Armand <arm.gilles@gmail.com>

* Adding pyarrow as dep for Polars

Signed-off-by: Armand <arm.gilles@gmail.com>

* Migrate get_transactions_out into Polars #116

Signed-off-by: Armand <arm.gilles@gmail.com>

* Explicite return with pandas use case #116

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update bench test #116

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
Signed-off-by: Armand <arm.gilles@gmail.com>
Signed-off-by: Armand <arm.gilles@gmail.com>
* Migrate transactions_in in Polars #121

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update test #121

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update read_activity_vcub to fix mistake output_type default and type

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
* Migrate transactions_in in Polars #121

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update test #121

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update read_activity_vcub to fix mistake output_type default and type

Signed-off-by: Armand <arm.gilles@gmail.com>

* Migration transactions_all into Polars #124

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
* Migrate transactions_in in Polars #121

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update test #121

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update read_activity_vcub to fix mistake output_type default and type

Signed-off-by: Armand <arm.gilles@gmail.com>

* Migrate transform_json_api_bdx_station_data_to_df and oslandia func to polars #126

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update unit test & bench + data test for #126

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update test with #126

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update test with #126

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update test with #126

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
* Migrate transactions_in in Polars #121

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update test #121

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update read_activity_vcub to fix mistake output_type default and type

Signed-off-by: Armand <arm.gilles@gmail.com>

* Working on #130

Signed-off-by: Armand <arm.gilles@gmail.com>

* try to remove numpy and using math to have pi #130

Signed-off-by: Armand <arm.gilles@gmail.com>

* test improve bench #130

Signed-off-by: Armand <arm.gilles@gmail.com>

* test improve bench lit #130

Signed-off-by: Armand <arm.gilles@gmail.com>

* test improve bench lit2 #130

Signed-off-by: Armand <arm.gilles@gmail.com>

* test improve bench lit as 1 #130

Signed-off-by: Armand <arm.gilles@gmail.com>

* test improve bench lit as expr #130

Signed-off-by: Armand <arm.gilles@gmail.com>

* test improve bench lit as expr with mul#130

Signed-off-by: Armand <arm.gilles@gmail.com>

* test improve bench lit as expr with no mul#130

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
Signed-off-by: Armand <arm.gilles@gmail.com>
Signed-off-by: Armand <arm.gilles@gmail.com>
* Update for front (#111)

* Fix install in front and got to be in PROD

Signed-off-by: Armand <arm.gilles@gmail.com>

* Just to check in site-package on dir below ROOT_DIR

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update Ruff version in pre commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Change check for prod with new structure and Front

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove non usefull reset_index in oslandia API data crunsh

Signed-off-by: Armand <arm.gilles@gmail.com>

* Format

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add matplotlib as dep (visualisation.py)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add seaborn as dep for deploy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to fix error install ci watcher 'error: Your local changes to the following files would be overwritten by checkout'

Signed-off-by: Armand <arm.gilles@gmail.com>

* minor update on notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Bump to version 1.2.2a (#112)

Signed-off-by: Armand <arm.gilles@gmail.com>

* dev_pandas version 1.2.2a (#113)

* Update for front (#111)

* Fix install in front and got to be in PROD

Signed-off-by: Armand <arm.gilles@gmail.com>

* Just to check in site-package on dir below ROOT_DIR

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update Ruff version in pre commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Change check for prod with new structure and Front

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove non usefull reset_index in oslandia API data crunsh

Signed-off-by: Armand <arm.gilles@gmail.com>

* Format

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add matplotlib as dep (visualisation.py)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add seaborn as dep for deploy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to fix error install ci watcher 'error: Your local changes to the following files would be overwritten by checkout'

Signed-off-by: Armand <arm.gilles@gmail.com>

* minor update on notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Bump to version 1.2.2a (#112)

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Migrate transactions_in in Polars #121

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update test #121

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update read_activity_vcub to fix mistake output_type default and type

Signed-off-by: Armand <arm.gilles@gmail.com>

* Increase bench dataset (#134)

* Update for front (#111)

* Fix install in front and got to be in PROD

Signed-off-by: Armand <arm.gilles@gmail.com>

* Just to check in site-package on dir below ROOT_DIR

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update Ruff version in pre commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Change check for prod with new structure and Front

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove non usefull reset_index in oslandia API data crunsh

Signed-off-by: Armand <arm.gilles@gmail.com>

* Format

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add matplotlib as dep (visualisation.py)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add seaborn as dep for deploy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to fix error install ci watcher 'error: Your local changes to the following files would be overwritten by checkout'

Signed-off-by: Armand <arm.gilles@gmail.com>

* minor update on notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Bump to version 1.2.2a (#112)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add bench test with bigger dataset #133

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Increase dateset bench with polars #133

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update input ouput bigger fonction to polars #133

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update input ouput bigger fonction to polars #133

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
* Update for front (#111)

* Fix install in front and got to be in PROD

Signed-off-by: Armand <arm.gilles@gmail.com>

* Just to check in site-package on dir below ROOT_DIR

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update Ruff version in pre commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Change check for prod with new structure and Front

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove non usefull reset_index in oslandia API data crunsh

Signed-off-by: Armand <arm.gilles@gmail.com>

* Format

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add matplotlib as dep (visualisation.py)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add seaborn as dep for deploy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to fix error install ci watcher 'error: Your local changes to the following files would be overwritten by checkout'

Signed-off-by: Armand <arm.gilles@gmail.com>

* minor update on notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Bump to version 1.2.2a (#112)

Signed-off-by: Armand <arm.gilles@gmail.com>

* dev_pandas version 1.2.2a (#113)

* Update for front (#111)

* Fix install in front and got to be in PROD

Signed-off-by: Armand <arm.gilles@gmail.com>

* Just to check in site-package on dir below ROOT_DIR

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update Ruff version in pre commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Change check for prod with new structure and Front

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove non usefull reset_index in oslandia API data crunsh

Signed-off-by: Armand <arm.gilles@gmail.com>

* Format

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add matplotlib as dep (visualisation.py)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add seaborn as dep for deploy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to fix error install ci watcher 'error: Your local changes to the following files would be overwritten by checkout'

Signed-off-by: Armand <arm.gilles@gmail.com>

* minor update on notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Bump to version 1.2.2a (#112)

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Migrate transactions_in in Polars #121

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update test #121

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update read_activity_vcub to fix mistake output_type default and type

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update unit test for #129

Signed-off-by: Armand <arm.gilles@gmail.com>

* still wip #129

Signed-off-by: Armand <arm.gilles@gmail.com>

* Fonction still en Pandas, but CI in Polars #129

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add in test data station eq 0 #129

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try a implementation of Polars migration but Slow I guess # 129

Signed-off-by: Armand <arm.gilles@gmail.com>

* Increase bench dataset (#134)

* Update for front (#111)

* Fix install in front and got to be in PROD

Signed-off-by: Armand <arm.gilles@gmail.com>

* Just to check in site-package on dir below ROOT_DIR

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update Ruff version in pre commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Change check for prod with new structure and Front

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove non usefull reset_index in oslandia API data crunsh

Signed-off-by: Armand <arm.gilles@gmail.com>

* Format

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add matplotlib as dep (visualisation.py)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add seaborn as dep for deploy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to fix error install ci watcher 'error: Your local changes to the following files would be overwritten by checkout'

Signed-off-by: Armand <arm.gilles@gmail.com>

* minor update on notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Bump to version 1.2.2a (#112)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add bench test with bigger dataset #133

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* fix value in get_consecutive_no_transactions_out unit test

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to improve perf on big dataset #129

Signed-off-by: Armand <arm.gilles@gmail.com>

* try an optimize way #129

Signed-off-by: Armand <arm.gilles@gmail.com>

* Cleaning for #129

Signed-off-by: Armand <arm.gilles@gmail.com>

* cleaning notebook #129

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
* Update for front (#111)

* Fix install in front and got to be in PROD

Signed-off-by: Armand <arm.gilles@gmail.com>

* Just to check in site-package on dir below ROOT_DIR

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update Ruff version in pre commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Change check for prod with new structure and Front

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove non usefull reset_index in oslandia API data crunsh

Signed-off-by: Armand <arm.gilles@gmail.com>

* Format

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add matplotlib as dep (visualisation.py)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add seaborn as dep for deploy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to fix error install ci watcher 'error: Your local changes to the following files would be overwritten by checkout'

Signed-off-by: Armand <arm.gilles@gmail.com>

* minor update on notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Bump to version 1.2.2a (#112)

Signed-off-by: Armand <arm.gilles@gmail.com>

* dev_pandas version 1.2.2a (#113)

* Update for front (#111)

* Fix install in front and got to be in PROD

Signed-off-by: Armand <arm.gilles@gmail.com>

* Just to check in site-package on dir below ROOT_DIR

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update Ruff version in pre commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Change check for prod with new structure and Front

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove non usefull reset_index in oslandia API data crunsh

Signed-off-by: Armand <arm.gilles@gmail.com>

* Format

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add matplotlib as dep (visualisation.py)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add seaborn as dep for deploy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to fix error install ci watcher 'error: Your local changes to the following files would be overwritten by checkout'

Signed-off-by: Armand <arm.gilles@gmail.com>

* minor update on notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Bump to version 1.2.2a (#112)

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Migrate transactions_in in Polars #121

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update test #121

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update read_activity_vcub to fix mistake output_type default and type

Signed-off-by: Armand <arm.gilles@gmail.com>

* Increase bench dataset (#134)

* Update for front (#111)

* Fix install in front and got to be in PROD

Signed-off-by: Armand <arm.gilles@gmail.com>

* Just to check in site-package on dir below ROOT_DIR

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update Ruff version in pre commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Change check for prod with new structure and Front

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove non usefull reset_index in oslandia API data crunsh

Signed-off-by: Armand <arm.gilles@gmail.com>

* Format

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add matplotlib as dep (visualisation.py)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add seaborn as dep for deploy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to fix error install ci watcher 'error: Your local changes to the following files would be overwritten by checkout'

Signed-off-by: Armand <arm.gilles@gmail.com>

* minor update on notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Bump to version 1.2.2a (#112)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add bench test with bigger dataset #133

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* wip #136

Signed-off-by: Armand <arm.gilles@gmail.com>

* comment read_meto, old fonction stay here just to know

Signed-off-by: Armand <arm.gilles@gmail.com>

* Migrate create_station_attribute & read_stations_attributes to polars with unit test #137

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
* Update omit coverage with no visualisation code

Signed-off-by: Armand <arm.gilles@gmail.com>

* migrate filter_periode to polars and add unit test

Signed-off-by: Armand <arm.gilles@gmail.com>

* To pass CI test with previous commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Migrate create & read station profile #136

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update & cleaning notebook #136

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
* Migrate create_activity_time_series & create_activity_time_series into partial polars with parquet & lazyframe #141

Signed-off-by: Armand <arm.gilles@gmail.com>

* Addapt some code with read_time_serie_activity in lazyframe (not fully polars) #141

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
* Migrate create_activity_time_series & create_activity_time_series into partial polars with parquet & lazyframe #141

Signed-off-by: Armand <arm.gilles@gmail.com>

* Addapt some code with read_time_serie_activity in lazyframe (not fully polars) #141

Signed-off-by: Armand <arm.gilles@gmail.com>

* Addapt some code with read_time_serie_activity in lazyframe (not fully polars) #141

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
* Bench pipeline transf (#155)

* Update for front (#111)

* Fix install in front and got to be in PROD

Signed-off-by: Armand <arm.gilles@gmail.com>

* Just to check in site-package on dir below ROOT_DIR

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update Ruff version in pre commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* Change check for prod with new structure and Front

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove non usefull reset_index in oslandia API data crunsh

Signed-off-by: Armand <arm.gilles@gmail.com>

* Format

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add matplotlib as dep (visualisation.py)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add seaborn as dep for deploy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to fix error install ci watcher 'error: Your local changes to the following files would be overwritten by checkout'

Signed-off-by: Armand <arm.gilles@gmail.com>

* minor update on notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* Bump to version 1.2.2a (#112)

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add benchmark for pipeline transf from json API data #154

Signed-off-by: Armand <arm.gilles@gmail.com>

* Improve speed of creation of simulated data & fix big data creation setting #154

Signed-off-by: Armand <arm.gilles@gmail.com>

* Optimise creation of simulated data and reduce volume test #154

Signed-off-by: Armand <arm.gilles@gmail.com>

* Codspeed run forever, only on small test #154

Signed-off-by: Armand <arm.gilles@gmail.com>

* Codspeed run forever, only on small test #154

Signed-off-by: Armand <arm.gilles@gmail.com>

* Codspeed previous commit ok with no new test, check with one

Signed-off-by: Armand <arm.gilles@gmail.com>

* Reduce volume for #154 to run on codspeed

Signed-off-by: Armand <arm.gilles@gmail.com>

* Reduce volume for #154 to run on codspeed

Signed-off-by: Armand <arm.gilles@gmail.com>

* Reduce again volume for codspeed, reduce CI time

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>

* specify timezone to be as real data #154

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
Signed-off-by: Armand <arm.gilles@gmail.com>
* Try lazy + expression fonction to check perf bench #146

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update code with get_transactions_out expr #146

Signed-off-by: Armand <arm.gilles@gmail.com>

* have to collect with lazy #146

Signed-off-by: Armand <arm.gilles@gmail.com>

* Lazy read_activity_vcub #148 and update notebook transactions_out

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update bench test with lazy vcub_keeper_py312 #148

Signed-off-by: Armand <arm.gilles@gmail.com>

* update docstring #148

Signed-off-by: Armand <arm.gilles@gmail.com>

* collect lazy df to be a fair bench #146

Signed-off-by: Armand <arm.gilles@gmail.com>

* collect lazy df to be a fair bench #146

Signed-off-by: Armand <arm.gilles@gmail.com>

* collect lazy df to be a fair bench #146

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update transactions_in to be lazy and expr fonction #149

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try to improve bench test on big result

Signed-off-by: Armand <arm.gilles@gmail.com>

* lazy and Expr function for transactions_all function #150

Signed-off-by: Armand <arm.gilles@gmail.com>

* try to fix bad perf on big dataset lazy

Signed-off-by: Armand <arm.gilles@gmail.com>

* try to fix bad perf on big dataset lazy

Signed-off-by: Armand <arm.gilles@gmail.com>

* try to fix bad perf on big dataset lazy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Lazy expr for get_consecutive_no_transactions_out #151

Signed-off-by: Armand <arm.gilles@gmail.com>

* Lazy transform_json_api_bdx_station_data_to_df function #152

Signed-off-by: Armand <arm.gilles@gmail.com>

* Encoding time in Expr function and process_data_cluster in lazy mode #153

Signed-off-by: Armand <arm.gilles@gmail.com>

* add todo for ML with pandas

Signed-off-by: Armand <arm.gilles@gmail.com>

* Adapt code for pipeline bench lazy

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try new lazy for pipeline bench

Signed-off-by: Armand <arm.gilles@gmail.com>

* Try new lazy for pipeline bench

Signed-off-by: Armand <arm.gilles@gmail.com>

* process data with with_columns style & lazy #161

Signed-off-by: Armand <arm.gilles@gmail.com>

* forget previous commit

Signed-off-by: Armand <arm.gilles@gmail.com>

* have to collect this tests

Signed-off-by: Armand <arm.gilles@gmail.com>

* Small test bench are in eager mode, big in lazy mode to faire comparaison

Signed-off-by: Armand <arm.gilles@gmail.com>

* Small test bench are in eager mode, big in lazy mode to faire comparaison

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update notebook with with_colums style for feature creation

Signed-off-by: Armand <arm.gilles@gmail.com>

* Using pipe style with lazy

Signed-off-by: Armand <arm.gilles@gmail.com>

* cleaning

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
Signed-off-by: Armand <arm.gilles@gmail.com>
* Update important viz fucntion to polars #164

Signed-off-by: Armand <arm.gilles@gmail.com>

* Upate notebook with viz, some pandas are still here but it's ok #164

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
Signed-off-by: Armand <arm.gilles@gmail.com>
* Fix bad condition for consecutive_no_transactions_out and available bike in station #170

Signed-off-by: Armand <arm.gilles@gmail.com>

* Update test to test available_bike less or equal 2 is consecutive_no_transactions_out = 0 #170

Signed-off-by: Armand <arm.gilles@gmail.com>

* Small update to give same type in test #170

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
Signed-off-by: Armand <arm.gilles@gmail.com>
Signed-off-by: Armand <arm.gilles@gmail.com>
* Bump pyproject and add dep to polars to read from cloud

Signed-off-by: Armand <arm.gilles@gmail.com>

* Fix Deprecated warning with replace to replace_strict

Signed-off-by: Armand <arm.gilles@gmail.com>

* Fix Deprecated warning with check_dtype to check_dtypes

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
* Adding MANIFEST to exclude notebook

Signed-off-by: Armand <arm.gilles@gmail.com>

* Exclude notebook and md files during install

Signed-off-by: Armand <arm.gilles@gmail.com>

* remove .gitattributes lfs files

Signed-off-by: Armand <arm.gilles@gmail.com>

* Add a blanck .gitattributes file

Signed-off-by: Armand <arm.gilles@gmail.com>

* Supprimer Git LFS et exclure les notebooks du suivi LFS

Signed-off-by: Armand <arm.gilles@gmail.com>

* Remove .gitattributes file

Signed-off-by: Armand <arm.gilles@gmail.com>

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
* Corriger le suivi Git LFS pour les fichiers notebooks et .ipynb

* Add again .gitattribute

Signed-off-by: Armand <arm.gilles@gmail.com>

* Fix: Track notebooks with Git LFS

* Fix: Reindex notebooks for Git LFS

---------

Signed-off-by: Armand <arm.gilles@gmail.com>
Signed-off-by: Armand <arm.gilles@gmail.com>
@codspeed-hq
Copy link

codspeed-hq bot commented Nov 21, 2024

CodSpeed Performance Report

Merging #179 will improve performances by 48.16%

Comparing dev_polars (0139174) with dev_pandas (62da224)

🎉 Hooray! pytest-codspeed just leveled up to 3.0.0!

A heads-up, this is a breaking change and it might affect your current performance baseline a bit. But here's the exciting part - it's packed with new, cool features and promises improved result stability 🥳!
Curious about what's new? Visit our releases page to delve into all the awesome details about this new version.

Summary

⚡ 13 improvements

Benchmarks breakdown

Benchmark dev_pandas dev_polars Change
test_benchmark_get_consecutive_no_transactions_out 19.5 ms 2 ms ×9.6
test_benchmark_get_consecutive_no_transactions_out_big 609.7 ms 61.2 ms ×10
test_benchmark_get_transaction_all 7.5 ms 1.6 ms ×4.8
test_benchmark_get_transaction_all_big 192.6 ms 52 ms ×3.7
test_benchmark_get_transaction_in 8.7 ms 1.9 ms ×4.7
test_benchmark_get_transaction_in_big 186.2 ms 53.6 ms ×3.5
test_benchmark_get_transaction_out 8.5 ms 1.9 ms ×4.5
test_benchmark_get_transaction_out_big 176.9 ms 52.5 ms ×3.4
test_benchmark_process_data_cluster 12.9 ms 4.1 ms ×3.1
test_benchmark_process_data_cluster_big 356.7 ms 240.8 ms +48.16%
test_benchmark_pipepline_transform 834.6 ms 175 ms ×4.8
test_benchmark_pipepline_transform_big 8.4 s 2 s ×4.2
test_benchmark_transf_json_to_df 159.4 ms 24.1 ms ×6.6

@codecov
Copy link

codecov bot commented Nov 21, 2024

Codecov Report

Attention: Patch coverage is 80.00000% with 27 lines in your changes missing coverage. Please review.

Project coverage is 68.70%. Comparing base (62da224) to head (0139174).

Files with missing lines Patch % Lines
src/vcub_keeper/create/creator.py 60.00% 14 Missing ⚠️
src/vcub_keeper/reader/reader.py 60.00% 10 Missing ⚠️
src/vcub_keeper/ml/train_cluster.py 0.00% 2 Missing ⚠️
src/vcub_keeper/ml/cluster.py 93.33% 1 Missing ⚠️
Additional details and impacted files
@@               Coverage Diff               @@
##           dev_pandas     #179       +/-   ##
===============================================
+ Coverage       43.10%   68.70%   +25.60%     
===============================================
  Files              10       10               
  Lines             457      310      -147     
===============================================
+ Hits              197      213       +16     
+ Misses            260       97      -163     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@armgilles armgilles changed the title Dev polars Pandas VS Polars benchmark PR Nov 21, 2024
@armgilles armgilles changed the title Pandas VS Polars benchmark PR Pandas VS Polars benchmark PR (don't merge) Feb 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant