-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
Problem
Currently when loading from a source the Polars eager API is used to load the full dataframe to memory, and then at each data manipulation statement this dataframe is modified in memory (similar to how Pandas works). Polars also features the lazy API which allows many data manipulation statements to be deferred and combined at a later stage. The lazy API is encouraged by the Polars developers where possible.
The benefits of this are:
- slightly faster performance
- much lower memory usage (i.e. only load the first few entries of a parquet file to get the starting timestamp instead of the full data).
Solution
Examine the load and data manipulation statements inside the Src and TimeSeriesFuser classes and see if they can be optimized to support both the eager and lazy Polars API.
Reactions are currently unavailable