Skip to content

pandera validation is slow for bigger datasets when type="pandas" #457

@samlamont

Description

@samlamont

Running the pandera validation for a large-ish pandas dataframe took ~5 mins

Image

Maybe we could manually convert dtypes and run validation checks (could be a place to handle NaN values?)

Pandera also support Polars, which may be faster but I couldn't get it to work easily on teehr-hub

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions