Running the pandera validation for a large-ish pandas dataframe took ~5 mins

Maybe we could manually convert dtypes and run validation checks (could be a place to handle NaN values?)
Pandera also support Polars, which may be faster but I couldn't get it to work easily on teehr-hub