-
Notifications
You must be signed in to change notification settings - Fork 13
adds rcf ForecastLeadTimeBins, corrects misc. formatting #601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adds rcf ForecastLeadTimeBins, corrects misc. formatting #601
Conversation
samlamont
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting the error: TypeError: ForecastLeadTimeBins._validate_bin_size_dict() takes 2 positional arguments but 3 were given when running rcf.ForecastLeadTimeBins() in the test_add_row_udfs test
…eld-method-to-bin-by-forecast-leadtime
resolved this by adding in the |
| pd.Timedelta(days=2): pd.Timedelta(hours=12), | ||
| pd.Timedelta(days=3): pd.Timedelta(days=1) | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it might be a bit more clear to define the keys as just an integer (or maybe even string?) then the bin could be some time value (could also be a string?), for example:
bin_size = {
"1": "6 hours",
"2": "12 hours",
"3": "1 day"
}where bin 1 would be 0-6, 2 would be 6-18, etc. And the values (whether strings or datetime objects) could be converted to datetime objects is not already. Then the resulting field would just be the simple key value 1,2, or 3 rather than the time interval string, which could be hard to filter on if there's a long record of forecasts in the table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to consider inputs of format:
output:
{
"PT0H_PT6H",
"PT6H_PT12H",
"PT12H_PT1D",
"PT1D_PT2D",
}
input:
1. constant bin size, timedelta:
pd.Timedelta(days=5)
2. variable bin size, dict:
[
{start_inclusive: pd.Timedelta(hours=0), end_exclusive: pd.Timedelta(hours=6)},
{start_inclusive: pd.Timedelta(hours=12), end_exclusive: pd.Timedelta(hours=18)}
]
3. variable bin size, dict with string/datetime keys:
{
"bin1": {start_inclusive: pd.Timedelta(hours=0), end_exclusive: pd.Timedelta(hours=6)}
}
- allow string, datetime timedelta as well
samlamont
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for your efforts on this!
* adds rcf ForecastLeadTimeBins, corrects misc. formatting * update docs and API formatting w.r.t ForecastLeadTimeBins * adds staticmethod decorators to class methods * refactors ForecastLeadTimeBins * adds new tests for ForecastLeadTimeBins * update tests * adds additional input type support, support for mixed input types --------- Co-authored-by: samlamont <sam.lamont@gmail.com>
ForecastLeadTimeBins()class to existing rcf'sAlso could include: