-
Notifications
You must be signed in to change notification settings - Fork 27
Batch requests when requesting multiple days of data #275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch requests when requesting multiple days of data #275
Conversation
R-packages/covidcast/R/covidcast.R
Outdated
| max_locations <- meta_info[meta_info$data_source == data_source & | ||
| meta_info$signal == signal & | ||
| meta_info$geo_type == geo_type, ]$num_locations | ||
| if (length(max_locations) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment about
- why the max locations would be empty and
- why we default to number of geo values
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's a pre-emptive fix for a similar effect of #270, but yeah, a comment would be good here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, it's a defensive measure against signals that aren't documented in covidcast_meta and using the max as an upper bound when it can't be found. Added a comment.
R-packages/covidcast/R/covidcast.R
Outdated
| } else { | ||
| nissues <- 1 | ||
| } | ||
| max_days_at_time <- floor(3649 / (ngeos * nissues)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This number would be better refactored into a file-level constant who variable name is in all caps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
R-packages/covidcast/R/covidcast.R
Outdated
| if (max_days_at_time == 0) { | ||
| max_days_at_time <- 1 | ||
| } | ||
| batch_days <- ceiling(ndays / max_days_at_time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually just the number of batches? If so, would prefer to see this variable name changed to reflect that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| geo_value = "*", | ||
| as_of = NULL, | ||
| issues = NULL, | ||
| lag = NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be indented more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
Please add a description of this PR. |
|
This will also run into conflicts with #271, mainly just in the unit tests (should be easy to deal with). We should plan the order of operations here when we merge. (but thanks -- this should be a great benefit to download speed) |
sgsmob
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the minor newline nit, but otherwise this is good to go.
| ), | ||
| regexp = NA) | ||
| expect_called(m, 2) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a newline here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
0b0453d to
e5d83a1
Compare
dd94b08 to
b1e64d3
Compare
Fix Python client shapefile packaging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, thanks. Let's not merge yet -- I'd like to merge into r-pkg-devel instead of main, so let me adjust and get that working, and in the mean time you can fix a couple small nits
R-packages/covidcast/R/covidcast.R
Outdated
| api_msg = dat[[i]]$message, | ||
| class = "covidcast_missing_geo_values" | ||
| ) | ||
| msg = dat[[i]]$message, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to go back to being called api_msg; maybe this changed back when you rebased on #252?
| msg = dat[[i]]$message, | |
| api_msg = dat[[i]]$message, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some merge conflicts early on, seems likely I grabbed the wrong line there. Fixed.
| grDevices, | ||
| httr, | ||
| jsonlite, | ||
| lubridate, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lubridate is only used in the tests, right? I think that means it can go in Suggests instead of Imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ymd is a lubricate function which I used a few times in covidcast.R
Have python module test on installed package
Counts taken from line numbers of files in https://github.com/cmu-delphi/covidcast-indicators/tree/fb-package-validation/validator/static
Various small code fixes - Add comments - Fix indenting - Factor out constant (MAX_RESULTS)
- covidcast_days now takes max_geos as an optional parameter instead of determining it internally - max_geo_values is now specific_meta, for retrieving more than just num_locations.
b1e64d3 to
2eab488
Compare
|
Well, that was satisfying to test out: That would have been 50 separate API calls before, so this is very nice. |
Description:
Improve efficiency of covidcast_days by requesting multiple days of data when it can be determined that it is safe to do so.
Change Log:
mockandstub