Skip to content

new feature making clustermq "pipeable" #318

@wds15

Description

@wds15

Hi!

First, clustermq is really great - it powers a lot of what I do. Today I just wrote a small utility function which makes the "Q" functions compatible with the pipe syntax which is being used a lot in R workflows. So maybe this function could be implemented in clustermq directly?

library(brms)
fit1 <- brm(count ~ zAge + zBase * Trt + (1|patient),
            data = epilepsy, family = poisson())

## adding predictions to the orginal data set can be done with a pipe approach
epilepsy |> tidybayes::add_predicted_rvars(fit1)

## which does not work with Q_rows as Q_rows sends the individual
## columns as arguments to the function. Thus the function below does
## nest things in a way so that clustermq can be applied directly
## here:

Q_rows_nested <- function(data, fun, arg, ...) {
    data |>
        dplyr::mutate(.row=1:dplyr::n()) |>
        tidyr::nest(data=-.row) |>
        dplyr::select("{{arg}}" := data) |>
        clustermq::Q_rows(fun=fun, ...) |>
        dplyr::bind_rows()
}


## now we can run the predictions in parallel over clustermq
epilepsy |> Q_rows_nested(tidybayes::add_predicted_rvars, newdata, const=list(object=fit1))

The above makes more sense for huge simulations and fits. What would be nice to add is chunking in a way so that the "data" is being chunked into bigger pieces... which should be easy to add.

This is just a feature suggestion as I think this could be useful for many others as well.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions