Skip to content

migrate() throws error when time values have overlapping characters #21

@mthomas-ketchbrook

Description

@mthomas-ketchbrook

Problem

When one of the distinct values for the time argument contains the other distinct value, the column name replacement logic will cause migrate() to throw an error stating that it "Can't subset elements that don't exist". For example, if one of the time values is "M1" and the other is "M12", the first set of gsub() logic will affect the column names for both of the distinct values for time.

Reproducible Example

library(migrate)

mock_credit |>
  dplyr::mutate(
    problematic_time = dplyr::case_when(
      date == as.Date("2020-06-30") ~ "time_1",
      date == as.Date("2020-09-30") ~ "time_12"   # this string contains "time_1"
    )
  ) |>
  migrate(
    id = customer_id,
    time = problematic_time,
    state = risk_rating,
  )

throws the following error:

ℹ Migrating from time_1 to time_12
Error in `dplyr::group_by()`:
ℹ In argument: `dplyr::across(dplyr::all_of(c(state_start_name, state_end_name)))`.
Caused by error in `across()`:
ℹ In argument: `dplyr::all_of(c(state_start_name, state_end_name))`.
Caused by error in `dplyr::all_of()`:
! Can't subset elements that don't exist.
✖ Element `risk_rating_end` doesn't exist.
Run `rlang::last_trace()` to see where the error occurred.

Possible Solution

We need to be stricter about how we name/rename these columns. Perhaps we should create these column names without performing any string replacement.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions