-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Labels
Description
Here is example table:
dt1 <- fread("V1 V2 V3
x xA;xB;xC x1;x2;x3
y yD y1
z zF;zG z1")and I want to split it by both V2 and V3 columns. You can see that the last record is "wrong": V2 has 2 values while V3 has only one. And that how cSplit() treats those cases:
# with default arguments:
cSplit(dt1, splitCols = c('V2', 'V3'), sep=';', direction = 'long')
# V1 V2 V3
#1: x xA x1
#2: x xB x2
#3: x xC x3
#4: y yD y1
#5: y NA NA
#6: y NA NA
#7: z zF z1
#8: z zG NA
#9: z NA NA
# with `makeEqual = TRUE`:
cSplit(dt1, splitCols = c('V2', 'V3'), sep=';', direction = 'long', makeEqual = T)
# V1 V2 V3
#1: x xA x1
#2: x xB x2
#3: x xC x3
#4: y yD y1
#5: y NA NA
#6: y NA NA
#7: z zF z1
#8: z zG NA
#9: z NA NASo, by default it works like with makeEqual = TRUE while in the help it is said Defaults to FALSE. Then I tried with FALSE:
cSplit(dt1, splitCols = c('V2', 'V3'), sep=';', direction = 'long', makeEqual = F)
# Warning in `[.data.table`(indt, , `:=`(eval(splitCols), lapply(X, function(x) { :
# Supplied 5 items to be assigned to 6 items of column 'V3' (recycled leaving remainder of 1 items).
# V1 V2 V3
# 1: x xA x1
# 2: x xB x2
# 3: x xC x3
# 4: y yD y1
# 5: z zF z1
# 6: z zG x1It recycles V3 elements but it takes it from another group which is kinda unexpected. I think it would be more logical to give one of the following outputs:
# without recycling, fill with NA:
# V1 V2 V3
#1: x xA x1
#2: x xB x2
#3: x xC x3
#4: y yD y1
#5: z zF z1
#6: z zG NA
# with recycling:
# V1 V2 V3
#1: x xA x1
#2: x xB x2
#3: x xC x3
#4: y yD y1
#5: z zF z1
#6: z zG z1Reactions are currently unavailable