Hello,
After collecting a test set of fragCounter coverage profiles for 4 normal samples, I attempted to run the dryclean workflow.
I encountered the following error while trying the first step of creating the PoN in prepare_detergent:
pon_detergent <- prepare_detergent(normal.table.path = "/drycleanRun/test_ton.rds",
use.all = TRUE,
num.cores = 2,
build = "hg38",
path.to.save = "drycleanRun/",
nochr = T,
save.pon = T)
### OUTPUT ###
Starting the preparation of Panel of Normal samples a.k.a detergent
4 samples available
Using all samples
PAR file not provided, using hg38 default. If this is not the correct build, please provide a GRange object delineating for corresponding build
PAR read
Checking for existence of files
4 files present
|=====================================================================================================================| 100%, Elapsed 07:21
Error in setattr(ans, "names", c(keep.names, paste0("V", seq_len(length(ans) - :
'names' attribute [1] must be the same length as the vector [0]
While troubleshooting, it seems like others have encountered the same error, but at a different stage of the workflow (#2).
Based on the output message, it looks like the error occurs within pbmclapply function call at line 259 although I am not exactly sure where.
I then decided to test prepare_detergent under the other possible approaches instead of using all samples.
Interestingly, using either of the two alternative options choose.randomly = TRUE or choose.by.clustering = TRUE both executed without an error.
Here using choose.randomly = TRUE and selecting 2 of the 4 samples:
pon_detergent <- prepare_detergent(normal.table.path = "/drycleanRun/test_ton.rds",
use.all = FALSE,
choose.randomly = TRUE,
number.of.samples = 2,
choose.by.clustering = FALSE,
num.cores = 2,
build = "hg38",
path.to.save = "drycleanRun/",
nochr = T,
save.pon = T)
### OUTPUT ###
Starting the preparation of Panel of Normal samples a.k.a detergent
4 samples available
Selecting 2 normal samples randomly
PAR file not provided, using hg38 default. If this is not the correct build, please provide a GRange object delineating for corresponding build
PAR read
Checking for existence of files
2 files present
|============================================================================================================| 100%, Elapsed 03:28
Starting decomposition
This is version 2
Warning: Item 1 has 3031053 rows but longest item has 15155223; recycled with remainder.Finished making the PON or detergent and saving it to the path provided
And here using choose.by.clustering = TRUE
pon_detergent <- prepare_detergent(normal.table.path = "/drycleanRun/test_ton.rds",
use.all = FALSE,
choose.randomly = FALSE,
number.of.samples = 2,
choose.by.clustering = TRUE,
num.cores = 2,
build = "hg38",
path.to.save = "drycleanRun/",
nochr = T,
save.pon = T)
### OUTPUT ###
Starting the preparation of Panel of Normal samples a.k.a detergent
4 samples available
Starting the clustering
Starting decomposition on a small section of genome
This is version 2
Starting clustering
PAR file not provided, using hg38 default. If this is not the correct build, please provide a GRange object delineating for corresponding build
PAR read
Checking for existence of files
2 files present
|============================================================================================================| 100%, Elapsed 01:52
Starting decomposition
This is version 2
Warning: Item 1 has 3031053 rows but longest item has 15155223; recycled with remainder.Finished making the PON or detergent and saving it to the path provided
The output detergent.rds is in working order as I was able to run start_wash_cycle without any problems.
I will likely use the clustering method for further analysis but wanted to point out this issue for others who encounter it.
Best,
Patrick
Hello,
After collecting a test set of fragCounter coverage profiles for 4 normal samples, I attempted to run the
drycleanworkflow.I encountered the following error while trying the first step of creating the PoN in
prepare_detergent:While troubleshooting, it seems like others have encountered the same error, but at a different stage of the workflow (#2).
Based on the output message, it looks like the error occurs within
pbmclapplyfunction call at line 259 although I am not exactly sure where.I then decided to test
prepare_detergentunder the other possible approaches instead of using all samples.Interestingly, using either of the two alternative options
choose.randomly = TRUEorchoose.by.clustering = TRUEboth executed without an error.Here using
choose.randomly = TRUEand selecting 2 of the 4 samples:And here using
choose.by.clustering = TRUEThe output
detergent.rdsis in working order as I was able to runstart_wash_cyclewithout any problems.I will likely use the clustering method for further analysis but wanted to point out this issue for others who encounter it.
Best,
Patrick