Skip to content

Error with merging paired ends using mergeAmpliconReads() #18

@APepey

Description

@APepey

Hello! Thank you for all your work!
I have a question about one the HaplotypR functions.

I am working on MiSeq data with the HaplotypR package. I have 35 samples with five P. falciparum markers: ama1-d3, cpmp, csp, cpp and msp7.
I could proceed with the following steps using an R code I received from the WEHI in December 2019:

  • demultiplexing by sample
  • demultiplexing by marker
  • read quality per marker

I am then trying to merge paired ends but I am encountering different issues depending on the syntax I am trying:

Syntax 1:

>   merged_reads <- 
+     marker_deplex %>%
+     select(SampleID, SampleName, MarkerID) %>%
+     bind_cols(mergeAmpliconReads(marker_deplex$FileR1, 
+                                                         marker_deplex$FileR2,
+                                                         outputDir = 'merge_reads'))
Processing file 3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_F.fastq.gz and 3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_R.fastq.gz ...1
Warning in if (method == "vsearch") { :
  the condition has length > 1 and only the first element will be used
Warning in system(call, intern = TRUE) :
  running command '"C:/Users/apepey/Documents/R/R-4.1.2/library/Rvsearch/vsearch" --fastq_mergepairs "marker_deplex/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_F.fastq.gz" --reverse "marker_deplex/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_R.fastq.gz" --fastqout "merge_reads/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_merge.fastq.gz" --fastq_truncqual 1 --fastq_maxns 0 --fastq_allowmergestagger' had status 1
Error: Input/Output
  no input files found
  dirPath: merge_reads/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_merge.fastq.gz
  pattern: character(0)

Syntax 2:

>   merged_reads <- 
+     marker_deplex %>%
+     mergeAmpliconReads(fastqFileR1 = marker_deplex$FileR1, 
+                                         fastqFileR2 = marker_deplex$FileR2,
+                                         outputDir = 'merge_reads')
Processing file 3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_F.fastq.gz and 3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_R.fastq.gz ...1
Warning in if (method == "vsearch") { :
  the condition has length > 1 and only the first element will be used
Warning in if (method == "NGmerge") { :
  the condition has length > 1 and only the first element will be used
Error in FUN(X[[i]], ...) : 
  Merge c("3D7-A", "3D7-A", "3D7-A", "3D7-A", "3D7-A", "3D7-B", "3D7-B", "3D7-B", "3D7-B", "3D7-B", "M0-176", "M0-176", "M0-176", "M0-176", "M0-176", "M0-316", "M0-316", "M0-316", "M0-316", "M0-316", "M0-441", "M0-441", "M0-441", "M0-441", "M0-441", "M0-445", "M0-445", "M0-445", "M0-445", "M0-445", "M0-5", "M0-5", "M0-5", "M0-5", "M0-5", "M0-561", "M0-561", "M0-561", "M0-561", "M0-561", "M0-562", "M0-562", "M0-562", "M0-562", "M0-562", "M0-687", "M0-687", "M0-687", "M0-687", "M0-687", "M0-801", "M0-801", 
"M0-801", "M0-801", "M0-801", "M0-96", "M0-96", "M0-96", "M0-96", "M0-96", "M1-316", "M1-316", "M1-316", "M1-316", "M1-316", "M1-561", "M1-561", "M1-561", "M1-561", "M1-561", "M1-687", "M1-687", "M1-687", "M1-687", "M1-687", "M1-801", "M1-801", "M1-801", "M1-801", "M1-801", "M1-831", "M1-831", "M1-831", "M1-831", "M1-831", "M11-441", "M11-441", "M11-441", "M11-441", "M11-441", "M11-792", "M11-792", "M11-792", "M11-792", "M11-792", "M2-445", "M2-445", "M2-445", "M2-445", "M2-445

Syntax 3:

>   merged_reads <-
+     marker_deplex %>%
+     select(SampleID, SampleName, MarkerID) %>% 
+     bind_cols(mergeAmpliconReads(fastqFileR1 = as.character(marker_deplex$FileR1), 
+                                                         fastqFileR2 = as.character(marker_deplex$FileR2),
+                                                         outputDir = './merge_reads'))
Processing file 3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_F.fastq.gz and 3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_R.fastq.gz ...1
Warning in if (method == "vsearch") { :
  the condition has length > 1 and only the first element will be used
Warning in system(call, intern = TRUE) :
  running command '"C:/Users/apepey/Documents/R/R-4.1.2/library/Rvsearch/vsearch" --fastq_mergepairs "marker_deplex/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_F.fastq.gz" --reverse "marker_deplex/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_R.fastq.gz" --fastqout "./merge_reads/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_merge.fastq.gz" --fastq_truncqual 1 --fastq_maxns 0 --fastq_allowmergestagger' had status 1
Error: Input/Output
  no input files found
  dirPath: ./merge_reads/3D7-A_BC_GTAAGGAG-CGTACTAG_cpmp_merge.fastq.gz
  pattern: character(0)

Syntax 4:

>   merged_reads <-
+     marker_deplex %>%
+     mergeAmpliconReads(as.character(marker_deplex$FileR1), 
+                                         as.character(marker_deplex$FileR2),
+                                         outputDir = './merge_reads')
Error in mergeAmpliconReads(., as.character(marker_deplex$FileR1), as.character(marker_deplex$FileR2),  : 
  Vector length of fastqFileR1 and fastqFileR2 not identical

What could I try to merge paired ends?
Any help is appreciated, thank you.
My marker_table is available here: https://gist.github.com/APepey/054e40b9a10f4b25987dc73b3421e2f5

Here is my session info:

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 14393)

Matrix products: default

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 
 
locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Rvsearch_0.99.1             cowplot_1.1.1               dada2_1.21.0                Rcpp_1.0.7                  HaplotypR_0.3.3            
 [6] devtools_2.4.3              usethis_2.1.5               ShortRead_1.52.0            GenomicAlignments_1.30.0    SummarizedExperiment_1.24.0
[11] Biobase_2.54.0              MatrixGenerics_1.6.0        matrixStats_0.61.0          Rsamtools_2.10.0            GenomicRanges_1.46.1       
[16] Biostrings_2.62.0           GenomeInfoDb_1.30.0         XVector_0.34.0              IRanges_2.28.0              S4Vectors_0.32.3           
[21] BiocParallel_1.28.3         BiocGenerics_0.40.0         pheatmap_1.0.12             forcats_0.5.1               stringr_1.4.0              
[26] dplyr_1.0.7                 purrr_0.3.4                 tidyr_1.1.4                 tibble_3.1.6                ggplot2_3.3.5              
[31] tidyverse_1.3.1             readr_2.1.1                

loaded via a namespace (and not attached):
 [1] colorspace_2.0-2       hwriter_1.3.2          ellipsis_0.3.2         rprojroot_2.0.2        fs_1.5.2               rstudioapi_0.13       
 [7] farver_2.1.0           remotes_2.4.2          bit64_4.0.5            fansi_0.5.0            lubridate_1.8.0        xml2_1.3.3            
[13] cachem_1.0.6           knitr_1.37             pkgload_1.2.4          jsonlite_1.7.3         broom_0.7.11           dbplyr_2.1.1          
[19] png_0.1-7              compiler_4.1.2         httr_1.4.2             backports_1.4.1        assertthat_0.2.1       Matrix_1.4-0          
[25] fastmap_1.1.0          cli_3.1.0              htmltools_0.5.2        prettyunits_1.1.1      tools_4.1.2            gtable_0.3.0          
[31] glue_1.6.0             GenomeInfoDbData_1.2.7 reshape2_1.4.4         cellranger_1.1.0       vctrs_0.3.8            xfun_0.29             
[37] ps_1.6.0               testthat_3.1.1         rvest_1.0.2            lifecycle_1.0.1        zlibbioc_1.40.0        scales_1.1.1          
[43] vroom_1.5.7            hms_1.1.1              parallel_4.1.2         RColorBrewer_1.1-2     yaml_2.2.1             memoise_2.0.1         
[49] latticeExtra_0.6-29    stringi_1.7.6          desc_1.4.0             pkgbuild_1.3.1         rlang_0.4.12           pkgconfig_2.0.3       
[55] bitops_1.0-7           evaluate_0.14          lattice_0.20-45        labeling_0.4.2         bit_4.0.4              tidyselect_1.1.1      
[61] processx_3.5.2         plyr_1.8.6             magrittr_2.0.1         R6_2.5.1               snow_0.4-4             generics_0.1.1        
[67] DelayedArray_0.20.0    DBI_1.1.2              pillar_1.6.4           haven_2.4.3            withr_2.4.3            RCurl_1.98-1.5        
[73] modelr_0.1.8           crayon_1.4.2           utf8_1.2.2             tzdb_0.2.0             rmarkdown_2.11         jpeg_0.1-9            
[79] grid_4.1.2             readxl_1.3.1           callr_3.7.0            reprex_2.0.1           digest_0.6.29          RcppParallel_5.1.5    
[85] munsell_0.5.0          sessioninfo_1.2.2  

Thank you again!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions