Skip to content

Releases: MorrellLAB/sequence_handling

Release v3.0.0: SNP calling with GATK 4.1 includes Slurm compatibility

02 Jun 19:27

Choose a tag to compare

This release includes the following changes.

Slurm workload manager is supported for all handlers.

GATK v4.1.2 on the Slurm queueing system is supported for the following handlers:

  • Haplotype_Caller
  • Added Genomic_DB_Import handler (this combines GVCF files prior to running Genotype_GVCFs handler)
  • Genotype_GVCF
  • Create_HC_Subset (preparation steps for GATK Variant Recalibrator)
  • Variant_Recalibrator

GATK v4.1.2 on non-PBS queueing systems is supported for the following handlers:

  • Haplotype_Caller
  • Genotype_GVCF
  • Variant_Filtering

Additional changes:

  • VCF annotation visualization to assist filtering has also been added.
  • Jupyter Notebook template for exploring VCF files prior to variant recalibration/filtering steps is now available in the HelperScripts directory
  • Realigner_Target_Creator and Indel_Realigner handlers have been separated from the main pipeline because the functionality is only available in GATK 3 or earlier and we still need indel realignment for other downstream tools. Please fill out Config_Indel_Realign for indel realignment steps.
  • Main Config file has been updated accordingly with updates to handlers. A few new variables have been added.
  • Haplotype_Caller, Genomics_DB_Import, and Genotype_GVCFs now handle parallelizing across regions using job arrays.
  • This version allows you to re-run specific job array numbers with an optional -t custom_array_indices argument from the command line (instead of having to re-create your sample list for failed/aborted jobs). So you can now run it like this:
./sequence_handling SAM_Processing /path/to/config -t 1-5,10,12

Without the -t flag, by default runs all samples in your list. So you can still run sequence_handling like this: ./sequence_handling SAM_Processing /path/to/config
This will work for any handler that utilizes job arrays.

  • Create_HC_Subset can now handle very large VCF files (>1TB vcf files) in a reasonable manner
  • Variant_Recalibrator now has additional features:
    • Can specify recalibration "mode" to recalibrate both indels and snps, indels only, or snps only
    • Allows specification of a custom set of annotations in the config file
    • Allows specification of additional options/flags to include
    • Allows more control over setting resource datasets as known, training, or truth sets
    • Automatically indexes raw vcf file and resource files if they are not already indexed

Release v2.1.0: Last supported GATK 3.8 version.

18 Jun 18:55
063221f

Choose a tag to compare

This is the most complete version to use with GATK 3.8.

Release v2.0: SNP calling with GATK 3.8

01 Jun 21:44

Choose a tag to compare

The sequence_handling wiki is fully up to date with this release.

This release adds the following handlers:

  • Haplotype_Caller
  • Genotype_GVCFs
  • Create_HC_Subset
  • Variant_Recalibrator
  • Variant_Filtering
  • Variant_Analysis
  • Realigner_Target_Creator
  • Indel_Realigner

10x Genomics linked reads and Nanopore long reads processing support is planned for future versions.

Release v1.0: FastQ to BAM pipeline

04 Aug 17:24

Choose a tag to compare

The sequence_handling wiki is fully up to date with this release.

This release includes the following functional handlers:

  • Quality_Assessment
  • Adapter_Trimming
  • Quality_Trimming
  • Read_Mapping
  • SAM_Processing
  • Coverage_Mapping

This release also includes nonfunctional code for the following:

  • GBS_Demultiplexing
  • Coverage_Mapping plots with R