Skip to content

Extended Output #31

@alex-b-chase

Description

@alex-b-chase

Hi,

Thanks for developing this. I had a question about the output files and, after reading the README, I am still unsure on the expected output.

I am using PE150 MiSeq reads. Performed a SPAdes assembly (this output is *.scaffolds.fasta). These genomes are highly fragmented but >99% ANI to a reference genome that I have previously sequenced fully with PACBIO (complete genome; 3.77Mb). However, when I use AlignGraph, the output files confuse me.

One example:

Sequence_ID	Total_Contigs	Genome_length	Largest_Contig	n50	GC_Percent
Desert-2-3.extended	16	3335782	1108581	343422	70
Desert-2-3.remain	19	707214	201094	182646	71
Desert-2-3.scaffolds	73	3740981	508547	119510	71

This appears to me that I would need to concatenate the *extended.fasta with the *remaining.fasta file to get the desired genome? Any clarification would be great.

Here is the command I am using:

AlignGraph --read1 $OUTDIR/${mate}_R1_001.fasta --read2 $OUTDIR/${mate}_R2_001.fasta \
--fastMap --contig $genome.scaffolds.fasta --genome $REFGENOME \
--distanceLow 550 --distanceHigh 1550 \
--extendedContig $genome.extended.fa --remainingContig $genome.remain.fa 

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions