Skip to content

SRST2 munges sample identifiers? #267

@hexylena

Description

@hexylena

I'm looking at a run of PHAC's SRST2 wrapper:

/srv/galaxy/var/shed_tools/toolshed.g2.bx.psu.edu/repos/nml/srst2/e59fdf6145db/srst2/srst2.pl /data/galaxy/jobs/020/20188/outputs/dataset_cef3ef2e-e010-4768-b36b-1de43d89b28f.dat /data/galaxy/jobs/020/20188/outputs/dataset_582ba0b2-a2eb-4918-a8b2-9f300b6f212a.dat /data/galaxy/jobs/020/20188/outputs/dataset_d6eec7c7-a2c9-4bcd-a919-63b192c9935e.dat  g /data/galaxy/jobs/020/20188/outputs/dataset_14d30ba8-5d8d-4237-a672-243227aac1d7.dat /data/galaxy/jobs/020/20188/outputs/dataset_473b25c9-757c-4a56-8170-df8add14378b.dat \
"ResFinder.fasta,ARGannot_r2.fasta"  \
"SRX6855211_SRR10127028_1.fastq uncompressed" \
--input_pe "/data/galaxy/f/e/1/dataset_fe1245a6-287d-4732-af60-37021f7eaab1.dat" \
"/data/galaxy/5/6/e/dataset_56eedb18-918a-4c26-a5db-f3504dd763c2.dat" \
 --gene_db /data/galaxy/1/6/2/dataset_162c6169-c72f-4f07-a99d-dfd15075ab8e.dat /data/galaxy/5/6/d/dataset_56da7859-6beb-47cd-b701-3b9bcae0e427.dat \
--gene_max_mismatch 250  --read_type q  --save_scores  --other "'-p ${GALAXY_SLOTS:-1}'"  --output ${PWD}/out

(line breaks manually inserted for clarity)

However the output sample table has only part of the sample identifier:

Column 1	Column 2	Column 3	Column 4	Column 5	Column 6	Column 7	Column 8	Column 9	Column 10	Column 11	Column 12	Column 13	Column 14
Sample	DB	gene	allele	coverage	depth	diffs	uncertainty	divergence	length	maxMAF	clusterid	seqid	annotation
SRX6855211	ARGannot_r2	AmpC2_Ecoli_Bla	AmpC2_346	99.735	86.838	22snp3indel		1.94	1134	0.091	99	346	no;no;AmpC2;Bla;CP002970;332756-333889;1134
SRX6855211	ARGannot_r2	MrdA_Bla	MrdA_836	100.0	66.382	25snp		1.314	1902	0.143	16	836	no;no;MrdA;Bla;CP002291;666340-664439;1902
SRX6855211	ARGannot_r2	MphA_MLS	MphA_1663	100.0	69.11			0.0	906	0.096	158	1663	no;no;MphA;MLS;KR091911;890-1795;906
SRX6855211	ARGannot_r2	CTX-M-9_Bla	CTX-M-27_109	100.0	60.304			0.0	876	0.078	190	109	no;no;CTX-M-27;Bla;AY156923;1-876;876

which means I can't match them back up with their input sample once they've gone through hamronize (which I really love.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions