Skip to content

Compatibility with inDrops data? #33

@kfontanez

Description

@kfontanez

Hello,

I am attempting to use Starcode-UMI with data produced from inDrops which has the structure Cellbarcode[8-12 bp]-fixed 22 bp sequence -Cellbarcode[8bp]-UMI[6bp]-PolyT.

I am able to cluster the UMI portion with a setting of 14 UMI bases but no matter how many bases I set seq-trim to the program hangs at the sequence clustering portion of the pipeline. I tried trimming every base following the first 14 bases so that the sequence clustering would have zero bases to work with and I also tried trimming nothing. In both cases, the program hangs at sequence clustering (I left it for over 14 hours with no progress). I'm running with 32 virtual cores and 64 Gb of RAM so I don't think it's a memory issue.

Here is what I ran:
./starcode-umi --umi-len 14 --umi-threads 8 --seq-threads 8 --umi-cluster s --seq-cluster s --umi-d 2 --seq-d 2 --seq-trim 15 ~/path/to/file/filename_R1.fastq

Here is the structure of the input sequence which is 51 bases long:
TGACANTACTTGAGTGATTGCTTGTGACGCCTTAGTCCCTTCTTTTATTTT

I can get several thousand UMI clusters but the program hangs at sequencing clustering with a cluster of size 1 that never increases.

Has starcode ever been tested with inDrops data that has this cell barcode structure?

Thank you.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions