Replies: 1 comment
-
|
Cc @DongzeHE! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
Thank you for your efforts in building this ecosystem! I'd like to try simpleaf on my dataset (previously analyzed using cellranger 8.0.1). I have a 10x GEM-X (3' V4) mouse dataset with GEX and HTO files. I have the following questions:
According to [1], the rlen parameter is based on the R2 read length for that chemistry. As far as I can see in [2], this would be 90 for 10X3'V4. However, in the FASTQ files, all reads (GEX+HTO) are 150bp long rather than 90 (see my question 4 for an example of R2). Do I have to adjust any settings because of this? For now, I have generated the GEX index via
simpleaf index --output $IDX_DIR --fasta /refdata/refdata-gex-GRCm39-2024-A/fasta/genome.fa --gtf /refdata/refdata-gex-GRCm39-2024-A/genes/genes.gtf --rlen 90 --threads 24 --use-piscemWhat is the correct command to generate the index for the HTOs using piscem? I have two HTOs (Biolegend TotalSeq B) in the dataset. So far I have only found instructions for salmon [3,4] but not simpleaf+piscem. Also I'm confused as to when to use which whitelists (if any) and what kind of mapping is needed to then reconcile the different GEX vs HTO barcodes [4,5] if this is still needed today (does cellranger count do this internally? Because HTO and GEX cell barcodes match well in their output matrices).
The way I understand it, after indexing, I then have to run two quant commands, one for GEX and one for HTO. How would these commands look like today with simpleaf and for my dataset? I'm assuming chemistry=10xv4-3p (even though in the source code [6] it says that this is identical to 10XV3, despite the different read length 91 vs 90 [2]), but I'm lost for what to set for expected-ori, resolution etc.
Specifically for the HTO quant step, I saw that, at least in the past, setting a custom chemistry was needed (
1{b[16]u[10]x:}2{x[10]r[15]x:}[7] if my R2 read for the barcode is like this:GNGTGTTACA*CCTATGGACTTGGAC*TGTGCCCCCGCTTTAAGGCCGGTCCTAGCAACGACGACTGCCACTGCACAGATGGTTGCCTGTCTCTTATACACATCTGACGCTGCCGACGAACTTGTGTCAGTGTAGATCTCGGTGGTCGCCGT- is this still the case with piscem today?I realize that there is a template [7,8], but I'd like to use individual commands to have a better understanding of the process. Also the barcode_translation link hardcoded in the template is no longer valid [9].
References
[1] https://combine-lab.github.io/alevin-fry-tutorials/2023/simpleaf-piscem/
[2] https://www.10xgenomics.com/support/single-cell-gene-expression/documentation/steps/sequencing/sequencing-requirements-for-single-cell-3
[3] https://divingintogeneticsandgenomics.com/post/how-to-use-salmon-alevin-to-preprocess-cite-seq-data/
[4] https://combine-lab.github.io/alevin-fry-tutorials/2021/af-feature-bc/
[5] COMBINE-lab/salmon#576
[6] Source code
[7] COMBINE-lab/alevin-fry#136
[8] https://github.com/COMBINE-lab/protocol-estuary/blob/main/protocols/10x-feature-barcode-antibody/10x-feature-barcode-antibody.jsonnet
[9] (dead) https://github.com/10XGenomics/cellranger/raw/master/lib/python/cellranger/barcodes/translation/3M-february-2018.txt.gz
Beta Was this translation helpful? Give feedback.
All reactions