Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
204 commits
Select commit Hold shift + click to select a range
7aa9235
add barcode calling, single file only
andrewprzh Sep 20, 2023
a13ab72
fix barcode calling execution
andrewprzh Sep 21, 2023
6ff47ad
split barcodes by chromosomes
andrewprzh Sep 21, 2023
4aada54
minor aggregator refactoring
andrewprzh Sep 21, 2023
528abd9
fix requirements
andrewprzh Sep 21, 2023
2d77be9
minor refactoring of UMI filtering
andrewprzh Sep 30, 2023
0645385
score for double barcode calling
andrewprzh Sep 30, 2023
30ace23
fix imports for now
andrewprzh Sep 30, 2023
b3f0356
load barcodes with score
andrewprzh Sep 30, 2023
acb03f7
proper usage of UMIFilter
andrewprzh Sep 30, 2023
895ea22
report some stats while filtering
andrewprzh Oct 1, 2023
901ca5a
fixes
andrewprzh Oct 1, 2023
12f5a40
fix naming
andrewprzh Oct 1, 2023
99ae089
add umit filtering to isoquant pipeline
andrewprzh Oct 1, 2023
5123043
FIX BUG
andrewprzh Oct 1, 2023
1f6bfee
fix option
andrewprzh Oct 1, 2023
9ad59f3
fix stats
andrewprzh Oct 12, 2023
d537ee1
fix clearing
andrewprzh Oct 13, 2023
4891bda
fx unique
andrewprzh Oct 13, 2023
fca2b25
do not skip reads in stats
andrewprzh Oct 13, 2023
48e3b11
fix imports
andrewprzh Oct 23, 2023
52258b8
do not ignore non-unique in stats counting
andrewprzh Oct 23, 2023
2e7cc0b
report spliced reads
andrewprzh Nov 1, 2023
6bcc1d3
do all distances
andrewprzh Oct 23, 2023
ddde404
fix imports
andrewprzh Nov 14, 2023
d32d89a
skip barcode calling in the resumed run
andrewprzh Nov 21, 2023
396432f
fix --resume in sc mode
andrewprzh Nov 26, 2023
f9738c5
argv in main
andrewprzh Feb 2, 2024
c53c909
array based index
andrewprzh Feb 2, 2024
c3e8ed6
set default k=6
andrewprzh Feb 5, 2024
8d8bc8b
ignore secondary in barcode calling
andrewprzh Feb 7, 2024
8c51473
fx imports
andrewprzh Feb 7, 2024
92475d6
allow multiple files for barcode calling
andrewprzh Mar 6, 2024
0136f49
enhance allinfo
andrewprzh Mar 6, 2024
9576208
proper attribute output
andrewprzh Mar 13, 2024
fd554a9
Barcode calling: handle gzipped read assignment file, correct number …
almiheenko May 9, 2024
143390c
Add editdistance to requirements
almiheenko May 9, 2024
20081c7
thread barcoded reads properly
andrewprzh Jun 11, 2024
fd033e3
move barcode caller into root
andrewprzh Jul 23, 2024
4f14d20
simple read info
andrewprzh Jul 23, 2024
4ce915d
create barcode callers within each thread
andrewprzh Jul 23, 2024
441a0e5
log
andrewprzh Jul 23, 2024
1d688ba
read storage class
andrewprzh Jul 24, 2024
ab8a183
do not duplicate headers
andrewprzh Jul 24, 2024
46a981c
do not forget header in single thread
andrewprzh Jul 24, 2024
f013640
do not clear chunks
andrewprzh Jul 24, 2024
ec2b827
switch from non-lazy ProcessExecutorPool.map to self-made asynchronou…
andrewprzh Jul 26, 2024
f933aa9
run barcode calling in a separate process due to memory issues with GC
andrewprzh Jul 30, 2024
16e9577
cosmectics
andrewprzh Jul 30, 2024
72cb102
fx
andrewprzh Jul 30, 2024
a75ad91
comment explaining separate process for barcode calling
andrewprzh Jul 31, 2024
2b4489b
allow gzipped barcodes
andrewprzh Jul 31, 2024
2470b40
proper exception handling
andrewprzh Jul 31, 2024
15d3ce6
fix chunk size
andrewprzh Jul 31, 2024
8f1a82b
propose a more elegant version for the future
andrewprzh Aug 1, 2024
22148bc
experimental extraction, if the last bases of primer are not aligned,…
andrewprzh Oct 29, 2024
09de600
fix delta 0
andrewprzh Nov 4, 2024
626a769
use annotated polyA sites for reported polyA sites within 50 bp
andrewprzh Nov 26, 2024
6016336
stereo
andrewprzh Dec 9, 2024
e606b34
fix
andrewprzh Dec 10, 2024
2e34027
count multiple linkers in stereo mode
andrewprzh Dec 10, 2024
bc68e12
fix stat
andrewprzh Dec 11, 2024
a03022f
fix
andrewprzh Dec 11, 2024
3a225e2
report empty
andrewprzh Dec 11, 2024
ab32523
more stats
andrewprzh Dec 13, 2024
ade0103
fix iterative linker search
andrewprzh Dec 13, 2024
9493dfa
load h5 barcodes
andrewprzh Dec 23, 2024
57da65a
fix decoding
andrewprzh Dec 24, 2024
0e7561c
improve h5 loading
andrewprzh Dec 24, 2024
ce64cbe
2bit indexing
andrewprzh Dec 27, 2024
bf7758c
implement stereo barcode calling
andrewprzh Dec 27, 2024
64eb5fa
old function
andrewprzh Dec 27, 2024
37266a2
2bit indexing for stereo
andrewprzh Dec 27, 2024
4ec6527
fx import
andrewprzh Dec 27, 2024
6f8a207
do not output large UMIs
andrewprzh Dec 27, 2024
89b1d02
fix barcode extraction and score
andrewprzh Dec 27, 2024
8528f92
try to reduce the number of SW calls
andrewprzh Dec 28, 2024
f0fbc9a
do not scan read multiple times for now
andrewprzh Dec 28, 2024
a87168c
fix minscore logic
andrewprzh Dec 28, 2024
d183fb0
minor fixes for primer
andrewprzh Dec 28, 2024
d1267eb
fix output
andrewprzh Dec 28, 2024
8eb3f13
use untrusted UMIs but only when trusted are not present
andrewprzh Dec 29, 2024
16768ef
add stereo mode
andrewprzh Dec 29, 2024
3507b1e
include stereo mode, improve output naming
andrewprzh Dec 29, 2024
903dd9f
use k=14 for stereo
andrewprzh Dec 29, 2024
1611e9d
read de-concatenation with TSO search
andrewprzh Feb 18, 2025
62d452a
draft for splitting barcode caller
andrewprzh Feb 19, 2025
ca9296a
improve printing
andrewprzh Feb 19, 2025
b137d6e
use multiple files for umi filtering
andrewprzh Feb 24, 2025
909d489
fix .gz
andrewprzh Feb 26, 2025
d459844
fix untrusted UMI filtering
andrewprzh Feb 28, 2025
cc8703b
one more filtering fix
andrewprzh Mar 4, 2025
524d0bd
properly deconcatenate reads, do not print non-informative results
andrewprzh Mar 5, 2025
660e883
debug mode
andrewprzh Mar 5, 2025
627fdcb
fix non splitting stereo barcode detector
andrewprzh Mar 5, 2025
082bcd0
print *, relax ED cutoffs, try to speed up some searches in bigger st…
andrewprzh Mar 6, 2025
67535c3
output FASTA in a proper way
andrewprzh Mar 6, 2025
7b18624
stats
andrewprzh Mar 6, 2025
27b7687
ensure the step is always made
andrewprzh Mar 7, 2025
7930ee8
print empty result
andrewprzh Mar 10, 2025
8339f16
embed read splitting into the pipeline
andrewprzh Mar 10, 2025
a5256db
umi ed for different modes
andrewprzh Mar 10, 2025
1140686
factor out assignment loading
andrewprzh Mar 11, 2025
b802b6a
properly process ambiguous reads
andrewprzh Nov 27, 2024
1b16297
refactor umi deduplication, use raw assignments instead of parsing TS…
andrewprzh Mar 12, 2025
f9e4f15
perform umi filtering right after the assignment step
andrewprzh Mar 12, 2025
a95be60
save umi-filtered reads to a save file
andrewprzh Mar 13, 2025
62e5d86
use umi filtered reads from counting and constuction
andrewprzh Mar 14, 2025
32d4a9b
print more stats on duplicated molecules
andrewprzh Mar 14, 2025
178714f
back to normal indexing
andrewprzh Mar 14, 2025
805ad96
output only sequences with TSO when splitted, more stats
andrewprzh Mar 17, 2025
6c49cfc
try to speedup tso and linker look ups
andrewprzh Mar 17, 2025
07cc492
fix split 0
andrewprzh Mar 17, 2025
8b233c3
fix None as assigned gene in allinfo files
andrewprzh Apr 1, 2025
8a8a6df
update requirements
andrewprzh May 30, 2025
56d2e8c
fix tenx caller
andrewprzh Aug 1, 2025
d13c47f
remove min score
andrewprzh Aug 1, 2025
5123e24
fix imports for umi filtering
andrewprzh Aug 4, 2025
af6328c
single array index
andrewprzh Aug 5, 2025
93a57e0
visium hd caller
andrewprzh Aug 21, 2025
3b3faee
polyA dirty fix
andrewprzh Aug 23, 2025
a50eda1
use iterators instead of list, split barcodes and other perfromance o…
andrewprzh Aug 22, 2025
69fb145
fix modes
andrewprzh Sep 16, 2025
205c467
shared memory index implementation
andrewprzh Sep 16, 2025
2459d92
unify modes and barcode calling
andrewprzh Sep 17, 2025
e6b5205
fix curio barcode detection
andrewprzh Sep 17, 2025
a0af4e3
fix messages for stereo
andrewprzh Sep 18, 2025
6a8bf5f
fix args check
andrewprzh Sep 18, 2025
0b28321
implement shared memory index in a nice way
andrewprzh Sep 18, 2025
9b2df98
unify single and multiple threads, minor refactoring and simplification
andrewprzh Sep 19, 2025
f16449e
use output prefix, create outdir automatically
andrewprzh Sep 20, 2025
5255a55
umi filtering parallel version
andrewprzh Sep 18, 2025
6d90d39
fix gene barcode pair stats, simpify check
andrewprzh Sep 22, 2025
21702ea
fix resume logic
andrewprzh Sep 22, 2025
6e8b846
stereoseq no split parallel version, use usual k-mer idexer for low b…
andrewprzh Sep 22, 2025
34d80e0
fix dir creation
andrewprzh Sep 27, 2025
e041be7
fix afer rebase, fix naming
andrewprzh Oct 13, 2025
2bffed8
more tests
andrewprzh Oct 14, 2025
80bdcd3
even more tests
andrewprzh Oct 14, 2025
be870e3
count ED in addition to score for visium calling
andrewprzh Oct 18, 2025
9fdd862
fix print
andrewprzh Oct 22, 2025
2bebfb8
add check_sq to process unmapped bams
andrewprzh Oct 22, 2025
6b4a15e
remove print
andrewprzh Oct 22, 2025
07181a8
fix visium
andrewprzh Oct 30, 2025
af53fdc
minor usability fixes
andrewprzh Oct 31, 2025
500592b
fix typo
andrewprzh Oct 31, 2025
b38129b
use score difference for non-exact mathes
andrewprzh Oct 31, 2025
32fc222
fix second best score keeping
andrewprzh Nov 3, 2025
60c1c51
fix None
andrewprzh Nov 3, 2025
7455d01
fix clean up and allinfo merging
andrewprzh Nov 4, 2025
02124cd
--no_large_files
andrewprzh Nov 4, 2025
41faf09
fix r2t
andrewprzh Nov 4, 2025
a2ffd74
fix no_large_files and lock files
andrewprzh Nov 12, 2025
e10cf0a
merge large files only when needed
andrewprzh Nov 13, 2025
626cc39
proper --resume and sc pipeline interaction
andrewprzh Nov 14, 2025
1f7525f
add --barcode2spot option
andrewprzh Nov 14, 2025
88580d8
do not split regions united by genes
andrewprzh Nov 21, 2025
5252e58
merge consecutive genes
andrewprzh Nov 22, 2025
bb0437d
refactor varios binary read assignment loader, implement merging assi…
andrewprzh Nov 23, 2025
4a843d2
fix unique gene-barcode pairs
andrewprzh Nov 23, 2025
8617a4d
sort molecules before detecting similar UMIs to avoid weird clustering
andrewprzh Nov 27, 2025
ebadfe1
use umi length when prioritizing UMIs
andrewprzh Nov 27, 2025
f96c61c
fix diff function
andrewprzh Nov 28, 2025
030ab7d
use actual read count for sorting
andrewprzh Nov 28, 2025
a076e7d
filter out ambiguous gene assignments
andrewprzh Dec 1, 2025
4895aca
fix empty set of filtered reads
andrewprzh Dec 5, 2025
7903661
parallel barcode table split
andrewprzh Dec 15, 2025
8627917
Support for multiple read groups
andrewprzh Dec 17, 2025
51dd373
Fix BasicReadAssignment deserialization for list read_group
andrewprzh Dec 17, 2025
a03d5c7
Use read_group[0] in counters (temporary fix)
andrewprzh Dec 17, 2025
082ad51
Join read_group list in TSV output
andrewprzh Dec 17, 2025
e240a3c
use first group in model construction for now
andrewprzh Dec 17, 2025
aaa03d3
Implement proper multi-group counters with strategy-based naming
andrewprzh Dec 18, 2025
77a7413
Fix references to grouped counters (now lists)
andrewprzh Dec 18, 2025
f76b9e4
Update test to expect new grouped file naming format
andrewprzh Dec 18, 2025
b27ca41
Fix multi-group support in GraphBasedModelConstructor
andrewprzh Dec 18, 2025
3c00e13
Fix technical replicas check to only use file_name group
andrewprzh Dec 18, 2025
01675b4
fix tests
andrewprzh Dec 18, 2025
bb9436a
Auto-add file_name grouping when multiple files are present
andrewprzh Dec 18, 2025
1a9c4e7
set use_technical_replicas later
andrewprzh Dec 18, 2025
84c036a
Make use_technical_replicas a per-sample property
andrewprzh Dec 18, 2025
8e16607
cosmetics
andrewprzh Dec 18, 2025
c0175ab
Implement improved parallel table splitting algorithm
andrewprzh Dec 18, 2025
2845c65
Replace chunked loading with line-by-line streaming for better effici…
andrewprzh Dec 18, 2025
f050c12
remove unused code
andrewprzh Dec 18, 2025
9dfb862
Add barcode/umi to ReadAssignment to eliminate redundant loading
andrewprzh Dec 18, 2025
6d09942
Add barcode_spot grouping to group reads by cell type/spot
andrewprzh Dec 18, 2025
9f03d7a
fix balancing
andrewprzh Dec 18, 2025
23bebd2
fix serialization and barcode split
andrewprzh Dec 19, 2025
c635583
refactor umi filtering, add type hints and remove unused code
andrewprzh Dec 21, 2025
bca7d04
use barcodes from read assignments
andrewprzh Dec 22, 2025
0b27351
fix stats
andrewprzh Dec 22, 2025
8d66462
TODOs
andrewprzh Dec 22, 2025
e2bfaf5
remove ReadAssignmentInfo, use ReadAssignment.additional_attributes
andrewprzh Dec 22, 2025
f47d3cd
remove unused barcode loading functions
andrewprzh Dec 22, 2025
2579bca
document single-cell/spatial options, update read_group
andrewprzh Dec 22, 2025
6db65f2
more TODOs
andrewprzh Dec 22, 2025
ef2f52e
fix imports
andrewprzh Dec 22, 2025
728edb6
add stereo toy data test
andrewprzh Dec 22, 2025
732357a
add unit tests for UMI filtering and read groups
andrewprzh Dec 22, 2025
f9b7f72
fix test imports and class names
andrewprzh Dec 22, 2025
c06fdb8
add tests and comments
andrewprzh Dec 22, 2025
ba74e4e
fix some tests
andrewprzh Dec 22, 2025
dd0b51b
fix test signatures and remove duplicate functions
andrewprzh Dec 22, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions .github/workflows/Short_runs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: Short runs

on:
workflow_dispatch:
push:
branches: [ "master", "sc_*" ]
pull_request:
branches: [ "master" ]

permissions:
contents: read

jobs:
integration-tests:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- name: Set up Python 3.8
uses: actions/setup-python@v3
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pytest
sudo apt-get install -y minimap2 samtools
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: Run integration tests
run: |
pytest tests/console_test.py -v
44 changes: 44 additions & 0 deletions .github/workflows/Stereo_toy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
name: Stereo toy data test

on:
workflow_dispatch:
schedule:
- cron: '0 2 * * 0,4'

env:
RUN_NAME: Stereo_toy
LAUNCHER: ${{github.workspace}}/tests/github/run_pipeline.py
CFG_DIR: /abga/work/andreyp/ci_isoquant/data
BIN_PATH: /abga/work/andreyp/ci_isoquant/bin/
OUTPUT_BASE: /abga/work/andreyp/ci_isoquant/output/${{github.ref_name}}/

concurrency:
group: ${{github.workflow}}
cancel-in-progress: false

jobs:
launch-runner:
runs-on:
labels: [isoquant]
name: 'Running IsoQuant and QC'

steps:
- name: 'Cleanup'
run: >
set -e &&
shopt -s dotglob &&
rm -rf *

- name: 'Checkout'
uses: actions/checkout@v3
with:
fetch-depth: 1

- name: 'Stereo toy data test'
if: always()
shell: bash
env:
STEP_NAME: STEREO.TOY
run: |
export PATH=$PATH:${{env.BIN_PATH}}
python3 ${{env.LAUNCHER}} ${{env.CFG_DIR}}/${{env.STEP_NAME}}.cfg -o ${{env.OUTPUT_BASE}}
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -104,3 +104,6 @@ venv.bak/
.mypy_cache/

.idea

# Claude Code documentation (private)
.claude/
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
3.10.0
3.10.0
Loading
Loading