Skip to content

Conversation

@louisegrimble
Copy link
Collaborator

Description

-Updated scanpy command to read in any h5 file
-Added validation for h5 file
-Added new input_utils definition proces_h5_file()
-made appropriate changes to merge_h5ad to read in the same metadata file

Fixes #198

Some examples of the validation working as expected:

$ ./solosis-cli scrna scanpy --metadata ../../scanpy_cb.csv
  SOLOSIS    ~  version 0.4.2
INFO: Initialized with execution_uid: 92e8ae32-6c34-46a1-af35-64104b8f4e79
WARNING: Metadata file ../../scanpy_cb.csv is missing required columns: h5_path
ERROR: No valid samples provided. Use --metadata
Aborted!
lg28@farm22-head2:~/repos/solosis$ nano ../../scanpy_cb.csv 
lg28@farm22-head2:~/repos/solosis$ ./solosis-cli scrna scanpy --metadata ../../scanpy_cb.csv
  SOLOSIS    ~  version 0.4.2
INFO: Initialized with execution_uid: 9229ccfe-1629-4763-8480-3757917ab09e
WARNING: Invalid h5_path (must end with .h5): /lustre/scratch124/cellgen/haniffa/data/samples/HCA_SkO14189565/cellbender/HCA_SkO14189565_lg28_20251003_filtered
ERROR: No valid samples provided. Use --metadata
Aborted!

Type of change

  • Documentation (non-breaking change that adds or improves the documentation)
  • New feature (non-breaking change which adds functionality)
  • Optimization (non-breaking, back-end change that speeds up the code)
  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (whatever its nature)

Checklist

  • All tests pass (eg. pytest)
  • Pre-commit hooks run successfully (eg. pre-commit run --all-files)

@codecov
Copy link

codecov bot commented Oct 7, 2025

Codecov Report

❌ Patch coverage is 35.65891% with 166 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.80%. Comparing base (0ee1898) to head (5f4b5c4).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
solosis/commands/scrna/merge_h5ad.py 29.82% 40 Missing ⚠️
solosis/commands/scrna/scanpy.py 28.57% 40 Missing ⚠️
solosis/utils/input_utils.py 25.49% 38 Missing ⚠️
solosis/commands/farm/single_job.py 40.74% 16 Missing ⚠️
solosis/commands/irods/iget_cellranger.py 15.78% 16 Missing ⚠️
solosis/commands/farm/run_notebook.py 44.00% 14 Missing ⚠️
solosis/commands/alignment/cellranger_count.py 80.00% 1 Missing ⚠️
solosis/utils/lsf_utils.py 90.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #199      +/-   ##
==========================================
- Coverage   63.42%   62.80%   -0.63%     
==========================================
  Files          28       22       -6     
  Lines        1277     1285       +8     
==========================================
- Hits          810      807       -3     
- Misses        467      478      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@louisegrimble louisegrimble marked this pull request as ready for review October 7, 2025 12:09
@louisegrimble louisegrimble changed the base branch from main to dev October 7, 2025 12:10
@louisegrimble
Copy link
Collaborator Author

@barbaratpferreira Hi Barbara, this is a PR looking into how h5 files are read into the scanpy notebook. When we initially received the command/notebook from Vijay, the notebook was specifically reading in filtered_feature_bc_matrix.h5 file. After further discussions with Vijay, it was agreed that this should be changed to handle an h5 file (i.e. from cellbender, etc.). Is this how you are constructing the new scanpy notebook also? or are you expecting the filtered_feature_bc_matrix.h5 file to be read into the notebook?

@louisegrimble
Copy link
Collaborator Author

Adding @keerthi-priya-c to this conversation

@keerthi-priya-c
Copy link

Adding @keerthi-priya-c to this conversation

Hi @louisegrimble ! I agree too it should take any h5ad file submitted by user ie - either from cellranger or cellbender or soupX . If it helps in the notebook, I was testing a h5ad file from cellranger output folder when pulled in from irods-using solosis that's why it is "filtered_feature_bc_matrix.h5" in current code. but the code can be adapted to take in an h5 file in general (i.e. from celbender, cellranger or soupX etc.). I am happy to change it to adapt this or if you are already looking into this please feel free to change/adapt the code for taking in any h5ad file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] scanpy reads in cellranger filtered_feature_bc_matrix.h5 instead of cellbender h5 file

4 participants