Skip to content

cellgeni/h5ad-cli

Repository files navigation

h5ad CLI

A command-line tool for exploring huge .h5ad (AnnData) files without loading them fully into memory. Streams data directly from disk for efficient inspection of structure, metadata, and matrices.

Features

  • info – Show file structure and dimensions (n_obs × n_var)
  • table – Export obs/var metadata to CSV with chunked streaming
  • subset – Filter h5ad files by cell/gene names (supports dense and sparse CSR/CSC matrices)
  • Memory-efficient chunked processing for large files
  • Rich terminal output with colors and progress bars

Installation

uv sync

For development and testing:

uv sync --extra dev

See docs/TESTING.md for testing documentation.

Usage

Invoke any subcommand via uv run h5ad ...:

uv run h5ad --help

Examples

Inspect overall structure and axis sizes:

uv run h5ad info data.h5ad

Export full obs metadata to CSV:

uv run h5ad table data.h5ad --axis obs --out obs_metadata.csv

Export selected obs columns to stdout:

uv run h5ad table data.h5ad --axis obs --cols cell_type,donor

Export var metadata with custom chunk size:

uv run h5ad table data.h5ad --axis var --chunk-rows 5000 --out var_metadata.csv

Subset by cell names:

uv run h5ad subset input.h5ad output.h5ad --obs cells.txt

Subset by both cells and genes:

uv run h5ad subset input.h5ad output.h5ad --obs cells.txt --var genes.txt

All commands stream from disk, so even multi-GB .h5ad files remain responsive.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •