|
1 | 1 | # pyXenium |
2 | 2 |
|
3 | | -A toy Python package for analyzing 10x Xenium data. |
| 3 | +Utilities for loading and analyzing 10x Genomics Xenium exports in Python. |
4 | 4 |
|
5 | | -## Installation |
| 5 | +## Quickstart |
| 6 | + |
| 7 | +Install: |
6 | 8 |
|
7 | 9 | ```bash |
8 | | -pip install -U "pyXenium>=0.2.0" |
9 | | -pip install "git+https://github.com/hutaobo/pyXenium.git@main" |
| 10 | +pip install pyXenium |
10 | 11 | ``` |
11 | 12 |
|
12 | | -## Quickstart |
13 | | - |
14 | | -### Load a partial Xenium dataset from Hugging Face |
15 | | - |
16 | | -The snippet below uses the **public demo dataset** and the **v2 loader** that supports `base_url`. |
| 13 | +Load a dataset hosted online (e.g. Hugging Face). **By default, `load_anndata_from_partial` looks for the 10x MEX triplet under `<base>/cell_feature_matrix/`**: |
17 | 14 |
|
18 | | -<!-- START load_partial_example --> |
19 | 15 | ```python |
20 | 16 | from pyXenium.io.partial_xenium_loader import load_anndata_from_partial |
21 | 17 |
|
22 | 18 | BASE = "https://huggingface.co/datasets/hutaobo/pyxenium-gsm9116572/resolve/main" |
23 | 19 |
|
24 | 20 | adata = load_anndata_from_partial( |
25 | 21 | base_url=BASE, |
26 | | - analysis_name="analysis.zarr.zip", |
27 | | - cells_name="cells.zarr.zip", |
28 | | - transcripts_name="transcripts.zarr.zip", |
29 | | - # Optional: if you uploaded a 10x MEX triplet under BASE/mex/ |
30 | | - # mex_dir=BASE + "/mex", |
31 | | - # mex_matrix_name="matrix.mtx.gz", |
32 | | - # mex_features_name="features.tsv.gz", |
33 | | - # mex_barcodes_name="barcodes.tsv.gz", |
34 | | - build_counts_if_missing=True, |
| 22 | + analysis_name="analysis.zarr.zip", # optional, attaches clusters if present |
| 23 | + cells_name="cells.zarr.zip", # optional, attaches spatial centroids if present |
| 24 | + # By default it will read MEX from: BASE + "/cell_feature_matrix/" |
35 | 25 | ) |
36 | 26 | print(adata) |
37 | 27 | ``` |
38 | | -<!-- END load_partial_example --> |
39 | | - |
40 | | -> **Note:** Requires `pyXenium>=0.2.0`. |
41 | | -> The demo dataset is hosted at: |
42 | | -> - Hugging Face Datasets: [hutaobo/pyxenium-gsm9116572](https://huggingface.co/datasets/hutaobo/pyxenium-gsm9116572) |
43 | | -
|
44 | | ---- |
45 | 28 |
|
46 | | -## Development |
| 29 | +**What gets loaded:** |
| 30 | +- **Counts**: from MEX (`cell_feature_matrix/{matrix.mtx.gz, features.tsv.gz, barcodes.tsv.gz}`). |
| 31 | +- **Clusters** (optional): from `analysis.zarr[.zip]` if provided. |
| 32 | +- **Spatial centroids** (optional): from `cells.zarr[.zip]` if provided. |
47 | 33 |
|
48 | | -To install with development dependencies (testing, docs, etc.): |
| 34 | +If MEX is missing: |
| 35 | +- The function returns an **empty-gene AnnData** (rows=cells if we can infer cell IDs; otherwise empty). |
| 36 | +- Clusters/spatial are still attached when possible. |
| 37 | +- To get real counts, upload MEX to `<base>/cell_feature_matrix/` or pass `mex_dir=...` explicitly. |
49 | 38 |
|
50 | | -```bash |
51 | | -pip install -e ".[dev]" |
52 | | -pytest |
| 39 | +### Override the MEX location (optional) |
| 40 | +```python |
| 41 | +adata = load_anndata_from_partial( |
| 42 | + base_url=BASE, |
| 43 | + mex_dir=BASE + "/cell_feature_matrix", # explicit |
| 44 | + analysis_name="analysis.zarr.zip", |
| 45 | + cells_name="cells.zarr.zip", |
| 46 | +) |
53 | 47 | ``` |
54 | 48 |
|
55 | | ---- |
56 | | - |
57 | | -## Links |
58 | | - |
59 | | -- 📦 PyPI: [pyXenium](https://pypi.org/project/pyXenium/) |
60 | | -- 📖 Documentation: [Read the Docs](https://pyxenium.readthedocs.io/en/latest/) |
61 | | -- 💻 Source code: [GitHub](https://github.com/hutaobo/pyXenium) |
62 | | - |
63 | | -## License |
| 49 | +### Local folder example |
| 50 | +```python |
| 51 | +adata = load_anndata_from_partial( |
| 52 | + base_dir="/path/to/xenium_export", |
| 53 | + analysis_name="analysis.zarr", |
| 54 | + cells_name="cells.zarr", |
| 55 | + # will look for /path/to/xenium_export/cell_feature_matrix/ |
| 56 | +) |
| 57 | +``` |
64 | 58 |
|
65 | | -MIT |
| 59 | +### Troubleshooting |
| 60 | +- **FileNotFoundError: MEX missing files** → Ensure the three files exist in `cell_feature_matrix/`: |
| 61 | + `matrix.mtx.gz`, `features.tsv.gz`, `barcodes.tsv.gz`. |
| 62 | +- **Different obs names** → We honor 10x barcodes (from MEX). If your Zarr stores numeric |
| 63 | + cell IDs, we normalize them to strings internally but prefer the barcodes from MEX. |
| 64 | +- **Large downloads** → Remote MEX is downloaded once into a temp dir per session run. |
0 commit comments