-
Notifications
You must be signed in to change notification settings - Fork 180
feat: accessors #2290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: accessors #2290
Conversation
❌ 12 Tests Failed:
View the top 3 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
ilan-gold
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioned on slack, but I think the landing page needs to make clear A.obsm["pca"](adata) is possible!
I have to run now but I have a broader question - is there a desire to do some form of validation as well? Like A.obs["x"].validate(adata) or similar. And relatedly asserting types might be in-scope like A.obsm["X_pca"].is(np.ndarray).validate(adata) or similar
Super nice! Very exciting
|
|
||
| @dataclass(frozen=True) | ||
| class RefAcc[R: AdRef[I], I](abc.ABC): # type: ignore | ||
| """Abstract base class for reference accessors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a terminology page where you explain these concepts (either centralized on acc or within each class) would be good explaining full name i.e., "reference" for Ref and its meanings (and hten how they are composed)
| object.__setattr__(self, "obsp", GraphMapAcc("obs", ref_class=self.ref_class)) | ||
| object.__setattr__(self, "varp", GraphMapAcc("var", ref_class=self.ref_class)) | ||
|
|
||
| def to_json(self, ref: R) -> list[str | int | None]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have a to_json method? Genuinely wondering about the use-case here.
| You can refer to columns or the index in the way you would access them | ||
| in a :class:`~pandas.DataFrame`: | ||
|
|
||
| >>> from anndata.acc import A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a line
| >>> from anndata.acc import A | |
| >>> from anndata.acc import A, MetaAcc | |
| >>> isinstance(A.obs, MetaAcc) |
I think it's implicit in the description, but I would find this helpful (in all of the classes)
User API design
AdPathinstances by descending into a central accessor constantA:anndata/tests/test_accessors.py
Lines 25 to 42 in 61fc920
AdPathinstance, e.g. to figure out which axes the resulting vector spans:anndata/tests/test_accessors.py
Lines 101 to 107 in 61fc920
AdPathinstance to extract a vector (see 1. for examples)subclassing
… is a direct goal of this, as people should be able to use
AdPathsubclasses. This means thatAdPathAPI is minimal and inspection can be done through.acc(could be even more minimal by just putting it all into a container?):anndata/src/anndata/acc/__init__.py
Lines 56 to 57 in 61fc920
anndata/src/anndata/acc/__init__.py
Lines 63 to 64 in 61fc920
AdAccconstant that produces your ownAdPathsubclass:anndata/src/anndata/acc/__init__.py
Line 415 in 61fc920
VecAcc.__getitem__to get anAdPathor a list of them. In that process,__getitem__callsprocess_idxwhich verifies and simplifies the indexaxes,__repr__, andidx_reprget called too to validate things workAdPathcan be used, which basically just delegates toVecAcc’saxes,__repr__,idx_repr, and__call__So in the end everything except for
__call__is validated, i.e.VecAcc.__getitem__raises exceptions on misuseanndata/src/anndata/acc/__init__.py
Lines 101 to 119 in 61fc920
TODO:
Especially since we already call these accessors! Instance namespaces (
adata.custom) #1869__dir__)