-
Notifications
You must be signed in to change notification settings - Fork 21
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Renaming global mdata.obs_names changes mdata['airr'].obs_names in an unexpected way
To Reproduce
import numpy as np
import pandas as pd
import anndata as ad
import mudata as md
# gex has 4 cells
gex = ad.AnnData(
X=np.random.rand(4, 3),
obs=pd.DataFrame(index=["cellA-1", "cellB-1", "cellC-1", "cellD-1"]),
var=pd.DataFrame(index=["G1", "G2", "G3"]),
)
# airr has only 2 of those cells (subset)
airr = ad.AnnData(
X=np.empty((2, 0)),
obs=pd.DataFrame(
{"VJ_1_cdr3_aa": ["CASSL", "CASRG"]},
index=["cellA-1", "cellC-1"],
),
)
mdata = md.MuData({"gex": gex, "airr": airr})
old_airr_obs = list(mdata["airr"].obs_names)
m2 = mdata.copy()
m2.obs_names = [s.split("-", 1)[0] for s in m2.obs_names]
new_airr_obs = list(m2["airr"].obs_names)
print('Pre renaming')
print("GLOBAL:", list(mdata.obs_names))
print("GEX: ", list(mdata["gex"].obs_names))
print("AIRR: ", list(mdata["airr"].obs_names))
print('Post renaming')
print("GLOBAL:", list(m2.obs_names))
print("GEX: ", list(m2["gex"].obs_names))
print("AIRR: ", list(m2["airr"].obs_names))
Output:
Pre renaming
GLOBAL: ['cellA-1', 'cellB-1', 'cellC-1', 'cellD-1']
GEX: ['cellA-1', 'cellB-1', 'cellC-1', 'cellD-1']
AIRR: ['cellA-1', 'cellC-1']
Post renaming
GLOBAL: ['cellA', 'cellB', 'cellC', 'cellD'] # expected
GEX: ['cellA', 'cellB', 'cellC', 'cellD'] # expected
AIRR: ['cellA', 'cellB'] # UNEXPECTED
Expected behaviour
Either throw a warning or error about renaming the airr obs_names
System
mudata version '0.3.2'
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working