Skip to content

push_obs transfers labels out of order #109

@racng

Description

@racng

Describe the bug
I have leiden cluster labels stored in mdata.obs. I used mdata.push_obs to push the label to the 'rna' and 'prot' modalities. The labels are out of order.

To Reproduce
Using my own data

mdata.push_obs([cluster_key], ['rna', 'prot'])
(mdata['rna'].obs[cluster_key] ==mdata['rna'].obs_names.map(mdata.obs[cluster_key])).all() 
# False

I prefer to share the dataset privately over email/cloud drive. Please email me at rng@systemsbiology.org.

I cannot reproduce the bug with a toy example.

import mudata
import numpy as np
obs_names = [f"A{i}" for i in range(100)]
g_names = [f"G{i}" for i in range(1000)]
p_names = [f"P{i}" for i in range(10)]

mdata = mudata.MuData({
    'airr': sc.AnnData(
        np.zeros((50, 1)), 
        obs = pd.DataFrame(index=obs_names[:-51:-1]),
        var = pd.DataFrame(index=['T'])
    ),
    'rna': sc.AnnData(
        np.zeros((100, 1000)), 
        obs = pd.DataFrame(index=obs_names),
        var = pd.DataFrame(index=g_names)
    ),
    'prot': sc.AnnData(
        np.zeros((100, 10)), 
        obs = pd.DataFrame(index=obs_names),
        var = pd.DataFrame(index=p_names)
    )
})

mdata.obs['cluster'] = ['0'] * 50 + ['1'] * 50
mdata.obs['cluster'] = mdata.obs['cluster'].astype('category')
mdata.push_obs(['cluster'], ['rna', 'prot'])
(mdata['rna'].obs['cluster'] == mdata['rna'].obs_names.map(mdata.obs['cluster'])).all() 
# True

Expected behaviour
A clear and concise description of what you expected to happen.

System

  • OS: Ubuntu 24.04.3 LTS
  • Python version Python 3.11.13
  • Versions of libraries involved MuData 0.3.2, AnnData 0.12.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions