Skip to content

Inputting nuclei counts into cell2location #344

@ssobt

Description

@ssobt

Hi, thank you for this tool! I have a question about entering in cell counts. I’m using an older version of cell2location (v.02-alpha) to input in nuclei counts for the 'the expected number of cells per location' hyperparamter. We’re having some trouble getting the latest version (v.0.1.3) to assign cell probabilities to most of the tissue due to high RNA variability after trying both 20 and 200 for alpha (see image below for alpha 200). Areas with low RNA content have very low probabilities assigned for any of the reference cell types. To try to alleviate the problem, we switched to the older version to input custom cell/nuclei counts. In v.02-alpha, I have inputted in a 1-dimensional numpy array with the nuclei counts of each spot (made from concatenating rows of 2d x,y array) on the Visium slide, the following error occurs asking for one value instead of locations specific values: Gamma has no finite default value to use, checked: ('median', 'mean', 'mode'). Pass testval argument or adjust so value is finite.

I tried entering the 2d array directly and got the same error. The model only started to run when I entered one integer, so I was wondering how to input nuclei counts per each spot/location? Any advice on this would be great, thanks!

image

Here is the model setup:

nuclei_count_1d = np.array(nuclei_counts_1149G['Count'])
r = cell2location.run_cell2location(

      # Single cell reference signatures as pd.DataFrame
      # (could also be data as anndata object for estimating signatures
      #  as cluster average expression - `sc_data=adata_snrna_raw`)
      sc_data=inf_aver,
      # Spatial data as anndata object
      sp_data=slide,

      # the column in sc_data.obs that gives cluster idenitity of each cell
      summ_sc_data_args={'cluster_col': "annotation_1",
                        },

      train_args={'use_raw': True, # By default uses raw slots in both of the input datasets.
                  'n_iter': 40000, # Increase the number of iterations if needed (see QC below)

                  # Whe analysing the data that contains multiple experiments,
                  # cell2location automatically enters the mode which pools information across experiments
                  'sample_name_col': 'sample'}, # Column in sp_data.obs with experiment ID (see above)


      export_args={'path': results_folder, # path where to save results
                   'save_model': True,
                   'run_name_suffix': '' # optinal suffix to modify the name the run
                  },

      model_kwargs={ # Prior on the number of cells, cell types and co-located groups

                    'cell_number_prior': {
                        # - N - the expected number of cells per location:
                        'cells_per_spot': nuclei_count_1d, # < - change this
                        # - A - the expected number of cell types per location (use default):
                        'factors_per_spot': 7,
                        # - Y - the expected number of co-located cell type groups per location (use default):
                        'combs_per_spot': 7
                    },

                     # Prior beliefs on the sensitivity of spatial technology:
                    'gene_level_prior':{
                        # Prior on the mean
                        'mean': 1/2,
                        # Prior on standard deviation,
                        # a good choice of this value should be at least 2 times lower that the mean
                        'sd': 1/4
                    }
      }
)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions