-
Notifications
You must be signed in to change notification settings - Fork 5
X. Python transfer
Explaining the transfer for different Seurat objects, to use in Python-based tools (like scvelo or SCEPIA).
This page outlines two options for transferring your Seurat object:
- Save the matrices from the Seurat object, load into an AnnData object in Python.
- Transfer with SeuratDisk to a .h5ad file (a HDF5-based AnnData containing file).
The different analyses in Seurat result in Seurat objects with a different build-up! For instance when processing for RNA velocity or performing an integrated Seurat analysis, the assays in the object differ.
Raw counts and normalized counts can be stored as sparse matrices, to save disk space.
Set up:
- Needed dependencies
- Transfer Seurat object (for SCEPIA)
- Transfer Seurat object for RNA Velocity
- TO DO: Transfer Seurat object from Integrated analyses

In R:
library(Seurat)
library(Matrix)
## The object itself (we call it seuset!)
seuset <- readRDS("path_to_file/Seurat_object.rds")
In Python:
import anndata as ad
import pandas as pd
import numpy as np
import scipy.io
## To plot your data
import scanpy as sc
Steps in R to save important matrices:
## Save raw counts, cell metadata and UMAP embedding:
writeMM(seuset[["RNA"]]@counts, "seuset_counts.mtx") ## For normalized counts use: seuset[["RNA"]]@data
write.csv(seuset@meta.data, "seuset_metadata.csv", quote = FALSE)
write.csv(seuset@reductions$umap@cell.embeddings, "seuset_umap_embedding.csv", quote = FALSE)
## For SCEPIA specifically: scaled counts and HVG details
write.csv(seuset[["RNA"]]@scale.data, "seuset_scaled_counts.csv", quote = FALSE)
write.csv(seuset[["RNA"]]@meta.features[,c("vst.variance.standardized", "vst.variable")], "seuset_HVG.csv"), quote = FALSE)
Steps in Python to build an AnnData object (example for SCEPIA):
## Load files:
raw_counts = scipy.io.mmread("seuset_counts.mtx") ## Example for sparse matrices
scaled_counts = np.asmatrix(pd.read_csv("seuset_scaled_counts.csv", index_col = 0))
metadata = pd.read_csv("seuset_metadata.csv", index_col = 0)
featuredata = pd.read_csv("seuset_HVG.csv", index_col = 0)
umap_embedding = np.asmatrix(pd.read_csv("seuset_umap_embedding.csv", index_col = 0))
## Built adata object:
## matrices from Seurat need to be transposed before adding.
## Example with scaled data like we use for SCEPIA, for other downstream purposes you can replace the scaled matrices with raw/normalized count matrices
adata = ad.AnnData(X = scaled_counts.T,
obs = metadata,
var = featuredata
)
adata.obsm["X_umap"] = umap_embedding
## Plot UMAP with Seurat's clustering:
sc.pl.umap(adata, color = ['seurat_clusters'])
## Additional steps for SCEPIA:
adata.var['highly_variable'] = adata.var['vst.variable']
adata.var['dispersions_norm'] = adata.var['vst.variance.standardized']
adata.obs['louvain'] = adata.obs['seurat_clusters']
Steps in R to save important matrices:
writeMM(seuset[["sf"]]@counts, "seuset_spliced_counts.mtx")
writeMM(seuset[["uf"]]@counts, "seuset_unspliced_counts.mtx")
write.csv(seuset@meta.data, "seuset_metadata.csv", quote = FALSE)
write.csv(seuset@reductions$umap@cell.embeddings, "seuset_umap_embedding.csv", quote = FALSE)
## For SCEPIA specifically
write.csv(seuset[["sf"]]@scale.data, "seuset_scaled_counts.csv", quote = FALSE)
## HVG details needed for SCEPIA (as adata.var['dispersions_norm'] and adata.var["highly_variable"] respectively)
write.csv(seuset[["sf"]]@meta.features[,c("vst.variance.standardized", "vst.variable")], "seuset_HVG.csv"), quote = FALSE)
Steps in Python to build an AnnData object (example for scvelo):
## Load files:
raw_spliced = scipy.io.mmread("seuset_spliced_counts.mtx")
raw_unspliced = scipy.io.mmread("seuset_unspliced_counts.mtx")
metadata = pd.read_csv(metadata_path, index_col = 0)
featuredata = pd.read_csv(featuredata_path, index_col = 0)
umap_embedding = np.asmatrix(pd.read_csv(umap_path, index_col = 0))
## Building the adata object, matrices from Seurat need to be transposed before adding.
adata = ad.AnnData(X = raw_spliced.T,
obs = metadata,
var = featuredata
)
adata.layers["spliced"] = raw_spliced.T
adata.layers["unspliced"] = raw_unspliced.T
adata.obsm["X_umap"] = umap_embedding
Now start from the steps of normalization, and run the scvelo analysis!