scGHSOM: A Hierarchical Framework for Single-Cell Data Clustering and Visualization
Currently running using the WSL terminal in VS Code.
- Requires JRE (Java Runtime Environment)
- Python version 3.6 or higher
- ./raw-data/Levine_13
- ./raw-data/Levine_32
- ./raw-data/CyTOF-Samusik
raw-data(folder): Stores data to be clustered.raw-data/label(folder): Stores labels for clustering data.- File names should have the same prefix as the data file, with
_labelappended.
- File names should have the same prefix as the data file, with
- Input data must be in CSV format.
- Columns: Represent training attributes (all columns).
- Rows: Represent data to be clustered.
- Before starting clustering, name the index column (the index name must be passed in the command).
Run the following commands in the terminal:
# for Levine_13
python3 execute.py --index=Event --data=Levine_13dim_cleaned --tau1=0.06 --tau2=0.1
# for Levine_32
python3 execute.py --index=Event --data=Levine_32dim_cleaned --tau1=0.1 --tau2=0.2
# for CyTOF-Samusik
python3 execute.py --index=Event --data=Samusik_01_cleaned --tau1=0.08 --tau2=0.2
dataandindexare mandatory parameters (ensure the index column is named and not empty).- If
tau1andtau2are not provided:tau1defaults to 0.1tau2defaults to 0.01
execute.py: Runs all the process steps.format_ghsom_input_vector.py: Generates data in a format compatible with GHSOM.get_ghsom_dim.py: Retrieves the dimensions of the clustering results.save_cluster_with_clustered_label.py: Produces a data frame with clustering results (Leaf and each Layer) and saves it to thedatafolder.
evaluation/clustering_scores: Calculates external and internal evaluation scores.
Run the following commands in the terminal:
# Cluster Feature Map
python3 programs/Visualize/cluster_feature_map.py --data=Samusik_01_cleaned --tau1=0.08 --tau2=0.2
# Cluster Distribution Map
python3 programs/Visualize/cluster_distribution_map.py --data=Samusik_01_cleaned --tau1=0.08 --tau2=0.2
data,tau1, andtau2should be set based on your dataset and analysis needs.
- Online Tutorial
- Shang-Jung Wen*, Jia-Ming Chang*, David Jing-Wei Chen, and Fang Yu scGHSOM: A Hierarchical Framework for Single-Cell Data Clustering and Visualization. 2025 IEEE Transactions on Computational Biology and Bioinformatics PP (99): 1–17. doi:10.1109/tcbbio.2025.3593632.
- Web server