Skip to content
View Sally332's full-sized avatar

Block or report Sally332

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Sally332/README.md

👩‍🔬 Computational Biologist | Genomics, Single-Cell Analysis & Interpretable AI

I develop interpretable machine learning frameworks that integrate multi-omic and spatial data to uncover the molecular logic of complex biological systems, with a focus on reproducibility, biological grounding, and transparent interpretation of high-dimensional genomics data.


🔬 Research Vision

My work focuses on developing interpretable and reproducible computational frameworks for genomics, uniting biological prior knowledge with multi-omic and spatial data. The central goal is to move beyond black-box prediction toward mechanistic understanding, building models that not only perform well but explain how genetic variation, regulatory programs, and perturbations reshape cellular states. Each framework emphasizes pathway- and network-level interpretability, cross-dataset generalization, and transparent benchmarking, establishing reusable analytical standards for single-cell and multimodal genomics. Through this approach, I aim to bridge machine learning, systems biology, and data-driven biological interpretation, supporting robust and generalizable discovery across biological contexts.


📂 Key Projects

The following key projects are part of the MM-KPNN framework family, a unified effort to develop concept-bottleneck and biologically constrained models that embed prior knowledge directly into network architectures, ensuring interpretability, reproducibility, and mechanistic insight across multi-omic and spatial data.


Repo

A modular and interpretable graph framework for spatial transcriptomics in tissue microenvironments.

  • Combines Graph Attention Networks (GAT) with knowledge-primed decoding
  • Models cell–cell communication, immune exclusion, and stromal remodeling
  • Outputs attention maps, pathway overlays, and ligand–receptor driver rankings

Repo

Interpretable multimodal neural network integrating scRNA-seq and scATAC-seq using biological priors.

  • Decoder constrained by pathway and transcription factor nodes
  • Enables mechanistic attribution of regulatory programs and cell states
  • Designed for reproducible benchmarking across single-cell modalities

Repo

Pathway-bottleneck graph neural network for perturbation and drug-response prediction.

  • Integrates multi-omic features with prior knowledge graphs
  • Focuses on cross-dataset generalization across pharmacogenomic panels
  • Provides pathway-level interpretability and reproducible evaluation

Repo

Concept-bottleneck framework for modeling drug and CRISPR perturbation responses at single-cell resolution.

  • Implements pathway and TF bottlenecks for interpretability
  • Measures attribution stability across perturbation conditions
  • Supports counterfactual pathway analysis in single-cell datasets

Additional Repositories

Repo

A modular computational framework for the analysis of organoid systems.

  • Addresses reproducibility, heterogeneity, and data integration
  • Integrates RNA and protein modalities with interpretable ML
  • Demonstrates end-to-end reproducibility through documented notebooks

Repo

Spatial mapping of tissue architecture using 10x Visium transcriptomics.

  • Defines epithelial, immune, stromal, and proliferative regions
  • Reveals spatial organization and regional heterogeneity
  • Fully documented, end-to-end analytical workflow

Repo

End-to-end pipeline for structural variant discovery and annotation using PacBio long-read sequencing.

  • Implements clinical annotation (ACMG/AMP) and variant filtering
  • Includes functional scoring and visualization modules
  • Designed for scalable deployment in HPC environments

Repo

Modular framework for rare-variant burden analysis in genomic cohorts.

  • Supports SKAT, SKAT-O, and extended statistical methods
  • Implements functional weighting and population correction
  • Provides reproducible filtering and QC workflows

Repo

Systems biology workflow for reconstructing gene regulatory networks.

  • Integrates TF–target priors with expression-based inference
  • Performs network topology and modularity analysis
  • Identifies functionally enriched regulatory modules

Repo

Gene co-expression analysis pipeline using WGCNA.

  • Identifies expression modules and hub genes
  • Evaluates biological function and module preservation
  • Applies to bulk and single-cell RNA-seq datasets

Repo

Workflow for secure and efficient genomic data transfer using Globus.

  • Supports HPC environments and structured data sharing
  • Enables checksum validation and metadata tracking
  • Designed for collaborative, reproducible research

Contact

Sally Yepes
📧 sallyepes233@gmail.com
🔗 GitHub: Sally332
🔗 Portfolio: sally332.github.io

Pinned Loading

  1. Sally332 Sally332 Public