Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
0e33f3c
Replace cat() with message() in logging utilities
Arshammik Feb 8, 2026
198a682
Replace cat/sink progress bar with txtProgressBar in train()
Arshammik Feb 8, 2026
8618258
Replace cat() progress bar with txtProgressBar in imputation accessors
Arshammik Feb 8, 2026
1fb1e25
Replace cat() progress bars with txtProgressBar in imputation
Arshammik Feb 8, 2026
18af0fc
Add verbose parameter to Seurat integration functions
Arshammik Feb 8, 2026
c1b8e1e
Replace installed.packages() with requireNamespace(), gate messages
Arshammik Feb 8, 2026
64e1ca4
Add Benchmark and .claude to .gitignore
Arshammik Feb 8, 2026
3d8ecf7
Replace unconditional message() with warning() for rasterization
Arshammik Feb 8, 2026
53946c2
Replace cat() with message() in list_h5_structure
Arshammik Feb 8, 2026
5918f9c
Replace cat() with message() in pathway analysis functions
Arshammik Feb 8, 2026
530862c
Bump version to 2.3.1, fix DESCRIPTION for CRAN compliance
Arshammik Feb 8, 2026
de57236
Move hdf5r from Imports to Suggests in R code
Arshammik Feb 8, 2026
70f62dd
Use CRAN-required LICENSE format
Arshammik Feb 8, 2026
ffcf104
Add missing @return tags to documented functions
Arshammik Feb 8, 2026
6c3c4ee
Replace non-ASCII characters in C++ source files
Arshammik Feb 8, 2026
cfafbc2
Add NEWS.md, cran-comments.md, update .Rbuildignore
Arshammik Feb 8, 2026
1eb820e
Regenerate NAMESPACE and man pages via roxygen2
Arshammik Feb 8, 2026
da11c66
Fix redirected GitHub URLs in DESCRIPTION
Arshammik Feb 8, 2026
52aeecf
fix: address R CMD check warnings and notes
Arshammik Feb 9, 2026
c55b638
fix: add missing Alpha to globalVariables for ggplot2 NSE
Arshammik Feb 9, 2026
842e079
fix: update 301-redirecting URLs, add CRAN badge, fix .Rbuildignore
Arshammik Feb 9, 2026
64617ed
chore: clean up leftover log files, update .Rbuildignore
Arshammik Feb 9, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,11 @@
^\.DS_Store$
^README\.md$
^roadmap\.md$
^Benchmark$
^gedi\.Rcheck$
^cran-comments\.md$
^\.claude$
^Logo\.svg$
^check_output\.txt$
^check_run\.log$
^nohup\.out$
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -58,3 +58,5 @@ rsconnect/

# Temp directory for development
temp/
Benchmark/
.claude/
29 changes: 12 additions & 17 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,33 +1,28 @@
Package: gedi
Type: Package
Title: Gene Expression Decomposition and Integration
Version: 2.3.0
Date: 2026-01-08
Version: 2.3.1
Date: 2026-02-08
Authors@R: c(
person("Arsham", "Mikaeili Namini", email = "arsham.mikaeilinamini@mail.mcgill.ca", role = c("aut", "cre")),
person("Hamed", "S.Najafabadi", email = "hamed.najafabadi@mcgill.ca", role = c("aut"))
)
Maintainer: Arsham Mikaeili Namini <arsham.mikaeilinamini@mail.mcgill.ca>
Description: A memory-efficient implementation for integrating gene expression data
from single-cell RNA sequencing experiments. GEDI v2 uses a C++ backend with
from single-cell RNA sequencing experiments. Uses a C++ backend with
thin R wrappers to enable analysis of large-scale single-cell datasets. The
package supports multiple data modalities including count matrices, paired
data (Splicing, Velocyto, CITE-seq), and binary indicators. It implements a latent variable model
with block coordinate descent optimization for dimensionality reduction and
batch effect correction.
data (splicing, RNA velocity, CITE-seq), and binary indicators. It implements
a latent variable model with block coordinate descent optimization for
dimensionality reduction and batch effect correction.
License: MIT + file LICENSE
URL: https://github.com/Arshammik/gedi
BugReports: https://github.com/Arshammik/gedi/issues
URL: https://github.com/csglab/gedi2
BugReports: https://github.com/csglab/gedi2/issues
Depends: R (>= 4.0.0)
Imports: Rcpp (>= 1.0.0), R6 (>= 2.5.0), Matrix (>= 1.3.0), hdf5r,
methods, stats, utils
Imports: Rcpp (>= 1.0.0), R6 (>= 2.5.0), Matrix (>= 1.3.0),
ggplot2, scales, methods, stats, utils
LinkingTo: Rcpp, RcppEigen
Suggests: Seurat, SingleCellExperiment
SystemRequirements: C++14, GNU make, Eigen3 (>= 3.3.0)
Suggests: hdf5r, uwot, digest, glmnet, Seurat, SingleCellExperiment
SystemRequirements: C++14, GNU make
Encoding: UTF-8
RoxygenNote: 7.3.3
Roxygen: list(markdown = TRUE)
NeedsCompilation: yes
Packaged: 2025-10-22 18:58:46 UTC; arshammikaeili
Author: Arsham Mikaeili Namini [aut, cre],
Hamed S.Najafabadi [aut]
23 changes: 2 additions & 21 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,21 +1,2 @@
MIT License

Copyright (c) 2025 Arsham Mikaeili Namini

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
YEAR: 2025
COPYRIGHT HOLDER: Arsham Mikaeili Namini
19 changes: 10 additions & 9 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Generated by roxygen2: do not edit by hand

S3method(print,gedi_dynamics)
S3method(print,gedi_dynamics_svd)
S3method(print,gedi_imputation)
S3method(print,gedi_pathway_associations)
export(CreateGEDIObject)
export(check_optional_dependencies)
export(gedi_to_seurat)
Expand All @@ -11,10 +15,6 @@ export(plot_embedding)
export(plot_feature_ratio)
export(plot_features)
export(plot_vector_field)
export(print.gedi_dynamics)
export(print.gedi_dynamics_svd)
export(print.gedi_imputation)
export(print.gedi_pathway_associations)
export(read_h5)
export(read_h5ad)
export(seurat_to_gedi)
Expand All @@ -25,14 +25,15 @@ importFrom(Matrix,Matrix)
importFrom(Matrix,sparseMatrix)
importFrom(Matrix,t)
importFrom(Rcpp,sourceCpp)
importFrom(hdf5r,"h5attr<-")
importFrom(hdf5r,H5File)
importFrom(hdf5r,existsGroup)
importFrom(hdf5r,h5attr)
importFrom(methods,as)
importFrom(methods,is)
importFrom(stats,coef)
importFrom(stats,median)
importFrom(stats,rnorm)
importFrom(stats,runif)
importFrom(utils,installed.packages)
importFrom(stats,var)
importFrom(utils,object.size)
importFrom(utils,packageVersion)
importFrom(utils,setTxtProgressBar)
importFrom(utils,txtProgressBar)
useDynLib(gedi, .registration = TRUE)
29 changes: 29 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# gedi 2.3.1

## CRAN Compliance

* Replace all `cat()` / `print()` calls with `message()` or `warning()` for
suppressible console output.
* Add `verbose` parameters to `seurat_to_gedi()`, `gedi_to_seurat()`,
`check_optional_dependencies()`, and `install_optional_dependencies()`.
* Replace `cat()`-based progress bars with `txtProgressBar()` across imputation
and training routines.
* Replace `installed.packages()` with `requireNamespace()` for dependency
checking.
* Move `hdf5r` from Imports to Suggests (optional dependency for H5AD I/O).
* Add `ggplot2` and `scales` to Imports; add `uwot` and `digest` to Suggests.
* Use CRAN-required two-line LICENSE format.
* Add `@return` documentation tags to all exported and documented functions.
* Replace non-ASCII characters in C++ source files with ASCII equivalents.
* Remove redundant Maintainer field from DESCRIPTION.

# gedi 2.3.0

* Initial public release with C++ backend and R6 interface.
* Support for multiple data modalities (count matrices, paired data, binary
indicators).
* Latent variable model with block coordinate descent optimization.
* Dimensionality reduction, batch correction, and imputation.
* Differential expression and pathway association analysis.
* H5AD file I/O for Python interoperability.
* Seurat and SingleCellExperiment integration.
36 changes: 18 additions & 18 deletions R/RcppExports.R
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,7 @@ run_factorized_svd_cpp <- function(Z, projDB, verbose = 0L) {
#' @return Dense matrix (J x Ni) - residual Yi after removing sample effects
#'
#' @details
#' Computes: Yi - QiDBi - (si ⊗ 1ᵀ) - (o + oi) ⊗ 1ᵀ
#' Computes: Yi - QiDBi - (si (x) 1^T) - (o + oi) (x) 1^T
#' This leaves only the ZDBi component (shared biological signal).
#'
#' @keywords internal
Expand All @@ -238,7 +238,7 @@ Yi_resZ <- function(Yi, QiDBi, si, o, oi) {
#' @return Dense matrix (J x Ni) - predicted Yi
#'
#' @details
#' Computes: Ŷi = ZDBi + QiDBi + (si ⊗ 1ᵀ) + (o + oi) ⊗ 1ᵀ
#' Computes: Y_hati = ZDBi + QiDBi + (si (x) 1^T) + (o + oi) (x) 1^T
#' This is the full model prediction for log-expression.
#'
#' @keywords internal
Expand All @@ -262,7 +262,7 @@ predict_Yhat <- function(ZDBi, QiDBi, si, o, oi) {
#'
#' This comes from the Poisson-lognormal model where:
#' - Mi ~ Poisson(exp(Yi))
#' - Yi ~ N(Ŷi, sigma2)
#' - Yi ~ N(Y_hati, sigma2)
#'
#' The posterior variance decreases where:
#' - Counts are high (exp(Yi) large)
Expand Down Expand Up @@ -292,7 +292,7 @@ Yi_var_M <- function(Yi, sigma2) {
#'
#' This comes from the binomial-logistic-normal model where:
#' - M1i ~ Binomial(M1i + M2i, p)
#' - logit(p) = Yi ~ N(Ŷi, sigma2)
#' - logit(p) = Yi ~ N(Y_hati, sigma2)
#'
#' The variance depends on:
#' - Total counts M (more counts = less variance)
Expand Down Expand Up @@ -324,7 +324,7 @@ Yi_var_M_paired <- function(Yi, M1i, M2i, sigma2) {
#' for downstream dispersion analysis (binning and aggregation done in R).
#'
#' Memory optimization: Works only on sparse nonzero positions (typically 5-10% of matrix).
#' For 30K genes × 10K cells with 5% nonzero: samples from ~15M positions instead of 300M.
#' For 30K genes x 10K cells with 5% nonzero: samples from ~15M positions instead of 300M.
#'
#' @keywords internal
#' @noRd
Expand Down Expand Up @@ -362,7 +362,7 @@ Yi_SSE_M_paired <- function(Yi, M1i, M2i, ZDBi, QiDBi, si, o, oi, sigma2) {
#'
#' @param feature_weights Vector of length K (factor loadings for this feature)
#' @param D Scaling vector of length K
#' @param Bi_list List of sample-specific cell projection matrices (K × Ni each)
#' @param Bi_list List of sample-specific cell projection matrices (K x Ni each)
#' @param verbose Integer verbosity level
#'
#' @return Vector of length N (total cells) with projected values
Expand All @@ -378,12 +378,12 @@ compute_feature_projection <- function(feature_weights, D, Bi_list, verbose = 0L
#' Projects multiple features through the GEDI model simultaneously.
#' Computes: (feature_weights * D) %*% B for F features.
#'
#' @param feature_weights Matrix K × F (factor loadings for F features)
#' @param feature_weights Matrix K x F (factor loadings for F features)
#' @param D Scaling vector of length K
#' @param Bi_list List of sample-specific cell projection matrices (K × Ni each)
#' @param Bi_list List of sample-specific cell projection matrices (K x Ni each)
#' @param verbose Integer verbosity level
#'
#' @return Matrix N × F with projected values for each feature
#' @return Matrix N x F with projected values for each feature
#'
#' @keywords internal
#' @noRd
Expand Down Expand Up @@ -420,18 +420,18 @@ aggregate_vectors <- function(Dim1, Dim2, To1, To2, color, alpha, n_bins, min_pe
#' the concatenation of all sample-specific Bi matrices. This is the main
#' integrated representation of cells in the GEDI latent space.
#'
#' @param Z Shared metagene matrix (J × K), where J = genes, K = latent factors
#' @param Z Shared metagene matrix (J x K), where J = genes, K = latent factors
#' @param D Scaling vector (length K) representing the importance of each factor
#' @param Bi_list List of sample-specific cell projection matrices, where each
#' Bi is a K × Ni matrix (Ni = number of cells in sample i)
#' Bi is a K x Ni matrix (Ni = number of cells in sample i)
#' @param verbose Integer verbosity level:
#' \itemize{
#' \item 0: Silent (no output)
#' \item 1: Progress bar and summary statistics
#' \item 2: Detailed per-sample information
#' }
#'
#' @return Dense matrix ZDB of dimensions J × N, where N = sum(Ni) is the total
#' @return Dense matrix ZDB of dimensions J x N, where N = sum(Ni) is the total
#' number of cells across all samples. Each column represents a cell in the
#' integrated latent space.
#'
Expand All @@ -441,17 +441,17 @@ aggregate_vectors <- function(Dim1, Dim2, To1, To2, color, alpha, n_bins, min_pe
#'
#' Computational strategy:
#' \enumerate{
#' \item Pre-compute ZD = Z * diag(D) once (saves K×J×numSamples operations)
#' \item Pre-compute ZD = Z * diag(D) once (saves KxJxnumSamples operations)
#' \item For each sample i: compute ZD * Bi and concatenate
#' \item Use Eigen block operations for efficient memory access
#' }
#'
#' Memory: Allocates one J × N dense matrix. For large datasets (e.g., 20k genes,
#' Memory: Allocates one J x N dense matrix. For large datasets (e.g., 20k genes,
#' 50k cells), this requires ~8 GB RAM. Consider computing projections on demand
#' or working with subsets if memory is limited.
#'
#' Performance: OpenMP parallelization available if enabled during compilation.
#' Typical speed: ~100-200ms for 20k × 5k dataset on modern CPU.
#' Typical speed: ~100-200ms for 20k x 5k dataset on modern CPU.
#'
#' @keywords internal
#' @noRd
Expand All @@ -467,10 +467,10 @@ compute_ZDB_cpp <- function(Z, D, Bi_list, verbose = 0L) {
#'
#' @param D Scaling vector (length K) representing factor importance
#' @param Bi_list List of sample-specific cell projection matrices, where each
#' Bi is a K × Ni matrix
#' Bi is a K x Ni matrix
#' @param verbose Integer verbosity level (0 = silent, 1 = progress, 2 = detailed)
#'
#' @return Dense matrix DB of dimensions K × N, where N = sum(Ni). Each column
#' @return Dense matrix DB of dimensions K x N, where N = sum(Ni). Each column
#' represents a cell's coordinates in the latent factor space.
#'
#' @details
Expand All @@ -488,7 +488,7 @@ compute_ZDB_cpp <- function(Z, D, Bi_list, verbose = 0L) {
#' \item Apply diagonal scaling: diag(D) * B
#' }
#'
#' Memory: Much smaller than ZDB (K × N vs. J × N). For K=10 and N=50k cells,
#' Memory: Much smaller than ZDB (K x N vs. J x N). For K=10 and N=50k cells,
#' requires only ~4 MB vs. ~8 GB for ZDB when J=20k.
#'
#' Performance: Very fast (~10-50ms) since K << J typically.
Expand Down
4 changes: 2 additions & 2 deletions R/gedi-package.R
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,6 @@
#' @seealso
#' \itemize{
#' \item \code{\link{CreateGEDIObject}}: Create a GEDI model
#' \item \code{\link{GEDI}}: R6 class documentation
#' }
#'
#' @examples
Expand Down Expand Up @@ -86,5 +85,6 @@
#' @import R6
#' @import Matrix
#' @importFrom methods as is
#' @importFrom stats rnorm runif
#' @importFrom stats coef median rnorm runif var
#' @importFrom utils object.size setTxtProgressBar txtProgressBar
NULL
Loading