group1.mean and group2.mean do not fit with mean meth values per DMR (in-script)

Greetings,

I am analysing wgbs data from a total of 10 samples divided per 5 groups (n=2 per group). I calculated dmrs with the standart t.statistics for all pairwise comparisons.

For visualization in a heatmap, I generated a script (below) in which I generate a matrix with the mean methylation values of all CpGs within a dmr per sample. However, I noticed that the mean methylation values generated in my script do not match with the group.means in the t.stat output. In some cases, I see opposite behavior, a DMR that appears to be hypermethylated in one specific group in the heatmap, is actually hypomethylated in the same comparison in the t.stat results. Furthermore, I noticed that this behavior occurs often in DMRs with high intra-group variability.

Can you weight on this? Is this possible to happen with the t.stat method or am I missing some mistake in my workflow?

In-house script:
# Make genomic ranges for bsseq object

ranges <- as.data.frame(bsseq.fit@rowRanges)
ranges$seqnames <- gsub("\\.[0-9]+$", "", ranges$seqnames)
row.names(ranges) <- paste0(ranges$seqnames, "_", ranges$start, "_", ranges$end)
cpg_gr <- makeGRangesFromDataFrame(ranges,
                                   keep.extra.columns = FALSE,
                                   ignore.strand = TRUE,
                                   seqnames.field = "seqnames",
                                   start.field = "start",
                                   end.field = "end",
                                   strand.field = "strand",
                                   starts.in.df.are.0based = FALSE)
 
# Extract methylation matrix
Meth <- getMeth(bsseq.fit, type = "smooth") %>% as.data.frame()
colnames(Meth) <- bsseq.fit@colData$Sample
row.names(Meth) <- rownames(ranges)
 
# Create unique DMR names
dmr_results$Name <- paste0(dmr_results$chr, "_", dmr_results$start, "_", dmr_results$end)
dmr_dict <- split(dmr_results, dmr_results$Name)
dmr_list <- names(dmr_dict)
 
# Precompute overlaps
cpg_ids <- rownames(ranges)
cpg_gr$ID <- cpg_ids
dmr_gr <- makeGRangesFromDataFrame(dmr_results,
                                   seqnames.field = "chr",
                                   start.field = "start",
                                   end.field = "end",
                                   keep.extra.columns = TRUE,
                                   ignore.strand = TRUE)

dmr_gr$Name <- dmr_results$Name

overlaps <- findOverlaps(dmr_gr, cpg_gr, ignore.strand = TRUE)
overlap_list <- split(subjectHits(overlaps), queryHits(overlaps))

# Function using precomputed overlaps
calculate_dmr_means <- function(dmr_idx, dmr_list, Meth, overlap_list, cpg_ids) {
  ol <- overlap_list[[as.character(dmr_idx)]]
  if (!is.null(ol) && length(ol) > 0) {
    cpg_subset <- cpg_ids[ol]
   Bvalues_subset <- Meth[cpg_subset, , drop = FALSE]
   mean_beta <- colMeans(Bvalues_subset, na.rm = TRUE)
  } else {
    mean_beta <- rep(NA, ncol(Meth))
  }
  dmr_name <- dmr_list[dmr_idx]
  dmr_means <- as.data.frame(t(mean_beta))
 rownames(dmr_means) <- dmr_name
  colnames(dmr_means) <- colnames(Meth)
  return(dmr_means)
}
 
# Setup parallelism and run
message("Starting DMR Beta calculation...")
plan(multisession)
DMR_beta_list <- future_lapply(seq_along(dmr_list), calculate_dmr_means,
                               dmr_list = dmr_list,
                               Meth = Meth,
                               overlap_list = overlap_list,
                               cpg_ids = cpg_ids)
 
# Combine and write result
DMR_beta <- do.call(rbind, DMR_beta_list)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

group1.mean and group2.mean do not fit with mean meth values per DMR (in-script) #147

Make genomic ranges for bsseq object

Extract methylation matrix

Create unique DMR names

Precompute overlaps

Function using precomputed overlaps

Setup parallelism and run

Combine and write result

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

group1.mean and group2.mean do not fit with mean meth values per DMR (in-script) #147

Description

Make genomic ranges for bsseq object

Extract methylation matrix

Create unique DMR names

Precompute overlaps

Function using precomputed overlaps

Setup parallelism and run

Combine and write result

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions