Skip to content

Add median normalization functionality#170

Merged
scheidec merged 10 commits intoSomaLogic:mainfrom
scheidec:add-medNorm
Mar 16, 2026
Merged

Add median normalization functionality#170
scheidec merged 10 commits intoSomaLogic:mainfrom
scheidec:add-medNorm

Conversation

@scheidec
Copy link
Copy Markdown
Contributor

Summary

Implements median normalization for study samples in a soma_adat object, including ability to reverse existing normalization prior to normalization to an external reference, and comprehensive console messaging for users.

Key Changes

  • Adds medianNormalize() function: Performs median normalization on soma_adat objects with support for internal/external references, grouped normalization, and comprehensive data validation. The function normalizes study samples only (SampleType == "Sample")
  • Adds reverseMedianNormalize() function: Reverses existing median or ANML normalization, also resetting scale factors to 1.0 and removing normalization metadata
  • Includes robust validation and console messaging: Comprehensive checks for existing data state, including processing steps (HybNorm, PlateScale), and checks for existing normalization

scheidec and others added 10 commits March 16, 2026 10:14
- added new medianNormalize() function for median normalization
  of soma_adat objects
- supports different methods for passing a reference: calculating an
  internal from the provided soma_adat with the ability to select
  specific samples, or passing an  external ADAT, tab or comma delimited
  file or  data.frame with Dilution and Reference columns
- includes validation for required hybridization and plate scale
  normalization steps
- updated documentation to provide guidance on what type of
  data the function can be applied to
- added data validation for hybridization normalization, plate
  scale factors, and existing processing
- implemented `reverse_existing` parameter to handle
  already-normalized data
- added dilution count validation, accepting 1 or 3, along with
  enhanced error messaging
- now recalculates RowCheck to adjust for new MedNorm values
- now adds `medNormSMP_ReferenceRFU` field to SeqId annotation to
  denote the median mormalization reference
- now updates the header metadata to add `crossplate` entry to
  `MedNormReference` field
- use subset of the exmaple data for unit tests
  to reduce execution time
- adjusted defaults to sample-only processing (SampleType == Sample)
- added ANML normalization reversal support
- simplified external reference to data.frame-only input
- now require only SeqId and Reference columns for external reference data
- added medNormSMP_ReferenceRFU annotation field with 2 decimal precision
- updated ProcessSteps formatting to use "MedNormSMP" with reversal tracking
- removed sample selection parameters (ref_field, ref_value,
  do_field, do_regexp) in favor of simplification in alignment
  with default of using study samples only
- removed `reverse_existing` parameter from `medianNormalize()`,
  which now requires an unnormalized adat input
- updated `medianNormalize()` to now point to
  reverseMedianNormalize() if an already normalized
  adat is passed
- properly handle data extraction from multiple scale factor columns
- added logic to preserve soma_adat attributes during rbind operations
- improved SampleType matching precision and clearing of
  ANMLFractionUsed column in reverseMedianMormalize()
@scheidec scheidec merged commit 019a321 into SomaLogic:main Mar 16, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants