Add median normalization functionality#170
Merged
scheidec merged 10 commits intoSomaLogic:mainfrom Mar 16, 2026
Merged
Conversation
- added new medianNormalize() function for median normalization of soma_adat objects - supports different methods for passing a reference: calculating an internal from the provided soma_adat with the ability to select specific samples, or passing an external ADAT, tab or comma delimited file or data.frame with Dilution and Reference columns - includes validation for required hybridization and plate scale normalization steps
- updated documentation to provide guidance on what type of data the function can be applied to - added data validation for hybridization normalization, plate scale factors, and existing processing - implemented `reverse_existing` parameter to handle already-normalized data - added dilution count validation, accepting 1 or 3, along with enhanced error messaging - now recalculates RowCheck to adjust for new MedNorm values - now adds `medNormSMP_ReferenceRFU` field to SeqId annotation to denote the median mormalization reference - now updates the header metadata to add `crossplate` entry to `MedNormReference` field
- use subset of the exmaple data for unit tests to reduce execution time
- adjusted defaults to sample-only processing (SampleType == Sample) - added ANML normalization reversal support - simplified external reference to data.frame-only input - now require only SeqId and Reference columns for external reference data - added medNormSMP_ReferenceRFU annotation field with 2 decimal precision - updated ProcessSteps formatting to use "MedNormSMP" with reversal tracking
- removed sample selection parameters (ref_field, ref_value, do_field, do_regexp) in favor of simplification in alignment with default of using study samples only
- removed `reverse_existing` parameter from `medianNormalize()`, which now requires an unnormalized adat input - updated `medianNormalize()` to now point to reverseMedianNormalize() if an already normalized adat is passed
- properly handle data extraction from multiple scale factor columns - added logic to preserve soma_adat attributes during rbind operations - improved SampleType matching precision and clearing of ANMLFractionUsed column in reverseMedianMormalize()
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements median normalization for study samples in a
soma_adatobject, including ability to reverse existing normalization prior to normalization to an external reference, and comprehensive console messaging for users.Key Changes
medianNormalize()function: Performs median normalization on soma_adat objects with support for internal/external references, grouped normalization, and comprehensive data validation. The function normalizes study samples only (SampleType == "Sample")reverseMedianNormalize()function: Reverses existing median or ANML normalization, also resetting scale factors to 1.0 and removing normalization metadata