-
Added
medianNormalize()function- performs median normalization of study samples on
soma_adatobjects that have been hybridization normalized and plate scaled - includes validation to ensure required normalization steps have been applied
- supports multiple reference approaches:
- Internal reference built from study samples
- Reference extracted from existing
soma_adatobject - External reference as a data.frame
- supports custom grouping by multiple clinical variables
- performs median normalization of study samples on
-
Added
reverseMedianNormalize()function- reverses median normalization (including ANML) that was previously
applied to study samples (
SampleType == "Sample") - designed to work with standard SomaScan deliverable ADAT files where study samples have undergone median or ANML normalization as the final sample processing step
- reverses median normalization (including ANML) that was previously
applied to study samples (
- Removed restrictive file validation from
read_annotations()- removed
md5sumchecksum validation and version dictionary checks resulting in misleading warnings about unknown annotations files - warning was often misleading, as menu annotations file updates are not always in alignment with timing of CRAN releases
- removed
getAnnoVer()function andver_dictobject - removed
tools::md5sumimport dependency
- removed
- Updated
preProcessAdat()to improve clarityfilter.qcparameter has been renamed tofilter.rowcheck- for backward compatibility,
filter.qcis still accepted but will generate a deprecation warning and will be removed in a future version - removed language that discusses implication of
ColCheckSeqIds not being removed frompreProcessAdat()output
- Updated
read_annotations()function andver_dictobject to recognize latest menu annotations files from Q3 2025
- updated statistical analysis workflow articles with clearer
messaging comments around
preProcessAdat() - updated two-group and three-group analysis workflow articles to
set
center.scale = FALSEinpreProcessAdat()to align with guidance to users for univariate analysis
- Updated
dplyrverb tests to no longer explicitly test for ordering of attributes (#165)
-
Added
updateColMeta()function- added utility function to update the column metadata in an
ADAT to match an annotations object from
read_annotations()
- added utility function to update the column metadata in an
ADAT to match an annotations object from
-
Added
is.AptName()to check for AptName formatis.apt()may return TRUE for both SeqIds and AptNames- wrote new function,
is.AptName(), to explicitly check if a SomaScan identifier is an AptName is.AptName()code and tests based on code/tests fromis.SeqId()
-
Relaxed file version requirement in
read_annotations()- now returns warning instead of error if annotations excel file version does not match specific values
-
Updated
preProcessAdat()to handle missingColCheck- added logic to
preProcessAdat()to handle adats that are missingColCheckin column annotation data
- added logic to
- Show package loading in stat workflow vignettes
- move
library()calls loading third-party packages to new code chunk that now appears in analysis workflow articles
- move
-
Added
preProcessAdat()function- added new function
preProcessAdat()to filter features, filter samples, generate data QC plots of normalization scale factors by covariates, and perform standard analyte RFU transformations including log10, centering, and scaling
- added new function
-
Added
calcOutlierMap()function- added
calcOutlierMap()and its print and plot S3 methods, along withgetOutlierIds()for identifying sample level outliers from outlier map object - added
ggplot2as a package dependency
- added
- Added
ex_clin_dataobject- a
tibbleobject with additional sample annotation fieldssmoking_statusandalcohol_useto demonstrate merging to asoma_adatobject
- a
-
Added pre-processing vignette article
- includes guidance on pre-processing SomaScan data for a typical analysis
- provides an example of recommended workflow of filtering features, filtering samples, performing data QC checks, and transformations of RFU features
- introduces usage of the
preProcessAdat()function
-
Improved adat ingest documentation in
README- added comments to clarify file path input to
read_adat()example inREADME
- added comments to clarify file path input to
-
Updated stat workflow articles to begin with reading in adat
- updated data preparation chunks with comments about
how to download and read in the the
example_data.adatobject - data preparation chunks now use the
preProcessAdat()function for pre-processing
- updated data preparation chunks with comments about
how to download and read in the the
-
Added sample annotation merging guidance
- updated
READMEand loading and wrangling vignette article with section including code to join theex_clin_dataobject to theexample_dataadat
- updated
- Added helper utility functions for snapshot plot unit tests
- added helper utility functions
figure(),close_figure(),save_png(), andexpect_snapshot_plot()for saving plot snapshot output totestthat/helper.R - added snapshot unit tests for
preProcessAdat()messaging, print and QC plot output
- added helper utility functions
- Added
calc_eLOD()function (#131)- calculates the estimated limit of detection (eLOD)
for SeqId columns of an input
soma_adatordata.frame
- calculates the estimated limit of detection (eLOD)
for SeqId columns of an input
-
Fixed
crayonbug andui_bullet()issue (#129, #130)- removed
crayonandusethisas dependencies in favor ofcli - fixed bug in R version 4.4.1 with
ui_bullet()internal calls withinloadAdatsAsList()andwrite_adat()
- removed
-
Fixed bug in
Summary.soma_adat()operations (#121)- these operations:
min(),max(),any(),range(), etc. would return the incorrect value due to anas.matrix()conversion under the hood - now skips that conversion, trips a warning, and carries on
- triggers an error if non-numerics are passed
as part of the '...' outside of a
soma_adat, just likeSummary.data.frame()
- these operations:
-
collapseAdats()now maintains Cal.Set entries of Col.Meta (#113)- collapsing ADATs can be problematic for the attributes, especially for large numbers of ADATs
collapseAdats()now attempts to smartly merge the (potentially numerous elements) Col.Meta attribute in the final object, preserving the "Cal.Set" and "ColCheck" columns in particular- the resulting
Col.Metaattribute is a combined product of the individual ADAT elements, and the intersect of the analyte features (as is the case for therbind()that is called)
-
Updated checksums and versions for Annotations Excel files (#116)
- updated the 7k and 11k file versions and md5sum checksums
- now allows
read_annotations()to load the individual Excel files
-
Updated
lift_masterobject to alpha sort columns
-
Updated company name, license year, and maintainer (#137)
- SomaLogic Operating Co., Inc is now Standard BioTools, Inc.
- updated license and copyright year to 2025
- updated package maintainer to Caleb Scheidel
-
Updated article links in README, intro vignette (#123)
- updated links to articles in README and introduction
vignette to URLs to pkgdown website rather than
vignette()code references - added clarification to above documents that articles are available on website only rather than traditional vignettes included with package
- updated links to articles in README and introduction
vignette to URLs to pkgdown website rather than
-
Updates to example documentation
read_annotations()example documentation now points to the most recent 11k Excel annotations fileparseHeader()example now prints list elements separately, rather than full object, which slowed website rendering
-
Updates to GitHub Action workflows
- added
rhub.yamlconfiguration file to comply withrhubv2 - updated macOS version in
pkgdown.yamlto macOS-14 - added write permission to
pkgdown.yamlfile to enable deployment - changed GitHub Action R checks to MacOS and Windows only
ubuntumachine was taking too long to build
- added
-
Increased package test coverage
- added unit tests for
getSomaScanLiftCCC(),parseCheck()and release utilities which were previously untested - increased test coverage for
pivotExpressionSet()
- added unit tests for
-
Added missing package anchors to .Rd files (#139)
- fixed note from remote windows check related to Rd \link{} targets missing package anchors
-
Updated README badge (#109)
- now shows 'downloads' per month over total downloads
-
Fixed link in DESCRIPTION; master -> main (#107)
-
Major restructure of
lift_adat()functionality (@stufield, #81, #78)lift_adat()now takes abridge =argument, replacing theanno.tbl =argument. Lifting is now performed internally for a better (and safer) user experience, without the necessity of an external annotations (Excel) file.- the majority of this refactoring was internal and the user should not experience a major disruption to the API.
- much improved lifting/bridging documentation (#82)
-
Added a new lifting and bridging vignette (@stufield, #77)
- in addition to the improved lifting documentation this new vignette provides additional context, explanation, clear examples, and lifting guidance.
-
is_lifted()is new and returns a boolean according to whether the signal space (RFU) has been previously lifted -
Lifting accessor function for Lin's CCC values (#88)
getSomaScanLiftCCC()accesses the lifting correlations between SomaScan versions for each analyte- returns a
tibblesplit by sample matrix (serum or plasma)
-
merge_clin()is newly exported (#80)- a thin wrapper that allows users to merge
clinical variables to
soma_adatobjects easily - previously users had to either use the CLI merge tool
or merge in clinical variables themselves with
dplyr
- a thin wrapper that allows users to merge
clinical variables to
-
Newly exported ADAT "get**" helpers (#83)
- functions to access properties of ADATs
getAdatVersion()getSomaScanVersion()getSignalSpace()checkSomaScanVersion()
getAdatVersion()gets a new S3 method (#92)- this enables passing of different objects
- namely
soma_adatorlistdepending on the situation
- functions to access properties of ADATs
-
Newly exported functions that were previously internal only:
addAttributes()addClass()cleanNames()
-
The package
READMEis now simplified (#35)- example analysis workflows are now split out
into their own vignettes/articles
and cross-linked in the
README
- example analysis workflows are now split out
into their own vignettes/articles
and cross-linked in the
-
Reorganization and expansion of statistical vignettes (#35, #47)
- moved 3 existing statistical examples from
READMEinto their own vignettes - resulting in four new "Statistical Workflow" vignettes/articles:
- Binary classification via logistic regression
- Linear regression for continuous variables
- Two-group comparison via t-test
- Three-group analysis ANOVA
- moved 3 existing statistical examples from
-
Added new general analysis workflow vignettes
- articles for the pkgdown website have been built out
- new articles on:
- safely mapping values among variables
- safely renaming a data frame
- loading-and-wrangling
- typical train and test data splits
- beginning the FAQs and/or Coming Soon pages
-
Added a new vignette describing how to use the command-line interface merge tool (#45)
- the new CLI merge tool used to add new clinical data to existing ADAT file
-
collapseAdats()better combinesHEADERinformation (#86)- certain information, e.g.
PlateScaleandCal*, are better maintained in the final collapsed ADAT - other entries are combined by pasting into a single string
- should result in less duplication of superfluous entries and
retention of more "useful"
HEADERinformation in the resulting (collapsed)soma_adat
- certain information, e.g.
-
Update
read_annotations()with11kcontent (#85) -
Update
transform()andscaleAnalytes()scaleAnalytes()(internal) now skips missing references and is much more like a "step" in therecipespackagetransform()gets edge case protection withdrop = FALSEin case a single-analytesoma_adatis scaled.
-
New
row.names()S3 method support forsoma_adatclass- dispatched on calls to
rownmaes() - rather than calling
NextMethod()which normally would invokedata.frame, we now force thedata.framemethod in case there aretbl_dforgrouped_dfclasses present that would be dispatched. Those are bypassed in favor of thedata.framebecausetbl_df1) can nuke the attributes, 2) triggers a warning about adding rownames to atibble.
- dispatched on calls to
-
New
grouped_dfS3 print support for the groupedsoma_adat- now displays Grouping information from a call to
the S3 print method for
soma_adatclass
- now displays Grouping information from a call to
the S3 print method for
-
New
grouped_dfS3 method support forsoma_adatclass (#66)grouped_dfdata objects previously unsupported and were interfering with downstream S3 methods fordplyrverbs onceNextMethod()was called- this support now ensures that the group
methods are maintained, as well as the
soma_adatclass itself (and most importantly, with its attributes intact)
-
tidyr::separate.soma_adat()S3 method was simplified (#72)- now uses
%||%helper internally - expanded error messages inside
stopifnot()to be more informative
- now uses
-
is_intact_attr()is now much quieter, signaling only when called indirectly (#71)- new conditional logic to silences signaling messages when called from within another function (indirectly)
- these previously lead to confusing messages
when they appear in wrappers, where
is_intact_attr()can be, sometimes deeply, nested in the call stack
-
Development and improvements to the
pkgdownwebsite- added new links and improved clarity in YAML
- added new logo at footer
- restyled side bar for easier hyperlinking and getting help
- clicking on the SomaLogic logo in the GitHub
READMEnow links to thepkgdownwebsite - new "Coming Soon" drop-down section in the website header to let users know about active progress (but not yet ready for external publication)
-
SomaDataIOno longer depends ondescpackage- to generate the
README.md
- to generate the
-
Internal rowname helpers were upgraded
- they now use internal cross-functions as originally intended to avoid redundancy, efficiency, and improved debugging
-
sysdata.rdano longer contains non-exported functions (#59)- new internal helper functions:
convertColMeta()genRowNames()parseCheck()syncColMeta()scaleAnalytes()
- new internal helper functions:
-
Bug-fix for corner-case writing a single-analyte ADAT (#51)
- RFU values are rounded to 1 decimal place when written by
write_adat(), via a call toapply(), which expects a 2-dim object when replacing those values. write_adat()no longer usesapply()and instead converts the entire RFU data frame to a matrix (maintains original dimensions), and use vectorized format conversion viasprintf()- in theory this should be faster because
sprintf()is only called once on a long vector, rather than 1000s of times on shorter vectors (insideapply()).
- RFU values are rounded to 1 decimal place when written by
-
Fixed missing closing parenthesis in
SomaScanObjects.R(thanks @Hijinx725!, #40)
- We are now on CRAN! 🥳
-
New clinical data merge CLI tool (@stufield, #25)
Rscript --vanilla merge_clin.Rfor merging clinical variables into existing*.adatSomaScan data files- added 2 new example
meta.csvandmeta2.csvfiles to run examples with random data but with valid index keys - see
dir(system.file("cli", "merge", package = "SomaDataIO"))
-
Package data objects (@stufield, #32)
example_data.adatwas reduced in size ton = 10samples (from 192) to conform to CRAN size requirements (< 5MB)- the current file was renamed
example_data10.adatto reflect this change - this likely has far-reaching consequences for users who access
this flat file via
system.file() - the
example_dataobject itself however remains true to its original file (https://github.com/SomaLogic/SomaLogic-Data/blob/master/example_data.adat) - the directory location
inst/example/was renamedinst/extdata/to conform to CRAN package standard naming conventions - the file
single_sample.adatwas removed from package data as it is now redundant (however still used in unit testing) SomaDataObjectswas renamed and is nowSomaScanObjects
-
Gradual deprecation (@stufield)
read.adat()is now soft-deprecated; please useread_adat() instead- lifecycle for soft-deprecated
warn()->stop()for functions that have been been soft deprecated sincev5.0.0getSomamers()getSomamerData()meltExpressionSet()
-
New S3 print method default (@stufield)
tibblehas newmax_extra_cols =argument, which is set to6for theprint.soma_adatmethod
-
New S3 merge method (@stufield, #31)
- calling
base::merge()on asoma_adatis strongly discouraged - we now redirect users to use
dplyr::*_join()alternatives which are designed to preservesoma_adatattributes
- calling
-
Code hardening for
prepHeaderMeta()(@stufield)- some ADATs do not have
CreatedDateandCreatedByin the HEADER entry. This currently breaks the writer - simplified to make more robust but also refactor to be more convenient (for abnormal ADATs not generated by standard SomaScan processing)
CreatedDateHistorywas removed as an entry from written ADATsCreatedByHistorywas combined and dated for written ADATsNULLbehavior remains if keys are missingCreatedByandCreatedDatewill be generated either as new entries or over-written as appropriate
- some ADATs do not have
-
Numerous non-user-facing (API) changes internal package maintenance, efficiency, and structural upgrades were included
-
Bug-fix release related to
write_adat():- fixed bug in
write_adat()that resulted from adding/removing clinical (non-SomaScan) variables to an ADAT. Export viawrite_adat()resulted in a broken ADAT file (@stufield, #18) write_adat()now has much higher fidelity to original text file (*.adat) in full-cycle read-write-read operations; particularly in presence of bangs (!) in the Header section and in floating point decimals in the?Col.Metasectionwrite_adat()no longer converts commas (,) to semi-colons (;) in the?Col.Metablock (originally introduced to avoid cell alignment issues in*.csvformats)write_adat()no longer concatenates written ADATs, when writing to the same file. Data is over-written to file to avoid mangled ADATs resulting from re-writing to the same connection and to match the default behavior ofwrite.table(),write.csv(), etc.
- fixed bug in
-
read_adat()now has more consistent character type theBarcode2variable in standard ADATs, now forcescharacterclass, does not allow R'sread.delim()to "guess" the type -
Decreased dependency of
magrittrpipes (%>%) in favor of the native R pipe (|>). As a result the package now depends onR >= 4.1.0SomaDataIOwill continue to re-exportmagrittrpipes for backward compatibility, but this should not be considered permanent. Please code accordingly
-
Migration to the default branch in GitHub from
master->main(@stufield, #19) -
Numerous non-user-facing (API) changes internal package maintenance, efficiency, and structural upgrades were included
-
Upgrades primarily from improvements to SomaLogic internal code base, including: (@stufield)
- general reduction on external package dependency to improve code stability
- internal usage of base R alternatives to the
readrpackage for parsing and importing ADATs (e.g.read.delim()overreadr::read_delim()). This is mostly for code simplification, but can often result in marked speed improvements. As the SomaScanplexsize increases, this speed improvement will become more important. parseHeader()was dramatically simplified, now reading in lines 20L at a time until the RFU block is reached. In addition, once the block is reached, all header lines are read-in once and indexed (as opposed to line-by-line).read_adat()now specifies column types viacolClasses =which for the majority of the ADAT is typedoublefor the RFU columns. This should dramatically improve speed of ingest.write_adat()was simplified internally, with fewer nestedapplyand for-loops.- encoding for all input/output (I/O) is assumed to be
UTF-8.
-
New
getAnalytes()S3 method for classrecipefrom therecipespackage. -
New
loadAdatsAsList()to load multiple ADAT files in a single call and optionally collapse them into a single data frame (@stufield, #8). -
New
getTargetNames()function to map ADATseq.XXXX.XXnames to corresponding protein targets from the annotations table
-
SomaLogic Inc. is now SomaLogic Operating Co. Inc.
-
Added new documentation regarding
Col.Meta(@stufield, #12).- documentation around column meta data, row meta data, where they are found in an ADAT, and how to access them.
-
Research Use Only ("RUO") language was added to the README (@stufield, #10).
-
Numerous internal code improvements from SomaLogic code-base (@stufield)
- the consisted of reducing usage of external dependencies,
e.g. using
stop()overui_stop()andwarning()overui_warn(), usingusethis,cli, andcrayonshims aliases. - package uses
purrrvery selectively and no longer usesstringr. - using base R alternatives in favor of increased stability for underlying, non-user-facing code.
- the consisted of reducing usage of external dependencies,
e.g. using
-
New
lift_adat()was added to provided 'lifting' functionality (@stufield, #11)- provides mechanism to convert RFU space between SomaScan versions (e.g. v4.1 -> v4.0).
- added new S3
transform.soma_adat()method which simplifies linear scaling ofsoma_adatcolumns (analytes). - uses an "Annotations file" (Excel) as source of scalars for transformation.
-
Minor improvements and updates to the
README.Rmd(@stufield, #7)- fixed a broken
adat2eSet()link in README (#5). - clearer text to the
READMEregardingBiobaseinstallation. - added new links to external Bioconductor website in installation section of README.
- new
pkgdownand links to Issues (#4). - SomaLogic logo was added to README.
- a lifecycle ("maturing") badge was added.
- fixed a broken
-
Startup message was improved with dynamic width (@stufield).
-
New
locateSeqId()function to pull outSeqIdregex. (@stufield). -
New
read_annotations()function (@stufield, #2)- new function to parse/import SomaLogic annotations files (
*.xlsx).
- new function to parse/import SomaLogic annotations files (
-
New
set_rn()drop-in replacement formagrittr::set_rownames() -
getFeatures()was renamed to be less ambiguous and better align with internal SomaLogic code usage. Now usegetAnalytes()(@stufield) -
getFeatureData()was also renamed togetAnalyteInfo()(@stufield) -
various upgrades as required by code changes in external package dependencies, e.g.
tidyverse. -
new alias for
read_adat(),read.adat(), for backward compatibility to previous versions ofSomaDataIO(@stufield)
- Initial public release to GitHub!