CellPhe provides functions to phenotype cells from time-lapse videos and accompanies the paper:
Wiggins, L., Lord, A., Murphy, K.L. et al.
The CellPhe toolkit for cell phenotyping using time-lapse imaging and pattern recognition.
Nat Commun 14, 1854 (2023).
https://doi.org/10.1038/s41467-023-37447-3
You can install the latest version of CellPhe from GitHub with:
# install.packages("devtools")
devtools::install_github("uoy-research/CellPhe")Included with the package is an example dataset to demonstrate CellPhe’s
capabilities, this data is available in example_data.zip and comprises
3 parts:
- The time-lapse stills as TIFF images (
05062019_B3_3_imagedata) - Existing pre-extracted features
(
05062019_B3_3_Phase-FullFeatureTable.csv) - Region-of-interest (ROI) boundaries already demarked in ImageJ
format (
05062019_B3_3_Phase)
These should be extracted into a suitable location before proceeding with the rest of the tutorial.
library(CellPhe)The first step is to prepare a dataframe containing metadata and any
pre-existing attributes. If PhaseFocus Livecyte or Trackmate software
has been used to generate the region-of-interest (ROI) files, then a
helper function is available to create the required metadata format:
copyFeatures. The dataframe format comprises each row corresponding to
a cell tracked in a given frame, indexed by columns FrameID and
CellID which contain numerical identifiers (NB: FrameID must be in
ascending chronological order). The only other required field is
ROI_filename, which specifies the filename of the ROI file
corresponding to the frame-cell combination. Any features can be
provided in additional columns, copyFeatures returns volume and
sphericity from PhaseFocus software.
The example below creates the metadata dataframe from a PhaseFocus experimental setup, only including cells that were tracked for at least 50 frames.
min_frames <- 50
input_feature_table <- "05062019_B3_3_Phase-FullFeatureTable.csv"
feature_table <- copyFeatures(input_feature_table, min_frames, source="Phase")In addition to any pre-calculated features, the extractFeatures()
function generates 74 descriptive features for each cell on every frame
using the frame images and pre-generated cell boundaries, based on size,
shape, texture, and the local cell density. The output is a dataframe
comprising the FrameID, CellID, and ROI_filename identifying
columns, the 74 features as columns, and any additional features that
may be present (such as from copyFeatures()) in further columns. The
program expects frames to be named according to the scheme
<experiment name>-<frameid>.tif, where <frameid> is a 4 digit
zero-padded integer corresponding to the FrameID column, and located
in the frame_folder directory, while ROI files are named according to
the ROI_filename column and located in the roi_folder directory.
roi_folder <- "05062019_B3_3_Phase"
image_folder <- "05062019_B3_3_imagedata"
new_features <- extractFeatures(feature_table, roi_folder, image_folder, framerate=0.0028)Variables are calculated from the time series for any pre-existing
features as well as the output of extractFeatures(), providing both
summary statistics and indicators of time-series behaviour at different
levels of detail obtained via wavelet analysis. 15 summary scores are
calculated for each feature, in addition to the cell trajectory, thereby
resulting in a default output of 1081 features (15x72 + 1). These are
output in the form of a dataframe with the first column being the
CellID used previously.
tsvariables <- varsFromTimeSeries(new_features)