Online streaming changepoint detection in R Package
This repository is a fork of my rkillick/changepoint.online (GSoC 2018) projects and aims to keep the same public interface while maintaining the codebase.
Given a time-ordered sequence
- change in mean,
- change in variance,
- change in mean and variance,
- (optionally) a nonparametric notion of change via energy-style statistics (“ECP”).
The package provides online versions of the familiar changepoint workflow by splitting the offline calls into:
- an initialisation step (create state), then
- repeated update steps (append new data, update estimates).
# install.packages("remotes")
remotes::install_github("AndrewC1998/changepoint.online")R CMD build .
R CMD INSTALL changepoint.online_*.tar.gzlibrary(changepoint.online)
set.seed(1)
x <- c(rnorm(50, 0, 1), rnorm(50, 5, 1), rnorm(50, 10, 1), rnorm(50, 3, 1))
y <- c(rnorm(50, 15, 1), rnorm(50, 25, 1), rnorm(50, 33, 1), rnorm(50, 7, 1))
state <- ocpt.mean.initialise(x) # initialise on initial batch
state <- ocpt.mean.update(state, y) # update with new batch
cpts(state) # estimated changepoint locations (end-of-segment indices)set.seed(1)
x <- c(rnorm(100, 0, 1), rnorm(100, 0, 3))
state <- ocpt.var.initialise(x, test.stat="Normal")
state <- ocpt.var.update(state, rnorm(100, 0, 0.5))
cpts(state)set.seed(1)
x <- matrix(c(rnorm(100, 50, 1), rnorm(100, 5, 1)), ncol=1)
y <- matrix(c(rnorm(100, 15, 1), rnorm(100, 25, 1)), ncol=1)
state <- ocpt.np.initialise(x)
state <- ocpt.np.update(state, y, K=3)
stateMost routines here can be understood through the standard penalised segmentation problem:
Choose changepoints
where
For exact multiple changepoints, PELT solves the recursion
with a pruning rule that removes candidate last-changepoint locations when they can no longer be optimal. This yields exact changepoints with linear expected time in the offline setting.
In this package, PELT.online operates on summary statistics so that segment costs can be computed quickly.
At time
- cumulative summary statistics needed to compute
$\mathcal{C}(x_{s+1:t})$ fast, - the current best segmentation according to the chosen method,
- any tuning constants (penalty,
minseglen, distribution choice, etc.).
When new data
- extends the summary statistics,
- recomputes the relevant dynamic program quantities for the new horizon,
- returns an updated
ocpt*object with updated changepoint estimates.
When test.stat="ECP" (or using the explicit ocpt.np.*), the algorithm switches to a nonparametric change statistic based on distances between observations in different segments (with a moment index alpha and window size delta exposed as tuning parameters). This is useful when parametric assumptions (Normal/Poisson/etc.) are questionable.
Mean changes
ocpt.mean.initialise(data, ...)/ aliasocpt.mean.initialize(...)ocpt.mean.update(previousanswer, newdata)
Variance changes
ocpt.var.initialise(data, ...)/ aliasocpt.var.initialize(...)ocpt.var.update(previousanswer, newdata)
Mean + variance changes
ocpt.meanvar.initialise(data, ...)/ aliasocpt.meanvar.initialize(...)ocpt.meanvar.update(previousanswer, newdata)
Nonparametric / ECP
ocpt.np.initialise(data, ...)/ aliasocpt.np.initialize(...)ocpt.np.update(previousanswer, newdata, K)
PELT.online(sumstat, pen, cost_func, ...)
A low-level PELT step operating on a matrix of summary statistics (exported for developers; no argument checking).PELT.online.initialise(...),PELT.online.update(...)
Convenience wrappers used internally whenmethod="PELT".
- Killick, R., Fearnhead, P., & Eckley, I. A. (2012). Optimal detection of changepoints with a linear computational cost. JASA, 107(500), 1590–1598.
GPL (see DESCRIPTION / LICENSE if present).