Skip to content

tdhock/binsegRcpp

Repository files navigation

binsegRcpp Efficient implementation of the binary segmentation heuristic algorithm for changepoint detection, using C++ std::multiset. Also contains functions for comparing empirical time complexity to best/worst case.

testshttps://github.com/tdhock/binsegRcpp/workflows/R-CMD-check/badge.svg
coveragehttps://codecov.io/gh/tdhock/binsegRcpp/branch/master/graph/badge.svg

Installation

install.packages("binsegRcpp") ##ORif(require("remotes"))install.packages("remotes") remotes::install_github("tdhock/binsegRcpp")

Usage

The main function is binseg for which you must at least specify the first two arguments:

  • distribution.str specifies the loss function to minimize.
  • data.vec is a numeric vector of data to segment.
>x<- c(0.1, 0, 1, 1.1, 0.1, 0) > (models.dt<-binsegRcpp::binseg("mean_norm", x)) binarysegmentationmodel:segmentsendlossvalidation.loss<int><int><num><num>1:161.348333e+0002:241.015000e+0003:321.500000e-0204:431.000000e-0205:555.000000e-0306:61-3.339343e-160

The result above summarizes the data that are computed during the binary segmentation algorithm. It has a special class with dedicated methods:

> class(models.dt) [1] "binsegRcpp""list"> methods(class="binsegRcpp") [1] coefplotprintsee'?methods'foraccessinghelpandsourcecode

The coef methods returns a data table of segment means:

> coef(models.dt, segments=2:3) segmentsstartendstart.posend.posmean<int><int><int><num><num><num>1:2140.54.50.552:2564.56.50.053:3120.52.50.054:3342.54.51.055:3564.56.50.05

Demo of poisson loss and non-uniform weights:

>data.vec<- c(3,4,10,20) > (fit1<-binsegRcpp::binseg("poisson", data.vec, weight.vec=c(1,1,1,10))) binarysegmentationmodel:segmentsendlossvalidation.loss<int><int><num><num>1:14-393.843702:23-411.634703:32-413.941604:41-414.01330

Demo of change in mean and variance for normal distribution:

>sim<-function(mu,sigma)rnorm(10000,mu,sigma) > set.seed(1) >data.vec<- c(sim(5,1), sim(0, 5)) >fit<-binsegRcpp::binseg("meanvar_norm", data.vec) > coef(fit, 2L) segmentsstartendstart.posend.posmeanvar<int><int><int><num><num><num><num>1:21100000.510000.54.993462961.0247632:2100012000010000.520000.5-0.0209503324.538556

Related work

Other implementations of binary segmentation include changepoint::cpt.mean(method=”BinSeg”) (quadratic storage in max number of segments), BinSeg::BinSegModel() (same linear storage as binsegRcpp), and ruptures.Binseg() (unknown storage). Figures comparing the timings.

This version uses the Rcpp/.Call interface whereas the binseg package uses the .C interface.

See branches for variations of the interface to use as test cases in RcppDeepState development.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages