From 04c67b43bab165bd0c0004d15dad02e6d9fe2f07 Mon Sep 17 00:00:00 2001 From: killick Date: Tue, 8 Nov 2022 11:51:59 +0000 Subject: [PATCH] Updating to CRAN version 2.2.4 Bringing Github back in sync with CRAN versioning - my bad. --- DESCRIPTION | 14 +- NEWS | 63 +-- R/cpt.class.R | 47 ++- R/exp.R | 10 +- R/gamma.R | 10 +- R/multiple.nonparametric.R | 8 +- R/multiple.norm.R | 12 +- R/poisson.R | 10 +- R/single.nonparametric.R | 12 +- R/single.norm.R | 16 +- inst/CITATION | 2 +- man/changepoint-package.Rd | 4 +- man/cpts--methods.Rd | 4 + man/cpts-methods.Rd | 5 + man/cpts.Rd | 5 +- src/BinSeg_one_func_minseglen.c | 387 +++++++++--------- src/PELT_one_func_minseglen.c | 226 +++++----- .../diagnostic-plot-default.svg | 55 +++ .../diagnostic-plot-histogram.svg | 187 +++++++++ tests/testthat/test-plot.R | 16 + 20 files changed, 674 insertions(+), 419 deletions(-) create mode 100644 tests/figs/plot-diagnostic-true-tests/diagnostic-plot-default.svg create mode 100644 tests/figs/plot-diagnostic-true-tests/diagnostic-plot-histogram.svg create mode 100644 tests/testthat/test-plot.R diff --git a/DESCRIPTION b/DESCRIPTION index c115717..4c07b2b 100755 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,28 +1,28 @@ Package: changepoint Type: Package Title: Methods for Changepoint Detection -Version: 2.2.3 -Date: 2022-03-08 +Version: 2.2.4 +Date: 2022-10-31 Authors@R: c(person("Rebecca", "Killick", role=c("aut","cre"),email="r.killick@lancs.ac.uk"), person("Kaylea", "Haynes", role="aut"), person("Idris", "Eckley", role=c("ths")), - person("Paul","Fearnhead",role=c("ctb","ths")), person("Jamie","Lee",role="ctr")) + person("Paul","Fearnhead",role=c("ctb","ths")), person("Robin", "Long", role="ctb"),person("Jamie","Lee",role="ctr")) Maintainer: Rebecca Killick BugReports: https://github.com/rkillick/changepoint/issues URL: https://github.com/rkillick/changepoint/ Description: Implements various mainstream and specialised changepoint methods for finding single and multiple changepoints within data. Many popular non-parametric and frequentist methods are included. The cpt.mean(), cpt.var(), cpt.meanvar() functions should be your first point of call. Depends: R(>= 3.1), methods, stats, zoo(>= 0.9-1) -Suggests: testthat +Suggests: testthat, vdiffr License: GPL -LazyLoad: yes LazyData: true -Packaged: 2022-03-14 14:14:13 UTC; killick +Packaged: 2022-10-31 14:14:13 UTC; killick NeedsCompilation: yes Repository: CRAN -Date/Publication: 2022-03-14 15:50:02 UTC +Date/Publication: 2022-10-31 15:50:02 UTC Author: Rebecca Killick [aut, cre], Kaylea Haynes [aut], Idris Eckley [ths], Paul Fearnhead [ctb, ths], + Robin Long [ctb], Jamie Lee [ctr] diff --git a/NEWS b/NEWS index 331f686..1bcee24 100755 --- a/NEWS +++ b/NEWS @@ -1,7 +1,12 @@ +Version 2.2.4 +============= +* Updated C code to be compliant with new CRAN flags +* Updated check for 1D objects to also check for 1D arrays. Thanks to Github user ChristopherEeles for highlighting the bug. + Version 2.2.3 ============= * Updated documentation to reflect that the changepoint is the last observation of a segment, not the first observation of a new segment as previously described. Thanks to Alex Bertram for reporting this. -* Added fuctionality for cpt.range into cpts to allow specification of ncpts. Use as cpts(object,ncpts=X) to retrive the X changepoints segmentation from object which has a cpt.range class. Thanks to Craig Faunce for the suggestion. +* Added fuctionality for cpt.range into cpts to allow specification of ncpts. Use as cpts(object,ncpts=X) to retrive the X changepoints segmentation from an object which has cpt.range class. Thanks to Craig Faunce for the suggestion. * Corrected parameter estimation for cpt.reg class so that sig2 contains the variance MLE per segment. This addresses a consistency issue with lm when evaluating logLik when no changepoints are found. Thanks to Simon Taylor for spotting this. * Correction bug in parameter estimation for cpt.var with cpt.range class when ncpts!=NA. Previously this was using seg.len() function which returned NA for CROPS and incorrect segments lengths for BinSeg. Thanks to David Krider for asking a question which led to this bug being identified. * Added "type=" functionality to the "diagnostic=TRUE" graphs. Thanks go to github user hoxo_m for providing the code and tests for this. @@ -93,14 +98,10 @@ Version 2.0 Version 1.1.5 ============= -Changes to previous version - * Modified citation information Version 1.1.3 ============= -Changes to previous version - * Upgraded zoo from "imports" to "depends". This is needed because more functionality from zoo is required and this also gives users access to the zoo features directly whereas previously they had to library the package separately. * The segment neighbourhood algorithm has now been updated to make sure that the changes in variance or mean and variance do not return segments of length 1. This is now in line with the PELT and BinSeg algorithms. * Added citation information. @@ -108,66 +109,46 @@ Changes to previous version Version 1.1.2 ============= -Changes to previous version - * The check introduced in version 1.1.1 was causing problems in older versions as there was a mask between S3 and S4 generics. Thus we removed the masking effect and incremented the minimum version of R to 3.0. Thanks go to Philipp Boersch-Supan for highlighting the bug. * Added Kaylea Haynes as author of package. Version 1.1.1 ============= -Changes to previous version - * Added a check on the generics used to check if the generics "coef" and "logLik" exist and if not to create them. This was creating problems with backwards R compatibility as "coef" and "logLik" were only introduced as generics in recent versions of R. Version 1.1 =========== -Changes to previous version - * Adding printing of p-value to single changepoint likelihood-based functions. Thanks go to anonymous reviewers for this suggestion. * Changed all Segment Neighbourhood and Binary Segmentation functions to print a warning if the number of changes/segments identified equals Q. Version 1.0.6 ============= -Changes to previous version - * Fixed bug where the likelihood & logLik generics returns an error when there are no changes. This was due to using the cpts generic which no longer returns the length of the data as a change (since version 1.0). Thanks go to Gia-Thuy Pham for highlighting this bug. * Corrected calculation of segment scale parameter estimates for Gamma models (used in cpt.meanvar(...,test.stat='Gamma')). Previously this was dividing by seg.length+1 when it should be dividing by seg.length. Thanks go to Martyn Byng for highlighting the error. Version 1.0.5 ============= -Changes to previous version - * Added journal references to PELT documentation. Thanks go to Martyn Byng for pointing out that this required updating. * Added checks on minimum lengths of data input (minimum for mean is 1 per segment, for variance / mean & variance it is 2 observations per segment). Thanks go to Toby Hocking for highlighting the error. Version 1.0.4 ============= -Changes to previous version - * Changed all the Binary Segmentation multiple changepoint algorithms to have a default oldmax parameter of Inf. The old default (1000) was causing problems when users were wanting to do elbow plots for penalty choice. Thanks go to Brian Jessop for highlighting the problem. Note: There was no problem with the changepoints returned by the previous code. This change purely relates to elbow plot penalty choice. Version 1.0.3 ============= -Changes to previous version - * Changed a few commands where na.rm=F to na.rm=T. This was causing problems when users entered constant data as some test statistics could produce NAs. Thanks go to Harutyun Khachatryan for pointing out this bug. Version 1.0.2 ============= -Changes to previous version - * Changed argument name in cpt.mean, cpt.var and cpt.meanvar. Previously was named 'dist', now named 'test.stat'. Not backwards compatible. This is in line with the major changes in version 1.0 to the class structure. Version 1.0.1 ============= -Changes to previous version - * Corrected naming of columns returned by logLik and likelihood generics. These return the scaled negative likelihood and scaled negative likelihood + penalty. Version 1.0 =========== -Changes to previous version - * Removed printing and plotting of the last observation as the last cpt * The distribution class slot for the cpt class has been rename to test.stat (backwards compatibility remains - for now) * Changed value parameter in all functions to be pen.value (no backwards compatibility for named variables but ordering of arguments remains the same) @@ -181,61 +162,43 @@ Changes to previous version Version 0.8 =========== -Changes to previous version - * Restructured C implementation of PELT to remove unnecessary function calls (minor speed improvement) * Updated FTSE100 data * Added Lai2005fig3 and Lai2005fig4 data Version 0.7.1 ============= -Changes to previous version - * added BIC1, SIC1, AIC1, Hannan-Quinn1 penalties (counting the changepoint as a parameter in contrast to BIC, SIC, AIC, Hannan-Quinn which do not) Version 0.7 =========== -Changes to previous version - * added change in Poisson distribution to cpt.meanvar Version 0.6.1 ============= -Changes to previous version - * removed plotting of "n" as a changepoint for changes in variance * changed array allocation to calloc in PELT C code * added checks to C code for user interruption (including appropriate memory handling) Version 0.6 =========== -Changes to previous version - * PELT algorithms now run using C code * Binary Segmentation algorithms now run using C code Version 0.5 =========== -Changes to previous version - * added namespace to comply with R 2.13 Version 0.4.2 ============= -Changes to previous version - * corrected mismatch in default values for mu Version 0.4.1 ============= -Changes to previous version - * changed default value of the mu parameter from -1000 to NA Version 0.4 =========== -Changes to previous version - * removed regression for Normal distributed errors * added cusum for change in mean * added cusum of squares for change in variance @@ -244,26 +207,18 @@ Changes to previous version Version 0.3 =========== -Changes to previous version - * added change in regression for Normal distributed errors Version 0.2.1 ============= -Changes to previous version - * fixed multiple optimal changepoint bug in multiple algorithms Version 0.2 =========== -Changes to previous version - * added change in scale parameter of Gamma distribution to cpt.meanvar * added change in Exponential distribution to cpt.meanvar * added ncpts function to cpt and cpt.reg classes -Version 0.1: Original - -Rebecca Killick -Lancaster University -www.lancs.ac.uk/~killick +Version 0.1 +=========== +* Original diff --git a/R/cpt.class.R b/R/cpt.class.R index 5a394e1..d145d97 100644 --- a/R/cpt.class.R +++ b/R/cpt.class.R @@ -142,7 +142,7 @@ if (is.function("cpts")){ fun <- cpts } - else {fun <- function(object){ + else {fun <- function(object,...){ standardGeneric("cpts") } } @@ -150,7 +150,19 @@ } setMethod("cpts","cpt",function(object) object@cpts[-length(object@cpts)]) setMethod("cpts","cpt.reg",function(object) object@cpts[-length(object@cpts)]) + setMethod("cpts","cpt.range",function(object,ncpts=NA){ + if(is.na(ncpts)){return(object@cpts[-length(object@cpts)])} + else{ + ncpts.full=apply(cpts.full(object),1,function(x){sum(x>0,na.rm=TRUE)}) + row=try(which(ncpts.full==ncpts),silent=TRUE) + if(inherits(row,'try-error')){ + stop("Your input object doesn't have a segmentation with the requested number of changepoints.\n Possible ncpts are: ",paste(ncpts.full,collapse=',')) + } + else{return(cpts.full(object)[row,])} + } + }) + if(!isGeneric("cpts.full")) { if (is.function("cpts.full")){ fun <- cpts.full @@ -346,6 +358,16 @@ else{ object@cpts <- c(value,n) } return(object) }) + setReplaceMethod("cpts", "cpt.range", function(object, value) { + if((cpttype(object)=="meanar")|(cpttype(object)=="trendar")){ + n=length(object@data.set)-1 + } + else{n=length(object@data.set)} + + if(value[length(value)]==n){object@cpts <- value} + else{ object@cpts <- c(value,n) } + return(object) + }) setReplaceMethod("cpts", "cpt.reg", function(object, value) { if(value[length(value)]==nrow(object@data.set)){object@cpts <- value} else{ object@cpts <- c(value,nrow(object@data.set)) } @@ -547,7 +569,7 @@ param.var=function(object,cpts){ nseg=length(cpts)-1 data=data.set(object) - seglen=seg.len(object) + seglen=cpts[-1]-cpts[-length(cpts)] tmpvar=NULL for(j in 1:nseg){ tmpvar[j]=var(data[(cpts[j]+1):(cpts[j+1])]) @@ -578,8 +600,8 @@ thetaT=(6*cptsumstat[,2])/((seglen+1)*(2*seglen+1)) + (thetaS * (1-((3*seglen)/((2*seglen)+1)))) return(cbind(thetaS,thetaT)) } - param.meanar=function(object){ - seglen=seg.len(object) + param.meanar=function(object,cpts){ + seglen=cpts[-1]-cpts[-length(cpts)] data=data.set(object) n=length(data)-1 sumstat=cbind(cumsum(c(0,data[-1])),cumsum(c(0,data[-(n+1)])),cumsum(c(0,data[-1]*data[-(n+1)])),cumsum(c(0,data[-1]^2)),cumsum(c(0,data[-(n+1)]^2))) @@ -589,8 +611,8 @@ return(cbind(beta1,beta2)) } - param.trendar=function(object){ - seglen=seg.len(object) + param.trendar=function(object,cpts){ + seglen=cpts[-1]-cpts[-length(cpts)] data=data.set(object) n=length(data)-1 sumstat=cbind(cumsum(c(0,data[-1])),cumsum(c(0,data[-(n+1)])),cumsum(c(0,data[-1]*data[-(n+1)])),cumsum(c(0,data[-1]*c(1:n))),cumsum(c(0,data[-(n+1)]*c(0:(n-1))))) @@ -685,7 +707,7 @@ } tmpfit=eval(parse(text=paste('lm(data[',(cpts[j]+1),':',cpts[j+1],',1]~',formula,')',sep=''))) tmpbeta[j,]=tmpfit$coefficients - tmpsigma[j]=var(tmpfit$residuals) + tmpsigma[j]=sum(tmpfit$residuals^2)/(length(tmpfit$residuals)-length(tmpfit$coefficients)) ##var(tmpfit$residuals) } return(list(beta=tmpbeta,sig2=tmpsigma)) } @@ -805,7 +827,15 @@ setMethod("plot","cpt.range",function(x,ncpts=NA,diagnostic=FALSE,cpt.col='red',cpt.width=1,cpt.style=1,...){ if(diagnostic==TRUE){ - return(plot(apply(cpts.full(x),1,function(x){sum(x>0,na.rm=TRUE)}),pen.value.full(x),type='l',xlab='Number of Changepoints',ylab='Penalty Value',...)) + n.changepoints = apply(cpts.full(x), 1, function(x) sum(x > 0, na.rm = TRUE)) + penalty.values = pen.value.full(x) + if (is.null(list(...)$type)) { + # By default, the type of the diagnostic plots is "lines". + plot(x = n.changepoints, y = penalty.values, xlab = 'Number of Changepoints', ylab = 'Penalty Value', type = "l", ...) + } else { + plot(x = n.changepoints, y = penalty.values, xlab = 'Number of Changepoints', ylab = 'Penalty Value', ...) + } + return(invisible(NULL)) } plot(data.set.ts(x),...) if(is.na(ncpts)){ @@ -891,6 +921,7 @@ rss=sum((data.set(object)-means)^2) n=length(data.set(object)) like=n*(log(2*pi)+log(rss/n)+1) # -2*loglik + cpts=c(0,object@cpts) if(pen.type(object)=="MBIC"){ like=c(like, like+(nseg(object)-2)*pen.value(object)+sum(log(cpts[-1]-cpts[-(nseg(object)+1)]))) }else{ diff --git a/R/exp.R b/R/exp.R index 028941e..f0e82b9 100644 --- a/R/exp.R +++ b/R/exp.R @@ -20,7 +20,7 @@ single.meanvar.exp.calc <- } } - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single data set cpt=singledim(data,extrainf,minseglen) return(cpt) @@ -48,7 +48,7 @@ single.meanvar.exp.calc <- single.meanvar.exp<-function(data,penalty="MBIC",pen.value=0,class=TRUE,param.estimates=TRUE,minseglen){ if(sum(data<0)>0){stop('Exponential test statistic requires positive data')} - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) } @@ -59,7 +59,7 @@ single.meanvar.exp<-function(data,penalty="MBIC",pen.value=0,class=TRUE,param.es if(n<(2*minseglen)){stop('Minimum segment legnth is too large to include a change in this data')} penalty_decision(penalty, pen.value, n, diffparam=1, asymcheck="meanvar.exp", method="AMOC") - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ tmp=single.meanvar.exp.calc(coredata(data),extrainf=TRUE,minseglen) if(penalty=="MBIC"){ tmp[3]=tmp[3]+log(tmp[1])+log(n-tmp[1]+1) @@ -267,7 +267,7 @@ multiple.meanvar.exp=function(data,mul.method="PELT",penalty="MBIC",pen.value=0, } diffparam=1 - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) } @@ -278,7 +278,7 @@ multiple.meanvar.exp=function(data,mul.method="PELT",penalty="MBIC",pen.value=0, pen.value = penalty_decision(penalty, pen.value, n, diffparam=1, asymcheck=costfunc, method=mul.method) - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset out = data_input(data=data,method=mul.method,pen.value=pen.value,costfunc=costfunc,minseglen=minseglen,Q=Q) diff --git a/R/gamma.R b/R/gamma.R index 83c810a..52d86fd 100644 --- a/R/gamma.R +++ b/R/gamma.R @@ -21,7 +21,7 @@ function(data,shape=1,extrainf=TRUE,minseglen){ } - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single data set cpt=singledim(data,shape,extrainf,minseglen) return(cpt) @@ -51,7 +51,7 @@ function(data,shape=1,extrainf=TRUE,minseglen){ single.meanvar.gamma<-function(data,shape=1,penalty="MBIC",pen.value=0,class=TRUE,param.estimates=TRUE,minseglen){ if(sum(data<=0)>0){stop('Gamma test statistic requires positive data')} - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) } @@ -62,7 +62,7 @@ single.meanvar.gamma<-function(data,shape=1,penalty="MBIC",pen.value=0,class=TRU if(n<(2*minseglen)){stop('Minimum segment legnth is too large to include a change in this data')} pen.value = penalty_decision(penalty, pen.value, n, diffparam=1, asymcheck="meanvar.gamma", method="AMOC") - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ tmp=single.meanvar.gamma.calc(coredata(data),shape,extrainf=TRUE,minseglen) if(penalty=="MBIC"){ tmp[3]=tmp[3]+log(tmp[1])+log(n-tmp[1]+1) @@ -259,7 +259,7 @@ multiple.meanvar.gamma=function(data,shape=1,mul.method="PELT",penalty="MBIC",pe costfunc = "meanvar.gamma.mbic" } diffparam=1 - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) shape=shape[1] @@ -271,7 +271,7 @@ multiple.meanvar.gamma=function(data,shape=1,mul.method="PELT",penalty="MBIC",pe pen.value = penalty_decision(penalty, pen.value, n, diffparam, asymcheck = costfunc, method=mul.method) - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset out = data_input(data=data,method=mul.method,pen.value=pen.value,costfunc=costfunc,minseglen=minseglen,Q=Q,shape=shape) diff --git a/R/multiple.nonparametric.R b/R/multiple.nonparametric.R index 8a8f6df..13d8834 100644 --- a/R/multiple.nonparametric.R +++ b/R/multiple.nonparametric.R @@ -116,7 +116,7 @@ multiple.var.css=function(data,mul.method="BinSeg",penalty="MBIC",pen.value=0,Q= stop("MBIC penalty is not valid for nonparametric test statistics.") } diffparam=1 - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) } @@ -126,7 +126,7 @@ multiple.var.css=function(data,mul.method="BinSeg",penalty="MBIC",pen.value=0,Q= if(n<(2*minseglen)){stop('Minimum segment legnth is too large to include a change in this data')} pen.value = penalty_decision(penalty, pen.value, n, diffparam, asymcheck = costfunc, method=mul.method) - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset if(mul.method=="BinSeg"){ out=binseg.var.css(data,Q,pen.value) @@ -309,7 +309,7 @@ multiple.mean.cusum=function(data,mul.method="BinSeg",penalty="Asymptotic",pen.v stop("MBIC penalty is not valid for nonparametric test statistics.") } diffparam=1 - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) } @@ -319,7 +319,7 @@ multiple.mean.cusum=function(data,mul.method="BinSeg",penalty="Asymptotic",pen.v if(n<(2*minseglen)){stop('Minimum segment legnth is too large to include a change in this data')} pen.value = penalty_decision(penalty, pen.value, n, diffparam, asymcheck = costfunc, method=mul.method) - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset if(mul.method=="BinSeg"){ out=binseg.mean.cusum(coredata(data),Q,pen.value) diff --git a/R/multiple.norm.R b/R/multiple.norm.R index 9b27136..2f96697 100644 --- a/R/multiple.norm.R +++ b/R/multiple.norm.R @@ -493,7 +493,7 @@ multiple.var.norm=function(data,mul.method="PELT",penalty="MBIC",pen.value=0,Q=5 costfunc = "var.norm.mbic" } diffparam=1 - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) mu=mu[1] @@ -506,7 +506,7 @@ multiple.var.norm=function(data,mul.method="PELT",penalty="MBIC",pen.value=0,Q=5 pen.value = penalty_decision(penalty, pen.value, n, diffparam, asymcheck=costfunc, method=mul.method) - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset if((know.mean==FALSE)&(is.na(mu))){ mu=mean(coredata(data)) @@ -561,7 +561,7 @@ multiple.mean.norm=function(data,mul.method="PELT",penalty="MBIC",pen.value=0,Q= costfunc = "mean.norm.mbic" } diffparam=1 - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) # still works if data is of class ts } @@ -572,7 +572,7 @@ multiple.mean.norm=function(data,mul.method="PELT",penalty="MBIC",pen.value=0,Q= pen.value = penalty_decision(penalty, pen.value, n, diffparam, asymcheck = costfunc, method=mul.method) - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset out = data_input(data=data,method=mul.method,pen.value=pen.value,costfunc=costfunc,minseglen=minseglen,Q=Q) @@ -614,7 +614,7 @@ multiple.meanvar.norm=function(data,mul.method="PELT",penalty="MBIC",pen.value=0 costfunc = "meanvar.norm.mbic" } diffparam=2 - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) } @@ -625,7 +625,7 @@ multiple.meanvar.norm=function(data,mul.method="PELT",penalty="MBIC",pen.value=0 pen.value = penalty_decision(penalty, pen.value, n, diffparam, asymcheck = costfunc, method=mul.method) - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset out = data_input(data=data,method=mul.method,pen.value=pen.value,costfunc=costfunc,minseglen=minseglen,Q=Q) diff --git a/R/poisson.R b/R/poisson.R index 344869a..0b84f80 100644 --- a/R/poisson.R +++ b/R/poisson.R @@ -27,7 +27,7 @@ single.meanvar.poisson.calc <- } } - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single data set cpt=singledim(data,extrainf,minseglen) return(cpt) @@ -56,7 +56,7 @@ single.meanvar.poisson.calc <- single.meanvar.poisson<-function(data,penalty="MBIC",pen.value=0,class=TRUE,param.estimates=TRUE,minseglen){ if((sum(data<0)>0)){stop('Poisson test statistic requires positive data')} if(sum(as.integer(data)==data)!=length(data)){stop('Poisson test statistic requires integer data')} - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) } @@ -67,7 +67,7 @@ single.meanvar.poisson<-function(data,penalty="MBIC",pen.value=0,class=TRUE,para if(n<(2*minseglen)){stop('Minimum segment legnth is too large to include a change in this data')} pen.value = penalty_decision(penalty, pen.value, n, diffparam=1, asymcheck="meanvar.poisson", method="AMOC") - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ tmp=single.meanvar.poisson.calc(coredata(data),extrainf=TRUE,minseglen) if(penalty=="MBIC"){ tmp[3]=tmp[3]+log(tmp[1])+log(n-tmp[1]+1) @@ -168,7 +168,7 @@ multiple.meanvar.poisson=function(data,mul.method="PELT",penalty="MBIC",pen.valu } diffparam=1 - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) } @@ -178,7 +178,7 @@ multiple.meanvar.poisson=function(data,mul.method="PELT",penalty="MBIC",pen.valu if(n<(2*minseglen)){stop('Minimum segment legnth is too large to include a change in this data')} pen.value = penalty_decision(penalty, pen.value, n, diffparam=1, asymcheck=costfunc, method=mul.method) - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset out = data_input(data=data,method=mul.method,pen.value=pen.value,costfunc=costfunc,minseglen=minseglen,Q=Q) diff --git a/R/single.nonparametric.R b/R/single.nonparametric.R index b93a02c..a8fc560 100644 --- a/R/single.nonparametric.R +++ b/R/single.nonparametric.R @@ -19,7 +19,7 @@ single.var.css.calc <- } - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single data set cpt=singledim(data,extrainf,minseglen) return(cpt) @@ -49,7 +49,7 @@ single.var.css<-function(data,penalty="MBIC",pen.value=0,class=TRUE,param.estima if(length(pen.value)>1){stop('Only one dimensional penalties can be used for CSS')} if(penalty=="MBIC"){stop("MBIC penalty is not valid for nonparametric test statistics.")} diffparam=1 - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) } @@ -60,7 +60,7 @@ single.var.css<-function(data,penalty="MBIC",pen.value=0,class=TRUE,param.estima if(n<(2*minseglen)){stop('Minimum segment legnth is too large to include a change in this data')} pen.value = penalty_decision(penalty, pen.value, n, diffparam, asymcheck = "var.css", method="AMOC") - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ tmp=single.var.css.calc(coredata(data),extrainf=TRUE,minseglen) ans=decision(tau=tmp[1],null=tmp[2],penalty="Manual",n=n,diffparam=1,pen.value=pen.value) if(class==TRUE){ @@ -126,7 +126,7 @@ single.mean.cusum.calc <- } - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single data set cpt=singledim(data,extrainf,minseglen) return(cpt) @@ -156,7 +156,7 @@ single.mean.cusum<-function(data,penalty="Asymptotic",pen.value=0.05,class=TRUE, if(length(pen.value)>1){stop('Only one dimensional penalties can be used for CUSUM')} if(penalty=="MBIC"){stop("MBIC penalty is not valid for nonparametric test statistics.")} - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) } @@ -167,7 +167,7 @@ single.mean.cusum<-function(data,penalty="Asymptotic",pen.value=0.05,class=TRUE, if(n<(2*minseglen)){stop('Minimum segment legnth is too large to include a change in this data')} pen.value = penalty_decision(penalty, pen.value, n, diffparam=1, asymcheck="mean.cusum", method="AMOC") - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ tmp=single.mean.cusum.calc(coredata(data),extrainf=TRUE,minseglen) ans=decision(tau=tmp[1],null=tmp[2],penalty=penalty,n=n,diffparam=1,pen.value=pen.value) if(class==TRUE){ diff --git a/R/single.norm.R b/R/single.norm.R index c81876a..ff13331 100644 --- a/R/single.norm.R +++ b/R/single.norm.R @@ -22,7 +22,7 @@ function(data,extrainf=TRUE,minseglen){ } - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single data set cpt=singledim(data,extrainf,minseglen) return(cpt) @@ -48,7 +48,7 @@ function(data,extrainf=TRUE,minseglen){ } single.mean.norm<-function(data,penalty="MBIC",pen.value=0,class=TRUE,param.estimates=TRUE,minseglen){ - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) } @@ -59,7 +59,7 @@ single.mean.norm<-function(data,penalty="MBIC",pen.value=0,class=TRUE,param.esti if(n<(2*minseglen)){stop('Minimum segment legnth is too large to include a change in this data')} pen.value = penalty_decision(penalty, pen.value, n, diffparam=1, asymcheck="mean.norm", method="AMOC") - if(is.null(dim(data))==TRUE){ # single dataset + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset tmp=single.mean.norm.calc(coredata(data),extrainf=TRUE,minseglen) if(penalty=="MBIC"){ tmp[3]=tmp[3]+log(tmp[1])+log(n-tmp[1]+1) @@ -134,7 +134,7 @@ single.var.norm.calc <- function(data,mu,extrainf=TRUE,minseglen){ single.var.norm<-function(data,penalty="MBIC",pen.value=0,know.mean=FALSE,mu=NA,class=TRUE,param.estimates=TRUE,minseglen){ - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) mu=mu[1] @@ -147,7 +147,7 @@ single.var.norm<-function(data,penalty="MBIC",pen.value=0,know.mean=FALSE,mu=NA, pen.value = penalty_decision(penalty, pen.value, n, diffparam=1, asymcheck="var.norm", method="AMOC") - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ if((know.mean==FALSE)&(is.na(mu))){ mu=mean(coredata(data)) } @@ -240,7 +240,7 @@ function(data,extrainf=TRUE,minseglen){ } - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single data set cpt=singledim(data,extrainf,minseglen) return(cpt) @@ -266,7 +266,7 @@ function(data,extrainf=TRUE,minseglen){ } single.meanvar.norm<-function(data,penalty="MBIC",pen.value=0,class=TRUE,param.estimates=TRUE,minseglen){ - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ # single dataset n=length(data) } @@ -278,7 +278,7 @@ single.meanvar.norm<-function(data,penalty="MBIC",pen.value=0,class=TRUE,param.e pen.value = penalty_decision(penalty, pen.value, n, diffparam=1, asymcheck="meanvar.norm", method="AMOC") - if(is.null(dim(data))==TRUE){ + if(is.null(dim(data))==TRUE || length(dim(data)) == 1){ tmp=single.meanvar.norm.calc(coredata(data),extrainf=TRUE,minseglen) if(penalty=="MBIC"){ tmp[3]=tmp[3]+log(tmp[1])+log(n-tmp[1]+1) diff --git a/inst/CITATION b/inst/CITATION index 5d1b0fc..0aeae0b 100644 --- a/inst/CITATION +++ b/inst/CITATION @@ -11,7 +11,7 @@ bibentry(bibtype="article", volume = "58", number = "3", pages = "1--19", - url = "https://www.jstatsoft.org/v58/i03/", + url = "https://www.jstatsoft.org/article/view/v058i03", ) bibentry(bibtype="Manual", diff --git a/man/changepoint-package.Rd b/man/changepoint-package.Rd index af467cc..c614f53 100644 --- a/man/changepoint-package.Rd +++ b/man/changepoint-package.Rd @@ -12,8 +12,8 @@ Implements various mainstream and specialised changepoint methods for finding si \tabular{ll}{ Package: \tab changepoint\cr Type: \tab Package\cr -Version: \tab 2.2.3 \cr -Date: \tab 2022-03-08\cr +Version: \tab 2.2.4 \cr +Date: \tab 2022-10-31\cr License: \tab GPL\cr LazyLoad: \tab yes\cr } diff --git a/man/cpts--methods.Rd b/man/cpts--methods.Rd index 502958e..e484757 100644 --- a/man/cpts--methods.Rd +++ b/man/cpts--methods.Rd @@ -3,6 +3,7 @@ \alias{cpts<--methods} \alias{cpts<-,cpt-method} \alias{cpts<-,cpt.reg-method} +\alias{cpts<-,cpt.range-method} \title{ ~~ Methods for Function cpts<- ~~} \description{ ~~ Methods for function \code{cpts<-} ~~ @@ -17,6 +18,9 @@ \item{\code{signature(x = "cpt.reg")}}{ Assigns the value following <- to the cpts slot in x, from version 1.0 this no longer requires the last entry to be the length of the dataset. } +\item{\code{signature(x = "cpt.range")}}{ + Assigns the value following <- to the cpts slot in x, from version 1.0 this no longer requires the last entry to be the length of the dataset. +} }} \keyword{methods} \keyword{cpt} diff --git a/man/cpts-methods.Rd b/man/cpts-methods.Rd index 7b3aaa5..4dcb4e4 100644 --- a/man/cpts-methods.Rd +++ b/man/cpts-methods.Rd @@ -3,6 +3,7 @@ \alias{cpts-methods} \alias{cpts,cpt-method} \alias{cpts,cpt.reg-method} +\alias{cpts,cpt.range-method} \title{ ~~ Methods for Function cpts ~~} \description{ ~~ Methods for function \code{cpts} ~~ @@ -17,6 +18,10 @@ \item{\code{signature(object = "cpt.reg")}}{ Retrieves cpts slot from an object of class cpt.reg, from version 1.0 this no longer prints the length of the dataset. } + +\item{\code{signature(object = "cpt.range",ncpts=NA)}}{ + Retrieves the row in the cpts.full slot from an object of class cpt.range that has length ncpts, from version 1.0 this no longer prints the length of the dataset. +} }} \keyword{methods} \keyword{cpt} diff --git a/man/cpts.Rd b/man/cpts.Rd index 5993f56..d9715c5 100644 --- a/man/cpts.Rd +++ b/man/cpts.Rd @@ -7,11 +7,14 @@ Generic Function - cpts Generic function } \usage{ -cpts(object) +cpts(object,...) } \arguments{ \item{object}{ Depending on the class of \code{object} depends on the method used (and if one exists) +} + \item{...}{ + Depending on the class of \code{object} depends on the method used and whether ... is needed/warranted (and if one exists) } } \details{ diff --git a/src/BinSeg_one_func_minseglen.c b/src/BinSeg_one_func_minseglen.c index fb6e3f1..edab9e0 100644 --- a/src/BinSeg_one_func_minseglen.c +++ b/src/BinSeg_one_func_minseglen.c @@ -12,229 +12,228 @@ //#include "cost_general_functions.c" // commented out as implicitly in the workspace as using the package. -void binseg(cost_func,sumstat,n,pen,Q,cptsout,minseglen,likeout,op_cps, shape) - char **cost_func; //Descibe the cost function used i.e. norm.mean.cost (change in mean in normal distributed data) - double *sumstat; //array of summary statistics of the time series - int *n; // Length of the time series - double *pen; // Penalty used to decide if a changepoint is significant - int *Q; // Max number of changepoints - int *cptsout; // Q length vector of identified changepoint locations - int *minseglen; //minimum segment length - double *likeout; // Q length vector of likelihood ratio values for changepoints in cptsout - int *op_cps; // Optimal number of changepoint for pen supplied - double *shape; // only used when cost_func is the gamma likelihood - +void binseg(char** cost_func, //Descibe the cost function used i.e. norm.mean.cost (change in mean in normal distributed data) + double* sumstat, //array of summary statistics of the time series + int* n, //Length of the time series + double* pen, //Penalty used to decide if a changepoint is significant + int* Q, //Max number of changepoints + int* cptsout, //Q length vector of identified changepoint locations + int* minseglen, //minimum segment length + double* likeout, //Q length vector of likelihood ratio values for changepoints in cptsout + int* op_cps, //Optimal number of changepoint for pen supplied + double* shape //only used when cost_func is the gamma likelihood + ) { - double oldmax=1.7E+308,null,lambda[*n],maxout; - int q,p,i,j,whichout,st,end; - for(i=0;i<*n;i++){ - lambda[i]=-INFINITY; - } - int tau[*Q+2]; // max ncpts is Q, +2 is for 0 and n - tau[0]=0; - tau[1]= *n; - -double mll_var(); -double mll_mean(); -double mll_meanvar(); -double mll_meanvar_exp(); -double mll_meanvar_gamma(); -double mll_meanvar_poisson(); -double mbic_var(); -double mbic_mean(); -double mbic_meanvar(); -double mbic_meanvar_exp(); -double mbic_meanvar_gamma(); -double mbic_meanvar_poisson(); - - double (*costfunction)(); - if (strcmp(*cost_func,"var.norm")==0){ - costfunction = &mll_var; - } - else if (strcmp(*cost_func,"mean.norm")==0){ - costfunction = &mll_mean; - } - else if (strcmp(*cost_func,"meanvar.norm")==0){ - costfunction = &mll_meanvar; - } - else if (strcmp(*cost_func,"meanvar.exp")==0){ - costfunction = &mll_meanvar_exp; + double oldmax=1.7E+308,null,lambda[*n],maxout; + int q,p,i,j,whichout,st,end; + for(i=0;i<*n;i++){ + lambda[i]=-INFINITY; } - else if (strcmp(*cost_func,"meanvar.gamma")==0){ - costfunction = &mll_meanvar_gamma; + int tau[*Q+2]; // max ncpts is Q, +2 is for 0 and n + tau[0]=0; + tau[1]= *n; + + double mll_var(double, double, double, int, double); + double mll_mean(double, double, double, int, double); + double mll_meanvar(double, double, double, int, double); + double mll_meanvar_exp(double, double, double, int, double); + double mll_meanvar_gamma(double, double, double, int, double); + double mll_meanvar_poisson(double, double, double, int, double); + double mbic_var(double, double, double, int, double); + double mbic_mean(double, double, double, int, double); + double mbic_meanvar(double, double, double, int, double); + double mbic_meanvar_exp(double, double, double, int, double); + double mbic_meanvar_gamma(double, double, double, int, double); + double mbic_meanvar_poisson(double, double, double, int, double); + + double (*costfunction)(double, double, double, int, double); + if (strcmp(*cost_func,"var.norm")==0){ + costfunction = &mll_var; } - else if (strcmp(*cost_func,"meanvar.poisson")==0){ - costfunction = &mll_meanvar_poisson; + else if (strcmp(*cost_func,"mean.norm")==0){ + costfunction = &mll_mean; + } + else if (strcmp(*cost_func,"meanvar.norm")==0){ + costfunction = &mll_meanvar; } - else if (strcmp(*cost_func,"mean.norm.mbic")==0){ - costfunction = &mbic_mean; + else if (strcmp(*cost_func,"meanvar.exp")==0){ + costfunction = &mll_meanvar_exp; } - else if (strcmp(*cost_func,"var.norm.mbic")==0){ - costfunction = &mbic_var; + else if (strcmp(*cost_func,"meanvar.gamma")==0){ + costfunction = &mll_meanvar_gamma; } - else if (strcmp(*cost_func,"meanvar.norm.mbic")==0){ - costfunction = &mbic_meanvar; -} - else if (strcmp(*cost_func,"meanvar.exp.mbic")==0){ - costfunction = &mbic_meanvar_exp; -} - else if (strcmp(*cost_func,"meanvar.gamma.mbic")==0){ - costfunction = &mbic_meanvar_gamma; -} - else if (strcmp(*cost_func,"meanvar.poisson.mbic")==0){ -costfunction = &mbic_meanvar_poisson; -} - - void max_which(); - void order_vec(); - - for(q=0;q<*Q;q++){ - R_CheckUserInterrupt(); // checks if user has interrupted the R session and quits if true - for(p=0;p<*n;p++){lambda[p]=0;} - i=1; - st=tau[0]+1; - end=tau[1]; - //null= (-0.5) * mll_var(*(y2+end)-*(y2+st-1),end-st+1); - // null = (-0.5)* call_function(*cost_function,n,sumstat,end,st-1,end-(st-1),*shape); - null = (-0.5)*costfunction(*(sumstat+end)-*(sumstat+st-1),*(sumstat + *n + 1 +end)-*(sumstat + *n + st),*(sumstat + *n + *n + 2 +end)-*(sumstat + *n + *n + 1 +st), end - st + 1, *shape); - for(j=2;j<(*n-2);j++){ - if(j==end){ - st=end+1; - i=i+1; - end=tau[i]; - //null= (-0.5) * mll_var(*(y2+end)-*(y2+st-1),end-st+1); - // null = (-0.5)* call_function(*cost_function,n,sumstat,end,st-1,end-(st-1),*shape); - null = (-0.5)* costfunction(*(sumstat+end)-*(sumstat+st-1),*(sumstat + *n + 1 +end)-*(sumstat + *n + st),*(sumstat + *n + *n + 2 +end)-*(sumstat + *n + *n + 1 +st), end - st + 1, *shape); - } - else{ - if(((j-st)>=*minseglen)&&((end-j)>=*minseglen)){ - // lambda[j]= ((-0.5) * mll_var(*(y2+j)-*(y2+st-1),j-st+1)) + ((-0.5) * mll_var(*(y2+end)-*(y2+j),end-j)) - null; - //lambda[j] = ((-0.5)*call_function(*cost_function,n,sumstat,j,st-1,j-(st-1),*shape)) + ((-0.5)*call_function(*cost_function,n,sumstat,end,j,end-j,*shape)) - null; - lambda[j] = ((-0.5)*costfunction(*(sumstat+j)-*(sumstat+st-1),*(sumstat + *n + 1 +j)-*(sumstat + *n + st),*(sumstat + *n + *n + 2 +j)-*(sumstat + *n + *n + 1 +st), j - st + 1, *shape)) + - ((-0.5)*costfunction(*(sumstat+end)-*(sumstat+j),*(sumstat + *n + 1 +end)-*(sumstat + *n + j + 1),*(sumstat + *n + *n + 2 +end)-*(sumstat + *n + *n + 2 +j), end - j, *shape)) - null; - } - } + else if (strcmp(*cost_func,"meanvar.poisson")==0){ + costfunction = &mll_meanvar_poisson; + } + else if (strcmp(*cost_func,"mean.norm.mbic")==0){ + costfunction = &mbic_mean; + } + else if (strcmp(*cost_func,"var.norm.mbic")==0){ + costfunction = &mbic_var; + } + else if (strcmp(*cost_func,"meanvar.norm.mbic")==0){ + costfunction = &mbic_meanvar; + } + else if (strcmp(*cost_func,"meanvar.exp.mbic")==0){ + costfunction = &mbic_meanvar_exp; + } + else if (strcmp(*cost_func,"meanvar.gamma.mbic")==0){ + costfunction = &mbic_meanvar_gamma; + } + else if (strcmp(*cost_func,"meanvar.poisson.mbic")==0){ + costfunction = &mbic_meanvar_poisson; + } + + void max_which(double*, int, double*, int*); + void order_vec(int[], int); + + for(q=0;q<*Q;q++){ + R_CheckUserInterrupt(); // checks if user has interrupted the R session and quits if true + for(p=0;p<*n;p++){lambda[p]=0;} + i=1; + st=tau[0]+1; + end=tau[1]; + //null= (-0.5) * mll_var(*(y2+end)-*(y2+st-1),end-st+1); + // null = (-0.5)* call_function(*cost_function,n,sumstat,end,st-1,end-(st-1),*shape); + null = (-0.5)*costfunction(*(sumstat+end)-*(sumstat+st-1),*(sumstat + *n + 1 +end)-*(sumstat + *n + st),*(sumstat + *n + *n + 2 +end)-*(sumstat + *n + *n + 1 +st), end - st + 1, *shape); + for(j=2;j<(*n-2);j++){ + if(j==end){ + st=end+1; + i=i+1; + end=tau[i]; + //null= (-0.5) * mll_var(*(y2+end)-*(y2+st-1),end-st+1); + // null = (-0.5)* call_function(*cost_function,n,sumstat,end,st-1,end-(st-1),*shape); + null = (-0.5)* costfunction(*(sumstat+end)-*(sumstat+st-1),*(sumstat + *n + 1 +end)-*(sumstat + *n + st),*(sumstat + *n + *n + 2 +end)-*(sumstat + *n + *n + 1 +st), end - st + 1, *shape); + } + else{ + if(((j-st)>=*minseglen)&&((end-j)>=*minseglen)){ + // lambda[j]= ((-0.5) * mll_var(*(y2+j)-*(y2+st-1),j-st+1)) + ((-0.5) * mll_var(*(y2+end)-*(y2+j),end-j)) - null; + //lambda[j] = ((-0.5)*call_function(*cost_function,n,sumstat,j,st-1,j-(st-1),*shape)) + ((-0.5)*call_function(*cost_function,n,sumstat,end,j,end-j,*shape)) - null; + lambda[j] = ((-0.5)*costfunction(*(sumstat+j)-*(sumstat+st-1),*(sumstat + *n + 1 +j)-*(sumstat + *n + st),*(sumstat + *n + *n + 2 +j)-*(sumstat + *n + *n + 1 +st), j - st + 1, *shape)) + + ((-0.5)*costfunction(*(sumstat+end)-*(sumstat+j),*(sumstat + *n + 1 +end)-*(sumstat + *n + j + 1),*(sumstat + *n + *n + 2 +end)-*(sumstat + *n + *n + 2 +j), end - j, *shape)) - null; + } + } } - max_which(lambda,*n,&maxout,&whichout); + max_which(lambda,*n,&maxout,&whichout); *(cptsout+q)=whichout; - *(likeout+q)= (oldmax<=maxout) ? oldmax : maxout ; + *(likeout+q)= (oldmax<=maxout) ? oldmax : maxout ; oldmax= *(likeout+q); - tau[q+2]=whichout; + tau[q+2]=whichout; order_vec(tau,q+3); } - int stop=0; - *op_cps=0; - while((stop==0)&&(*op_cps < *Q)){ - if((2* *(likeout+*op_cps)) >= *pen){ (*op_cps)++; } - else{ stop=1; } - } + int stop=0; + *op_cps=0; + while((stop==0)&&(*op_cps < *Q)){ + if((2* *(likeout+*op_cps)) >= *pen){ (*op_cps)++; } + else{ stop=1; } + } } // Cost functions -/* -double mll_var(double *sumstat, int end, int start, int n, double shape){ - double x3=*(sumstat+end)-*(sumstat+start); - if(x3<=0){x3=0.00000000001;} - return(n*(log(2*M_PI)+log(x3/n)+1)); // M_PI is in Rmath.h -} + /* + double mll_var(double *sumstat, int end, int start, int n, double shape){ + double x3=*(sumstat+end)-*(sumstat+start); + if(x3<=0){x3=0.00000000001;} + return(n*(log(2*M_PI)+log(x3/n)+1)); // M_PI is in Rmath.h + } -double mll_meanvar(double *sumstat, int end, int start, int n, double shape){ - double x2=*(sumstat+n+1+end)-*(sumstat+n+1+start); // this relies on the R code doing things in the correct order! - double x=*(sumstat+end)-*(sumstat+start); - double sigsq=(x2-((x*x)/n))/n; - if(sigsq<=0){sigsq=0.00000000001;} - return(n*(log(2*M_PI)+log(sigsq)+1)); // M_PI is in Rmath.h -} + double mll_meanvar(double *sumstat, int end, int start, int n, double shape){ + double x2=*(sumstat+n+1+end)-*(sumstat+n+1+start); // this relies on the R code doing things in the correct order! + double x=*(sumstat+end)-*(sumstat+start); + double sigsq=(x2-((x*x)/n))/n; + if(sigsq<=0){sigsq=0.00000000001;} + return(n*(log(2*M_PI)+log(sigsq)+1)); // M_PI is in Rmath.h + } -double mll_mean(double *sumstat, int end, int start, int n, double shape){ - double x2=*(sumstat+n+1+end)-*(sumstat+n+1+start); // this relies on the R code doing things in the correct order! - double x=*(sumstat+end)-*(sumstat+start); - return(x2-(x*x)/n); -} + double mll_mean(double *sumstat, int end, int start, int n, double shape){ + double x2=*(sumstat+n+1+end)-*(sumstat+n+1+start); // this relies on the R code doing things in the correct order! + double x=*(sumstat+end)-*(sumstat+start); + return(x2-(x*x)/n); + } -double mll_meanvar_exp(double *sumstat, int end, int start, int n, double shape){ - double x=*(sumstat+end)-*(sumstat+start); - return(2*n*(log(x)-log(n))); -} + double mll_meanvar_exp(double *sumstat, int end, int start, int n, double shape){ + double x=*(sumstat+end)-*(sumstat+start); + return(2*n*(log(x)-log(n))); + } -double mll_meanvar_gamma(double *sumstat, int end, int start, int n, double shape){ - double x=*(sumstat+end)-*(sumstat+start); - return(2*n*shape*(log(x)-log(n*shape))); -} + double mll_meanvar_gamma(double *sumstat, int end, int start, int n, double shape){ + double x=*(sumstat+end)-*(sumstat+start); + return(2*n*shape*(log(x)-log(n*shape))); + } -double mll_meanvar_poisson(double *sumstat, int end, int start, int n, double shape){ - double x=*(sumstat+end)-*(sumstat+start); - if(x==0){return(0);} - else{return(2*x*(log(n)-log(x)));} -} + double mll_meanvar_poisson(double *sumstat, int end, int start, int n, double shape){ + double x=*(sumstat+end)-*(sumstat+start); + if(x==0){return(0);} + else{return(2*x*(log(n)-log(x)));} + } -// code to choose cost function - -const static struct { - char *name; - double (*func)(double *sumstat, int end, int start, int n, double shape); -} function_map [] = { - { "norm.var", mll_var}, - {"norm.mean", mll_mean}, - {"norm.meanvar", mll_meanvar}, - {"exp", mll_meanvar_exp}, - {"gamma", mll_meanvar_gamma}, - {"poisson", mll_meanvar_poisson}, - }; - -double call_function(const char *name,double *sumstat, int end, int start, int n, double shape) -{ - int k; - - for (k = 0; k <= (sizeof(function_map) / sizeof(function_map[0])); k++) { - if (!strcmp(function_map[k].name, name) && function_map[k].func) { - return function_map[k].func(sumstat,end,start, n, shape); - } - } - } + // code to choose cost function + const static struct { + char *name; + double (*func)(double *sumstat, int end, int start, int n, double shape); + } function_map [] = { + { "norm.var", mll_var}, + {"norm.mean", mll_mean}, + {"norm.meanvar", mll_meanvar}, + {"exp", mll_meanvar_exp}, + {"gamma", mll_meanvar_gamma}, + {"poisson", mll_meanvar_poisson}, + }; + double call_function(const char *name,double *sumstat, int end, int start, int n, double shape) + { + int k; -void min_which(double *array,int n,double *minout,int *whichout){ - // Function to find minimum of an array with n elements that is put in min - *minout=*array; - *whichout=0; - int i; - for(i=1;i *maxout){ - *maxout= *(array+i); - *whichout=i; - } - } -} -void order_vec( int a[], int n ){ - int i, j; - for(i = 0; i < n; i++){ // Make a pass through the array for each element - for(j = 1; j < (n-i); j++){ // Go through the array beginning to end - if(a[j-1] > a[j]) // If the the first number is greater, swap it - SWAP(a[j-1],a[j]); - } - } -} -*/ + + void min_which(double *array,int n,double *minout,int *whichout){ + // Function to find minimum of an array with n elements that is put in min + *minout=*array; + *whichout=0; + int i; + for(i=1;i *maxout){ + *maxout= *(array+i); + *whichout=i; + } + } + } + + void order_vec( int a[], int n ){ + int i, j; + for(i = 0; i < n; i++){ // Make a pass through the array for each element + for(j = 1; j < (n-i); j++){ // Go through the array beginning to end + if(a[j-1] > a[j]) // If the the first number is greater, swap it + SWAP(a[j-1],a[j]); + } + } + } + */ diff --git a/src/PELT_one_func_minseglen.c b/src/PELT_one_func_minseglen.c index f2f423d..da39530 100644 --- a/src/PELT_one_func_minseglen.c +++ b/src/PELT_one_func_minseglen.c @@ -18,99 +18,99 @@ static int *checklist; static double *tmplike; static int *tmpt; -void FreePELT(error) - int *error; /* Error code from PELT C function, non-zero => error */ - { - if(*error==0){ - // free((void *)lastchangecpts); - // free((void *)lastchangelike); - free((void *)checklist); - free((void *)tmplike); - free((void *)tmpt); - } +void FreePELT(int* error) +{ + if(*error==0){ + // free((void *)lastchangecpts); + // free((void *)lastchangelike); + free((void *)checklist); + free((void *)tmplike); + free((void *)tmpt); + } } -void PELTC(cost_func, sumstat,n,pen,cptsout,error, shape, minseglen, lastchangelike, lastchangecpts, numchangecpts) - char **cost_func; - double *sumstat; /* Summary statistic for the time series */ - int *n; /* Length of the time series */ - double *pen; /* Penalty used to decide if a changepoint is significant */ - int *cptsout; /* Vector of identified changepoint locations */ - int *error; /* 0 by default, nonzero indicates error in code */ - double *shape; // only used when cost_func is the gamma likelihood - int *minseglen; //minimum segment length - double *lastchangelike; // stores likelihood up to that time using optimal changepoint locations up to that time - int *lastchangecpts; // stores last changepoint locations - int *numchangecpts; //stores the current number of changepoints - { - // R code does know.mean and fills mu if necessary +void PELTC(char** cost_func, + double* sumstat, // Summary statistic for the time serie + int* n, // Length of the time series + double* pen, // Penalty used to decide if a changepoint is significant + int* cptsout, // Vector of identified changepoint locations + int* error, // 0 by default, nonzero indicates error in code + double* shape, // only used when cost_func is the gamma likelihood + int* minseglen, //minimum segment length + double* lastchangelike, // stores likelihood up to that time using + //optimal changepoint locations up to that time + int* lastchangecpts, // stores last changepoint locations + int* numchangecpts //stores the current number of changepoints + ) +{ + // R code does know.mean and fills mu if necessary - double (*costfunction)(); - double mll_var(); -double mll_mean(); -double mll_meanvar(); -double mll_meanvar_exp(); -double mll_meanvar_gamma(); -double mll_meanvar_poisson(); -double mbic_var(); -double mbic_mean(); -double mbic_meanvar(); -double mbic_meanvar_exp(); -double mbic_meanvar_gamma(); -double mbic_meanvar_poisson(); - - if (strcmp(*cost_func,"var.norm")==0){ - costfunction = &mll_var; - } - else if (strcmp(*cost_func,"mean.norm")==0){ - costfunction = &mll_mean; - } - else if (strcmp(*cost_func,"meanvar.norm")==0){ - costfunction = &mll_meanvar; - } - else if (strcmp(*cost_func,"meanvar.exp")==0){ - costfunction = &mll_meanvar_exp; - } - else if (strcmp(*cost_func,"meanvar.gamma")==0){ - costfunction = &mll_meanvar_gamma; - } - else if (strcmp(*cost_func,"meanvar.poisson")==0){ - costfunction = &mll_meanvar_poisson; - } - else if (strcmp(*cost_func,"mean.norm.mbic")==0){ - costfunction = &mbic_mean; - } - else if (strcmp(*cost_func,"var.norm.mbic")==0){ - costfunction = &mbic_var; - } - else if (strcmp(*cost_func,"meanvar.norm.mbic")==0){ - costfunction = &mbic_meanvar; -} - else if (strcmp(*cost_func,"meanvar.exp.mbic")==0){ - costfunction = &mbic_meanvar_exp; -} - else if (strcmp(*cost_func,"meanvar.gamma.mbic")==0){ - costfunction = &mbic_meanvar_gamma; -} - else if (strcmp(*cost_func,"meanvar.poisson.mbic")==0){ -costfunction = &mbic_meanvar_poisson; -} + double (*costfunction)(double, double, double, int, double); + double mll_var(double, double, double, int, double); + double mll_mean(double, double, double, int, double); + double mll_meanvar(double, double, double, int, double); + double mll_meanvar_exp(double, double, double, int, double); + double mll_meanvar_gamma(double, double, double, int, double); + double mll_meanvar_poisson(double, double, double, int, double); + double mbic_var(double, double, double, int, double); + double mbic_mean(double, double, double, int, double); + double mbic_meanvar(double, double, double, int, double); + double mbic_meanvar_exp(double, double, double, int, double); + double mbic_meanvar_gamma(double, double, double, int, double); + double mbic_meanvar_poisson(double, double, double, int, double); + + if (strcmp(*cost_func,"var.norm")==0){ + costfunction = &mll_var; + } + else if (strcmp(*cost_func,"mean.norm")==0){ + costfunction = &mll_mean; + } + else if (strcmp(*cost_func,"meanvar.norm")==0){ + costfunction = &mll_meanvar; + } + else if (strcmp(*cost_func,"meanvar.exp")==0){ + costfunction = &mll_meanvar_exp; + } + else if (strcmp(*cost_func,"meanvar.gamma")==0){ + costfunction = &mll_meanvar_gamma; + } + else if (strcmp(*cost_func,"meanvar.poisson")==0){ + costfunction = &mll_meanvar_poisson; + } + else if (strcmp(*cost_func,"mean.norm.mbic")==0){ + costfunction = &mbic_mean; + } + else if (strcmp(*cost_func,"var.norm.mbic")==0){ + costfunction = &mbic_var; + } + else if (strcmp(*cost_func,"meanvar.norm.mbic")==0){ + costfunction = &mbic_meanvar; + } + else if (strcmp(*cost_func,"meanvar.exp.mbic")==0){ + costfunction = &mbic_meanvar_exp; + } + else if (strcmp(*cost_func,"meanvar.gamma.mbic")==0){ + costfunction = &mbic_meanvar_gamma; + } + else if (strcmp(*cost_func,"meanvar.poisson.mbic")==0){ + costfunction = &mbic_meanvar_poisson; + } - // int *lastchangecpts; -// lastchangecpts = (int *)calloc(*n+1,sizeof(int)); -// if (lastchangecpts==NULL) { -// *error = 1; -// goto err1; -// } + // int *lastchangecpts; + // lastchangecpts = (int *)calloc(*n+1,sizeof(int)); + // if (lastchangecpts==NULL) { + // *error = 1; + // goto err1; + // } //double lastchangelike[*n]; /* stores likelihood up to that time using optimal changepoint locations up to that time */ -// double *lastchangelike; -// lastchangelike = (double *)calloc(*n+1,sizeof(double)); -// if (lastchangelike==NULL) { -// *error = 2; -// goto err2; - // } + // double *lastchangelike; + // lastchangelike = (double *)calloc(*n+1,sizeof(double)); + // if (lastchangelike==NULL) { + // *error = 2; + // goto err2; + // } //int checklist[*n]; int *checklist; @@ -125,7 +125,7 @@ costfunction = &mbic_meanvar_poisson; int nchecklist; double minout; -//double tmplike[*n]; + //double tmplike[*n]; double *tmplike; tmplike = (double *)calloc(*n+1,sizeof(double)); if (tmplike==NULL) { @@ -145,13 +145,13 @@ costfunction = &mbic_meanvar_poisson; int tstar,i,whichout,nchecktmp; - void min_which(); + void min_which(double*, int, double*, int*); lastchangelike[0]= -*pen; lastchangecpts[0]=0; numchangecpts[0]=0; - // lastchangelike[1]=costfunction(*(sumstat+1),*(sumstat + *n + 1 + 1),*(sumstat + *n + *n + 2 + 1),1, *shape); - // lastchangecpts[1]=0; lastchangecpts[*n+1]=1; + // lastchangelike[1]=costfunction(*(sumstat+1),*(sumstat + *n + 1 + 1),*(sumstat + *n + *n + 2 + 1),1, *shape); + // lastchangecpts[1]=0; lastchangecpts[*n+1]=1; int j; @@ -165,7 +165,7 @@ costfunction = &mbic_meanvar_poisson; lastchangecpts[j] = 0; } - for(j=*minseglen;j<(2*(*minseglen));j++){ + for(j=*minseglen;j<(2*(*minseglen));j++){ numchangecpts[j] =1; } @@ -177,29 +177,29 @@ costfunction = &mbic_meanvar_poisson; for(tstar=2*(*minseglen);tstar<(*n+1);tstar++){ R_CheckUserInterrupt(); /* checks if user has interrupted the R session and quits if true */ - if ((lastchangelike[tstar]) == 0){ - for(i=0;i<(nchecklist);i++){ + if ((lastchangelike[tstar]) == 0){ + for(i=0;i<(nchecklist);i++){ tmplike[i]=lastchangelike[checklist[i]] + costfunction(*(sumstat+tstar)-*(sumstat+checklist[i]),*(sumstat + *n + 1 +tstar)-*(sumstat + *n + 1 +checklist[i]),*(sumstat + *n + *n + 2 +tstar)-*(sumstat + *n + *n + 2 +checklist[i]), tstar-checklist[i], *shape)+*pen; } - min_which(tmplike,nchecklist,&minout,&whichout); /*updates minout and whichout with min and which element */ - lastchangelike[tstar]=minout; - lastchangecpts[tstar]=checklist[whichout]; - numchangecpts[tstar]=numchangecpts[lastchangecpts[tstar]]+1; - /* Update checklist for next iteration, first element is next tau */ + min_which(tmplike,nchecklist,&minout,&whichout); /*updates minout and whichout with min and which element */ + lastchangelike[tstar]=minout; + lastchangecpts[tstar]=checklist[whichout]; + numchangecpts[tstar]=numchangecpts[lastchangecpts[tstar]]+1; + /* Update checklist for next iteration, first element is next tau */ nchecktmp=0; - for(i=0;i + + + + + + + + + + + + + + + + + + + + + + +0 +50 +100 +150 +200 + + + + + + + + +0 +50 +100 +150 +200 +250 +300 + +Number of Changepoints +Penalty Value + diff --git a/tests/figs/plot-diagnostic-true-tests/diagnostic-plot-histogram.svg b/tests/figs/plot-diagnostic-true-tests/diagnostic-plot-histogram.svg new file mode 100644 index 0000000..e694adb --- /dev/null +++ b/tests/figs/plot-diagnostic-true-tests/diagnostic-plot-histogram.svg @@ -0,0 +1,187 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +0 +50 +100 +150 +200 + + + + + + + + +0 +50 +100 +150 +200 +250 +300 + +Number of Changepoints +Penalty Value + diff --git a/tests/testthat/test-plot.R b/tests/testthat/test-plot.R new file mode 100644 index 0000000..663b841 --- /dev/null +++ b/tests/testthat/test-plot.R @@ -0,0 +1,16 @@ +context("plot(diagnostic = TRUE) tests") + +# Generate cpt.range object +testdata <- changepoint::ftse100$V2 +obj.cpt.range <- cpt.var(testdata, method = "PELT", + penalty = "CROPS", pen.value = c(5, 500)) + +# For code coverage +plot(obj.cpt.range, diagnostic = TRUE) +plot(obj.cpt.range, diagnostic = TRUE, type = "h") + +# Tests for plots +vdiffr::expect_doppelganger("Diagnostic plot (default)", + plot(obj.cpt.range, diagnostic = TRUE)) +vdiffr::expect_doppelganger("Diagnostic plot (histogram)", + plot(obj.cpt.range, diagnostic = TRUE, type = "h"))