A-quick-tour-of-RHLP

Introduction

RHLP: Flexible and user-friendly probabilistic segmentation of time series (or structured longitudinal data) with smooth and/or abrupt regime changes by a mixture model-based regression approach with a hidden logistic process, fitted by the EM algorithm.

It was written in R Markdown, using the knitr package for production.

See help(package="samurais") for further details and references provided by citation("samurais").

Load data

data("univtoydataset")
x <- univtoydataset$x
y <- univtoydataset$y

Set up RHLP model parameters

K <- 5 # Number of regimes (mixture components)
p <- 3 # Dimension of beta (order of the polynomial regressors)
q <- 1 # Dimension of w (order of the logistic regression: to be set to 1 for segmentation)
variance_type <- "heteroskedastic" # "heteroskedastic" or "homoskedastic" model

Set up EM parameters

n_tries <- 1
max_iter = 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE

Estimation

rhlp <- emRHLP(X = x, Y = y, K, p, q, variance_type, n_tries, 
               max_iter, threshold, verbose, verbose_IRLS)
## EM - RHLP: Iteration: 1 | log-likelihood: -2119.2730847863
## EM - RHLP: Iteration: 2 | log-likelihood: -1149.01040275042
## EM - RHLP: Iteration: 3 | log-likelihood: -1118.2038425746
## EM - RHLP: Iteration: 4 | log-likelihood: -1096.8826062752
## EM - RHLP: Iteration: 5 | log-likelihood: -1067.55719335696
## EM - RHLP: Iteration: 6 | log-likelihood: -1037.26620104185
## EM - RHLP: Iteration: 7 | log-likelihood: -1022.7174307707
## EM - RHLP: Iteration: 8 | log-likelihood: -1006.118254514
## EM - RHLP: Iteration: 9 | log-likelihood: -1001.18491882476
## EM - RHLP: Iteration: 10 | log-likelihood: -1000.91250762673
## EM - RHLP: Iteration: 11 | log-likelihood: -1000.62280599148
## EM - RHLP: Iteration: 12 | log-likelihood: -1000.30309886791
## EM - RHLP: Iteration: 13 | log-likelihood: -999.932334867598
## EM - RHLP: Iteration: 14 | log-likelihood: -999.484219689836
## EM - RHLP: Iteration: 15 | log-likelihood: -998.928118018318
## EM - RHLP: Iteration: 16 | log-likelihood: -998.234244639955
## EM - RHLP: Iteration: 17 | log-likelihood: -997.359536244659
## EM - RHLP: Iteration: 18 | log-likelihood: -996.15265481515
## EM - RHLP: Iteration: 19 | log-likelihood: -994.697863399405
## EM - RHLP: Iteration: 20 | log-likelihood: -993.186583927774
## EM - RHLP: Iteration: 21 | log-likelihood: -991.813523755133
## EM - RHLP: Iteration: 22 | log-likelihood: -990.611295180997
## EM - RHLP: Iteration: 23 | log-likelihood: -989.539226242094
## EM - RHLP: Iteration: 24 | log-likelihood: -988.553118850066
## EM - RHLP: Iteration: 25 | log-likelihood: -987.539963656861
## EM - RHLP: Iteration: 26 | log-likelihood: -986.073920058718
## EM - RHLP: Iteration: 27 | log-likelihood: -983.263549767648
## EM - RHLP: Iteration: 28 | log-likelihood: -979.340492092037
## EM - RHLP: Iteration: 29 | log-likelihood: -977.468559826356
## EM - RHLP: Iteration: 30 | log-likelihood: -976.653534229025
## EM - RHLP: Iteration: 31 | log-likelihood: -976.589338743393
## EM - RHLP: Iteration: 32 | log-likelihood: -976.589338067356

Summary

rhlp$summary()
## ---------------------
## Fitted RHLP model
## ---------------------
## 
## RHLP model with K = 5 components:
## 
##  log-likelihood nu       AIC       BIC       ICL
##       -976.5893 33 -1009.589 -1083.959 -1083.176
## 
## Clustering table (Number of observations in each regimes):
## 
##   1   2   3   4   5 
## 100 120 200 100 150 
## 
## Regression coefficients:
## 
##       Beta(k = 1) Beta(k = 2) Beta(k = 3) Beta(k = 4) Beta(k = 5)
## 1    6.031875e-02   -5.434903   -2.770416    120.7698    4.027543
## X^1 -7.424718e+00  158.705091   43.879453   -474.5887   13.194260
## X^2  2.931652e+02 -650.592347  -94.194780    597.7947  -33.760602
## X^3 -1.823560e+03  865.329795   67.197059   -244.2385   20.402152
## 
## Variances:
## 
##  Sigma2(k = 1) Sigma2(k = 2) Sigma2(k = 3) Sigma2(k = 4) Sigma2(k = 5)
##       1.220624      1.110243      1.079394     0.9779734      1.028332

Plots

Fitted regressors

rhlp$plot(what = "regressors")

Estimated signal

rhlp$plot(what = "estimatedsignal")

Log-likelihood

rhlp$plot(what = "loglikelihood")