A-quick-tour-of-NMoE

Introduction

NMoE (Normal Mixtures-of-Experts) provides a flexible modelling framework for heterogenous data with Gaussian distributions. NMoE consists of a mixture of K Normal expert regressors network (of degree p) gated by a softmax gating network (of degree q) and is represented by:

  • The gating network parameters alpha’s of the softmax net.
  • The experts network parameters: The location parameters (regression coefficients) beta’s and variances sigma2’s.

It was written in R Markdown, using the knitr package for production.

See help(package="meteorits") for further details and references provided by citation("meteorits").

Application to a simulated dataset

Generate sample

n <- 500 # Size of the sample
alphak <- matrix(c(0, 8), ncol = 1) # Parameters of the gating network
betak <- matrix(c(0, -2.5, 0, 2.5), ncol = 2) # Regression coefficients of the experts
sigmak <- c(1, 1) # Standard deviations of the experts
x <- seq.int(from = -1, to = 1, length.out = n) # Inputs (predictors)

# Generate sample of size n
sample <- sampleUnivNMoE(alphak = alphak, betak = betak, sigmak = sigmak, x = x)
y <- sample$y

Set up tMoE model parameters

K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)

Set up EM parameters

n_tries <- 1
max_iter <- 1500
threshold <- 1e-5
verbose <- TRUE
verbose_IRLS <- FALSE

Estimation

nmoe <- emNMoE(X = x, Y = y, K, p, q, n_tries, max_iter, 
               threshold, verbose, verbose_IRLS)
## EM NMoE: Iteration: 1 | log-likelihood: -806.158510299976
## EM NMoE: Iteration: 2 | log-likelihood: -805.474723076792
## EM NMoE: Iteration: 3 | log-likelihood: -804.599409421661
## EM NMoE: Iteration: 4 | log-likelihood: -802.776614190871
## EM NMoE: Iteration: 5 | log-likelihood: -798.862000135275
## EM NMoE: Iteration: 6 | log-likelihood: -791.004758285789
## EM NMoE: Iteration: 7 | log-likelihood: -776.970680842872
## EM NMoE: Iteration: 8 | log-likelihood: -755.617124660873
## EM NMoE: Iteration: 9 | log-likelihood: -729.374018061859
## EM NMoE: Iteration: 10 | log-likelihood: -705.436877525156
## EM NMoE: Iteration: 11 | log-likelihood: -690.440039381914
## EM NMoE: Iteration: 12 | log-likelihood: -683.585987216521
## EM NMoE: Iteration: 13 | log-likelihood: -680.834342132805
## EM NMoE: Iteration: 14 | log-likelihood: -679.691346516783
## EM NMoE: Iteration: 15 | log-likelihood: -679.150059025065
## EM NMoE: Iteration: 16 | log-likelihood: -678.852576279762
## EM NMoE: Iteration: 17 | log-likelihood: -678.66716060468
## EM NMoE: Iteration: 18 | log-likelihood: -678.54011904612
## EM NMoE: Iteration: 19 | log-likelihood: -678.447010679379
## EM NMoE: Iteration: 20 | log-likelihood: -678.375505702636
## EM NMoE: Iteration: 21 | log-likelihood: -678.318773924676
## EM NMoE: Iteration: 22 | log-likelihood: -678.272701933882
## EM NMoE: Iteration: 23 | log-likelihood: -678.234628597116
## EM NMoE: Iteration: 24 | log-likelihood: -678.202731722941
## EM NMoE: Iteration: 25 | log-likelihood: -678.175707646804
## EM NMoE: Iteration: 26 | log-likelihood: -678.15259290826
## EM NMoE: Iteration: 27 | log-likelihood: -678.132657964758
## EM NMoE: Iteration: 28 | log-likelihood: -678.115339940966
## EM NMoE: Iteration: 29 | log-likelihood: -678.100197924551
## EM NMoE: Iteration: 30 | log-likelihood: -678.086882088052
## EM NMoE: Iteration: 31 | log-likelihood: -678.075111734904
## EM NMoE: Iteration: 32 | log-likelihood: -678.064659344906
## EM NMoE: Iteration: 33 | log-likelihood: -678.055338779846
## EM NMoE: Iteration: 34 | log-likelihood: -678.046996442114
## EM NMoE: Iteration: 35 | log-likelihood: -678.039504567551
## EM NMoE: Iteration: 36 | log-likelihood: -678.032756083302

Summary

nmoe$summary()
## ------------------------------------------
## Fitted Normal Mixture-of-Experts model
## ------------------------------------------
## 
## NMoE model with K = 2 experts:
## 
##  log-likelihood df       AIC       BIC       ICL
##       -678.0328  8 -686.0328 -702.8912 -740.8888
## 
## Clustering table (Number of observations in each expert):
## 
##   1   2 
## 292 208 
## 
## Regression coefficients:
## 
##     Beta(k = 1) Beta(k = 2)
## 1     0.0788678  -0.1938003
## X^1  -2.5643243   2.0904461
## 
## Variances:
## 
##  Sigma2(k = 1) Sigma2(k = 2)
##      0.8489325     0.7327877

Plots

Mean curve

nmoe$plot(what = "meancurve")

Confidence regions

nmoe$plot(what = "confregions")

Clusters

nmoe$plot(what = "clusters")

Log-likelihood

nmoe$plot(what = "loglikelihood")

Application to a real dataset

Load data

data("tempanomalies")
x <- tempanomalies$Year
y <- tempanomalies$AnnualAnomaly

Set up tMoE model parameters

K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)

Set up EM parameters

n_tries <- 1
max_iter <- 1500
threshold <- 1e-5
verbose <- TRUE
verbose_IRLS <- FALSE

Estimation

nmoe <- emNMoE(X = x, Y = y, K, p, q, n_tries, max_iter, 
               threshold, verbose, verbose_IRLS)
## EM NMoE: Iteration: 1 | log-likelihood: 48.9228441140463
## EM NMoE: Iteration: 2 | log-likelihood: 49.7589444969456
## EM NMoE: Iteration: 3 | log-likelihood: 51.6845844776189
## EM NMoE: Iteration: 4 | log-likelihood: 56.0856243423045
## EM NMoE: Iteration: 5 | log-likelihood: 63.3421351430867
## EM NMoE: Iteration: 6 | log-likelihood: 70.363549104895
## EM NMoE: Iteration: 7 | log-likelihood: 74.806898138828
## EM NMoE: Iteration: 8 | log-likelihood: 77.7568228419545
## EM NMoE: Iteration: 9 | log-likelihood: 80.4628903123957
## EM NMoE: Iteration: 10 | log-likelihood: 83.7638068909149
## EM NMoE: Iteration: 11 | log-likelihood: 88.1635344392289
## EM NMoE: Iteration: 12 | log-likelihood: 92.7746985321795
## EM NMoE: Iteration: 13 | log-likelihood: 95.3095664485913
## EM NMoE: Iteration: 14 | log-likelihood: 96.2197673770326
## EM NMoE: Iteration: 15 | log-likelihood: 96.5835142356994
## EM NMoE: Iteration: 16 | log-likelihood: 96.7839801009007
## EM NMoE: Iteration: 17 | log-likelihood: 96.9368846038257
## EM NMoE: Iteration: 18 | log-likelihood: 97.0772705979879
## EM NMoE: Iteration: 19 | log-likelihood: 97.2153394899876
## EM NMoE: Iteration: 20 | log-likelihood: 97.3528649667427
## EM NMoE: Iteration: 21 | log-likelihood: 97.4886791567153
## EM NMoE: Iteration: 22 | log-likelihood: 97.6210421499387
## EM NMoE: Iteration: 23 | log-likelihood: 97.7488977395181
## EM NMoE: Iteration: 24 | log-likelihood: 97.8725082479884
## EM NMoE: Iteration: 25 | log-likelihood: 97.9935567592081
## EM NMoE: Iteration: 26 | log-likelihood: 98.1147180376743
## EM NMoE: Iteration: 27 | log-likelihood: 98.2389699526914
## EM NMoE: Iteration: 28 | log-likelihood: 98.3689110223236
## EM NMoE: Iteration: 29 | log-likelihood: 98.506375671909
## EM NMoE: Iteration: 30 | log-likelihood: 98.6524322280808
## EM NMoE: Iteration: 31 | log-likelihood: 98.8077561864715
## EM NMoE: Iteration: 32 | log-likelihood: 98.9731345548937
## EM NMoE: Iteration: 33 | log-likelihood: 99.1499539333167
## EM NMoE: Iteration: 34 | log-likelihood: 99.3405985095048
## EM NMoE: Iteration: 35 | log-likelihood: 99.5488465157777
## EM NMoE: Iteration: 36 | log-likelihood: 99.7804652968701
## EM NMoE: Iteration: 37 | log-likelihood: 100.044216984486
## EM NMoE: Iteration: 38 | log-likelihood: 100.35339053082
## EM NMoE: Iteration: 39 | log-likelihood: 100.727393912158
## EM NMoE: Iteration: 40 | log-likelihood: 101.189441681236
## EM NMoE: Iteration: 41 | log-likelihood: 101.7412762218
## EM NMoE: Iteration: 42 | log-likelihood: 102.277167691258
## EM NMoE: Iteration: 43 | log-likelihood: 102.63286180931
## EM NMoE: Iteration: 44 | log-likelihood: 102.7196610052
## EM NMoE: Iteration: 45 | log-likelihood: 102.720470611251

Summary

nmoe$summary()
## ------------------------------------------
## Fitted Normal Mixture-of-Experts model
## ------------------------------------------
## 
## NMoE model with K = 2 experts:
## 
##  log-likelihood df      AIC      BIC      ICL
##        102.7205  8 94.72047 83.06985 83.17676
## 
## Clustering table (Number of observations in each expert):
## 
##  1  2 
## 84 52 
## 
## Regression coefficients:
## 
##       Beta(k = 1)  Beta(k = 2)
## 1   -12.667361687 -42.36153502
## X^1   0.006474844   0.02149239
## 
## Variances:
## 
##  Sigma2(k = 1) Sigma2(k = 2)
##     0.01352348    0.01193134

Plots

Mean curve

nmoe$plot(what = "meancurve")

Confidence regions

nmoe$plot(what = "confregions")

Clusters

nmoe$plot(what = "clusters")

Log-likelihood

nmoe$plot(what = "loglikelihood")