SNMoE (Skew-Normal Mixtures-of-Experts) provides a flexible modelling framework for heterogenous data with possibly skewed distributions to generalize the standard Normal mixture of expert model. SNMoE consists of a mixture of K skew-Normal expert regressors network (of degree p) gated by a softmax gating network (of degree q) and is represented by:
alpha
’s of the softmax
net.beta
’s, scale parameters
sigma
’s, and the skewness parameters lambda
’s.
SNMoE thus generalises mixtures of (normal,
skew-normal) distributions and mixtures of regressions with these
distributions. For example, when q = 0, we retrieve mixtures of
(skew-normal, or normal) regressions, and when both p = 0 and q = 0, it is a mixture of
(skew-normal, or normal) distributions. It also reduces to the standard
(normal, skew-normal) distribution when we only use a single expert
(K = 1).Model estimation/learning is performed by a dedicated expectation conditional maximization (ECM) algorithm by maximizing the observed data log-likelihood. We provide simulated examples to illustrate the use of the model in model-based clustering of heterogeneous regression data and in fitting non-linear regression functions.
It was written in R Markdown, using the knitr package for production.
See help(package="meteorits")
for further details and
references provided by citation("meteorits")
.
n <- 500 # Size of the sample
alphak <- matrix(c(0, 8), ncol = 1) # Parameters of the gating network
betak <- matrix(c(0, -2.5, 0, 2.5), ncol = 2) # Regression coefficients of the experts
lambdak <- c(3, 5) # Skewness parameters of the experts
sigmak <- c(1, 1) # Standard deviations of the experts
x <- seq.int(from = -1, to = 1, length.out = n) # Inputs (predictors)
# Generate sample of size n
sample <- sampleUnivSNMoE(alphak = alphak, betak = betak, sigmak = sigmak,
lambdak = lambdak, x = x)
y <- sample$y
snmoe <- emSNMoE(X = x, Y = y, K, p, q, n_tries, max_iter,
threshold, verbose, verbose_IRLS)
## EM - SNMoE: Iteration: 1 | log-likelihood: -659.364098455991
## EM - SNMoE: Iteration: 2 | log-likelihood: -532.877118445633
## EM - SNMoE: Iteration: 3 | log-likelihood: -528.02232626183
## EM - SNMoE: Iteration: 4 | log-likelihood: -527.11735186571
## EM - SNMoE: Iteration: 5 | log-likelihood: -526.74937351346
## EM - SNMoE: Iteration: 6 | log-likelihood: -526.456338861649
## EM - SNMoE: Iteration: 7 | log-likelihood: -526.17926051541
## EM - SNMoE: Iteration: 8 | log-likelihood: -525.919640457271
## EM - SNMoE: Iteration: 9 | log-likelihood: -525.682600335409
## EM - SNMoE: Iteration: 10 | log-likelihood: -525.469425428799
## EM - SNMoE: Iteration: 11 | log-likelihood: -525.279099930352
## EM - SNMoE: Iteration: 12 | log-likelihood: -525.109394941159
## EM - SNMoE: Iteration: 13 | log-likelihood: -524.957944030159
## EM - SNMoE: Iteration: 14 | log-likelihood: -524.822251207901
## EM - SNMoE: Iteration: 15 | log-likelihood: -524.700089183135
## EM - SNMoE: Iteration: 16 | log-likelihood: -524.589478847936
## EM - SNMoE: Iteration: 17 | log-likelihood: -524.488789599886
## EM - SNMoE: Iteration: 18 | log-likelihood: -524.396689275102
## EM - SNMoE: Iteration: 19 | log-likelihood: -524.312113413735
## EM - SNMoE: Iteration: 20 | log-likelihood: -524.234055001203
## EM - SNMoE: Iteration: 21 | log-likelihood: -524.162202553338
## EM - SNMoE: Iteration: 22 | log-likelihood: -524.096396825853
## EM - SNMoE: Iteration: 23 | log-likelihood: -524.036293514659
## EM - SNMoE: Iteration: 24 | log-likelihood: -523.981484933788
## EM - SNMoE: Iteration: 25 | log-likelihood: -523.931562385228
## EM - SNMoE: Iteration: 26 | log-likelihood: -523.886355724509
## EM - SNMoE: Iteration: 27 | log-likelihood: -523.845453069074
## EM - SNMoE: Iteration: 28 | log-likelihood: -523.808397137861
## EM - SNMoE: Iteration: 29 | log-likelihood: -523.774753110598
## EM - SNMoE: Iteration: 30 | log-likelihood: -523.743982962095
## EM - SNMoE: Iteration: 31 | log-likelihood: -523.715618850683
## EM - SNMoE: Iteration: 32 | log-likelihood: -523.689140023484
## EM - SNMoE: Iteration: 33 | log-likelihood: -523.664047614775
## EM - SNMoE: Iteration: 34 | log-likelihood: -523.639886519945
## EM - SNMoE: Iteration: 35 | log-likelihood: -523.616226649687
## EM - SNMoE: Iteration: 36 | log-likelihood: -523.592718000468
## EM - SNMoE: Iteration: 37 | log-likelihood: -523.569054744224
## EM - SNMoE: Iteration: 38 | log-likelihood: -523.545050178975
## EM - SNMoE: Iteration: 39 | log-likelihood: -523.520626137887
## EM - SNMoE: Iteration: 40 | log-likelihood: -523.495837114211
## EM - SNMoE: Iteration: 41 | log-likelihood: -523.4708772571
## EM - SNMoE: Iteration: 42 | log-likelihood: -523.445922769563
## EM - SNMoE: Iteration: 43 | log-likelihood: -523.42171490086
## EM - SNMoE: Iteration: 44 | log-likelihood: -523.398733082935
## EM - SNMoE: Iteration: 45 | log-likelihood: -523.377323807915
## EM - SNMoE: Iteration: 46 | log-likelihood: -523.357443266491
## EM - SNMoE: Iteration: 47 | log-likelihood: -523.339232395235
## EM - SNMoE: Iteration: 48 | log-likelihood: -523.32310682294
## EM - SNMoE: Iteration: 49 | log-likelihood: -523.309017551561
## EM - SNMoE: Iteration: 50 | log-likelihood: -523.296869112449
## EM - SNMoE: Iteration: 51 | log-likelihood: -523.286428744865
## EM - SNMoE: Iteration: 52 | log-likelihood: -523.277466961319
## EM - SNMoE: Iteration: 53 | log-likelihood: -523.269807429161
## EM - SNMoE: Iteration: 54 | log-likelihood: -523.263257287056
## EM - SNMoE: Iteration: 55 | log-likelihood: -523.257623391596
## EM - SNMoE: Iteration: 56 | log-likelihood: -523.252745171612
## EM - SNMoE: Iteration: 57 | log-likelihood: -523.248502537504
## EM - SNMoE: Iteration: 58 | log-likelihood: -523.244786721058
## EM - SNMoE: Iteration: 59 | log-likelihood: -523.241502187331
## EM - SNMoE: Iteration: 60 | log-likelihood: -523.238568528192
## EM - SNMoE: Iteration: 61 | log-likelihood: -523.235932146062
## EM - SNMoE: Iteration: 62 | log-likelihood: -523.233534487491
## EM - SNMoE: Iteration: 63 | log-likelihood: -523.231234718918
## EM - SNMoE: Iteration: 64 | log-likelihood: -523.229236244541
## EM - SNMoE: Iteration: 65 | log-likelihood: -523.227300052752
## EM - SNMoE: Iteration: 66 | log-likelihood: -523.225509521492
## EM - SNMoE: Iteration: 67 | log-likelihood: -523.223797086709
## EM - SNMoE: Iteration: 68 | log-likelihood: -523.222200187307
## EM - SNMoE: Iteration: 69 | log-likelihood: -523.220679459477
## EM - SNMoE: Iteration: 70 | log-likelihood: -523.219244892326
## EM - SNMoE: Iteration: 71 | log-likelihood: -523.217880442527
## EM - SNMoE: Iteration: 72 | log-likelihood: -523.216579848419
## EM - SNMoE: Iteration: 73 | log-likelihood: -523.215335607024
## EM - SNMoE: Iteration: 74 | log-likelihood: -523.214151965079
## EM - SNMoE: Iteration: 75 | log-likelihood: -523.21303387873
## EM - SNMoE: Iteration: 76 | log-likelihood: -523.211965338274
## EM - SNMoE: Iteration: 77 | log-likelihood: -523.210948447569
## EM - SNMoE: Iteration: 78 | log-likelihood: -523.209976968569
## EM - SNMoE: Iteration: 79 | log-likelihood: -523.209042141499
## EM - SNMoE: Iteration: 80 | log-likelihood: -523.208151147968
## EM - SNMoE: Iteration: 81 | log-likelihood: -523.207297676705
## EM - SNMoE: Iteration: 82 | log-likelihood: -523.206483520854
## EM - SNMoE: Iteration: 83 | log-likelihood: -523.205705127856
## EM - SNMoE: Iteration: 84 | log-likelihood: -523.204961448705
## EM - SNMoE: Iteration: 85 | log-likelihood: -523.204247971995
## EM - SNMoE: Iteration: 86 | log-likelihood: -523.203562624953
## EM - SNMoE: Iteration: 87 | log-likelihood: -523.202909668876
## EM - SNMoE: Iteration: 88 | log-likelihood: -523.202283513954
## EM - SNMoE: Iteration: 89 | log-likelihood: -523.201682764747
## EM - SNMoE: Iteration: 90 | log-likelihood: -523.201106116206
## EM - SNMoE: Iteration: 91 | log-likelihood: -523.200551519712
## EM - SNMoE: Iteration: 92 | log-likelihood: -523.20001958608
## EM - SNMoE: Iteration: 93 | log-likelihood: -523.199512857189
snmoe$summary()
## -----------------------------------------------
## Fitted Skew-Normal Mixture-of-Experts model
## -----------------------------------------------
##
## SNMoE model with K = 2 experts:
##
## log-likelihood df AIC BIC ICL
## -523.1995 10 -533.1995 -554.2726 -554.2698
##
## Clustering table (Number of observations in each expert):
##
## 1 2
## 251 249
##
## Regression coefficients:
##
## Beta(k = 1) Beta(k = 2)
## 1 0.9050196 0.9162288
## X^1 2.5374803 -2.5161610
##
## Variances:
##
## Sigma2(k = 1) Sigma2(k = 2)
## 0.3681353 0.6213428
snmoe <- emSNMoE(X = x, Y = y, K, p, q, n_tries, max_iter,
threshold, verbose, verbose_IRLS)
## EM - SNMoE: Iteration: 1 | log-likelihood: 83.9262407175937
## EM - SNMoE: Iteration: 2 | log-likelihood: 87.7188119212205
## EM - SNMoE: Iteration: 3 | log-likelihood: 88.8445797118665
## EM - SNMoE: Iteration: 4 | log-likelihood: 89.0852102605541
## EM - SNMoE: Iteration: 5 | log-likelihood: 89.2625930149186
## EM - SNMoE: Iteration: 6 | log-likelihood: 89.4730877383519
## EM - SNMoE: Iteration: 7 | log-likelihood: 89.6474003121459
## EM - SNMoE: Iteration: 8 | log-likelihood: 89.7509686750037
## EM - SNMoE: Iteration: 9 | log-likelihood: 89.806874689179
## EM - SNMoE: Iteration: 10 | log-likelihood: 89.8372612098151
## EM - SNMoE: Iteration: 11 | log-likelihood: 89.8560706433432
## EM - SNMoE: Iteration: 12 | log-likelihood: 89.8702894939668
## EM - SNMoE: Iteration: 13 | log-likelihood: 89.8822074613821
## EM - SNMoE: Iteration: 14 | log-likelihood: 89.8925244553486
## EM - SNMoE: Iteration: 15 | log-likelihood: 89.9017961640383
## EM - SNMoE: Iteration: 16 | log-likelihood: 89.9103334359088
## EM - SNMoE: Iteration: 17 | log-likelihood: 89.9182405289601
## EM - SNMoE: Iteration: 18 | log-likelihood: 89.9255416774617
## EM - SNMoE: Iteration: 19 | log-likelihood: 89.9322377598534
## EM - SNMoE: Iteration: 20 | log-likelihood: 89.9383303069481
## EM - SNMoE: Iteration: 21 | log-likelihood: 89.9438154977284
## EM - SNMoE: Iteration: 22 | log-likelihood: 89.9487855645687
## EM - SNMoE: Iteration: 23 | log-likelihood: 89.9532932545014
## EM - SNMoE: Iteration: 24 | log-likelihood: 89.9571981949638
## EM - SNMoE: Iteration: 25 | log-likelihood: 89.9605279574829
## EM - SNMoE: Iteration: 26 | log-likelihood: 89.9633100246732
## EM - SNMoE: Iteration: 27 | log-likelihood: 89.9658406879855
## EM - SNMoE: Iteration: 28 | log-likelihood: 89.9670066538147
## EM - SNMoE: Iteration: 29 | log-likelihood: 89.9681784712967
## EM - SNMoE: Iteration: 30 | log-likelihood: 89.9690797824211
## EM - SNMoE: Iteration: 31 | log-likelihood: 89.969859931756
## EM - SNMoE: Iteration: 32 | log-likelihood: 89.9704196465536
## EM - SNMoE: Iteration: 33 | log-likelihood: 89.9712719008371
## EM - SNMoE: Iteration: 34 | log-likelihood: 89.9715846944475
## EM - SNMoE: Iteration: 35 | log-likelihood: 89.9718222300817
## EM - SNMoE: Iteration: 36 | log-likelihood: 89.9719245875983
## EM - SNMoE: Iteration: 37 | log-likelihood: 89.9722547976426
## EM - SNMoE: Iteration: 38 | log-likelihood: 89.9723608875067
## EM - SNMoE: Iteration: 39 | log-likelihood: 89.9724083427288
snmoe$summary()
## -----------------------------------------------
## Fitted Skew-Normal Mixture-of-Experts model
## -----------------------------------------------
##
## SNMoE model with K = 2 experts:
##
## log-likelihood df AIC BIC ICL
## 89.97241 10 79.97241 65.40913 65.29856
##
## Clustering table (Number of observations in each expert):
##
## 1 2
## 70 66
##
## Regression coefficients:
##
## Beta(k = 1) Beta(k = 2)
## 1 -14.1039715 -33.80203804
## X^1 0.0072143 0.01719797
##
## Variances:
##
## Sigma2(k = 1) Sigma2(k = 2)
## 0.01497861 0.01736828