Journal of Modern Power Systems and Clean Energy

ISSN 2196-5625 CN 32-1884/TK

网刊加载中。。。

使用Chrome浏览器效果最佳,继续浏览,你可能不会看到最佳的展示效果,

确定继续浏览么?

复制成功,请在其他浏览器进行阅读

A Robust Segmented Mixed Effect Regression Model for Baseline Electricity Consumption Forecasting  PDF

  • Xiaoyang Zhou
  • Yuanqi Gao
  • Weixin Yao
  • Nanpeng Yu
Department of Statistics, University of California, Riverside, CA, USA; Department of Electrical and Computer Engineering, University of California, Riverside, USA

Updated:2022-01-21

DOI:10.35833/MPCE.2020.000023

  • Full Text
  • Figs & Tabs
  • References
  • Authors
  • About
CITE
OUTLINE

Abstract

Renewable energy production has been surging around the world in recent years. To mitigate the increasing uncertainty and intermittency of the renewable generation, proactive demand response algorithms and programs are proposed and developed to further improve the utilization of load flexibility and increase the efficiency of power system operation. One of the biggest challenges to efficient control and operation of demand response resources is how to forecast the baseline electricity consumption and estimate the load impact from demand response resources accurately. In this paper, we propose a mixed effect segmented regression model and a new robust estimate for forecasting the baseline electricity consumption in Southern California, USA, by combining the ideas of random effect regression model, segmented regression model, and the least trimmed squares estimate. Since the log-likelihood of the considered model is not differentiable at breakpoints, we propose a new backfitting algorithm to estimate the unknown parameters. The estimation performance of the new estimation procedure has been demonstrated with both simulation studies and the real data application for the electric load baseline forecasting in Southern California.

I. Introduction

THE renewable energy sector has experienced exponential growth in the past five to ten years. The global annual growth rates of solar photovoltaic and wind energy are 42% and 17% from 2010 through 2015, respectively [

1]. The renewable penetration level in certain parts of the world is much higher than the global average penetration level. For example, the renewable energy penetration level in California reached 30% in 2017. The recently passed California Senate Bill No. 100 will further boost renewable penetration level up to 60% by 2030 and 100% by 2045. To mitigate the increasing uncertainty and intermittency of renewable generation, demand response resources are in critical need. In the past ten years, traditional and passive price-based and incentive-based demand response programs have been implemented throughout the USA. In recent years, proactive demand response algorithms and programs are proposed and developed to improve the utilization of load flexibility and dispatchability further [2]. Accurate load impact forecasting is needed to leverage the load flexibility from the demand response resources effectively. The load impact from a demand response resource is defined as the difference between load baselines and metered load when a demand response event is triggered. In practice, it is very challenging to develop a good estimation of the load baseline which represents the electric load that would have occurred without demand response event [3].

A sound baseline estimation methodology should represent an appropriate tradeoff between simplicity and accuracy. The existing baseline methodology can be categorized into two types: Type-I and Type-II. In Type-I methodology, the baseline is estimated by using a similar day-based algorithm, which depends on historical interval meter data and similarity metrics such as weather and calendar. Simplicity is the most significant advantage of Type-I methodology [

4], [5]. In Type-II methodology, more sophisticated statistical methods are adopted to estimate and forecast the baseline electricity consumption. Type-II methodology typically yields better forecasting accuracy and is undergoing rapid development. It can be further divided into three groups: statistical methods, machine learning/deep learning methods, and hybrid methods. In the first group, [6] proposes a refined multiple linear regression model. Reference [7] proposes a method to coherently convert a set of lower-level node forecasting to aggregate nodes using empirical copula and Monte-Carlo sampling. In the same vein, [8] proposes an aggregation of random forest load forecasting framework. The second group utilizes deep learning algorithms. Reference [9] proposes support vector regressions models to forecast the demand response baseline. In [10], an ensemble ResNet deep neural network model is proposed. The sequence-to-sequence recurrent neural network with attention mechanism is adopted in [11]. In the third group, hybrid methods have been developed, in which more than one forecasting algorithms serve as building blocks for an overall model. Reference [12] proposes a cooperative quantile regression forest and multivariate quantile regression framework. Reference [13] proposes a two-level hybrid ensemble of deep belief network model.

There are several limitations with the existing approaches. First, some of the methods do not exploit the structure of the forecasting problem effectively. For example, the segmented nature of calendar variables on the load profile is not well addressed. Second, deep learning based forecasting algorithms are typically computationally expensive to train. In addition, they yield un-interpretable results and can be sensitive to the selection of hyperparameters. Third, hybrid methods are generally complicated to build, thus can be error-prone to implement and benchmark. Lastly, most of the existing work build and train a separate model for each time series. This significantly limits the scalability of model, especially for large service territories operated by electric utilities.

In this paper, we propose a mixed effect segmented regression (MESR) model, which is a Type-II methodology, to forecast the hourly electric load baseline in Southern California, USA at the 220 kV transformer bank level. One commonly used method for electric power demand forecasting at each hour is the multiple linear regression with hour as a categorical variable and weather data as continuous covariates. An alternative model for hour is to include it as a linear predictor. However, it is expected that the linear effect of hour on electric demand does not hold in the whole range of time. To this end, we propose to model the hour effect by a segmented regression model [

14]-[17], which can be considered as a compromise between modelling the hour as a global linear predictor and a categorical variable. The nonlinear relationship with breakpoints is piece-wise, segmented, broken-line, or multi-phased. The breakpoints are also called change-points, transition-points or switch-points in some applications. Using a segmented regression model for the covariate hour, the effect on the electric consumption changes continuously across time so that we can borrow the information from other hours when estimating the impact of hour. The estimated breakpoints can also tell us how the linear effect of hour changes across different segmented areas. Segmented regression models have been widely used in many areas. In medication, segmented regression is a powerful statistical tool for estimating the intervention effects of interrupted time series studies [18]. The segmented regression is also used to identify the changes in the recent trend of cancer mortality and incidence data analysis [19]. In ecology area, the segmented regression is a widely used statistical tool to model ecological thresholds [20]. For the geometric purpose, the segmented regression statistically models the trends at groundwater levels [21]. Many other examples with piecewise linear terms have been studied in the literature including mortality studies [22], Stanford heart transplant data [23], and mouse leukemia [24]. Note that electric consumption data are essentially longitudinal/panel data. They exhibits very strong spatio-temporal dependencies [25]. To incorporate the correlation among the observations and the individual-specific heterogeneity from each transformer bank, we propose to use the random effect regression model [26], [27].

Note that it is not trivial to compute the maximum likelihood estimate (MLE) for the MESR, since its log-likelihood is not differentiable at breakpoints. Many standard computation algorithms such as the Newton-Raphson algorithm can not be used directly. In this paper, we propose a backfitting algorithm to combine the segmented regression estimation method proposed in [

28] and the mixed effect regression estimation method proposed in [29] to maximize the non-differentiable log-likelihood of the mixed effect segmented regression model. Note that the MLE is sensitive to outliers, which is the case of our electric consumption data collected in the Southern California. We further propose a robust estimation procedure for the considered model by extending the idea of the least trimmed squares (LTS) estimate [30]. Simulation studies demonstrate the effectiveness of the proposed estimation procedures. The LTS also provides much better forecasting performance than the standard MLE for the testing data when forecasting the hourly electric power consumption in Southern California.

The rest of the paper is organized as follows. Section II introduces the MESR and describes the proposed robust estimation algorithms. Section III illustrates the finite sample performance of the proposed method using a simulation study. In Section IV, we apply the new estimation procedure to forecast the hourly electric power demand in Southern California, USA. Section V concludes the paper with some discussions.

II. MESR and Proposed Robust Estimation Algorithm

Given a random sample {yij,xij,sij,zij}, i=1,2,,n,  j=1,2,,ni, where n is the number of subjects; ni is the number of observations collected for the ith subject; yij is the response variable; xij is the p-dimensional fixed-effect covariate; sij is the q-dimensional random-effect covariate; and zij is the breakpoint variable with the breakpoints {φk},k=1,2,,l. The proposed MESR model for the load baseline estimation can be written as:

yij=xijTϕ+sijTγi+β0zij+k=1lβk(zij-φk)++εij (1)

where ϕ is the regression coefficient for the random effect covariates; β0 and βk are the regression coefficients for the breakpoint variables; and the quantities with a subscript “+” means taking the positive part. For example, t+ equals t if t0 and 0 otherwise; γiNq(0,Σγ); εi=(εi1,εi2,,εini)~ Nni(0,Σε). In this paper, we assume that Σε=σ2Ini, where σ is the standard deviation of the error variable; and Ini is the ni×ni identity matrix. The MESR (1) consists of three parts: multiple linear regression xijTϕ, random effects sijTγi, and segmented regression β0zij+k=1lβk(zij-φk)+, which models the heterogeneous linear effect of zij on yij across different areas of the breakpoint variable. βk measures the difference of slopes (linear effect of zij on yij) before and after the breakpoint φk. We mainly focus on the situation where the segmented parts are fixed effects. But the proposed estimation procedure can be extended to the situation where the segmented parts also contain random effects [

31]-[33].

Suppose that yi=(yi1,yi2,,yini)T,  Xi=(xi1,xi2,,xini)T,  Si=(si1,si2,,sini)T, and Zi=(zi1*,zi2*,,zini*)T, where zij*=(zij,(zij-φ1)+,,(zij-φl)+)T. Then, (1) can be rewritten in matrix format as:

yi=Xiϕ+Siγi+Ziβ+εi (2)

where β=(β0,β1,,βl)T. Based on (2), E(yi|Xi,Zi,Si)=Xiϕ+Ziβ and var(yi|Xi,Zi,Si)=SiΣγSiT+σε2IniΣi, where E() and var() denote conditional expectation and conditional variances, respectively. Therefore, the random effectsγi make the observations within each correlated subject. The log-likelihood function of {yij,xij,sij,zij}, i=1,2,,n,  j=1,2,,ni is:

l(θ)i=1nln(|Σi|-1/2)-12i=1n(yi-Xiϕ-Ziβ)TΣi-1(yi-Xiϕ-Ziβ) (3)

where θ collects all the unknown parameters {ϕ,β,φ,σ,Σγ} in model (1). Unlike the traditional mixed effect model, maximizing (3) is not trivial since it is not differentiable at φk. We propose a backfitting algorithm to maximize (3) by alternately updating the segmented regression part and the linear mixed effect part. Next, we discuss in detail how to perform such two estimation procedures.

A. Estimation of Breakpoints

Given the estimate {ϕ,Σ̂i}, (1) will be a segmented regression model. The breakpoints and slopes in segmented regression can be estimated through many ways such as regression spline as well as Bayesian Markov chain Monte Carlo (MCMC) methods [

34], [35]. We will extend the linearization technique proposed in [28] to MESR (1) due to its simplicity of computation. According to the definition of breakpoints, the log-likelihood is not differentiable at φk. The breakpoint estimation can be performed via a first-order Taylor expansion of (zij-φk)+ around an initial value φk(0):

(zij-φk)+(zij-φk(0))++(φk-φk(0))(-1)I(zij>φk(0)) (4)

where I() is the indicator function. It equals 1 if the condition inside the parenthesis is true and 0 otherwise; (-1)I(zij>φk(0)) is the first derivative of (zij-φk)+ assessed in φk(0).

Let vij=((-1)I(zij>φ10), (-1)I(zij>φ20),, (-1)I(zij>φl(0)))T, z˜ij=(zij, (zij-φ10)+, (zij-φ20)+,, (zij-φl(0))+)T, and δk=βk(φk-φk(0)). Define Vi=(vi1,vi2,,vini)T, δ=(δ1,δ2,,δl)T, and Z˜i=(z˜i1,z˜i2,,z˜ini)T. Given the estimate {ϕ,Σ̂i}, the log-likelihood (3) can be simplified as:

l1(β,δ)-12i=1n(y˜i-Z˜iβ-Viδ)TΣ̂i-1(y˜i-Z˜iβ-Viδ) (5)

where y˜i=yi-Xiϕ. Therefore, β and δ in (5) can be easily found by weighted least squares estimate. Note that φk=(δk/βk)+φk(0). The iterative algorithm will terminate at δk=0. Given the estimate {ϕ,Σ̂i}, the algorithm to estimate the breakpoints is summarized in Algorithm 1.

Algorithm 1  : segmented regression estimation

1. Set initial values of all breakpoints φk(0) for k=1,2,...,l, and calculate the  variable Z˜i and the variable Vi

2. Repeat

3. Fit the regression model of y˜i on Z˜i and Vi by maximizing the log-like   lihood (5). Update the breakpoint with equation φk(s+1)=δk(s)/βk(s)+φk(s),  where φk(s) is the estimate of φk at the sth iteration

4. Until converge

B. Estimation of Mixed Effect Regression Models

In this sub-section, we discuss how to maximize (3) given the estimate β and φ, where φ=(φ1,φ2,,φl)T. Let Ẑi be the estimate of Zi after replacing φk by φ̂k. Plugging in the estimate {Ẑi,β} into the model (1), we can obtain:

yi*=Xiϕ+Siγi+εi (6)

where yi*=yi-Ẑiβ. Therefore, the model (6) is simply a traditional mixed effect regression model. We propose to employ the penalized weighted least square (PWLS) method to estimate the unknown parameters in (6). More details of computing the linear mixed effect regression model are given in [

29], which are also implemented in R package lme4.

C. Estimation of Mixed Effect Breakpoints

By combining the estimation procedures in Section II-A and II-B, we propose Algorithm 2 to maximize the log-likelihood (3) for the model (2).

Algorithm 2  : MLE

1. Set initial values of breakpoint φk(0) and β(0)

2. Repeat

3. Given current breakpoint values φk(s) and slopes β(s), calculate yi*(s)=yi-    Ẑi(s)β(s)

4. Fit mixed effect regression model by the PWLS estimation procedure    introduced in Section II-B to obtain covariance matrix Σr(s) and the     fixed effect regression estimate ϕ(s)

5. Calculate y˜i(s)=yi-Xiϕ(s)

6. Fit segmented regression model with y˜i(s) and Σr(s) using Algorithm 1 and  update segmented regression parameter estimate to φ(s+1) and β(s+1)

7. Until converge

D. Estimation of Robust Segmented Mixed Effect Regression

It is well known that the MLE is sensitive to outliers and might give misleading results when there are outliers in the data, which is the case for our collected electric power demand data in Southern California. More details will be given in Section IV. The issue of outlier is well recognized in the field of load forecasting, and is typically solved using robust regression algorithms. For example, [

36] considers Huber’s robust regression; [37] advocates the use of L1 regression model. In the statistics literature, many robust regression methods have been proposed, although not all of them are investigated in the load forecasting literature, e.g., M-estimates [38], R-estimates [39], least median of squares (LMS) estimates [40], LTS estimates [30], S-estimates [41], MM-estimates [42], robust and efficient weighted least squares estimator (REWLSE) [43], mean shift method [44]-[46]. Reference [47] provides a good review of some commonly used robust regression estimation methods. Next, we propose the idea of least trimmed squares estimate [30] to provide a robust estimate of the model (1). Given an integer trimming parameter hN, where N is the total number of training samples, the least trimmed squares minimize the sum of the smallest h squared residuals with objective function:

k=1hr(k)2 (7)

where r(k)2[r(1)2,r(N)2] are the ordered squared residuals {yij-ŷij,i=1,2,,n,  j=1,2,,ni} with ŷij=xijTϕ+sijTγi+β̂0zij+k=1lβ̂k(zij-φ̂k)+. The robust MESR estimation based on LTS is described in Algorithm 3.

Algorithm 3  : LTS

1. A subsample of size h* is selected randomly from the data and then the  model (1) is fitted to the subsample using Algorithm 2 of Section II-   C. Let θ(0) be the initial parameter estimate

2. Repeat

3. Based on current model parameter estimate θ(s), forecast N responses    ŷij(s), and calculate the residuals rij(s)=yij-ŷij(s). Rank the squared residu   als from smallest to largest and select the first h observations that cor  respond to the smallest h squared residuals

4. Fit the model (1) to the subsample selected in Step 3 using Algorithm 2  and get the model parameter estimate θ(s+1)

5. Until converge

To increase the chance of finding the global minimum, one might run Algorithm 3 from many random subsamples and choose the solution which has the smallest trimmed squares. Let r be the dimension of θ. The initial sample size h* can be any small number larger than r as long as the initial parameter estimate θ(0) can be computed based on the subsample. The maximum breakpoint of LTS [

48], i.e., the smallest fraction of contamination that can cause the estimator to take arbitrary large values, is 0.5, which is attained when h=[(N+r+1)/2], where [·] means rounding to the nearest integer. If we have the prior that the proportion of outliers is no more than α, we can also set h=[N(1-α)+1], where α is he trimming proportion. In practice, one might try several α values to evaluate LTS and check how the estimate behaves with different trimming proportions [49]-[51]. In our real data application, we use a validation data to choose the trimming proportion.

III. Simulation Study

In this section, we use a simulation study to illustrate the performance of the proposed estimation procedure for the MESR. All the computations are implemented in R. We use R package segmented::segmented [

52] for breakpoint estimation and lme4::lmer [53] for random-effect estimation. We generate observations {yij,xij,sij,zij,i=1,2,,n,j=1,2,,ni} from the following model:

yij=ϕ0+ϕ1xij+γi0+sijγi1+β0zij+β1(zij-φ1)++β2(zij-φ2)++εij (8)

where xijPois(10), the Poisson distribution with rate parameter 10; sijUniform(5,10), the uniform distribution with lower and upper limits 5 and 10, respectively. The breakpoint variables zij are ni arithmetic sequence ranging in (0,20), εijN(0,0.5), γi0γi1N00,σr12ρσr1σr2ρσr1σr2σr22, with σr1=σr2=1,  ρ=0.5,  i=1,2,,n,  j=1,2,,ni. The other parameters in (8) are set to be: ϕ0=-2.5; ϕ1=1.5; β0=1.5; β1=1.5; β2=-2.5; φ1=6.67; and φ2=13.33.

We consider the following four simulation scenarios:

1) n=50, and ni is randomly chosen in (90,110).

2) n=50, and ni is randomly chosen in (18,22).

3) n=200, and ni is randomly chosen in (450,550).

4) n=200, and ni is randomly chosen in (18,22).

First, we utilize (8) to simulate the dataset without outliers. The model is estimated using MLE. In Tables I-IV, we report the mean, median, and standard deviation (SD) for the estimates of fixed effect regression parameters, breakpoints, segmented regression parameters, and random effect covariance matrix, respectively, based on 500 replications.

Table I Simulation Results of Fixed Effect Parameter Estimates by MLE for Situation Without Outliers
Parameterϕ0=-2.5ϕ1=1.5
MeanMedianSDMeanMedianSD
n=50, niU(90,110) -2.505 -2.508 0.125 1.500 1.499 0.002
n=50, niU(18,22) -2.500 -2.502 0.169 1.500 1.500 0.005
n=200, niU(450,550) -2.498 -2.497 0.064 1.500 1.500 0.001
n=200, niU(18,22) -2.491 -2.498 0.125 1.500 1.500 0.004
Table II Simulation Results of Breakpoints Estimates by MLE for Situation Without Outliers
Parameterφ1=6.667φ2=13.333
MeanMedianSDMeanMedianSD
n=50, niU(90,110) 6.667 6.666 0.022 13.334 13.332 0.012
n=50, niU(18,22) 6.667 6.664 0.053 13.334 13.333 0.034
n=200, niU(450,550) 6.667 6.667 0.006 13.333 13.333 0.003
n=200, niU(18,22) 6.667 6.666 0.023 13.332 13.332 0.023
Table III Simulation Results of Breakpoint Slope Estimates by MLE for Situation Without Outliers
Parameterβ0=1.5β1=1.5β2=-2.5
MeanMedianSDMeanMedianSDMeanMedianSD
n=50, niU(90,110) 1.499 1.500 0.006 1.499 1.499 0.008 -2.499 -2.499 0.008
n=50, niU(18,22) 1.499 1.500 0.013 1.502 1.502 0.020 -2.502 -2.502 0.019
n=200, niU(450,550) 1.500 1.500 0.001 1.500 1.500 0.001 -2.500 -2.500 0.002
n=200, niU(18,22) 1.499 1.499 0.010 1.502 1.501 0.014 -2.501 -2.502 0.014
Table IV Simulation Results of Random Effect Estimates BY MLE for Situation Without Outliers
Parameterσr1=1σr2=1ρ=0.5
MeanMedianSDMeanMedianSDMeanMedianSD
n=50, niU(90,110) 0.969 0.959 0.108 0.978 0.976 0.101 0.504 0.503 0.121
n=50, niU(18,22) 0.988 0.988 0.141 0.993 0.990 0.096 0.490 0.501 0.150
n=200, niU(450,550) 0.990 0.991 0.050 0999 0.999 0.056 0.499 0.499 0.001
n=200, niU(18,22) 0.987 0.990 0.101 0.997 1.000 0.068 0.500 0.506 0.094

From Tables I-IV, we can see that the proposed MLE algorithm performs well when the dataset does not contain any outliers. Also, when the sample size increases, the SD of each parameter estimate decreases.

Next, we simulate the dataset with outliers based on model (8). The model parameters are estimated by both Algorithm 2 and Algorithm 3. In order to check how robust each estimate is against the outliers, we randomly choose 5% of each simulated data and add 30 to the response Y (the range of Y is (15,69)) and 10 to the value of X (the range of X is (0,10)). When applying LTS, we need to choose the trimming proportion α, which has long been a difficult problem. However, LTS can provide a robust model estimate as long as the proportion of outliers is less than α but with low efficiency if α is too large. Usually, a conservative choice of α is recommended in practice. In this paper, we report the results for both α=0.1 and α=0.2. Note that the results of LTS will be better if α=0.05 is used.

Tables V-VIII present the simulation results for the estimates of fixed effect regression parameters, breakpoints, segmented regression parameters, and random effect covariance matrix, respectively, based on 200 replications. From the tables, it is observed that the standard MLE fails to provide the reasonable estimates of fixed effect regression parameters and random effect covariance matrix when the data contain 5% outliers while LTS can provide reasonable estimates for all parameters with both α=0.1 and α=0.2.

Table V Simulation Results of Fixed Effect Estimates by MLE and LTS with Different α Levels for Situation with Outliers
ParameterMethodϕ0=-2.5ϕ1=1.5
MeanMedianSDMeanMedianSD
n=50, niU(90,110) MLE 3.332 3.334 0.663 1.019 1.017 0.026
LTS α=0.2 -2.535 -2.539 0.283 1.500 1.500 0.004
LTS α=0.1 -2.521 -2.522 0.210 1.500 1.500 0.003
n=50, niU(18,22) MLE 3.293 3.298 1.069 1.490 1.487 0.045
LTS α=0.2 -2.471 -2.438 0.616 1.500 1.500 0.015
LTS α=0.1 -2.496 -2.485 0.539 1.499 1.500 0.014
n=200, niU(450,550) MLE 3.310 3.314 0.180 1.017 1.017 0.006
LTS α=0.2 -2.502 -2.507 0.130 1.500 1.500 0.001
LTS α=0.1 -2.502 -2.505 0.089 1.500 1.500 0.001
n=200, niU(18,22) MLE 3.359 3.417 1.049 1.493 1.496 0.046
LTS α=0.2 -2.464 -2.437 0.290 1.500 1.500 0.007
LTS α=0.1 -2.488 -2.491 0.267 1.500 1.499 0.007
Table VI Simulation Results of Breakpoint Estimates by MLE and LTS with Different α Levels for Situation with Outliers
ParameterMethodφ1=6.667φ2=13.333
MeanMedianSDMeanMedianSD
n=50, niU(90,110) MLE 6.647 6.679 0.546 13.322 13.317 0.324
LTS α=0.2 6.670 6.380 0.755 13.351 13.342 0.519
LTS α=0.1 6.670 6.677 0.415 13.323 13.332 0.283
n=50, niU(18,22) MLE 6.319 6.317 0.204 12.656 12.651 0.139
LTS α=0.2 6.425 6.445 0.185 13.027 13.037 0.237
LTS α=0.1 6.603 6.616 0.170 13.116 13.117 0.145
n=200, niU(450,550) MLE 6.671 6.673 0.172 13.333 13.337 0.098
LTS α=0.2 6.670 6.685 0.293 13.325 13.330 0.166
LTS α=0.1 6.670 6.677 0.166 13.328 13.330 0.094
n=200, niU(18,22) MLE 6.307 6.316 0.209 12.643 12.649 0.144
LTS α=0.2 6.427 6.433 0.082 13.030 13.039 0.118
LTS α=0.1 6.606 6.610 0.076 13.190 13.116 0.067
Table VII Simulation Results of Breakpoint Slope Estimates by MLE and LTS with Different α Levels for Situation with Outliers
ParameterMethodβ0=1.5β1=1.5β2=-2.5
MeanMedianSDMeanMedianSDMeanMedianSD
n=50, niU(90,110) MLE 1.493 1.494 0.109 1.521 1.518 0.154 -2.516 -2.522 0.163
LTS α=0.2 1.501 1.502 0.123 1.510 1.500 0.166 -2.519 -2.505 0.168
LTS α=0.1 1.506 1.503 0.070 1.505 1.501 0.083 -2.509 -2.507 0.071
n=50, niU(18,22) MLE 1.590 1.590 0.039 1.544 1.536 0.074 -2.610 -2.602 0.135
LTS α=0.2 1.556 1.556 0.027 1.584 1.564 0.102 -2.602 -2.580 0.109
LTS α=0.1 1.575 1.576 0.029 1.557 1.557 0.069 -2.590 -2.588 0.070
n=200, niU(450,550) MLE 1.500 1.502 0.035 1.500 1.499 0.040 -2.500 -2.502 0.042
LTS α=0.2 1.500 1.502 0.054 1.502 1.500 0.051 -2.499 -2.502 0.054
LTS α=0.1 1.495 1.497 0.022 1.506 1.505 0.026 -2.506 -2.499 0.028
n=200, niU(18,22) MLE 1.587 1.586 0.037 1.549 1.544 0.075 -2.605 -2.611 0.138
LTS α=0.2 1.574 1.574 0.014 1.557 1.556 0.048 -2.499 -2.502 0.054
LTS α=0.1 1.553 1.553 0.014 1.571 1.566 0.035 -2.506 -2.499 0.028
Table VIII Simulation Results of Random Effect Estimates by MLE and LTS with Different α Levels for Situation with Outliers
ParameterMethodσr1=1σr2=1ρ=0.5
MeanMedianSDMeanMedianSDMeanMedianSD
n=50, niU(90,110) MLE 0.518 0.379 0.582 1.019 1.026 0.117 0.843 0.999 0.612
LTS α=0.2 1.000 0.993 0.097 0.993 1.003 0.097 0.509 0.510 0.115
LTS α=0.1 0.992 0.998 0.111 0.994 0.999 0.097 0.511 0.519 0.107
n=50, niU(18,22) MLE 0.529 0.529 0.111 0.533 0.523 0.318 0.815 0.816 0.158
LTS α=0.2 0.823 0.825 0.121 0.779 0.779 0.066 0.498 0.485 0.066
LTS α=0.1 0.746 0.744 0.114 1.080 1.073 0.079 0.494 0.493 0.079
n=200, niU(450,550) MLE 0.769 0.897 0.435 1.013 1.015 0.056 0.619 0.592 0.272
LTS α=0.2 0.992 0.990 0.048 0.998 0.998 0.047 0.495 0.496 0.050
LTS α=0.1 0.992 0.989 0.048 0.998 0.997 0.047 0.496 0.496 0.049
n=200, niU(18,22) MLE 0.535 0.531 0.118 0.553 0.550 0.315 0.787 0.787 0.148
LTS α=0.2 0.884 0.848 0.060 0.781 0.782 0.033 0.476 0.473 0.085
LTS α=0.1 0.763 0.763 0.056 1.082 1.082 0.038 0.493 0.493 0.071

IV. Real Data Analysis

In this Section, we illustrate the application of the proposed estimation procedure of MESR to forecast the electric load in Southern California, USA.

A. Data

The electric consumption data are aggregated to fifty-two 220 kV transformer banks from December 31, 2012 to November 1, 2013 in Southern California Edison’s service territory. The task is to build a forecasting model for the total electricity consumption of residential customers at each 220 kV transformer bank on weekdays.

The data cleansing of the raw dataset is done in two steps. First, we exclude daily observations for commercial customers and remove zero-usage records from the electric consumption data file. Second, we add daily temperature and humidity information for each bank according to its zipcodes.

The response variable Ut is the aggregated customers’ hourly electricity consumption recorded by the smart meters. We use the following transformation to make it comparative among 52 subgroups:

Lt=lnUtAtotal (9)

In (9), the transformed response variable is derived as follows. First, we divide the aggregated usage by the total air conditioning tonnage of a residential customer. The customer is in the air conditioning cycling program. Second, we apply the log-transformation. The electricity consumption is divided by the total AC tonnage because the latter determines the numerical magnitude of the load measurements. Since the new response variable represents the electricity consumption level per unit of air conditioning tonnage, the effects of other explanatory variables are comparable among different transformer banks, which allows to use common slopes to simplify the model. Figure 1 shows the transformed response variable for a few sample banks over 5 days.

Fig. 1 Data visualization of transformed response variable Lt (shown in the figure as lnUsageper,t) versus transformer bank indicator variable B (shown in the figure as A_Bank) and time.

The explanatory variables are collected and listed in Table IX. A two-day lagged electricity consumption variable is selected rather than the one-day lagged variable, because the estimates of the load impact of the demand response resources need to be submitted to the independent system operator one day before the actual operations. The average temperature and humidity are included because they are highly correlated with the electricity consumption. The duty cycle option variable indicates the percentage of participation rate of air conditioning load in the program, and has a substantial influence over the load impact for air conditioning cycling demand response program. The transformer bank indicator variable B is chosen as the random effect, because it contains information about the data from different geographic areas, which is thus expected to be heterogeneous with different baselines. A random effect model, assuming that B are sampled from a larger population, is able to incorporate the individual-specific heterogeneity of B while allowing to borrow information across B with much smaller number of parameters (compared with one fixed effect parameter for each of 52 values of B). In addition, this allows to extend the model to additional transformer banks.

Table Ix Seven Explanatory Variables in Real Data Application
NotationExplanatory variable
Ltlag Base e log of two-day lagged electricity consumption
Tt Daily average ambient temperature
Ht Humidity of the day
Hrt Hour/time of the day
At Duty cycle percentage
Atotal Total AC tonnage under the same transformer bank
B Indicator variable of transformer bank

Note:   B is the random effect variable; and Hrt is the segmented variable.

In this paper, the training dataset is chosen as the samples in the first 205 observed weekdays for all transformer banks. The testing dataset consists of the samples from the 10 observed weekdays immediately following the training dataset. The total number of testing sample is 12480.

B. Model and Result

We apply the proposed estimation procedure of MESR to forecast the electricity consumption. Figure 2 shows the hourly trend for average electric consumption and its forecasting results for a typical day. Two breakpoints locate between 02:00-3:00 a.m. and 06:00-08:00 p.m., respectively.

Fig. 2 Hourly trend between average hourly electric consumption Lt (shown in the figure as lnUsageper,t) with variable Hrt (shown in the figure as t) averaged over all transformer bank B.

It seems that the curve corresponding to the actual consumption (after the log transformation) indicates three segments with two breakpoints. The first breakpoint locates between 02:00 a.m. and 03:00 a.m., and the second breakpoint lies between 06:00 p.m. and 08:00 p.m.. We have also tried the model with three breakpoints (one more breakpoint in the middle segmented area), but the BIC for two breakpoints is smaller. The forecasting curve in Fig. 2 corresponds to the forecasted values across all transformer banks for the same forecasting day. As shown in Fig. 2, we can see that the proposed model can forecast the actual values well and the fitted values also match the breakpoint relationship.

The observations collected over time within the same transformer bank are correlated. The auto-correlation function of the observation time series of each transformer bank is shown in Fig. 3, where the time series demonstrates strong auto-correlation patterns.

Fig. 3 Auto-correlation function of observation time series of each transformer bank.

Ignoring such correlation by fixed effect model would result in inefficient estimates and lose forecasted power. To incorporate such correction, the transformer bank is treated as random effects. Using a random effect model can also drastically reduce the number of unknown parameters in the model, and thus lead to more efficient parameter estimates.

Next, we describe the construction of the fixed effects. The first six explanatory variables described in Table IX, along with their two-way and three-way interactions, are considered as potential explanatory variables. The model variables are selected via least absolute shrinkage and selection operator (LASSO) regression method [

54], which improves the forecasting accuracy and interpretability of the model. The final selected MESR is expressed as:

Lt=B+Hrt+(Hrt-φ1)++(Hrt-φ2)++Tt+Ht+At+Ltlag+[Hrt+(Hrt-φ1)++(Hrt-φ2)+]Tt+[Hrt+(Hrt-φ2)+]Ht+[Hrt+(Hrt-φ1)++(Hrt-φ2)+]At+(Tt+Ht)At+TtHt (10)

where BN(0,σA_Bank2) is the normal distribution with mean 0 and variance σA_Bank2. We apply both MLE and LTS algorithms to estimate the model and compare their forecasting performances. Since the true proportion of outliers is unknown, three proportions α=0.15,0.10,0.05 are selected for LTS to fit the model (10). In addition, the proposed algorithms are compared with two benchmarks, i.e., the multiple linear regression model [

55] and the cooperative quantile regression forest (QRF)/multivariate quantile regression (MQR) method [12]. We set the training/testing data partitioning of these benchmarks to be the same as the setup discussed in Section IV-A. We compare their performances by mean absolute percentage error (MAPE) and root mean squared error (RMSE) on the testing dataset. The formula for MAPE and RMSE are given by MAPE=1Nyij-ŷijyij and RMSE=1N(yij-ŷij)2, where N is the total number of testing samples. For better comparison, we also report three quartiles of absolute percentage error (APE) and absolute error (AE).

From Tables X and XI, the proposed robust segmented mixed effect regression model (LTS in the tables) outperforms the multiple linear regression model [

55] and the QRF/MQR model [55]. The improvements are more significant in terms of the RMSE and AE. The reason why the LTS has a slightly higher MAPE compared with the QRF/MQR baseline is that the LTS produces a bit larger estimation errors for some transformer banks with lighter loading. This results in a higher MAPE due to the small denominator. Within the LTS method, each evaluation criterion reaches the lowest value when α=0.1 and is much smaller than that of MLE. The breakpoint estimates shown in Table XII confirm the locations of the breakpoints shown in Fig. 2.

Table X Forecasting Results Evaluated by APE for Last 10 Days in October 2013 with MLE Compared with LTS with Different α Levels
MethodMAPE (%)25% APE (%)50% APE (%)75% APE (%)
GEFcom2012 [55] 17.97 5.95 12.06 19.80
QRF/MQR [12] 10.15 2.36 5.25 10.70
MLE 13.94 4.55 8.48 13.66
LTS α=0.05 11.08 2.78 5.45 9.37
LTS α=0.1 10.75 2.46 4.95 8.77
LTS α=0.15 10.88 2.55 5.10 9.01
Table XI Forecasting Results Evaluated by RMSE for Last 10 Days in October 2013 with MLE Compared with LTS with Different α Levels
MethodRMSE (kWh)25% AE (kWh)50% AE (kWh)75% AE (kWh)
GEFcom2012 [55] 1305.65 26.92 168.96 684.95
QRF/MQR [12] 742.73 10.01 67.77 298.06
MLE 672.88 5.78 42.19 164.08
LTS α=0.05 449.68 3.98 27.01 97.45
LTS α=0.1 414.42 3.65 24.73 86.33
LTS α=0.15 420.00 4.75 25.26 88.84
Table XII Breakpoint Estimation for Electric Power Demand Dataset via MLE and LTS with Different α Levels
Methodφ1 (hour)φ2 (hour)
MLE 2.64 20.47
LTS α=0.05 2.27 20.47
LTS α=0.1 2.27 20.77
LTS α=0.15 2.27 20.47

Table XIII displays the fixed effect and breakpoints slope estimates for LTS with α=0.1. The variance estimates of the random effects and the error term are 0.0015 and 0.0052, respectively.

Table XIII Parameter Estimates for Electric Power Demand Dataset Evaluated by LTS with α=0.1 and Significance Level of 0.05
ParameterEstimatep-value
Intercept 1.063×100 <0.0001
Hrt 2.100×10-2 0.0003
(Hrt-φ1)+ -4.713×10-2 <0.0001
(Hrt-φ2)+ 6.665×10-2 <0.0001
Tt -2.198×10-2 <0.0001
Ht 2.556×10-2 0.0010
At 8.838×10-1 <0.0001
Ltlag -2.223×100 <0.0001
HrtHt -2.493×10-5 <0.0001
(Hrt-φ2)+Ht 2.017×10-4 <0.0001
HrtTt -8.165×10-4 <0.0001
(Hrt-φ1)+Tt 1.018×10-3 <0.0001
(Hrt-φ2)+Tt -1.606×10-3 <0.0001
HrtAt 2.012×10-2 0.0002
TtHt -6.684×10-5 <0.0001
TtAt 2.763×10-2 <0.0001
HtAt 2.763×10-2 <0.0001

According to Table XIII, all the parameters are significant with α=0.05. When calculating the p-values, Satterthwaite method is used for approximating degrees of freedom of the t-distribution for the t-test statistics. The variable Hrt and its breakpoints have both positive and negative slopes, and the signs match the curves in Fig. 2. Also, there is a sensible positive relationship between At and electric load Ut.

V. Conclusion

In this paper, we propose a robust segmented mixed effect regression model to forecast the electric load baseline in Southern California. When estimating unknown parameters, we propose a backfitting algorithm by combining the ideas of the penalized least square method for random-effects regression model and the linearization technique [

28] for segmented regression. In addition, we extend the idea of LTS to MESR to provide a robust model estimate. Both simulation study and real data application demonstrate the effectiveness of the proposed new estimation procedures.

Since the model is built up with hourly data, we could also aggregate the data and construct a daily electric load model. In this paper, we assume that the number of breakpoints is known. If the number of breakpoints is unknown, one could apply the selection techniques proposed by [

56]-[59] to our model. In (1), all random effects are assumed to have a multivariate normal distribution. It will be interesting to extend the work of [60] to relax the normality assumption of the random effects in (1). In addition, for LTS, although an conservation α or serval α values can be used in practice, it requires some careful tuning of α so that LTS can have both robustness and high efficiency. Since the response variable by AC tonnage is normalized, it is expected that there is not too much heterogeneity for the effects of other variables after controlling the heterogeneity of different banks. But it is worthy of more research trying some more complicated models such as random slopes for all other variables and their interactions as well as nonparametric regression for hour, humidity, or temperature variables.

REFERENCES

1

R. Adib, H. Murdock, F. Appavou et al. (2016, Dec.). Renewables 2016 global status report. Global Status Report Renewable Energy Policy Network for the 21st Century (REN21). [Online]. Available: https://www.ren21.net/wp-content/uploads/2019/05/REN21_GSR2016_FullReport_en_11.pdf [Baidu Scholar

2

Q. Wang, C. Zhang, Y. Ding et al., “Review of real-time electricity markets for integrating distributed energy resources and demand response,” Applied Energy, vol. 138, pp. 695-706, Jan. 2015. [Baidu Scholar

3

S. Nolan and M. O’Malley, “Challenges and barriers to demand response deployment and evaluation,” Applied Energy, vol. 152, pp. 1-10, Aug. 2015. [Baidu Scholar

4

T. Wei, Q. Zhu, and N. Yu, “Proactive demand participation of smart buildings in smart grid,” IEEE Transactions on Computers, vol. 65, no. 5, pp. 1392-1406, May 2016. [Baidu Scholar

5

N. Yu, T. Wei, and Q. Zhu, “From passive demand response to proactive demand participation,” in Proceedings of 2015 IEEE International Conference on Automation Science and Engineering (CASE), Gothenburg, Sweden, pp. 1300-1306, Aug. 2015. [Baidu Scholar

6

N. Charlton and C. Singleton, “A refined parametric model for short term load forecasting,” International Journal of Forecasting, vol. 30, no. 2, pp. 364-368, Apr. 2014. [Baidu Scholar

7

S. B. Taieb, J. W. Taylor, and R. J. Hyndman, “Hierarchical probabilistic forecasting of electricity demand with smart meter data,” Journal of the American Statistical Association, vol. 6, pp. 1-17, Mar. 2020. [Baidu Scholar

8

B. Goehry, Y. Goude, P. Massart et al., “Aggregation of multi-scale experts for bottom-up load forecasting,” IEEE Transactions on Smart Grid, vol. 11, no. 3, pp. 1895-1904, May 2020. [Baidu Scholar

9

Y. Chen, P. Xu, Y. Chu et al., “Short-term electrical load forecasting using the support vector regression (SVR) model to calculate the demand response baseline for office buildings,” Applied Energy, vol. 195, pp. 659-670, Jun. 2017. [Baidu Scholar

10

K. Chen, K. Chen, Q. Wang et al., “Short-term load forecasting with deep residual networks,” IEEE Transactions on Smart Grid, vol. 10, no. 4, pp. 3943-3952, Jul. 2019. [Baidu Scholar

11

L. Sehovac and K. Grolinger, “Deep learning for load forecasting: sequence to sequence recurrent neural networks with attention,” IEEE Access, vol. 8, no. 8, pp. 36411-36426, Feb. 2020. [Baidu Scholar

12

A. Bracale, P. Caramia, P. De Falco et al., “Multivariate quantile regression for short-term probabilistic load forecasting,” IEEE Transactions on Power Systems, vol. 35, no. 1, pp. 628-638, Jan. 2020. [Baidu Scholar

13

Z. Cao, C. Wan, Z. Zhang et al., “Hybrid ensemble deep learning for deterministic and probabilistic low-voltage load forecasting,” IEEE Transactions on Power Systems, vol. 35, no. 3, pp. 1881-1897, May 2020. [Baidu Scholar

14

P. I. Feder, “The log likelihood ratio in segmented regression,” The Annals of Statistics, vol. 3, no. 1, pp. 84-97, Jan. 1975. [Baidu Scholar

15

R. Beckman and R. Cook, “Testing for two-phase regressions,” Technometrics, vol. 21, no. 1, pp. 65-69, Feb. 1979. [Baidu Scholar

16

J. E. Ertel and E. B. Fowlkes, “Some algorithms for linear spline and piecewise multiple linear regression,” Journal of the American Statistical Association, vol. 71, no. 355, pp. 640-648, Apr. 1976. [Baidu Scholar

17

A. Tishler and I. Zang, “A maximum likelihood method for piecewise regression models with a continuous dependent variable,” Journal of the Royal Statistical Society: Series C (Applied Statistics), vol. 30, no. 2, pp. 116-124, Jun. 1981. [Baidu Scholar

18

A. K. Wagner, S. B. Soumerai, F. Zhang et al., “Segmented regression analysis of interrupted time series studies in medication use research,” Journal of Clinical Pharmacy and Therapeutics, vol. 27, no. 4, pp. 299-309, Aug. 2002. [Baidu Scholar

19

H.-J. Kim, M. P. Fay, E. J. Feuer et al., “Permutation tests for joinpoint regression with applications to cancer rates,” Statistics in Medicine, vol. 19, no. 3, pp. 335-351, Jan. 2000. [Baidu Scholar

20

J. D. Toms and M. L. Lesperance, “Piecewise regression: a tool for identifying ecological thresholds,” Ecology, vol. 84, no. 8, pp. 2034-2041, Aug. 2003. [Baidu Scholar

21

Q. Shao and N. Campbell, “Applications: modelling trends in groundwater levels by segmented regression with constraints,” Australian & New Zealand Journal of Statistics, vol. 44, no. 2, pp. 129-141, Dec. 2002. [Baidu Scholar

22

A. E. Kunst, C. W. Looman, and J. P. Mackenbach, “Outdoor air temperature and mortality in the netherlands: a time-series analysis,” American Journal of Epidemiology, vol. 137, no. 3, pp. 331-341, Feb. 1993. [Baidu Scholar

23

N. Molinari, J.-P. Daurès, and J.-F. Durand, “Regression splines for threshold selection in survival data analysis,” Statistics in Medicine, vol. 20, no. 2, pp. 237-247, Jan. 2001. [Baidu Scholar

24

R. Rigby and D. Stasinopoulos, “Detecting break points in the hazard function in survival analysis,” Statistical Modelling, vol. 1992, pp. 303-311, Dec. 1992. [Baidu Scholar

25

J. Shi, Y. Liu, and N. Yu, “Spatio-temporal modeling of electric loads,” in Proceedings of North American Power Symposium (NAPS), Morgantown, USA, Sept. 2017, pp. 1-6. [Baidu Scholar

26

N. M. Laird and J. H. Ware, “Random-effects models for longitudinal data,” Biometrics, vol. 38, no. 4, pp. 963-974, Dec. 1982. [Baidu Scholar

27

P. Diggle, P. J. Diggle, P. Heagerty et al., Analysis of Longitudinal Data. Oxford: Oxford University Press, 2002. [Baidu Scholar

28

V. M. Muggeo, “Estimating regression models with unknown break-points,” Statistics in Medicine, vol. 22, no. 19, pp. 3055-3071, Sept. 2003. [Baidu Scholar

29

D. Bates. (2020, Nov.). Computational methods for mixed models. [Online]. Available: https://cran.r-project.org/web/packages/lme4/vignettes/Theory.pdf [Baidu Scholar

30

P. J. Rousseeuw, “Least median of squares regression,” Journal of the American Statistical Association, vol. 79, no. 388, pp. 871-880, Jan. 1984. [Baidu Scholar

31

V. M. Muggeo, D. C. Atkins, R. J. Gallop et al., “Segmented mixed models with random changepoints: a maximum likelihood approach with application to treatment for depression study,” Statistical Modelling, vol. 14, no. 4, pp. 293-313, May 2014. [Baidu Scholar

32

M. Muggeo. (2016, Feb.). Segmented mixed models with random changepoints in R. [Online]. Available: https://www.researchgate.net/publication/292629179 [Baidu Scholar

33

A. Dominicus, S. Ripatti, N. L. Pedersen et al., “A random change point model for assessing variability in repeated measures of cognitive function,” Statistics in Medicine, vol. 27, no. 27, pp. 5786-5798, Nov. 2008. [Baidu Scholar

34

T. Hastie and R. Tibshirani, Generalized Additive Models. Virginia Beach: Chapman and Hall/CRC, 1990. [Baidu Scholar

35

C. Gössl and H. Küchenhoff, “Bayesian analysis of logistic regression with an unknown change point and covariate measurement error,” Statistics in Medicine, vol. 20, no. 20, pp. 3109-3121, Oct. 2001. [Baidu Scholar

36

J. Jiao, Z. Tang, P. Zhang et al., “Ensuring cyberattack-resilient load forecasting with a robust statistical method,” in Proceedings of 2019 IEEE PES General Meeting (PESGM), Atlanta, USA, pp. 1-5, Aug. 2019. [Baidu Scholar

37

J. Luo, T. Hong, and S. Fang, “Robust regression models for load forecasting,” IEEE Transactions on Smart Grid, vol. 10, no. 5, pp. 5397-5404, Sept. 2019. [Baidu Scholar

38

P. J. Huber, Robust Statistics. New York: John Wiley and Sons, 1981. [Baidu Scholar

39

L. A. Jaeckel, “Estimating regression coefficients by minimizing the dispersion of the residuals,” The Annals of Mathematical Statistics, vol. 43, no. 5, pp. 1449-1458, Oct. 1972. [Baidu Scholar

40

A. F. Siegel, “Robust regression using repeated medians,” Biometrika, vol. 69, no. 1, pp. 242-244, Apr. 1982. [Baidu Scholar

41

P. Rousseeuw and V. Yohai, “Robust regression by means of s-estimators,” in Robust and Nonlinear Time Series Analysis. Berlin: Springer, 1984, pp. 256-272. [Baidu Scholar

42

V. J. Yohai, “High breakdown-point and high efficiency robust estimates for regression,” The Annals of Statistics, vol. 15, no. 2, pp. 642-656, Jun. 1987. [Baidu Scholar

43

D. Gervini and V. J. Yohai, “A class of robust and fully efficient regression estimators,” The Annals of Statistics, vol. 30, no. 2, pp. 583-616, Apr. 2002. [Baidu Scholar

44

Y. She and A. B. Owen, “Outlier detection using nonconvex penalized regression,” Journal of the American Statistical Association, vol. 106, no. 494, pp. 626-639, Jun. 2011. [Baidu Scholar

45

Y. Lee, S. N. MacEachern, and Y. Jung, “Regularization of case-specific parameters for robustness and efficiency,” Statistical Science, vol. 27, no. 3, pp. 350-372, Aug. 2012. [Baidu Scholar

46

C. Yu, K. Chen, and W. Yao, “Outlier detection and robust mixture modeling using nonconvex penalized likelihood,” Journal of Statistical Planning and Inference, vol. 164, no. 1, pp. 27-38, Sept. 2015. [Baidu Scholar

47

C. Yu and W. Yao, “Robust linear regression: a review and comparison,” Communications in Statistics-Simulation and Computation, vol. 46, no. 8, pp. 6261-6282, Mar. 2017. [Baidu Scholar

48

D. L. Donoho and P. J. Huber, “The notion of breakdown point,” in A Festschrift for Erich L. Lehmann. Belmon: Wadsworth International Group, 1983. [Baidu Scholar

49

N. Neykov, P. Filzmoser, R. Dimova et al., “Robust fitting of mixtures using the trimmed likelihood estimator,” Computational Statistics & Data Analysis, vol. 52, no. 1, pp. 299-308, Sept. 2007. [Baidu Scholar

50

M. Li, S. Xiang, and W. Yao, “Robust estimation of the number of components for mixtures of linear regression models,” Computational Statistics, vol. 31, no. 4, pp. 1539-1555, Aug. 2016. [Baidu Scholar

51

L. Yang, S. Xiang, and W. Yao, “Robust fitting of mixtures of factor analyzers using the trimmed likelihood estimator,” Communications in Statistics–Simulation and Computation, vol. 46, no. 2, pp. 1280-1291, Feb. 2017. [Baidu Scholar

52

V. M. Muggeo, “Segmented: an R package to fit regression models with broken-line relationships,” R News, vol. 8, no. 1, pp. 20-25, Jan. 2008. [Baidu Scholar

53

D. Bates, M. Mächler, B. Bolker et al., “Fitting linear mixed-effects models using lme4,” Journal of Statistical Software, vol. 67, no. 1, pp. 1-48, Oct. 2015. [Baidu Scholar

54

R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society: Series B (Methodological), vol. 58, no. 1, pp. 267-288, Jan. 1996. [Baidu Scholar

55

T. Hong, P. Pinson, and S. Fan, “Global energy forecasting competition 2012,” International Journal of Forecasting, vol. 30, no. 2, pp. 357-363, Apr. 2014. [Baidu Scholar

56

J. Liu, S. Wu, and J. V. Zidek, “On segmented multivariate regression,” Statistica Sinica, vol. 7, no. 2, pp. 497-525, Apr. 1997. [Baidu Scholar

57

M. S. Ben Aïssa, M. Boutahar, and J. Jouini, “Bai and Perron’s and spectral density methods for structural change detection in the US inflation process,” Applied Economics Letters, vol. 11, no. 2, pp. 109-115, Feb. 2004. [Baidu Scholar

58

B. Strikholm and T. Teräsvirta, “Determing the number of regimes in a threshold autoregressive model using smooth transition autoregressions,” Tech. Rep. SSE/EFI Working Paper Series in Economics and Finance, Jan. 2005. [Baidu Scholar

59

R. Prodan, “Potential pitfalls in determining multiple structural changes with an application to purchasing power parity,” Journal of Business & Economic Statistics, vol. 26, no. 1, pp. 50-65, Jan. 2008. [Baidu Scholar

60

K. J. Lee and S. G. Thompson, “Flexible parametric models for random-effects distributions,” Statistics in Medicine, vol. 27, no. 3, pp. 418-434, May 2008. [Baidu Scholar