Abstract
Renewable energy production has been surging around the world in recent years. To mitigate the increasing uncertainty and intermittency of the renewable generation, proactive demand response algorithms and programs are proposed and developed to further improve the utilization of load flexibility and increase the efficiency of power system operation. One of the biggest challenges to efficient control and operation of demand response resources is how to forecast the baseline electricity consumption and estimate the load impact from demand response resources accurately. In this paper, we propose a mixed effect segmented regression model and a new robust estimate for forecasting the baseline electricity consumption in Southern California, USA, by combining the ideas of random effect regression model, segmented regression model, and the least trimmed squares estimate. Since the log-likelihood of the considered model is not differentiable at breakpoints, we propose a new backfitting algorithm to estimate the unknown parameters. The estimation performance of the new estimation procedure has been demonstrated with both simulation studies and the real data application for the electric load baseline forecasting in Southern California.
THE renewable energy sector has experienced exponential growth in the past five to ten years. The global annual growth rates of solar photovoltaic and wind energy are 42% and 17% from 2010 through 2015, respectively [
A sound baseline estimation methodology should represent an appropriate tradeoff between simplicity and accuracy. The existing baseline methodology can be categorized into two types: Type-I and Type-II. In Type-I methodology, the baseline is estimated by using a similar day-based algorithm, which depends on historical interval meter data and similarity metrics such as weather and calendar. Simplicity is the most significant advantage of Type-I methodology [
There are several limitations with the existing approaches. First, some of the methods do not exploit the structure of the forecasting problem effectively. For example, the segmented nature of calendar variables on the load profile is not well addressed. Second, deep learning based forecasting algorithms are typically computationally expensive to train. In addition, they yield un-interpretable results and can be sensitive to the selection of hyperparameters. Third, hybrid methods are generally complicated to build, thus can be error-prone to implement and benchmark. Lastly, most of the existing work build and train a separate model for each time series. This significantly limits the scalability of model, especially for large service territories operated by electric utilities.
In this paper, we propose a mixed effect segmented regression (MESR) model, which is a Type-II methodology, to forecast the hourly electric load baseline in Southern California, USA at the 220 kV transformer bank level. One commonly used method for electric power demand forecasting at each hour is the multiple linear regression with hour as a categorical variable and weather data as continuous covariates. An alternative model for hour is to include it as a linear predictor. However, it is expected that the linear effect of hour on electric demand does not hold in the whole range of time. To this end, we propose to model the hour effect by a segmented regression model [
Note that it is not trivial to compute the maximum likelihood estimate (MLE) for the MESR, since its log-likelihood is not differentiable at breakpoints. Many standard computation algorithms such as the Newton-Raphson algorithm can not be used directly. In this paper, we propose a backfitting algorithm to combine the segmented regression estimation method proposed in [
The rest of the paper is organized as follows. Section II introduces the MESR and describes the proposed robust estimation algorithms. Section III illustrates the finite sample performance of the proposed method using a simulation study. In Section IV, we apply the new estimation procedure to forecast the hourly electric power demand in Southern California, USA. Section V concludes the paper with some discussions.
Given a random sample , where is the number of subjects; is the number of observations collected for the
(1) |
where is the regression coefficient for the random effect covariates; and are the regression coefficients for the breakpoint variables; and the quantities with a subscript “” means taking the positive part. For example, equals if and 0 otherwise; ; . In this paper, we assume that , where is the standard deviation of the error variable; and is the identity matrix. The MESR (1) consists of three parts: multiple linear regression , random effects , and segmented regression , which models the heterogeneous linear effect of on across different areas of the breakpoint variable. measures the difference of slopes (linear effect of on ) before and after the breakpoint . We mainly focus on the situation where the segmented parts are fixed effects. But the proposed estimation procedure can be extended to the situation where the segmented parts also contain random effects [
Suppose that , and , where . Then, (1) can be rewritten in matrix format as:
(2) |
where . Based on (2), and , where and denote conditional expectation and conditional variances, respectively. Therefore, the random effects make the observations within each correlated subject. The log-likelihood function of is:
(3) |
where collects all the unknown parameters in model (1). Unlike the traditional mixed effect model, maximizing (3) is not trivial since it is not differentiable at . We propose a backfitting algorithm to maximize (3) by alternately updating the segmented regression part and the linear mixed effect part. Next, we discuss in detail how to perform such two estimation procedures.
Given the estimate , (1) will be a segmented regression model. The breakpoints and slopes in segmented regression can be estimated through many ways such as regression spline as well as Bayesian Markov chain Monte Carlo (MCMC) methods [
(4) |
where is the indicator function. It equals 1 if the condition inside the parenthesis is true and 0 otherwise; is the first derivative of assessed in .
Let , , and . Define , , and . Given the estimate , the log-likelihood (3) can be simplified as:
(5) |
where . Therefore, and in (5) can be easily found by weighted least squares estimate. Note that . The iterative algorithm will terminate at . Given the estimate , the algorithm to estimate the breakpoints is summarized in
In this sub-section, we discuss how to maximize (3) given the estimate and , where . Let be the estimate of after replacing by . Plugging in the estimate into the model (1), we can obtain:
(6) |
where . Therefore, the model (6) is simply a traditional mixed effect regression model. We propose to employ the penalized weighted least square (PWLS) method to estimate the unknown parameters in (6). More details of computing the linear mixed effect regression model are given in [
By combining the estimation procedures in Section II-A and II-B, we propose
It is well known that the MLE is sensitive to outliers and might give misleading results when there are outliers in the data, which is the case for our collected electric power demand data in Southern California. More details will be given in Section IV. The issue of outlier is well recognized in the field of load forecasting, and is typically solved using robust regression algorithms. For example, [
(7) |
where are the ordered squared residuals with . The robust MESR estimation based on LTS is described in
To increase the chance of finding the global minimum, one might run
In this section, we use a simulation study to illustrate the performance of the proposed estimation procedure for the MESR. All the computations are implemented in R. We use R package segmented::segmented [
(8) |
where , the Poisson distribution with rate parameter 10; , the uniform distribution with lower and upper limits 5 and 10, respectively. The breakpoint variables are arithmetic sequence ranging in , , , with . The other parameters in (8) are set to be: ; ; ; ; ; ; and .
We consider the following four simulation scenarios:
1) and is randomly chosen in .
2) and is randomly chosen in .
3) and is randomly chosen in .
4) and is randomly chosen in .
First, we utilize (8) to simulate the dataset without outliers. The model is estimated using MLE. In Tables I-IV, we report the mean, median, and standard deviation (SD) for the estimates of fixed effect regression parameters, breakpoints, segmented regression parameters, and random effect covariance matrix, respectively, based on 500 replications.
From Tables I-IV, we can see that the proposed MLE algorithm performs well when the dataset does not contain any outliers. Also, when the sample size increases, the SD of each parameter estimate decreases.
Next, we simulate the dataset with outliers based on model (8). The model parameters are estimated by both
Tables V-VIII present the simulation results for the estimates of fixed effect regression parameters, breakpoints, segmented regression parameters, and random effect covariance matrix, respectively, based on 200 replications. From the tables, it is observed that the standard MLE fails to provide the reasonable estimates of fixed effect regression parameters and random effect covariance matrix when the data contain outliers while LTS can provide reasonable estimates for all parameters with both and .
In this Section, we illustrate the application of the proposed estimation procedure of MESR to forecast the electric load in Southern California, USA.
The electric consumption data are aggregated to fifty-two 220 kV transformer banks from December 31, 2012 to November 1, 2013 in Southern California Edison’s service territory. The task is to build a forecasting model for the total electricity consumption of residential customers at each 220 kV transformer bank on weekdays.
The data cleansing of the raw dataset is done in two steps. First, we exclude daily observations for commercial customers and remove zero-usage records from the electric consumption data file. Second, we add daily temperature and humidity information for each bank according to its zipcodes.
The response variable is the aggregated customers’ hourly electricity consumption recorded by the smart meters. We use the following transformation to make it comparative among 52 subgroups:
(9) |
In (9), the transformed response variable is derived as follows. First, we divide the aggregated usage by the total air conditioning tonnage of a residential customer. The customer is in the air conditioning cycling program. Second, we apply the log-transformation. The electricity consumption is divided by the total AC tonnage because the latter determines the numerical magnitude of the load measurements. Since the new response variable represents the electricity consumption level per unit of air conditioning tonnage, the effects of other explanatory variables are comparable among different transformer banks, which allows to use common slopes to simplify the model.

Fig. 1 Data visualization of transformed response variable (shown in the figure as ) versus transformer bank indicator variable (shown in the figure as ABank) and time.
The explanatory variables are collected and listed in
Note: B is the random effect variable; and is the segmented variable.
In this paper, the training dataset is chosen as the samples in the first 205 observed weekdays for all transformer banks. The testing dataset consists of the samples from the 10 observed weekdays immediately following the training dataset. The total number of testing sample is 12480.
We apply the proposed estimation procedure of MESR to forecast the electricity consumption.

Fig. 2 Hourly trend between average hourly electric consumption (shown in the figure as ) with variable (shown in the figure as t) averaged over all transformer bank .
It seems that the curve corresponding to the actual consumption (after the log transformation) indicates three segments with two breakpoints. The first breakpoint locates between 02:00 a.m. and 03:00 a.m., and the second breakpoint lies between 06:00 p.m. and 08:00 p.m.. We have also tried the model with three breakpoints (one more breakpoint in the middle segmented area), but the BIC for two breakpoints is smaller. The forecasting curve in
The observations collected over time within the same transformer bank are correlated. The auto-correlation function of the observation time series of each transformer bank is shown in

Fig. 3 Auto-correlation function of observation time series of each transformer bank.
Ignoring such correlation by fixed effect model would result in inefficient estimates and lose forecasted power. To incorporate such correction, the transformer bank is treated as random effects. Using a random effect model can also drastically reduce the number of unknown parameters in the model, and thus lead to more efficient parameter estimates.
Next, we describe the construction of the fixed effects. The first six explanatory variables described in
(10) |
where is the normal distribution with mean 0 and variance . We apply both MLE and LTS algorithms to estimate the model and compare their forecasting performances. Since the true proportion of outliers is unknown, three proportions are selected for LTS to fit the model (10). In addition, the proposed algorithms are compared with two benchmarks, i.e., the multiple linear regression model [
From Tables
According to
In this paper, we propose a robust segmented mixed effect regression model to forecast the electric load baseline in Southern California. When estimating unknown parameters, we propose a backfitting algorithm by combining the ideas of the penalized least square method for random-effects regression model and the linearization technique [
Since the model is built up with hourly data, we could also aggregate the data and construct a daily electric load model. In this paper, we assume that the number of breakpoints is known. If the number of breakpoints is unknown, one could apply the selection techniques proposed by [
REFERENCES
R. Adib, H. Murdock, F. Appavou et al. (2016, Dec.). Renewables 2016 global status report. Global Status Report Renewable Energy Policy Network for the 21st Century (REN21). [Online]. Available: https://www.ren21.net/wp-content/uploads/2019/05/REN21_GSR2016_FullReport_en_11.pdf [Baidu Scholar]
Q. Wang, C. Zhang, Y. Ding et al., “Review of real-time electricity markets for integrating distributed energy resources and demand response,” Applied Energy, vol. 138, pp. 695-706, Jan. 2015. [Baidu Scholar]
S. Nolan and M. O’Malley, “Challenges and barriers to demand response deployment and evaluation,” Applied Energy, vol. 152, pp. 1-10, Aug. 2015. [Baidu Scholar]
T. Wei, Q. Zhu, and N. Yu, “Proactive demand participation of smart buildings in smart grid,” IEEE Transactions on Computers, vol. 65, no. 5, pp. 1392-1406, May 2016. [Baidu Scholar]
N. Yu, T. Wei, and Q. Zhu, “From passive demand response to proactive demand participation,” in Proceedings of 2015 IEEE International Conference on Automation Science and Engineering (CASE), Gothenburg, Sweden, pp. 1300-1306, Aug. 2015. [Baidu Scholar]
N. Charlton and C. Singleton, “A refined parametric model for short term load forecasting,” International Journal of Forecasting, vol. 30, no. 2, pp. 364-368, Apr. 2014. [Baidu Scholar]
S. B. Taieb, J. W. Taylor, and R. J. Hyndman, “Hierarchical probabilistic forecasting of electricity demand with smart meter data,” Journal of the American Statistical Association, vol. 6, pp. 1-17, Mar. 2020. [Baidu Scholar]
B. Goehry, Y. Goude, P. Massart et al., “Aggregation of multi-scale experts for bottom-up load forecasting,” IEEE Transactions on Smart Grid, vol. 11, no. 3, pp. 1895-1904, May 2020. [Baidu Scholar]
Y. Chen, P. Xu, Y. Chu et al., “Short-term electrical load forecasting using the support vector regression (SVR) model to calculate the demand response baseline for office buildings,” Applied Energy, vol. 195, pp. 659-670, Jun. 2017. [Baidu Scholar]
K. Chen, K. Chen, Q. Wang et al., “Short-term load forecasting with deep residual networks,” IEEE Transactions on Smart Grid, vol. 10, no. 4, pp. 3943-3952, Jul. 2019. [Baidu Scholar]
L. Sehovac and K. Grolinger, “Deep learning for load forecasting: sequence to sequence recurrent neural networks with attention,” IEEE Access, vol. 8, no. 8, pp. 36411-36426, Feb. 2020. [Baidu Scholar]
A. Bracale, P. Caramia, P. De Falco et al., “Multivariate quantile regression for short-term probabilistic load forecasting,” IEEE Transactions on Power Systems, vol. 35, no. 1, pp. 628-638, Jan. 2020. [Baidu Scholar]
Z. Cao, C. Wan, Z. Zhang et al., “Hybrid ensemble deep learning for deterministic and probabilistic low-voltage load forecasting,” IEEE Transactions on Power Systems, vol. 35, no. 3, pp. 1881-1897, May 2020. [Baidu Scholar]
P. I. Feder, “The log likelihood ratio in segmented regression,” The Annals of Statistics, vol. 3, no. 1, pp. 84-97, Jan. 1975. [Baidu Scholar]
R. Beckman and R. Cook, “Testing for two-phase regressions,” Technometrics, vol. 21, no. 1, pp. 65-69, Feb. 1979. [Baidu Scholar]
J. E. Ertel and E. B. Fowlkes, “Some algorithms for linear spline and piecewise multiple linear regression,” Journal of the American Statistical Association, vol. 71, no. 355, pp. 640-648, Apr. 1976. [Baidu Scholar]
A. Tishler and I. Zang, “A maximum likelihood method for piecewise regression models with a continuous dependent variable,” Journal of the Royal Statistical Society: Series C (Applied Statistics), vol. 30, no. 2, pp. 116-124, Jun. 1981. [Baidu Scholar]
A. K. Wagner, S. B. Soumerai, F. Zhang et al., “Segmented regression analysis of interrupted time series studies in medication use research,” Journal of Clinical Pharmacy and Therapeutics, vol. 27, no. 4, pp. 299-309, Aug. 2002. [Baidu Scholar]
H.-J. Kim, M. P. Fay, E. J. Feuer et al., “Permutation tests for joinpoint regression with applications to cancer rates,” Statistics in Medicine, vol. 19, no. 3, pp. 335-351, Jan. 2000. [Baidu Scholar]
J. D. Toms and M. L. Lesperance, “Piecewise regression: a tool for identifying ecological thresholds,” Ecology, vol. 84, no. 8, pp. 2034-2041, Aug. 2003. [Baidu Scholar]
Q. Shao and N. Campbell, “Applications: modelling trends in groundwater levels by segmented regression with constraints,” Australian & New Zealand Journal of Statistics, vol. 44, no. 2, pp. 129-141, Dec. 2002. [Baidu Scholar]
A. E. Kunst, C. W. Looman, and J. P. Mackenbach, “Outdoor air temperature and mortality in the netherlands: a time-series analysis,” American Journal of Epidemiology, vol. 137, no. 3, pp. 331-341, Feb. 1993. [Baidu Scholar]
N. Molinari, J.-P. Daurès, and J.-F. Durand, “Regression splines for threshold selection in survival data analysis,” Statistics in Medicine, vol. 20, no. 2, pp. 237-247, Jan. 2001. [Baidu Scholar]
R. Rigby and D. Stasinopoulos, “Detecting break points in the hazard function in survival analysis,” Statistical Modelling, vol. 1992, pp. 303-311, Dec. 1992. [Baidu Scholar]
J. Shi, Y. Liu, and N. Yu, “Spatio-temporal modeling of electric loads,” in Proceedings of North American Power Symposium (NAPS), Morgantown, USA, Sept. 2017, pp. 1-6. [Baidu Scholar]
N. M. Laird and J. H. Ware, “Random-effects models for longitudinal data,” Biometrics, vol. 38, no. 4, pp. 963-974, Dec. 1982. [Baidu Scholar]
P. Diggle, P. J. Diggle, P. Heagerty et al., Analysis of Longitudinal Data. Oxford: Oxford University Press, 2002. [Baidu Scholar]
V. M. Muggeo, “Estimating regression models with unknown break-points,” Statistics in Medicine, vol. 22, no. 19, pp. 3055-3071, Sept. 2003. [Baidu Scholar]
D. Bates. (2020, Nov.). Computational methods for mixed models. [Online]. Available: https://cran.r-project.org/web/packages/lme4/vignettes/Theory.pdf [Baidu Scholar]
P. J. Rousseeuw, “Least median of squares regression,” Journal of the American Statistical Association, vol. 79, no. 388, pp. 871-880, Jan. 1984. [Baidu Scholar]
V. M. Muggeo, D. C. Atkins, R. J. Gallop et al., “Segmented mixed models with random changepoints: a maximum likelihood approach with application to treatment for depression study,” Statistical Modelling, vol. 14, no. 4, pp. 293-313, May 2014. [Baidu Scholar]
M. Muggeo. (2016, Feb.). Segmented mixed models with random changepoints in R. [Online]. Available: https://www.researchgate.net/publication/292629179 [Baidu Scholar]
A. Dominicus, S. Ripatti, N. L. Pedersen et al., “A random change point model for assessing variability in repeated measures of cognitive function,” Statistics in Medicine, vol. 27, no. 27, pp. 5786-5798, Nov. 2008. [Baidu Scholar]
T. Hastie and R. Tibshirani, Generalized Additive Models. Virginia Beach: Chapman and Hall/CRC, 1990. [Baidu Scholar]
C. Gössl and H. Küchenhoff, “Bayesian analysis of logistic regression with an unknown change point and covariate measurement error,” Statistics in Medicine, vol. 20, no. 20, pp. 3109-3121, Oct. 2001. [Baidu Scholar]
J. Jiao, Z. Tang, P. Zhang et al., “Ensuring cyberattack-resilient load forecasting with a robust statistical method,” in Proceedings of 2019 IEEE PES General Meeting (PESGM), Atlanta, USA, pp. 1-5, Aug. 2019. [Baidu Scholar]
J. Luo, T. Hong, and S. Fang, “Robust regression models for load forecasting,” IEEE Transactions on Smart Grid, vol. 10, no. 5, pp. 5397-5404, Sept. 2019. [Baidu Scholar]
P. J. Huber, Robust Statistics. New York: John Wiley and Sons, 1981. [Baidu Scholar]
L. A. Jaeckel, “Estimating regression coefficients by minimizing the dispersion of the residuals,” The Annals of Mathematical Statistics, vol. 43, no. 5, pp. 1449-1458, Oct. 1972. [Baidu Scholar]
A. F. Siegel, “Robust regression using repeated medians,” Biometrika, vol. 69, no. 1, pp. 242-244, Apr. 1982. [Baidu Scholar]
P. Rousseeuw and V. Yohai, “Robust regression by means of s-estimators,” in Robust and Nonlinear Time Series Analysis. Berlin: Springer, 1984, pp. 256-272. [Baidu Scholar]
V. J. Yohai, “High breakdown-point and high efficiency robust estimates for regression,” The Annals of Statistics, vol. 15, no. 2, pp. 642-656, Jun. 1987. [Baidu Scholar]
D. Gervini and V. J. Yohai, “A class of robust and fully efficient regression estimators,” The Annals of Statistics, vol. 30, no. 2, pp. 583-616, Apr. 2002. [Baidu Scholar]
Y. She and A. B. Owen, “Outlier detection using nonconvex penalized regression,” Journal of the American Statistical Association, vol. 106, no. 494, pp. 626-639, Jun. 2011. [Baidu Scholar]
Y. Lee, S. N. MacEachern, and Y. Jung, “Regularization of case-specific parameters for robustness and efficiency,” Statistical Science, vol. 27, no. 3, pp. 350-372, Aug. 2012. [Baidu Scholar]
C. Yu, K. Chen, and W. Yao, “Outlier detection and robust mixture modeling using nonconvex penalized likelihood,” Journal of Statistical Planning and Inference, vol. 164, no. 1, pp. 27-38, Sept. 2015. [Baidu Scholar]
C. Yu and W. Yao, “Robust linear regression: a review and comparison,” Communications in Statistics-Simulation and Computation, vol. 46, no. 8, pp. 6261-6282, Mar. 2017. [Baidu Scholar]
D. L. Donoho and P. J. Huber, “The notion of breakdown point,” in A Festschrift for Erich L. Lehmann. Belmon: Wadsworth International Group, 1983. [Baidu Scholar]
N. Neykov, P. Filzmoser, R. Dimova et al., “Robust fitting of mixtures using the trimmed likelihood estimator,” Computational Statistics & Data Analysis, vol. 52, no. 1, pp. 299-308, Sept. 2007. [Baidu Scholar]
M. Li, S. Xiang, and W. Yao, “Robust estimation of the number of components for mixtures of linear regression models,” Computational Statistics, vol. 31, no. 4, pp. 1539-1555, Aug. 2016. [Baidu Scholar]
L. Yang, S. Xiang, and W. Yao, “Robust fitting of mixtures of factor analyzers using the trimmed likelihood estimator,” Communications in Statistics–Simulation and Computation, vol. 46, no. 2, pp. 1280-1291, Feb. 2017. [Baidu Scholar]
V. M. Muggeo, “Segmented: an R package to fit regression models with broken-line relationships,” R News, vol. 8, no. 1, pp. 20-25, Jan. 2008. [Baidu Scholar]
D. Bates, M. Mächler, B. Bolker et al., “Fitting linear mixed-effects models using lme4,” Journal of Statistical Software, vol. 67, no. 1, pp. 1-48, Oct. 2015. [Baidu Scholar]
R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society: Series B (Methodological), vol. 58, no. 1, pp. 267-288, Jan. 1996. [Baidu Scholar]
T. Hong, P. Pinson, and S. Fan, “Global energy forecasting competition 2012,” International Journal of Forecasting, vol. 30, no. 2, pp. 357-363, Apr. 2014. [Baidu Scholar]
J. Liu, S. Wu, and J. V. Zidek, “On segmented multivariate regression,” Statistica Sinica, vol. 7, no. 2, pp. 497-525, Apr. 1997. [Baidu Scholar]
M. S. Ben Aïssa, M. Boutahar, and J. Jouini, “Bai and Perron’s and spectral density methods for structural change detection in the US inflation process,” Applied Economics Letters, vol. 11, no. 2, pp. 109-115, Feb. 2004. [Baidu Scholar]
B. Strikholm and T. Teräsvirta, “Determing the number of regimes in a threshold autoregressive model using smooth transition autoregressions,” Tech. Rep. SSE/EFI Working Paper Series in Economics and Finance, Jan. 2005. [Baidu Scholar]
R. Prodan, “Potential pitfalls in determining multiple structural changes with an application to purchasing power parity,” Journal of Business & Economic Statistics, vol. 26, no. 1, pp. 50-65, Jan. 2008. [Baidu Scholar]
K. J. Lee and S. G. Thompson, “Flexible parametric models for random-effects distributions,” Statistics in Medicine, vol. 27, no. 3, pp. 418-434, May 2008. [Baidu Scholar]