Abstract
To reduce environmental pollution and improve the efficiency of cascaded energy utilization, regional integrated energy system (RIES) has received extensive attention. An accurate multi-energy load prediction is significant for RIES as it enables stakeholders to make effective decisions for carbon peaking and carbon neutrality goals. To this end, this paper proposes a multivariate two-stage adaptive-stacking prediction (M2ASP) framework. First, a preprocessing module based on ensemble learning is proposed. The input data are preprocessed to provide a reliable database for M2ASP, and highly correlated input variables of multi-energy load prediction are determined. Then, the load prediction results of four predictors are adaptively combined in the first stage of M2ASP to enhance generalization ability. Predictor hyper-parameters and intermediate data sets of M2ASP are trained with a metaheuristic method named collaborative atomic chaotic search (CACS) to achieve the adaptive staking of M2ASP. Finally, a prediction correction of the peak load consumption period is conducted in the second stage of M2ASP. The case studies indicate that the proposed framework has higher prediction accuracy, generalization ability, and stability than other benchmark prediction models.
THE accelerated construction of regional integrated energy system (RIES) improves the comprehensive utilization efficiency of various energy forms and contributes to environmental protection. The RIES utilizes the advanced technologies of physical information systems and novel management models to integrate heterogeneous energy sources such as electricity and heat within a volume of time and space [
Many studies have been performed on short-term load prediction because of its indispensable role in dispatch. The existing single prediction models of short-term load prediction can be broadly divided into parametric and data-driven algorithms. However, uncertain meteorological and sociological events bring many uncertainties to the multi-energy load in RIES, and the parametric method is challenging to solve nonlinear issues and complex influencing factors. Data-driven algorithms like random forest (RF) [
To this end, ensemble learning is adopted to predict RIES loads. Multiple data-driven models are integrated into an ensemble prediction framework to obtain better performance than any single model with a complementary mechanism [
It is worth noting that limited research has been published on improving the generalization ability and training efficiency of SEF. When the traditional SEF generates the intermediate data set that is used to integrate all base-predictor knowledge, the weighted average of each predictor’s results can weaken the performance of strong predictors due to the interference of weak ones. Each predictor model’s correlation is ignored. In addition, a massive increase of predictors may in fact reduce the training speed of the entire SEF with limited accuracy improvement. The enormous computational cost makes it challenging to applying the SEF to large-scale RIES. Meanwhile, the prediction error of multi-energy load mainly comes from the peak period, for which most existing predictors could not address well due to the meteorology variation, coupling relations, and other uncertainties. Therefore, it is worthwhile to dynamically change the SEF structure to accommodate the characteristics of the adopted predictor models. The prediction correction of peak load based on a separate SEF with strong generalization ability can be an effective way to improve prediction accuracy.
It was found that the randomness of random search can match the stacking framework pattern very well [
In this paper, a multivariate two-stage adaptive-stacking prediction (M2ASP) framework is investigated, whose structure can be adaptively changed based on the performance of each predictor. An ensemble-learning-based module is employed to construct a data-driven short-term load prediction model as follows: ① a preprocessing module based on ensemble learning is first proposed. Triangular interpolation (TI) is adopted to fill in the null data. Isolation forest (IF) is applied to identify abnormal data. Adaptive variational mode decomposition (AVMD) is used to extract load feature components. Self-organizing maps (SOMs) are adopted to fuse approximate load components. The RF with recursive feature elimination (RF-RFE) is employed to determine the highly correlated input variables for M2ASP; ② the M2ASP is proposed to predict the multi-energy load and correct the prediction errors. A complete M2ASP contains two independent multivariate adaptive-stacking prediction frameworks (MASP); and ③ a collaborative atomic chaotic search (CACS) method is employed to optimize the M2ASP stacking structure and predictor hyperparameters. Although the optimization results of each run of the metaheuristic method are different, the difference in hyper-parameters of each independent predictor model can enhance the generalization ability of MASP. The major contributions of this paper can be highlighted as follows.
1) An M2ASP is proposed, alleviating the blind stacking of the predictor number in SEF. Multi-energy loads of RIES are initially predicted in the first stage, and peak load errors are reduced in the second stage.
2) A preprocessing module based on ensemble learning is first proposed, where the perceived multi-energy load data is restructured by TI, IF, AVMD, and SOM, and representative input variables of M2ASP are determined by RF-RFE.
3) For the correlated exogenous variables and uncertain electricity consumption behavior of users, the prediction errors of the peak load consumption period can be effectively reduced by M2ASP due to the strong generalization ability of MASP.
4) A population-based metaheuristic algorithm named CACS is adopted to optimize the hyper-parameters and intermediate data of M2ASP, which achieves the adaptive staking of M2ASP predictors. The proposed CACS is suitable for the M2ASP structure.
5) The case studies demonstrate that the proposed framework achieves at least an 8.5% improvement in multi-energy load prediction accuracy, fitness, generalization ability, and stability compared with benchmark prediction models.
The remaining paper is organized as follows. Section II describes the framework of multi-energy load prediction in RIES. Section III proposes a preprocessing module based on ensemble learning. Section IV describes the working of CACS. Section V describes the principles of the M2ASP. Section VI provides the results of multi-energy load prediction in RIES using the proposed method and the comparison with benchmark prediction methods for validation. And Sention VII concludes this paper.
To handle the complex coupling relationships among different energy forms, this paper proposes a framework of multi-energy load prediction in RIES, as shown in

Fig. 1 Framework of multi-energy load prediction in RIES.
As an effective measure to solve the fossil energy crisis, climate changes, carbon peaking and carbon neutrality goals, and environmental pollution, the construction of RIES has received extensive attention. The first layer of
To detect essential features in RIES, a preprocessing module based on ensemble learning is employed to conduct the data reconstruction and variable dimension reduction. The essence of data reconstruction is to preprocess, decompose, and recombine the multi-energy loads, whose structure is shown in the second layer of
Assume Le(w,dl,t), Lh(w,dl,t), and Lc(w,dl,t) are the electrical, heating, and cooling loads of a particular hour of the day (HD) in a week. 27 possible influencing factors that may affect the multi-energy loads are considered as follows: ① Le(w,dl,), Lh(w,dl,), and Lc(w,dl,) are the c-hour-ahead () electrical, heating, and cooling loads of the same day of the same week; ② Le(w,,t), Lh(w,,t), and Lc(w,,t) are the electrical, heating, and cooling loads of the same hour of the previous day of the same week; ③ Le(,dl,t), Lh(,dl,t), and Lc(,dl,t) are the electrical, heating, and cooling loads of the same hour of the same day of the previous week; ④ humidity (H); ⑤ wind speed (WS); ⑥ solar radiation (SR); ⑦ temperature (T); ⑧ day type (off day or working day); ⑨ HD; ⑩ day of the week (DW); 11 month of the year (MY); and 12 week of the month (WM). By eliminating irrelevant interfering information, the preprocessing module based on ensemble learning selects input variables that are highly correlated with output variables. Multiple exogenous and meteorological variables are condensed into representative variables of M2ASP. The structure of this section is shown in the third layer of
Based on the results of data reconstruction and variable dimension reduction, an independent MASP model is proposed to predict RIES multi-energy loads, and a CACS algorithm is proposed to achieve the adaptive stacking of MASP parameters. The structure of load prediction is shown in the fourth layer of
In fact, load data loss is common due to unexpected failure or device maintenance, and it is necessary to preprocess the input data of M2ASP. TI is suitable for filling in null data in the training and test sets [
Separately predicting each component of load through data decomposition is an effective way for RIES load prediction [
(1) |
where Kv is the number of modes decomposed; represents the set of all IMFs; represents the center frequencies of all IMFs; is the impulse function; and is the partial derivative of t.
To render the equation unconstrained, the constraints reconstructed by employing a quadratic penalty term and Lagrangian multipliers are as follows:
(2) |
Based on the AVMD proposed in [
Although the separate prediction of each IMF may improve the accuracy, the increase of predicted variables of the M2ASP model observably reduces the computation speed. Therefore, the SOM clustering algorithm [
As the RF method has the in-built feature evaluation mechanisms that can measure the contribution of each variable to the prediction results, the nonsignificant features can be eliminated. To decrease the redundant information between features (variables), a hybrid RF-RFE method is employed to reduce the input variable dimensions and select highly correlated input variables, which evaluates variable importance and calculates feature weights based on the in-built feature evaluation mechanisms of RF. Through iterative screening, the combination of input variables with the highest prediction accuracy is retained, as shown in

Fig. 2 Implementation process of RF-RFE.
To optimize the predictor hyper-parameters and intermediate dataset of M2ASP, this paper proposes a new CACS algorithm to enhance the AOS algorithm. Based on collaborative behaviors, chaotic orbits, and dynamic photon strategy, the proposed CACS can overcome the limitations of AOS. CACS is a general meta-heuristic method that outperforms other benchmark optimizers in terms of adaptive stacking of M2ASP.
Concepts borrowed from the inspiring discipline are shown in Table SAI and Fig. SA1 of Supplementary Material A. The electron cloud around the nucleus is regarded as the search space. Each electron around the nucleus is considered as a solution candidate. The electron position can be defined by the decision variables. The energy state of each electron is regarded as the objective function value of each solution candidate. Besides, the electrons with lower energy levels represent the solution candidates with better objective function values. The schematic diagram of the quantum-based atomic model is shown in

Fig. 3 Schematic diagram of quantum-based atomic model.
Assume the solution candidates are as follows:
(3) |
where S represents the electrons around the nucleus; is the position of the
(4) |
where is the initial position of the solution candidates; is the minimum bound of the
(5) |
where O is a vector containing the objective function values of all electrons; and is the energy level of the
The positions and energy levels of electrons in different imaginary layers can be expressed as:
(6) |
(7) |
where is the vector of electrons in the
The binding state and binding energy in the
(8) |
(9) |
The binding state BS and binding energy BE of an atom are calculated as:
(10) |
(11) |
The energy level of each electron in each imaginary layer is compared to to decide the emission or absorption of photons for happening. If , the electrons tend to emit a photon with a certain amount of energy to reach the energy LE and the state BS of the atom, which is expressed as:
(12) |
where and are the and iteration positions for the
If , the electrons tend to absorb a photon with a certain amount of energy to reach the state and the electron position with the lowest energy level in the
(13) |
where is the solution candidate with the best objective function values in the
The probability of a photon acting on an electron depends on the randomly generated number and the photon rate PR. If , the probability of a photon acting on an electron is 0. Therefore, the movement of electrons will mainly depend on magnetic fields, which are expressed as:
(14) |
where is a randomly generated scalar in the range of [0,1].
Although AOS has high search efficiency, AOS in dealing with complex optimization problems remains challenging [
Aiming to maintain a reasonable balance of exploration and exploitation, a collaborative atom model is proposed to evolve the electron search-ability based on the state-of-the-art knowledge of multiple atoms. Assume an atom group contains a leader atom and multiple follower atoms. The leader atom is responsible for exploration, and the follower atoms are responsible for exploitation. In an atom group, the atom containing the electron with the lowest energy level in the collaborative atom model is regarded as the leader atom. Other atoms are regarded as follower atoms. The position updating equations of the electrons in the follower atoms remain consistent with the original AOS method. Each electron of the leader atom updates its position based on the knowledge of itself and the follower atoms, using the historical best position of itself and the follower electrons as follows.
Assume represents the energy level of the electron in the
(15) |
(16) |
(17) |
where and are the and iteration positions of the
If , the electrons tend to absorb a photon based on multi-atomic cooperation. The solution candidates aim to reach the binding state of the layer and the state of the electron with the lowest energy level inside the considered leader layer. The collaborative position updating equations are as follows:
(18) |
(19) |
(20) |
where and are the total electron numbers of the leader atom and follower atoms in the
The original AOS simulates the effects of magnetic fields on electrons by randomly generated vectors . However, adding randomly generated vectors can miss significant points in the search space. The previously accumulated exploration experience is easy to be wasted. Due to the easy implementation and remarkable ability to avoid the local optimum, chaos theory has been applied to optimization. The seemingly disordered process of chaotic variables has inherent regularity, which employs the randomness, ergodicity, and regularity of chaotic variables to search for the global optimum. Chaos motion can traverse all states without repetition according to its own law within a certain range. Therefore, replacing with a chaotic orbit can effectively enhance the search efficiency and help electrons get rid of the local optimum. To achieve efficient exploration, electrons perform a chaotic local search in chaotic orbits to explore an irregular space. The entry process of chaotic orbits can be represented as follows.
Step 1: chaotic space mapping. Assume is the position vector of the
(21) |
where is the
Step 2: chaotic local search. An iterative logistic equation is adopted as:
(22) |
where is the control coefficient. When is 4, the chaotic dynamics are characterized by iterative logistic equations. With enough iterations, the chaotic orbit of an electron can travel ergodically over the whole search space, which exhibits pseudo-randomness, ergodicity, and irregularity, respectively.
Step 3: solution space mapping. The chaotic variable can be mapped to the original solution space as:
(23) |
This paper presents a dynamic photon strategy based on the optimization demand of different energy levels. By dynamically adjusting PR, fine electrons tend to search locally, and weak electrons tend to perform large modifications such as chaotic orbits. Assume and are the objective function values of the
1) When , the studied electron is far from the electron position with the lowest energy level. The electron should perform large modifications to explore wider space for exploration. The photon rate PR becomes larger to lead the electron into the chaotic orbits, which is expressed as:
(24) |
where PRmax and PRmin are the maximum and minimum photon rates, respectively; and rPR is a random coefficient in [0,1].
2) When , the studied electron surrounds the electron with the lowest energy level. The electron should gradually search locally as the iteration proceeds for exploitation. The nonlinear decrease of the photon rate PR can be expressed as:
(25) |
where is the maximum number of iterations. The improvement of CACS for AOS is mainly due to the adoption of collaborative behaviors, chaotic orbits, and dynamic photon strategy.
A complete SEF contains predictors for multi-energy load prediction and a stacking method that fuses predictors. The proposed M2ASP contains two independent MASP models. The load prediction results of three base-predictors and a meta-predictor are adaptively combined in the first stage of M2ASP, and a prediction correction of the peak load consumption period is adopted in the second stage of M2ASP. The final prediction result of M2ASP is the sum of the multi-energy load prediction result calculated by the first MASP model and the prediction error calculated by the second MASP model. The schematic diagram of the MASP is provided in

Fig. 4 Schematic diagram of MASP framework.
MASP is a key submodel of M2ASP. The essence of MASP is to obtain preliminary prediction information through the base-predictors, dynamically generate new datasets based on the performance of each base-predictor, and integrate appropriate prediction knowledge to obtain more accurate prediction results than a single predictor.
To avoid overfitting, the intermediate data sets of MASP are generated based on 5-fold cross-validation. First, the original training data is divided into five folds (training K), as shown in
(26) |
where is the matrix containing new training set generated by the
The new test set of a base-predictor is obtained by weighting the five groups of predicted test data, which are defined as:
(27) |
where is the new test set generated by the
The meta-predictor prediction is to summarize the knowledge provided by the base-predictors of base-predictor training and obtain the preliminary prediction results through a meta-predictor. First, the three groups of new training datasets obtained in base-predictor training are concatenated into an intermediate training set. Next, the original training labels and coalescent training set are brought into the AdaBoost method for training. Finally, the AdaBoost method predicts future multi-energy loads using the new test set obtained in base-predictor training.
The hyper-parameter setting is more complicated when the MASP is integrated with more predictors. Besides, the determination of the weight remains a controversial issue. As the performance and correlation of each base-predictor model are ignored, the stacking solely on the base-predictor number may result in unacceptable calculation speed and limited accuracy improvement. To achieve the adaptive stacking in different scenarios and improve the generalization ability, should be associated with the performance of different base-predictor models in the MASP. Therefore, the hyperparameters of the four predictors and the 15 weights shown in

Fig. 5 Implementation process of M2ASP.

Fig. 6 Implementation of proposed framework.
In ensemble learning, it is necessary to select the prediction methods with noticeable differences as the base-predictors. Based on case study results, three base-predictors and one meta-predictor with better stacking prediction performance are selected as follows: ① RF is an ensemble method consisting of decision trees in that each decision tree depends on the values of a random vector sampled independently [
Affected by the irregular electricity consumption behaviors of users and the high penetration of DGs, the current prediction method remains challenging to eliminate the prediction error of the peak multi-energy loads. Due to the reinforced generalization ability of MASP, peak value correction based on the MASP is an effective way to handle this issue. Therefore, a two-stage MASP model named M2ASP is presented for load prediction and peak load correction.
M2ASP contains two independent MASP models, one for load prediction and the other for error correction. After the process of the preprocessing module based on ensemble learning, future multi-energy loads in RIES can be preliminarily predicted with the first MASP model. Although the prediction accuracy of the MASP is improved compared with a single predictor, the existing prediction errors are mainly concentrated around the peak of the multi-energy loads. Therefore, the preliminary prediction results and training errors of the first MASP model are transferred to the second stage.
To improve the results of the first stage, the peak load errors are estimated in the second stage of M2ASPe as follows.
Step 1: data preprocessing. Based on the preprocessing module in Section II, the training error obtained in the first stage and influencing factors are processed to meet the requirements of the following processes.
Step 2: variable dimension reduction. Based on the preprocessing module in Section II, the representative influencing factors on the prediction error of peak loads are selected using the RF-RFE method.
Step 3: error estimation. According to the load prediction (the first stage of M2ASP) results, the peak load range with a large prediction error can be identified. The data range of the second MASP model only includes peak load periods with large prediction errors. When training the second MASP model, the original training labels are replaced with preliminary training errors of the first MASP model. The prediction error of peak multi-energy loads on the test set can be estimated.
Step 4: error correction. The prediction error on the test set is fed back to the first MASP model to correct the prediction result. IMF sequences for each type of load are added linearly. The final prediction result of M2ASP is the sum of the multi-energy load prediction result calculated by the first MASP model and the prediction error calculated by the second MASP model.
Combining the proposed preprocessing module based on ensemble learning,
Assume the input data of the raw training set is a 27×500 matrix, where 27 represents the dimension of the input variables, and 500 means that the input data contains 500 hours of sampling points. RF-RFE selects a suitable combination of input variables to achieve the dimensionality reduction of input variables. Assume that only eight input variables are retained.
Therefore, the dimension of the input data of the training set is 8×500. Assume the time dimension of the test set is 100 hours. In the same way, the test set is processed into an 8×100 matrix of input data.
The raw training labels are 500 hours of heating load samples (1×500 matrix). Then, the training labels (predicted objective) are decomposed into 11 IMFs by AVMD. SOM reorganizes 11 IMFs into 4 IMF sequences. As a result, training labels become a 500×4 matrix.
Taking RF as an example, 5-fold cross-validation establishes five sub-models (models I-V), as shown in
The training and prediction of the meta-predictor are based on a new training set (500×12 matrix) and a new test set (100×12 matrix) generated by the base-predictor. Training labels remain a 500×4 matrix. The meta-predictor is trained based on the new training set (500×12 matrix) and training labels (500×4 matrix). Next, the meta-predictor will make predictions on the new test set (100×12 matrix) to obtain the predicted IMF sequence (100×4 matrix). Each column of the matrix is added linearly to get prediction results of the heating loads for 100 hours (100×1 matrix).
The purpose of this step is to determine the best hyperparameters and intermediate weights (new test set weights) for the model, which is generally done before training. The hyperparameters of the four predictors and the 15 weights shown in
The climate data are perceived from the China National Weather Station database. The distribution diagram of multi-energy loads of distribution networks in Binhai District, Tianjin, China, is shown in Fig. SA2 of Supplementary Material A. To achieve the adaptive staking of M2ASP, the validating data of the predictor hyper-parameter setting and the intermediate dataset weights are from February, May, August, and November, 2017. The training data are from January, March, April, June, July, September, October, and December, 2017 for predictor training and peak error correction. The testing data are the representative days of the four quarters in 2017 that are not part of the validating and training sets for performance tests. The time resolution of the electrical, heating, and cooling loads is set to an hour, represented by time t. The range limits of are [0,1], which are normalized to ensure that the sum of the weights of each base-predictor is 1. In other words, a day can be divided into 24 intervals represented by time t. The peak load range is selected by the M2ASP in the first stage. The objective of the proposed framework is to predict the multi-energy loads one hour later. In addition to historical multi-energy loads, exogenous and meteorological variables are considered in predictions such as T, H, WS, SR, HD, DW, MY, WM, and day type. One-hot encoding is used to quantify calendar information.
Each evaluation indicator has different emphases and limitations. Therefore, it is necessary to carry out comparative experiments on the proposed M2ASP using various evaluation indicators. Three evaluation indicators selected from the machine learning field that can overall reflect the performance of the prediction methods are as follows.
1) MAPE is adopted to compare the performance of the proposed framework and benchmark prediction models. MAPE normalizes the prediction error for each measurement and weakens the interference of individual outliers on the absolute error, which is expressed as:
(28) |
where Ml is the number of observation points; and Lpre and Lact are the predicted load and actual load, respectively.
2) Root mean square error (RMSE) utilizes an average prediction error that is sensitive to abnormal values. The value of RMSE is more sensitive to outliers than MAPE, which can evaluate the overall stability of the prediction method. RMSE is expressed as:
(29) |
3) Although the values of RMSE and MAPE are sufficient to evaluate the prediction accuracy, they cannot evaluate how well the predicted values fit the actual values. R-square (
(30) |
In the preprocessing module based on ensemble learning, the original time-series of multi-energy load can be decomposed into multiple IMFs and a residue component mode by AVMD. The numbers of IMFs for electrical, heating, and cooling loads are 6, 13, and 6, respectively. A schematic diagram of IMFs for multi-energy loads decomposed by AVMD is shown in

Fig. 7 Schematic diagram of IMFs for multi-energy loads decomposed by AVMD. (a) IMFs of electrical load. (b) IMFs of heating load. (c) IMFs of cooling load.
Because multi-energy loads include electrical, heating, and cooling loads, three independent MASP models are constructed to predict the IMF fusions of electrical, heating, and cooling loads, respectively. To reduce peak load errors, three additional MASP models are constructed to estimate the prediction errors of IMF fusion. Therefore, a total of six MASP models are established, which form a total of three M2ASP models to predict multi-energy loads. To determine six groups of highly correlated input variables of different MASP models, RF-RFE is used to analyze the importance of the above 27 influencing factors and select the optimal input variable combinations. Through the iteration of RF-RFE, the relationship between the number of retained input variables and the RMSE of the prediction results is shown in

Fig. 8 RF-RFE results of prediction results prediction framework.
To further study the impact of input variable changes on multi-energy load prediction, the determined input variables are subjected to sensitivity analysis. Since WS and SR are not considered in the input variables, this paper only analyzes the environmental characteristics of H and T. The values of the two variables vary by . When conducting the sensitivity analysis, only a single input variable can be changed, and other input variables remain unchanged.
Thereby, the influence of environmental factors on RIES load prediction can be analyzed. The MAPE of the prediction results under the change of environmental variables is shown in

Fig. 9 Sensitivity analysis results.
Compared with H, the proposed framework is more sensitive to T. When the offset of T is 15%, the RMSEs of electrical, heating, and cooling loads increase by 8.41%, 12.48%, and 12.29%, respectively, compared with the initial prediction value after the change. When the offset of T is , the RMSEs of electrical, heating, and cooling loads increase by 6.46%, 14.48%, and 7.93%, respectively, compared with the initial prediction value after the change. Compared with the other two loads, environmental factors have less influences on electricity load. Therefore, the precision of dry bulb temperature should be strictly required in subsequent studies. Otherwise, excessive environmental data errors will lead to poor accuracy of RIES load prediction.
The predictor hyper-parameters and intermediate dataset weights of M2ASP are optimized by CACS. In this subsection, the hyperparameter configuration of CACS is discussed. The hyperparameters of CACS include the maximum number of iterations, PR, , and . The range of photon rate PR is set to be [0, 0.3] to limit the photon dynamics. After more than 100 iterations, the iterative curve of CACS generally converges. To ensure that the optimizer can converge reliably, the maximum number of iterations is set to be 100. To determine other hyperparameters of CACS, the MAPE of M2ASP using different and is shown in

Fig. 10 MAPE of different qL and qF for CACS.
To study the difference between AOS and CACS, the total number of electrons and iterations of AOS and CACS should be consistent. CACS in M2ASP is replaced by AOS to compare the performance of the two optimizers in adaptive stacking. The average MAPEs of the multi-energy load prediction in different months are shown in
Month | MAPE | Time (s) | Wilcoxon signed-rank | |||
---|---|---|---|---|---|---|
CACS | AOS | CACS | AOS | p-value | Winner | |
January | 0.0463 | 0.0451 | 436 | 423 |
1.08×1 | - |
March | 0.0392 | 0.0417 | 451 | 444 |
2.67×1 | + |
April | 0.0431 | 0.0437 | 421 | 432 |
7.83×1 | + |
June | 0.0490 | 0.0513 | 461 | 419 |
3.23×1 | + |
July | 0.0538 | 0.0559 | 451 | 425 |
4.39×1 | + |
September | 0.0472 | 0.0482 | 422 | 441 |
3.31×1 | + |
October | 0.0439 | 0.0425 | 411 | 401 |
2.66×1 | - |
December | 0.0581 | 0.0459 | 419 | 398 |
5.93×1 | - |
To demonstrate the superiority of the proposed CACS method, TSA [
Optimizer | MAPE | RMSE | Time (s) | Average rank | p-value | |
---|---|---|---|---|---|---|
CACS | 0.0471 | 0.1103 | 0.9732 | 437 | 2.00 | |
IPSO-COA | 0.0486 | 0.1093 | 0.9692 | 592 | 3.00 |
5.13×1 |
AOS | 0.0526 | 0.1123 | 0.9513 | 401 | 3.25 |
7.34×1 |
HBA | 0.0584 | 0.1212 | 0.9501 | 467 | 5.00 |
1.99×1 |
SDSA | 0.0763 | 0.1321 | 0.9402 | 398 | 5.00 |
1.21×1 |
BCA | 0.0541 | 0.1201 | 0.9689 | 589 | 4.25 |
5.47×1 |
TSA | 0.0804 | 0.1367 | 0.9308 | 359 | 5.50 |
9.53×1 |
As the heuristic optimization algorithm is only a tool for M2ASP to achieve adaptive stacking, this paper mainly focuses on whether CACS adapts to M2ASP, and the performance of CACS in other optimization domains will not be discussed. If CACS-based M2ASP has better accuracy than other benchmark predictors and CACS outperforms other benchmark optimizers in M2ASP, CACS will be employed as the adaptive stacking method of M2ASP. As shown in

Fig. 11 Iterative curves for different optimizers.
To verify the performance of the proposed framework, eleven prediction models that perform well in the field of machine learning. LightGBM [
Considering the strong seasonal and temperature sensitivity of multi-energy loads, the load characteristics are different in different seasons. Representative daily load data are extracted to verify the feasibility of the proposed model. Besides, the prediction effects of benchmark models are compared and analyzed. The comparison chart of prediction results in different seasons is shown in

Fig. 12 Comparison chart of prediction results in different seasons. (a) Electrical load in January. (b) Electrical load in April. (c) Electrical load in July. (d) Electrical load in October. (e) Heating load in January. (f) Heating load in April. (g) Heating load in July. (h) Heating load in October. (i) Cooling load in January. (j) Cooling load in April. (k) Cooling load in July. (l) Cooling load in October.
To compare the difference between single predictor and ensemble framework, the average performance of benchmark models in four quarters is compared in Table III. For the single predictor, the MAPEs of RF, SVR, LSTM, and AdaBoost are stronger than other single predictors, which indicates that the base-predictors and meta-predictor of M2ASP are credible. Among them, RF has the highest prediction accuracy, which is 16.92% and 7.84% lower than KNN in MAPE and RMSE, respectively. With the integration of base-predictors, the ensemble framework (M2ASP, LSF, MEFF) outperforms a single predictor performance. Since the ensemble framework requires cross-validation to prevent overfitting, there is a longer time lag when training the model compared with a single prediction model. Due to its lightweight structure and less integrated predictors, the computation time of M2ASP is reduced by 10.25% and 27.36% compared with LSF and MEFF, respectively. In terms of MAPE, RMSE, and
Model | MAPE | RMSE | Time (s) | p-value | |
---|---|---|---|---|---|
RF | 0.054 | 0.141 | 0.957 | 28 |
3.66×1 |
SVR | 0.063 | 0.147 | 0.951 | 25 |
4.67×1 |
LSTM | 0.058 | 0.146 | 0.954 | 26 |
5.31×1 |
AdaBoost | 0.055 | 0.143 | 0.951 | 32 |
8.32×1 |
LighGBM | 0.059 | 0.152 | 0.952 | 24 |
7.12×1 |
GBR | 0.061 | 0.158 | 0.955 | 26 |
3.64×1 |
XGBoost | 0.063 | 0.149 | 0.949 | 30 |
9.31×1 |
GBRT | 0.061 | 0.148 | 0.953 | 26 |
4.12×1 |
KNN | 0.065 | 0.153 | 0.947 | 23 |
1.35×1 |
M2ASP | 0.043 | 0.121 | 0.977 | 429 | |
M2ASP without peak error correction | 0.046 | 0.131 | 0.969 | 306 | |
LSF | 0.049 | 0.137 | 0.961 | 478 |
6.41×1 |
MEFF | 0.047 | 0.133 | 9.965 | 591 |
4.51×1 |
M2ASP without peak error correction is introduced into the comparison. As shown in Table III, the peak error correction helps M2ASP reduce MAPE and RMSE by 6.52% and 7.6%, respectively, and improves
Special days refer to days with low frequency and irregular electricity consumption. To test the generalization ability and application value of M2ASP, this paper predicts the multi-energy loads on the Chinese New Year, which is shown in

Fig. 13 MAPE of multi-energy load on Chinese New Year.
In this paper, an M2ASP is proposed to predict multi-energy loads in the RIES, which achieves an adaptive combination of four different predictors. Predictor hyper-parameters and intermediate data sets of M2ASP are trained with a new optimization method named CACS to improve prediction performance. Inputs are processed with a preprocessing module based on ensemble learning, using multiple preprocessing technologies. In the first stage of M2ASP, the outputs of base-predictors are adaptively combined to predict the multi-energy loads preliminarily. The second stage of M2ASP conducts a prediction correction in the peak load consumption period. The final prediction result is the sum of the IMF sequences of various loads. The case studies demonstrate that the proposed framework has better prediction accuracy, generalization ability, and stability than other benchmark prediction models.
References
M. Silberberg. (1969, Apr.). Principles of general chemistry. [Online].Available: https://doi.org/10.1021/ed046p260.1 [Baidu Scholar]
S. Wang, K. Wu, Q. Zhao et al., “Multienergy load forecasting for regional integrated energy systems considering multienergy coupling of variation characteristic curves,” Frontiers in Energy Research, vol. 9, pp. 1-14, Apr. 2021. [Baidu Scholar]
D. Liu, L. Wang, G. Qin et al., “Power load demand forecasting model and method based on multi-energy coupling,” Applied Sciences-Basel, vol. 10, no. 2, pp. 1-24, Jan. 2020. [Baidu Scholar]
L. Alfieri and P. De Falco, “Wavelet-based decompositions in probabilistic load forecasting,” IEEE Transactions on Smart Grid, vol. 11, no. 2, pp. 1367-1376, Mar. 2020. [Baidu Scholar]
Y. Dai and P. Zhao, “A hybrid load forecasting model based on support vector machine with intelligent methods for feature selection and parameter optimization,” Applied Energy, vol. 279, p. 115332, Dec. 2020. [Baidu Scholar]
L. Ge, Y. Li, J. Yan et al., “Short-term load prediction of integrated energy system with wavelet neural network model based on improved particle swarm optimization and chaos optimization algorithm,” Journal of Modern Power Systems and Clean Energy, vol. 9, no. 6, pp. 1490-1499, Nov. 2021. [Baidu Scholar]
J. Sun, G. Liu, B. Sun et al., “Light-stacking strengthened fusion based building energy consumption prediction framework via variable weight feature selection,” Applied Energy, vol. 303, p. 117694, Dec. 2021. [Baidu Scholar]
M. Q. Raza, N. Mithulananthan, J. Li et al., “Multivariate ensemble forecast framework for demand prediction of anomalous days,” IEEE Transactions on Sustainable Energy, vol. 11, no. 1, pp. 27-36, Jan. 2020. [Baidu Scholar]
S. Wang, S. Wang, H. Chen et al., “Multi-energy load forecasting for regional integrated energy systems considering temporal dynamic and coupling characteristics,” Energy, vol. 195, p. 116964, Mar. 2020. [Baidu Scholar]
J. Lee, J. Kim, and W. Ko, “Day-ahead electric load forecasting for the residential building with a small-size dataset based on a self-organizing map and a stacking ensemble learning method,” Applied Sciences-Basel, vol. 9, no. 6, pp. 1-19, Mar. 2019. [Baidu Scholar]
J. Bergstra and Y. Bengio, “Random search for hyper-parameter optimization,” Journal of Machine Learning Research, vol. 13, pp. 281-305, Feb. 2012. [Baidu Scholar]
F. A. Hashim, E. H. Houssein, K. Hussain et al., “Honey badger algorithm: new metaheuristic algorithm for solving optimization problems,” Mathematics and Computers in Simulation, vol. 192, pp. 84-110, Feb. 2022. [Baidu Scholar]
M. Azizi, “Atomic orbital search: a novel metaheuristic algorithm,” Applied Mathematical Modelling, vol. 93, pp. 657-683, May 2021. [Baidu Scholar]
J. Selva, “Convolution-based trigonometric interpolation of band-limited signals,” IEEE Transactions on Signal Processing, vol. 56, no. 11, pp. 5465-5477, Nov. 2008. [Baidu Scholar]
X. Li, X. Gao, B. Yan et al., “An approach of data anomaly detection in power dispatching streaming data based on isolation forest algorithm,” Power System Technology, vol. 43, no. 4, pp. 1447-1456, Jun. 2019. [Baidu Scholar]
J. Zhu, H. Dong, S. Li et al., “Review of data-driven load forecasting for integrated energy system,” in Proceedings of the Chinese Society of Electrical Engineering, vol. 41, no. 23, pp. 7905-7923, May 2021. [Baidu Scholar]
S. Gul, M. F. Siddiqui, and N. u. Rehman, “FPGA-based design for online computation of multivariate empirical mode decomposition,” IEEE Transactions on Circuits and Systems, vol. 67, no. 12, pp. 5040-5050, Dec. 2020. [Baidu Scholar]
K. Dragomiretskiy and D. Zosso, “Variational mode decomposition,” IEEE Transactions on Signal Processing, vol. 62, no. 3, pp. 531-544, Feb. 2014. [Baidu Scholar]
X. Zhang, D. Li, J. Li et al., “Grey wolf optimization-based variational mode decomposition for magnetotelluric data combined with detrended fluctuation analysis,” Acta Geophysica, vol. 70, no. 1, pp. 111-120, Jan. 2022. [Baidu Scholar]
A. D. Ramos, E. Lopez-Rubio, and E. J. Palomo, “The forbidden region self-organizing map neural network,” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 1, pp. 201-211, Jan. 2020. [Baidu Scholar]
F. Magoulès and H. Zhao, Data Mining and Machine Learning in Building Energy Analysis, New York: Wiley-IEEE Press, 2016. [Baidu Scholar]
Z. Qi, F. Meng, Y. Tian et al., “Adaboost-LLP: a boosting method for learning with label proportions,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 8, pp. 3548-3559, Aug. 2018. [Baidu Scholar]
J. Derrac, S. Garcia, D. Molina et al., “A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms,” Swarm and Evolutionary Computation, vol. 1, no. 1, pp. 3-18, Mar. 2011. [Baidu Scholar]
S. Kaur, L. K. Awasthi, A. L. Sangal et al., “Tunicate swarm algorithm: a new bio-inspired based metaheuristic paradigm for global optimization,” Engineering Applications of Artificial Intelligence, vol. 90, p. 103541, Apr. 2020. [Baidu Scholar]
D. Yadav, “Blood coagulation algorithm: a novel bio-inspired meta-heuristic algorithm for global optimization,” Mathematics, vol. 9, no. 23, pp. 1-40, Dec. 2021. [Baidu Scholar]
S. M. Zandavi, V. Y. Y. Chung, and A. Anaissi, “Stochastic dual simplex algorithm: a novel heuristic optimization algorithm,” IEEE Transactions on Cybernetics, vol. 51, no. 5, pp. 2725-2734, May 2021. [Baidu Scholar]
J. Hu, H. Chen, A. A. Heidari et al., “Orthogonal learning covariance matrix for defects of grey wolf optimizer: insights, balance, diversity, and feature selection,” Knowledge-based Systems, vol. 213, p. 106684, Feb. 2021. [Baidu Scholar]
J. Yan, Y. Xu, Q. Cheng et al., “LightGBM: accelerated genomically designed crop breeding through ensemble learning,” Genome Biology, vol. 22, no. 1, pp. 1-24, Sept. 2021. [Baidu Scholar]
T. D. Pham, N. N. Le, N. T. Ha et al., “Estimating mangrove above-ground biomass using extreme gradient boosting decision trees algorithm with fused Sentinel-2 and ALOS-2 PALSAR-2 data in can gio biosphere reserve, Vietnam,” Remote Sensing, vol. 12, no. 5, pp. 1-20, Mar. 2020. [Baidu Scholar]
Y. Qiu, J. Zhou, M. Khandelwal et al., “Performance evaluation of hybrid WOA-XGBoost, GWO-XGBoost and BO-XGBoost models to predict blast-induced ground vibration,” Engineering with Computers, vol. 38, no. 5, pp. 4145-4162, Apr. 2021. [Baidu Scholar]
Z. Liu, G. Gilbert, J. M. Cepeda et al., “Modelling of shallow landslides with machine learning algorithms,” Geoscience Frontiers, vol. 12, no. 1, pp. 385-393, Jan. 2021. [Baidu Scholar]
H. Shahabi, A. Shirzadi, K. Ghaderi et al., “Flood detection and susceptibility mapping using sentinel-1 remote sensing data and a machine learning approach: hybrid intelligence of bagging ensemble based on K-nearest neighbor classifier,” Remote Sensing, vol. 12, no. 2, pp. 1-30, Jan. 2020. [Baidu Scholar]
M. Mao, S. Zhang, L. Chang et al., “Schedulable capacity forecasting for electric vehicles based on big data analysis,” Journal of Modern Power Systems and Clean Energy, vol. 7, no. 6, pp. 1651-1662, Nov. 2019. [Baidu Scholar]
P. Razmi, M. O. Buygi, and M. Esmalifalak, “A machine learning approach for collusion detection in electricity markets based on nash equilibrium theory,” Journal of Modern Power Systems and Clean Energy, vol. 9, no. 1, pp. 170-180, Jan. 2021. [Baidu Scholar]
S. Li, W. Hu, D. Cao et al., “Electric vehicle charging management based on deep reinforcement learning,” Journal of Modern Power Systems and Clean Energy, vol. 10, no. 3, pp. 719-730, May 2022. [Baidu Scholar]