Abstract
Accurate regional wind power prediction plays an important role in the security and reliability of power systems. For the performance improvement of very short-term prediction intervals (PIs), a novel probabilistic prediction method based on composite conditional nonlinear quantile regression (CCNQR) is proposed. First, the hierarchical clustering method based on weighted multivariate time series motifs (WMTSM) is studied to consider the static difference, dynamic difference, and meteorological difference of wind power time series. Then, the correlations are used as sample weights for the conditional linear programming (CLP) of CCNQR. To optimize the performance of PIs, a composite evaluation including the accuracy of PI coverage probability (PICP), the average width (AW), and the offsets of points outside PIs (OPOPI) is used to quantify the appropriate upper and lower bounds. Moreover, the adaptive boundary quantiles (ABQs) are quantified for the optimal performance of PIs. Finally, based on the real wind farm data, the superiority of the proposed method is verified by adequate comparisons with the conventional methods.
WITH the increasing capacity of renewable energy, the randomness and dynamic fluctuations of electrical magnitudes set new requirements for power system security, efficiency, and flexibility [
Based on the result of point prediction, prediction intervals (PIs) can be utilized for quantification of uncertainties within prescribed confidence level [
With the clustering algorithm of numerical weather prediction (NWP), the similarity of samples was used to improve the accuracy of point prediction [
The above-mentioned methods were applied for power prediction of a single wind farm. For the dispatching function, however, the significance of regional power prediction is much higher than that of a single wind farm. In [
Hence, based on the aforementioned literature, a probabilistic prediction method of very short-term PIs for regional wind power based on composite conditional nonlinear quantile regression (CCNQR) is proposed in this paper, with the following main contributions:
1) A hierarchical clustering method based on weighted multivariate time series motifs (WMTSM) is used to analyze the static characteristic, dynamic characteristic and meteorological characteristic of regional wind power.
2) Based on the clustering analysis, the correlation coefficients are formulated as the weights for the accuracy of samples’ utilization used to optimize the cost function of conditional LP (CLP). In addition, to further improve the performance of PIs, the composite evaluation by considering reliability, sharpness, and OPOPI, combined with the adaptive boundary quantiles (ABQs) is studied.
The rest of the paper is organized as follows. In Section II, the proposed WMTSM and CLP are described. Besides, combined with the ABQs, composite optimization considering reliability, average width (AW), and OPOPI is presented. The model construction process is demonstrated in Section III. Case studies are presented in Section IV, which illustrate the effectiveness of the proposed method. Conclusions are drawn in Section V.
The flowchart of the proposed method is illustrated in

Fig. 1 Flowchart of proposed method.
To improve the accuracy of hierarchical clustering by calculating the similarity based on Euclidean distance [
The process of WMTSM for sample distance is listed as follows.
1) According to the regional wind power time series, the matrix is defined as:
(1) |
2) Due to the complex correlation between the input variables and output variables, the Spearman correlation analysis can be utilized to quantify the correlation between and [
(2) |
3) For the th input vector , the difference between the adjacent explanatory variables represented by is utilized to quantify the fluctuation of adjacent variables. is defined as:
(3) |
4) Based on the above analysis, the distance of WMTSM between the th input vector and the th input vector is formulated as:
(4) |
(5) |
(6) |
(7) |
The hierarchical clustering method based on WMTSM aims at weighting ,, and to quantify the distances between samples. can be used as the weighting coefficient of wind speed for each wind farm in the regional wind farms [
With WMSTM analysis, the correlation of samples is defined as:
(8) |
In NQR [
(9) |
s.t.
(10) |
(11) |
(12) |
(13) |
In CLP, different correlations are utilized for the samples from different clusters. Meanwhile, the samples with low correlations are also considered, instead of being simply removed. Comprehensive use of the samples with weighting coefficients is studied to improve the accuracy of samples’ utilization. The magnitude of influence is directly determined by the correlations which are quantified by calculating the distances between the cluster centers.
The performance of PIs is evaluated considering both reliability and overall performance based on the deviation between the PI coverage probability (PICP) and PI nominal confidence (PINC), and the interval score, respectively [
(14) |
(15) |
The sharpness is defined as:
(16) |
(17) |
To comprehensively evaluate the performance of PIs including the sharpness, the interval score [
(18) |
(19) |
As analyzed above, the interval score is a significant criterion for the overall performance of PIs. In (19), the interval score is quantified by the average score of all PIs. As shown in (18), when the actual points are outside the PIs, the larger the actual point deviating from PIs, the lower the score. Thus, to evaluate the overall performance, not only the reliability and sharpness, but also the OPOPI should be considered in the cost function of model training. The cost function of CCNQR based on CLP and composite optimization is described as (20), whose constraints contain (10)-(13) and (21)-(28).
(20) |
s.t.
(21) |
(22) |
(23) |
(24) |
(25) |
(26) |
(27) |
(28) |
When the actual value lies in the PI, denotes the width of the PI. Otherwise, can be quantified based on distances between the bounds and , reflecting both the sharpness and OPOPI. By considering , the overall performance of PIs can be directly optimized based on the efficient LP. is defined as:
(29) |
The conventional PIs quantify the nominal proportions of boundary quantiles based on (30) and (31).
(30) |
(31) |
To further improve the accuracy of boundary quantiles, ABQs are studied to optimize the bounds of PIs. The nominal proportion of upper quantile can be optimized by meta-heuristic algorithm adaptively, and the nominal proportion of lower quantile is quantified by:
(32) |
s.t.
(33) |
(34) |
In the proposed method, the training samples are clustered based on WMTSM. Then, with the clustering coefficients of training samples for CLP and composite optimization, CCNQR is performed. The training samples from the same cluster have the same in (20). Particle swarm optimization (PSO) [
Step 1: initialize the coefficients of NQR and PSO, and set the confidence of PIs. Import and normalize the dataset for training and testing samples.
Step 2: quantify the correlations between outputs and variables of input vectors.
Step 3: for each iteration in the search space of PSO for optimal , , and , based on WMTSM, the prediction error is utilized as the objective of the cost function.
Step 4: obtain the optimal coefficients and the clustering labels of training samples, and quantify the correlation coefficients of clusters as the weights of in CLP.
Step 5: for each resolution in the search space of PSO to obtain the optimal and nominal proportions of boundary quantiles for each cluster, based on the optimization function of CCNQR given in (10)-(13), (20)-(28), and (32)-(34), the quantification considering interval score and reliability is performed.
Step 6: based on the application result of CCNQR, the output weights of upper and lower quantiles in each cluster in the training process are calculated.
Step 7: by comparing the weighted distances between the inputs of testing samples and each cluster center, the labels of testing samples are obtained. Then, with the result of the training process, PIs can be quantified.
Remark 1: different from the high-fluctuating generation of single wind farm [
Remark 2: different from LP [
Remark 3: based on the criterion of interval score and considering in a reasonable range,the coefficient in (20) is utilized to regulate different characteristics for the optimization of PIs. The sharpness and OPOPI are both considered in the composite optimization of CCNQR. This helps to fine-tune the PIs for better performance. Different from the conventional QR cost function that only considers the coverage accuracy of PIs [
To fully verify the effectiveness of the proposed methodologies, two datasets are considered, which are given as follows.
1) Dataset 1: the wind power data of 20 wind farms located in the northeast of China with 15-min resolution covering the first half of 2019 and the corresponding wind speed data at 100 meters are studied. The data of the last 4 days in each month are used for testing and the data of the latest 11 days are used for training.
2) Dataset 2: the wind power data of 7 wind farms in Global Energy Forecasting Competition 2012 (GEFCom2012) with hourly resolution covering the second half of 2010 and the corresponding wind speed data at 10 meters are studied [
The wind speed at the time of wind power generation outage is set to be 0 in order to improve the accuracy and synchronization between the NWP data and wind farm outputs according to the outage plans. Historical power time series with fewer zero output is selected to study the performance of PIs in each month. The results of different probabilistic prediction methods are then compared, and the datasets are used after normalization. The regional wind power as well as its variation is influenced by the nature of the wind farm itself, the different seasons, and the periods of time, i.e., recent observations of wind power output are more important than those observed earlier [
For numerical comparison of clustering-based deterministic predictions via ELM, the hourly-ahead prediction errors of the WMTSM-based method, K-means based method [

Fig. 2 Prediction errors of different clustering methods in each month covering datasets 1 and 2.
In this subsection, the data in January and June from dataset 1 and the data in September-October and November-December from dataset 2 are utilized to study the performance of PIs based on the weighting coefficients and the nominal proportions of upper quantiles.
The and interval scores according to the nominal proportions of upper quantiles in ABQs and weighting coefficients in the training process are shown in Figs.

Fig. 3 Performances of PIs in January from dataset 1 with PINC of 90% and 95%. (a) Reliability with PINC of 90%. (b) Overall performance with PINC of 90%. (c) Reliability with PINC of 95%. (d) Overall performance with PINC of 95%.

Fig. 4 Performances of PIs in June from dataset 1 with PINC of 90% and 95%. (a) Reliability with PINC of 90%. (b) Overall performance with PINC of 90%. (c) Reliability with PINC of 95%. (d) Overall performance with PINC of 95%.

Fig. 5 Performances of PIs in September-October from dataset 2 with PINC of 90% and 95%. (a) Reliability with PINC of 90%. (b) Overall performance with PINC of 90%. (c) Reliability with PINC of 95%. (d) Overall performance with PINC of 95%.

Fig. 6 Performances of PIs in November-December from dataset 2 with PINC of 90% and 95%. (a) Reliability with PINC of 90%. (b) Overall performance with PINC of 90%. (c) Reliability with PINC of 95%. (d) Overall performance with PINC of 95%.
Figures
For the numerical analysis of the proposed method, the performances of BELM [
To further verify the effectiveness of the proposed method in different periods, the numerical comparisons with different look-ahead time based on datasets 1 and 2 are presented in Tables VII-X. The conventional PIs for the regional wind power based on deterministic prediction and Gaussian error distribution such as smoothing method [
To reveal the result of proposed method with 1-hour look-ahead time,

Fig. 7 PIs of wind power in January and April from dataset 1. (a) January. (b) April.

Fig. 8 PIs of wind power in September-October and November-December from dataset 2. (a) September-October. (b) November-December.
In this paper, a novel probabilistic prediction method based on CCNQR is proposed for very short-term PIs of regional wind power, which implements the following four tasks. Firstly, WMTSM clustering the samples by considering the static difference, dynamic difference, meteorological difference and the importance of variables is verified by numerical comparison of deterministic predictions. Secondly, CNQR considering reliability, sharpness, and OPOPI for the performance improvement of PIs is studied, while the ABQs are studied to improve the flexibility and robustness of PIs.
As verified by the analysis of coefficients and numerical comparisons with different PINCs and look-ahead time, the composite optimization and ABQs improve the forecasting performance. Thirdly, with the result of clustering, the CLP for each cluster is quantified, which can improve the accuracy of samples’ utilization, and further enhance the performance of CNQR. Finally, the numerical comparisons with existing methods for different PINCs and look-ahead time demonstrate the effectiveness of the proposed method.
The future work may focus on the advanced method of dynamic analysis, which can accurately describe the characteristics of wind power time series. Besides, the analysis of STC can also be utilized to improve the performance of PIs for regional output.
Nomenclature
Symbol | —— | Definition |
---|---|---|
—— | Indicator of quantile | |
, , , | —— | Auxiliary variables |
—— | Weight of dynamic difference | |
—— | Weight of static difference | |
—— | Weight of meteorological difference | |
—— | Difference between adjacent explanatory variables | |
—— | Wind speed in numerical weather prediction(NWP) | |
—— | Nominal proportion of prediction intervals (PIs) | |
—— | Nominal proportion of upper quantile | |
—— | Nominal proportion of lower quantile | |
—— | Average width of PIs with | |
—— | Absolute value of proportion deviation | |
—— | Correlation coefficient | |
—— | Wind farm capacity | |
—— | Distance of dynamic difference | |
—— | Distance of static difference | |
—— | Distance of weighted multivariate time series motifs (WMTSM) | |
—— | Distance of meteorological difference | |
—— | Composite optimization considering offsets of points outside PIs (OPOPI), sharpness, and reliability | |
—— | Output function of extreme learning machine (ELM) | |
, , , | —— | Common indices |
—— | PI in a time point with | |
—— | Interval score with | |
—— | Spearman correlation coefficient | |
—— | Weighting coefficient | |
—— | Number of wind farms | |
—— | Number of input variables in each sample | |
—— | Upper quantile of PI with | |
—— | Lower quantile of PI with | |
—— | Score in a time point with | |
—— | Wind power observation | |
—— | Number of testing samples | |
—— | Number of training samples | |
—— | Output weight of ELM | |
—— | Width of PI | |
—— | Input variable of ELM | |
—— | Explanatory variable vector | |
—— | Matrix consisting of input variables in training samples | |
—— | Prediction target of training sample | |
—— | Output vector of training samples |
References
B. Mohandes, M. S. E. Moursi, N. Hatziargyriou et al., “A review of power system flexibility with high penetration of renewables,” IEEE Transactions on Power Systems, vol. 34, no. 4, pp. 3140-3155, Jul. 2019. [Baidu Scholar]
B. Liu, K. Meng, Z. Dong et al., “Marginal bottleneck identification in power system considering correlated wind power prediction errors,” Journal of Modern Power Systems and Clean Energy, vol. 8, no. 1, pp. 187-192, Jan. 2020. [Baidu Scholar]
L. Ge, Y. Xian, J. Yan et al., “A hybrid model for short-term PV output forecasting based on PCA-GWO-GRNN,” Journal of Modern Power Systems and Clean Energy, vol. 8, no. 6, pp. 1268-1275, Nov. 2020. [Baidu Scholar]
Y. Wang, Y. Sun, V. Dinavhi et al., “Robust forecasting-aided state estimation for power system against uncertainties,” IEEE Transactions on Power Systems, vol. 35, no. 1, pp. 691-702, Aug. 2020. [Baidu Scholar]
A. Cerejo, S. J. P. S. Mariano, P. M. S. Carvalh et al., “Hydro-wind optimal operation for Joint bidding in day-ahead market: storage efficiency and impact of wind forecasting uncertainty,” Journal of Modern Power Systems and Clean Energy, vol. 8, no. 1, pp. 142-149, Jan. 2020. [Baidu Scholar]
Y. Zhao, L. Ye, P. Pinson et al., “Correlation-constrained and sparsity-controlled vector autoregressive model for spatio-temporal wind power forecasting,” IEEE Transactions on Power Systems, vol. 33, no. 5, pp. 5029-5040, Sept. 2018. [Baidu Scholar]
Y. Lin, M. Yang, C. Wan et al., “A multi-model combination approach for probabilistic wind power forecasting,” IEEE Transactions on Sustainable Energy, vol. 10, no. 1, pp. 226-237, Jan. 2019. [Baidu Scholar]
J. T. G. Hwang and A. Ding, “Prediction intervals for artificial neural networks,” Journal of the American Statistical Association, vol. 92, no. 438, pp. 748-757, Jun. 1997. [Baidu Scholar]
Z. Wang, W. Wang, C. Liu et al., “Forecasted scenarios of regional wind farms based on regular vine copulas,” Journal of Modern Power Systems and Clean Energy, vol. 8, no. 1, pp. 77-85, Jan. 2020. [Baidu Scholar]
Y. Sun, P. Wang, S. Zhai et al., “Ultra short-term probability prediction of wind power based on LSTM network and condition normal distribution,” Wind Energy, vol. 23, no. 2, pp. 63-76, Oct. 2019. [Baidu Scholar]
C. Wan, Z. Xu, P. Pinson et al., “Probabilistic forecasting of wind power generation using extreme learning machine,” IEEE Transactions on Power Systems, vol. 29, no. 3, pp. 1033-1044, May 2014. [Baidu Scholar]
C. Wan, Z. Xu, P. Pinson et al., “Direct interval forecasting of wind power,” IEEE Transactions on Power Systems, vol. 28, no. 4, pp. 4877-4878, Nov. 2013. [Baidu Scholar]
R. Koenker and G. Bassett, “Regression quantiles,” Econometrica, vol. 46, no. 1, pp. 33-50, Jan. 1978. [Baidu Scholar]
C. Wan, J. Lin, J. Wang et al., “Direct quantile regression for nonparametric probabilistic forecasting of wind power generation,” IEEE Transactions on Power Systems, vol. 32, no. 4, pp. 2767-2778, Jul. 2016. [Baidu Scholar]
C. Wan, J. Wang, J. Lin et al., “Nonparametric prediction intervals of wind power via linear programming,” IEEE Transactions on Power Systems, vol. 33, no. 1, pp. 1074-1076, Jan. 2018. [Baidu Scholar]
A. Xu, T. Yang, J. Ji et al., “Application of cluster analysis in short-term wind forecasting model,” The Journal of Engineering, vol. 2019, no. 9, pp. 5423-5426, Apr. 2019. [Baidu Scholar]
Q. Xu, D. He, N. Zhang et al., “A short-term wind power forecasting approach with adjustment of numerical weather prediction input by data mining,” IEEE Transactions on Sustainable Energy, vol. 6, no. 4, pp. 1283-1291, Jun. 2015. [Baidu Scholar]
G. Sideratos and N. Hatziargyriou, “Probabilistic wind power forecasting using radial basis function neural networks,” IEEE Transactions on Power Systems, vol 27, no. 4, pp. 1788-1796, Nov. 2012. [Baidu Scholar]
G. Sideratos and N. Hatziargyriou, “A distributed memory RBF-based model for variable generation forecasting,” International Journal of Electrical Power & Energy Systems, vol. 120, pp. 106041, Sept. 2020. [Baidu Scholar]
G. Sideratos and N. Hatziargyriou, “An advanced statistical method for wind power forecasting,” IEEE Transactions on Power Systems, vol. 22, no. 1, pp. 258-265, Mar. 2007. [Baidu Scholar]
G. Sideratos, A. Ikonomopoulos, and N. Hatziargyriou, “A novel fuzzy-based ensemble model for load forecasting using hybrid deep neural networks,” Electric Power Systems Research, vol. 178, p. 106025, Jan. 2020. [Baidu Scholar]
M. Marinelli, P. Maul, A. N. Hahmann et al., “Wind and photovoltaic large-scale regional models for hourly production evaluation,” IEEE Transactions on Sustainable Energy, vol. 6, no. 3, pp. 916-923, Sept. 2015. [Baidu Scholar]
M. B. Ozkan and P. Karagoz, “A novel wind power forecast model: statistical hybrid wind power forecast technique (SHWIP),” IEEE Transactions on Industrial Informatics, vol. 11, no. 2, pp. 375-387, Apr. 2015. [Baidu Scholar]
M. B. Ozkan and P. Karagoz, “Data mining-based upscaling approach for regional wind power forecasting: regional statistical hybrid wind power forecast technique (regional SHWIP),” IEEE Access, vol. 7, pp. 171790-171800, Nov. 2019. [Baidu Scholar]
Q. Zhu, J. Chen, D. Shi et al., “Learning temporal and spatial correlations jointly: a unified framework for wind speed prediction,” IEEE Transactions on Sustainable Energy, vol. 11, no. 1, pp. 509-523, Sept. 2015. [Baidu Scholar]
M. He, L. Yang, J. Zhang et al., “A spatio-temporal analysis approach for short-term forecast of wind farm generation,” IEEE Transactions on Power Systems, vol. 29, no. 4, pp. 1611-1622, Jul. 2014. [Baidu Scholar]
M. G. Lobo and I. Sanchez, “Regional wind power forecasting based on smoothing techniques, with application to the Spanish peninsular system,” IEEE Transactions on Power Systems, vol. 27, no. 4, pp. 1990-1997, Nov. 2012. [Baidu Scholar]
Q. Liang, Y. Xiong, and K. Liu, “Weather division-based wind power forecasting model with feature selection,” IET Renewable Power Generation, vol. 13, no. 16, pp. 3050-3060, Oct. 2019. [Baidu Scholar]
W. Xie, R. Han, and W. Zhou, “Time series classification based on triadic time series motifs,” International Journal of Modern Physics A, vol. 33, no. 21, pp. 1-14, Jan. 2019. [Baidu Scholar]
G. Karypis, E. Han, and V. Kumar, “Chameleon: hierarchical clustering using dynamic modeling,” IEEE Computer Society, vol. 32, no. 8, pp. 68-75, Sept. 1999. [Baidu Scholar]
C. Spearman, “The proof and measurement of association between two things,” International Journal of Epidemiology, vol. 39, no. 5, pp. 1137-1150, Oct. 2010. [Baidu Scholar]
H. Wang and B. Zou, “Probabilistic computational model for correlated wind speed, solar irradiation, and load using Bayesian network,” IEEE Access, vol. 8, pp. 51653-51663, Mar. 2020. [Baidu Scholar]
C. Wan, Z. Xu, P. Pinson et al., “Optimal prediction intervals of wind power generation,” IEEE Transactions on Power Systems, vol. 29, no. 3, pp. 1166-1174, May 2014. [Baidu Scholar]
J. Kennedy and R. C. Eberhart, “Particle swarm optimization,” in Proceedings of 1995 IEEE International Conference on Neural Networks, Perth, Australia, Nov. 1995, pp. 1942-1948. [Baidu Scholar]
M. Yang, X. Chen, and B. Huang, “Ultra-short-term multi-step wind power prediction based on fractal scaling factor transformation,” Journal of Renewable and Sustainable Energy, vol. 10, no. 5, pp. 1-17, Oct. 2018. [Baidu Scholar]
M. Yang, R. Zhang, Y. Cui et al., “Investigating the wind power smoothing effect using set pair analysis,” IEEE Transactions on Sustainable Energy, vol. 11, no. 3, pp. 1161-1172, Jul. 2020. [Baidu Scholar]
L. Ye, C. Zhang, Y. Tang et al., “Hierarchical model predictive control strategy based on dynamic active power dispatch for wind power cluster integration,” IEEE Transactions on Power Systems, vol. 34, no. 6, pp. 4617-4629, Nov. 2019. [Baidu Scholar]
T. Hong, P. Pinson, and S. Fan, “Global energy forecasting competition 2012,” International Journal of Forecasting, vol. 30, no. 2, pp. 351-363, Apr.-Jun. 2014. [Baidu Scholar]
X. Huang, Y. Ye, and H. Zhang, “Extensions of K means-type algorithms: a new clustering framework by integrating intracluster compactness and intercluster separation,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 25, no. 8, pp. 1433-1446, Aug. 2014. [Baidu Scholar]
M. Yang, C. Shi, and H. Liu, “Day-ahead wind power forecasting based on the clustering of equivalent power curves,” Energy, vol. 218, p. 119515, Mar. 2021. [Baidu Scholar]