Abstract
The increasing penetration of highly intermittent wind generation could seriously jeopardize the operation reliability of power systems and increase the risk of electricity outages. To this end, this paper proposes a novel data-driven method for operation risk assessment of wind-integrated power systems. Firstly, a new approach is presented to model the uncertainty of wind power in lead time. The proposed approach employs k-means clustering and mixture models (MMs) to construct time-dependent probability distributions of wind power. The proposed approach can also capture the complicated statistical features of wind power such as multimodality. Then, a non-sequential Monte Carlo simulation (NSMCS) technique is adopted to evaluate the operation risk indices. To improve the computation performance of NSMCS, a cross-entropy based importance sampling (CE-IS) technique is applied. The CE-IS technique is modified to include the proposed model of wind power. The method is validated on a modified IEEE 24-bus reliability test system (RTS) and a modified IEEE 3-area RTS while employing the historical data of wind generation. The simulation results verify the importance of accurate modeling of short-term uncertainty of wind power for operation risk assessment. Further case studies have been performed to analyze the impact of transmission systems on operation risk indices. The computational performance of the framework is also examined.
Keywords
Cross entropy; mixture model; Monte Carlo simulation; operation risk; power system reliability
THE penetration of wind generation in modern power systems is on the rise. According to a recent forecast by Global Wind Energy Council, the total global installed capacity of wind generation will reach 840 GW, i.e., a 42% increase from the current level by 2022 [
The consideration of wind generation in long-term risk assessment of power systems is studied in [
For operation risk assessment of wind-integrated power systems, the existing methods can be broadly classified into two main categories: analytical methods and simulation techniques. Reference [
Analytical methods have limited applications and are not suitable for operation risk assessment of composite generation-transmission systems [
A common determinant of the existing techniques in [
To address the problems envisaged previously, a new approach is proposed to model the short-term uncertainty of wind power for operation risk assessment. Firstly, k-means clustering is used to obtain sufficient historical data of wind power for fitting time-dependent PDFs. Then for each cluster, mixture models (MMs) are utilized to develop multivariate PDFs, which can capture the complicated statistical features of wind power. MMs are semi-parametric probabilistic models that can represent arbitrarily complex PDFs with great flexibility. Gaussian MMs (GMMs) have been used to model spatial correlation of wind speed in long-term reliability evaluation [
Afterward, using the law of total expectation, an analytical expression for integrating the proposed wind power modeling in operation risk assessment is obtained. An NSCMS technique is then adopted to evaluate the analytical expression for operation risk assessment. The computation speed of the NSCMS is greatly enhanced by employing the CE-IS technique [
In summary, the main contributions of this paper are:
1) This paper presents a novel approach based on k-means clustering and GMMs to represent the uncertainty of wind power for operation risk assessment. The proposed approach also adopts MAP estimation to obtain GMM parameters instead of the widely-used MLE technique to avoid overfitting and singularities.
2) Based on the proposed probabilistic modeling of wind power, the paper presents a new framework for operation risk assessment. The clustering-based GMM modeling of wind power is integrated into the operation risk assessment using the law of total expectation.
3) An NSMCS technique is adopted to estimate the risk indices. To improve the computation performance, CE-IS is adapted which is also applied to GMMs of wind power. Simulation studies are also performed on two test systems to depict the efficacy of the proposed modeling approach and the operation risk assessment framework.
The rest of the paper is organized as follows. In Section II, the fundamental concepts of operation risk assessment in composite power systems are introduced. Section III describes the proposed probabilistic modeling approach for wind power. In Section IV, the proposed operation risk assessment is put forward. Simulation results are presented in Section V. Finally, Section VI concludes the paper.
Consider a power system with conventional generation stations, transmission lines, wind generation units, and buses. To represent the uncertainties arising from the unplanned outages of conventional generation units, transmission lines, and wind power at time , a random vector is defined. , where is the number of available generation units in generation station ; , where is the status of transmission line and is 1 if the transmission line is available and 0 if it is on outage; , where is the wind power of wind farm . Using the above notation, the risk can be mathematically expressed as:
(1) |
where is the risk index; is the limit-state function (LSF) or test function; is the joint multivariate PDF of random vector ; is the state space; is the expectation operator; and is a particular realization of . As can be deduced from (1), the choice of significantly impacts the risk indices. A more accurate estimation of would invariably lead to a more accurate evaluation of the risk indices [
Assuming that the outages of conventional generation units and transmission lines, and variability of wind power are mutually independent of each other, can be expressed as:
(2) |
For the conventional generation units, it is assumed that each generation station further comprises identical generation units. The failure events of these generation units are also assumed to be independent of each other [
(3) |
where
(4) |
(5) |
where is the failure rate of a generation unit in generation station ; is the lead time (typically 1 hour); and is the outage replacement rate [
Similar to the conventional generation units, it is assumed that the line outages are independent of each other. is represented by a product of Bernoulli distributions:
(6) |
where
(7) |
(8) |
where is the failure rate of transmission line .
Finally, for wind power, it is assumed that the spatial correlation among the wind farms is negligible, which implies that can be expressed as:
(9) |
where is the PDF for wind farm at time .
In this section, a novel approach founded on k-means clustering and GMMs is presented to model the PDF of wind power for operation risk assessment. Before delving any further, the motivation behind the proposed approach is presented. As mentioned in Section I, the existing approaches in operation risk assessment are primarily based on modeling wind speed PDF. Wind power PDF is then obtained through a wind power curve [

Fig. 1 Wind power curve with box plots of actual wind power data.
The wind power also possesses certain sophisticated statistical features that cannot be captured by simple parametric PDFs, e.g., Weibull, Beta, and Gaussian, which are used in the existing literature on power system reliability. One such feature is the multimodality.

Fig. 2 Histogram of historical wind power data for a month.
In the light of the above discussion, GMMs are used to model the short-term uncertainty of wind power for specific hours of a specific day. To employ GMMs, additional random variables are defined. The random vector delineated in the previous section represents the wind power of wind farms during time t. To study the temporal dependence, another random vector is considered. It should be noted that both and are defined for specific instances of the time at a specific day. As highlighted in [
(10) |
As is defined for two specific hours, this PDF should be constructed using the appropriate wind power data of those specific hours. For instance, if and are 14:00 and 15:00 on January 3, respectively, all the historical data for the two hours on this day would be employed to estimate the PDF.
In order to estimate for specific time periods, a substantial amount of historical data for those time periods are required. However, as only limited amount of historical data is available, a clustering approach is proposed. In particular, by using k-means clustering, the historical data for a specific month, i.e., January, is grouped into clusters. Then, for each cluster, the historical data for the two particular hours, i.e., 14:00 and 15:00, are used to estimate the required joint PDF. The joint PDF for each cluster is associated with a probability , which is obtained through k-means clustering. This approach ensures sufficient data available for fitting the PDFs. The approach is pictorially depicted in

Fig. 3 Clustering of historical data.
After obtaining sufficient data for specific hours of each cluster, different approaches can be utilized to estimate for each cluster. One approach is the kernel density estimation (KDE) which is non-parametric [
Using MMs, the PDF for wind power in two particular hours is estimated as:
(11) |
where is a parametric bivariate PDF with parameter ; is the number of mixtures; and is the
(12) |
(13) |
There are three evident advantages of using the MM approach of (11). Firstly, by employing a mixture of parametric distributions, the multimodality of wind power PDF could be captured. Secondly, through the inclusion of , the temporal correlation could be included in the model. Thirdly, by employing a mixture of bivariate PDFs essentially, the existing risk assessment frameworks could be easily modified to include (11) in the risk assessment framework.
In this paper, the Gaussian PDF is used to model . Consequently, (11) can also be expressed as:
(14) |
where is the Gaussian PDF with the mean and the covariance . For brevity, the GMM parameters are grouped as , and , which is a three-dimensional matrix of covariance matrices .
The main task is to determine the GMM parameters using the historical wind power data. In addition, needs to be set. Popular techniques for determining include the split-and-merge method [
(15) |
The MLE approach for estimating GMM parameters suffers from two major shortcomings. Firstly, given the limited amount of data, it is prone to overfitting. Secondly, due to the collapsing variance problem, singularities could occur [
(16) |
where and are the prior distributions on GMM parameters. In particular, is a Dirichlet distribution and is a Normal-Inverse-Wishart distribution. These prior distributions act to regularize the parameter fitting and thus avoid overfitting and singularities. Using the expectation maximization (EM) method, the MAP estimates in (16) are obtained as [
(17) |
(18) |
(19) |
(20) |
(21) |
(22) |
where ; ; is the scalar parameter of Dirichlet distribution; , , , are the parameters of the Normal-Inverse-Wishart distribution, is a D×1 vector parameter, and are scalar parameters, and is a D×D matrix parameter, is the number of dimensions in the data.
After obtaining the joint PDF, the conditional PDF for each cluster can be obtained. Due to the characteristic of Gaussian PDFs, this conditional PDF is also a univariate GMM. Subsequently, it is employed in operation reliability assessment.
In this section, the proposed method of operation risk assessment is presented, which integrates the previously developed GMM model of wind power. As an example, the method is explained using the loss of load probability (LOLP) index. However, the method could be easily extended to estimate other reliability indices.
The risk in (1) is defined for a specific PDF . As there are multiple PDFs for multiple clusters, (1) needs to be modified. The law of total expectation can be used to obtain the total risk considering different PDFs for different clusters as:
(23) |
where is similar to in (2) with the exception that is replaced by PDF of each cluster .
Due to a large number of states in the state space and high dimensionality of the integral, it is difficult to evaluate (23) analytically [
(24) |
where are independent and identically distributed (IID) samples drawn from ; and is the total number of samples. The LSF is defined as:
(25) |
(26) |
where is the load demand during lead time; is the cumulative available generation at bus b; and is the load supplied to bus b. The DC OPF is used to evaluate and .
Because of low failure probabilities during power system operation, most of the samples correspond to = 0. Thus, a larger number of samples are required to correctly estimate the risk indices. This would significantly increase the computation burden [
(27) |
(28) |
In (27), the IID samples are now drawn from . Similar to (2), can be written as:
(29) |
The PDF which is also known as importance sampling density, can be obtained using the widely-used CE optimization [
Considering the GMM PDF of wind farm w for cluster c,, the following transformation is used to obtain a new random variable as:
(30) |
where is the cumulative distribution function (CDF) of ; and is the CDF of . After transformation, the PDF for is distorted using the CE optimization. As the PDF of belongs to exponential family of distributions, analytical rules can be applied to obtain the CE parameters. In the calculation of (25) and (26) through DC-OPF, the random variable is transformed back to actual wind power random variable using the distorted via the inverse of (30). Thus, acts as a proxy random variable to distort the GMM. The complete CE algorithm is depicted as
In this section, case studies are performed to depict the effectiveness of the proposed probabilistic modeling of wind power and the proposed operation risk assessment method. The simulations are performed on a modified IEEE 24-bus RTS as shown in

Fig. 4 Modified IEEE 24-bus RTS.
The efficacy of the GMM model is explained in this subsection.

Fig. 5 Bivariate histogram for given dataset for one of the three clusters.

Fig. 6 GMM for given dataset when MAP estimation is employed.

Fig. 7 GMM for given dataset when MLE estimation is employed.

Fig. 8 Bivariate Gaussian approximation to given dataset.
In this section, the operation risk indices are evaluated. The proposed method is compared with another method (Method B) in which the PDF of wind power is modeled using a bivariate Gaussian distribution as shown in
The results indicate that there is a stark difference between the risks obtained from the two methods. For all values of initial wind power, the risk indices obtained by the proposed approach are higher than those obtained by Method B. A reason behind this observation can be deduced by investigating
This subsection examines the computation performance of the proposed method.

Fig. 9 Convergence behavior of proposed method for three clusters.
In this subsection, the effect of the location of the wind farm on the operation risk indices is investigated to understand the impact of transmission system constraints. As can be observed from Table II, the operation risk indices are markedly different for different buses. This variation stems from the difference in the total capacities of transmission lines connected to these buses. The operation risk indices for bus 4 are the highest as it has the lowest total capacity of transmission lines connected to it, i.e., 350 MW.
Therefore, the output of the wind farm is highly constrained in this case. For bus 10, the total capacity of transmission lines is 1545 MW. Therefore, the operation risk indices for this bus are lower than that of bus 4. Finally, for bus 19, the transmission capacity available to wind farm is 1000 MW. Hence, the operation risk indices for this bus lie between those of bus 4 and bus 10. This effect is more pronounced for cases when the wind power PDF for the lead time is more skewed toward the maximum capacity, e.g., when the initial wind power is 1.0. These results highlight the effects of transmission system on operation risk indices. These effects could only be considered through the risk assessment of composite power systems.
This subsection performs studies on the IEEE 3-area RTS. Similar to the previous case studies, a 155 MW conventional generator on bus 16 of each area is removed. A 500 MW wind farm is installed on bus 19 of each area. The load is set to the peak value of 8550 MW. The probabilistic model for wind power is similar to that in the previous studies. Table III presents the risk indices obtained for this test system. Compared to the IEEE 24-bus RTS, the risk indices are expectedly lower. This is because the interconnection of the three areas improves the overall reliability of the system.

Fig. 10 Convergence behavior of proposed method for three clusters when IEEE 3-area RTS is employed.
In this paper, a new data-driven method for operation risk assessment of composite power systems considering wind power is proposed. The short-term uncertainty of wind power is directly modeled using GMMs. k-means clustering and MAP estimation are adopted to address the issue of limited availability of wind power data. The proposed GMM is then incorporated in the operation risk assessment framework using the total law of expectation. NSMCS is then applied to obtain the risk indices. The computation performance of NSMCS is improved by adapting the CE-IS.
Case studies have shown that the complicated statistical features of wind power, which are modeled by GMM, are necessary to obtain accurate operation risk indices. In particular, the multimodality of wind power PDF affects the calculated operation risk indices. The computation performance of the proposed data-driven method is suitable for application in power systems operation. The transmission system constraints significantly affect the operation risk indices. Therefore, it is crucial to include the transmission system in any operation risk studies. Compared to the existing approaches, the proposed data-driven approach avoids the pitfalls of using wind speed data. Moreover, the extensive modeling of uncertainty of wind generation leads to more accurate estimation of risk indices.
REFERENCES
Global Wind Energy Council (GWEC). (2018, May). Global wind report: annual market update 2017. [Online]. Available: https://www.researchgate.net/publication/324966225_GLOBAL_WIND_REPORT_-_ Annual_Market_Update_2017 [百度学术]
R. Billinton, B. Karki, R. Karki et al., “Unit commitment risk analysis of wind integrated power systems,” IEEE Transactions on Power Systems, vol. 24, no. 2, pp. 930-939, May 2009. [百度学术]
R. Billinton and R. N. Allan, Reliability Evaluation of Power Systems. New York: Plenum, 1996. [百度学术]
W. Li, Risk Assessment of Power Systems-Models, Methods, and Applications. New York: IEEE Press, 2005. [百度学术]
M. Liu, W. Li, J. Yu et al., “Reliability evaluation of tidal and wind power generation system with battery energy storage,” Journal of Modern Power Systems and Clean Energy, vol. 4, no. 4, pp. 636-647, Sept. 2016. [百度学术]
R. Billinton, R. Karki, Y. Gao et al., “Adequacy assessment considerations in wind integrated power systems,” IEEE Transactions on Power Systems, vol. 27, no. 4, pp. 2297-2305, Nov. 2012. [百度学术]
S. Sulaeman, M. Benidris, J. Mitra et al., “A wind farm reliability model considering both wind variability and turbine forced outages,” IEEE Transactions on Sustainable Energy, vol. 8, no. 2, pp. 629-637, Apr. 2017. [百度学术]
Z. Parvini, A. Abbaspour, M. Fotuhi-Firuzabad et al., “Operational reliability studies of power systems in presence of energy storage systems,” IEEE Transactions on Power Systems, vol. 33, no. 4, pp. 3691-3700, Jul. 2018. [百度学术]
S. Thapa, R. Karki, and R. Billinton, “Utilization of the area risk concept for operational reliability evaluation of a wind-integrated power system,” IEEETransactions on Power Systems, vol. 28, no. 4, pp. 4771-4779, Nov. 2013. [百度学术]
P. Wang, Z. Gao, and L. B. Tjernberg, “Operational adequacy studies of power systems with wind farms and energy storage,” IEEETransactions on Power Systems, vol. 27, no. 4, pp. 2377-2384. Nov. 2012. [百度学术]
Y. Ding, L. Cheng, Y. Zhang et al., “Operational reliability evaluation of restructured power systems with wind power penetration utilizing reliability network equivalent and time-sequential simulation approaches,” Journal of Modern Power Systems and Clean Energy, vol. 2, no. 4, pp. 329-340, Dec. 2014. [百度学术]
A. M. L. Silva, J. F. C. Castro, and R. Billinton, “Probabilistic assessment of spinning reserve via cross-entropy method considering renewable sources and transmission restrictions,” IEEETransactions on Power Systems, vol. 33, no. 4, pp. 4574-4582, Jul. 2018. [百度学术]
O. A. Ansari and C. Y. Chung, “A hybrid framework for short-term risk assessment of wind-integrated composite power systems,” IEEE Transactions on Power Systems, vol. 34, no. 3, pp. 2334-2344, May 2019. [百度学术]
Y. Wang, Q. Hu, and S. Pei, “Wind power curve modeling with asymmetric error distribution,” IEEE Transactions on Sustainable Energy. DOI: 10.1109/TSTE.2019.2920386. [百度学术]
B. Khorramdel, C. Y. Chung, N. Safari et al., “A fuzzy adaptive probabilistic wind power prediction framework using diffusion kernel density estimators,” IEEE Transactions on Power Systems, vol. 33, no. 6, pp. 7109-7121, Nov. 2018. [百度学术]
L. Geng, Y. Zhao, and W. Li, “Enhanced cross entropy method for composite power system reliability evaluation,” IEEE Transactions on Power Systems, vol. 34, no. 4, pp. 3129-3139, Jul. 2019. [百度学术]
J. Cai, Q. Xu, M. Cao et al., “A novel importance sampling method of power system reliability assessment considering multi-state units and correlation between wind speed and load,” International Journal of Electrical Power & Energy Systems, vol. 109, pp. 217-226, Jul. 2019. [百度学术]
F. Ge, Y. Ju, Z. Qi et al., “Parameter estimation of a Gaussian mixture model for wind power forecast error by Riemann L-BFGS optimization,” IEEE Access, vol. 6, pp. 38892-38899, Jul. 2018. [百度学术]
D. Ke, C. Y. Chung, and Y. Sun, “A novel probabilistic optimal power flow model with uncertain wind power generation described by customized Gaussian mixture model,” IEEE Transactions on Sustainable Energy, vol. 7, no. 1, pp. 200-212, Jan. 2016. [百度学术]
C. M. Bishop, Pattern Recognition and Machine Learning. New York: Springer, 2006. [百度学术]
R. Y. Rubinstein and D. P. Kroese, Simulation and the Monte Carlo Methods. New York: Wiley, 2007. [百度学术]
R. Y. Rubinstein and D. P. Kroese, The Cross-Entropy Method. New York: Springer, 2004. [百度学术]
D. W. Scott, Multivariate Density Estimation: Theory, Practice and Visualization. Hoboken: Wiley, 2015. [百度学术]
A. L. Rojas, (2005, Jul.. Conditional density estimation using finite mixture models with an application to astrophysics. Carnegie Mellon University, Pittsburgh, USA. [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.61.2002 [百度学术]
G. McLachlan and D. Peel, Finite Mixture Models. New York: Wiley, 2000. [百度学术]
K. P. Murphy, Machine Learning: A Probabilistic Approach. Cambridge: The MIT Press, 2012. [百度学术]
A. M. L. Silva, J. F. C. Castro, and R. A. González-Fernández, “Spinning reserve assessment under transmission constraints based on cross-entropy method,” IEEE Transactions on Power Systems, vol. 31, no. 2, pp. 1624-1632, Mar. 2016. [百度学术]
Probability Methods Subcommittee, “IEEE reliability test system,”IEEE Transactions on Power Apparatus and Systems, vol. PAS-98, no. 6, pp. 2047-2054, Nov. 1979. [百度学术]
P. E. E. Sotavento (2019, Oct.). Wind Power Data. [Online]. Available: http://www.sotaventogalicia.com/es [百度学术]