Data-driven Operation Risk Assessment of Wind-integrated Power Systems via Mixture Models and Importance Sampling

Ansari Osama Aslam，Gong Yuzhong，Liu Weijia，Chi Yung Chung; Osama Aslam Ansari; Yuzhong Gong; Weijia Liu; Chi Yung Chung

网刊加载中。。。

使用Chrome浏览器效果最佳，继续浏览，你可能不会看到最佳的展示效果，

确定继续浏览么?

复制成功，请在其他浏览器进行阅读

Data-driven Operation Risk Assessment of Wind-integrated Power Systems via Mixture Models and Importance Sampling PDF

- ORCID：
Osama Aslam Ansari
- ORCID：
Yuzhong Gong
- ORCID：
Weijia Liu
- ORCID：
Chi Yung Chung

Department of Electrical and Computer Engineering, University of Saskatchewan, Saskatoon, Canada

Updated：2020-05-28

DOI：10.35833/MPCE.2019.000163

Abstract

The increasing penetration of highly intermittent wind generation could seriously jeopardize the operation reliability of power systems and increase the risk of electricity outages. To this end, this paper proposes a novel data-driven method for operation risk assessment of wind-integrated power systems. Firstly, a new approach is presented to model the uncertainty of wind power in lead time. The proposed approach employs k-means clustering and mixture models (MMs) to construct time-dependent probability distributions of wind power. The proposed approach can also capture the complicated statistical features of wind power such as multimodality. Then, a non-sequential Monte Carlo simulation (NSMCS) technique is adopted to evaluate the operation risk indices. To improve the computation performance of NSMCS, a cross-entropy based importance sampling (CE-IS) technique is applied. The CE-IS technique is modified to include the proposed model of wind power. The method is validated on a modified IEEE 24-bus reliability test system (RTS) and a modified IEEE 3-area RTS while employing the historical data of wind generation. The simulation results verify the importance of accurate modeling of short-term uncertainty of wind power for operation risk assessment. Further case studies have been performed to analyze the impact of transmission systems on operation risk indices. The computational performance of the framework is also examined.

Keywords

Cross entropy; mixture model; Monte Carlo simulation; operation risk; power system reliability

I. Introduction

THE penetration of wind generation in modern power systems is on the rise. According to a recent forecast by Global Wind Energy Council, the total global installed capacity of wind generation will reach 840 GW, i.e., a 42% increase from the current level by 2022 [

1]. This ever-growing utilization of wind generation inevitably brings several challenges to power systems. One of the critical challenges is to maintain and improve the reliability of power systems and reduce the risk of electricity outages. In particular, during power system operation, the short-term reliability would be significantly impacted due to constrained availability or lack thereof of remedial resources amid unexpected variability of wind generation [2]-[4]. Thus, there is a pressing need to develop frameworks that can accurately assess the short-term or operation risk of wind-integrated power systems. These frameworks could enable power system operators to make risk-informed decisions ahead of time for mitigating the adverse impacts of wind generation on power system reliability.

The consideration of wind generation in long-term risk assessment of power systems is studied in [

5]-[7]. For instance, different autoregressive moving average (ARMA) series for wind speed are developed for long-term reliability in [5] and [6]. Reference [7] formulates tables on capacity outage probability considering both the variability of wind speed and the outages of wind turbines. Nonetheless, these techniques are not applicable to operation risk assessment. The main reason is that the long-term reliability models of wind speed and wind power are not appropriate to represent the time-dependent short-term uncertainty of wind power during power system operation [2].

For operation risk assessment of wind-integrated power systems, the existing methods can be broadly classified into two main categories: analytical methods and simulation techniques. Reference [

8] formulates a discrete probability distribution function (PDF) of wind power using wind speed time series, which is then employed in an analytical technique known as the PJM method [3]. In [2] and [9], the ARMA series of wind speed is adapted to construct discrete PDFs, which are conditioned on initial wind speed. These PDFs of wind speed are then utilized in the area-risk method which is an extension of the PJM method. In [10], the series of wind speed ARMA is directly adopted in a contingency-list based analytical method for operation risk assessment.

Analytical methods have limited applications and are not suitable for operation risk assessment of composite generation-transmission systems [

4]. In this case, simulation techniques such as Monte Carlo simulation (MCS) provide an attractive alternative approach for operation risk assessment. In [11], the continuous-time Markov chain is employed to model the wind speed. Then, the Markov chain is used in conjunction with sequential MCS for operation risk evaluation. Reference [12] employs quasi-sequential MCS, where wind power is modeled using a fixed number of scenarios. The computational efficiency of MCS is enhanced by adopting the cross-entropy (CE) based importance sampling (IS) technique. Different from pure analytical and simulation methods, a hybrid framework is proposed [13]. The wind speed uncertainty is modeled using multiple conditional Weibull PDFs. Then, the area-risk method is combined with non-sequential MCS (NSMCS) based on CE-IS to evaluate the short-term risk indices.

A common determinant of the existing techniques in [

8]-[11], [13] is that either discrete or parametric continuous PDFs are employed to model the short-term uncertainty of wind speed. There are two critical issues with this approach. Firstly, the process of modeling the wind speed and converting it to wind power unavoidably includes the inaccuracy of wind power curve [14]. Secondly, the unimodal PDFs employed in these studies, e.g., Weibull and Gaussian, are not well-suited to model the complicated statistical features of wind speed and wind power [15]. The features including the multimodality of PDF and temporal correlation might lead to inaccurate short-term risk indices.

To address the problems envisaged previously, a new approach is proposed to model the short-term uncertainty of wind power for operation risk assessment. Firstly, k-means clustering is used to obtain sufficient historical data of wind power for fitting time-dependent PDFs. Then for each cluster, mixture models (MMs) are utilized to develop multivariate PDFs, which can capture the complicated statistical features of wind power. MMs are semi-parametric probabilistic models that can represent arbitrarily complex PDFs with great flexibility. Gaussian MMs (GMMs) have been used to model spatial correlation of wind speed in long-term reliability evaluation [

16], [17]. Researchers have also employed GMMs to model the forecast error of wind power [18] and represent the uncertainty of wind power in probabilistic optimal power flow (OPF) [19]. In contrast to [16] and [17], this paper adopts GMMs to construct time-dependent PDFs of wind power for specific hours in order to render them suitable for short-term reliability evaluation. The drawback of using wind speed data is thus avoided. Compared to [18] and [19], as the data available for fitting GMMs for specific hours are scarce, this paper adopts maximum a posteriori (MAP) estimation for GMMs as opposed to maximum likelihood estimation (MLE). This is because MLE is susceptible for overfitting and is also prone to singularities in the case of GMMs [20].

Afterward, using the law of total expectation, an analytical expression for integrating the proposed wind power modeling in operation risk assessment is obtained. An NSCMS technique is then adopted to evaluate the analytical expression for operation risk assessment. The computation speed of the NSCMS is greatly enhanced by employing the CE-IS technique [

21], [22]. The CE-IS technique is also modified to include the proposed short-term uncertainty model of wind power. In particular, a proxy distribution is used to obtain the distorted parameters of GMMs. This ensures that the operational indices are evaluated with an acceptable computation burden. The proposed data-driven framework is tested on a modified IEEE 24-bus reliability test system (RTS) and a modified IEEE 73-bus 3-area RTS. The actual wind power data from a wind farm in Spain is adopted for the short-term uncertainty modeling of wind power.

In summary, the main contributions of this paper are:

1) This paper presents a novel approach based on k-means clustering and GMMs to represent the uncertainty of wind power for operation risk assessment. The proposed approach also adopts MAP estimation to obtain GMM parameters instead of the widely-used MLE technique to avoid overfitting and singularities.

2) Based on the proposed probabilistic modeling of wind power, the paper presents a new framework for operation risk assessment. The clustering-based GMM modeling of wind power is integrated into the operation risk assessment using the law of total expectation.

3) An NSMCS technique is adopted to estimate the risk indices. To improve the computation performance, CE-IS is adapted which is also applied to GMMs of wind power. Simulation studies are also performed on two test systems to depict the efficacy of the proposed modeling approach and the operation risk assessment framework.

The rest of the paper is organized as follows. In Section II, the fundamental concepts of operation risk assessment in composite power systems are introduced. Section III describes the proposed probabilistic modeling approach for wind power. In Section IV, the proposed operation risk assessment is put forward. Simulation results are presented in Section V. Finally, Section VI concludes the paper.

II. Preamble of Operation Risk Assessment

Consider a power system with $N_{G}$ conventional generation stations, $N_{L}$ transmission lines, $N_{W}$ wind generation units, and $N_{B}$ buses. To represent the uncertainties arising from the unplanned outages of conventional generation units, transmission lines, and wind power at time $t$ , a random vector $X_{t} = [X_{G, t}^{}, X_{L, t}^{}, X_{W, t}^{}]$ is defined. $X_{G, t}^{} = [n_{1, t}^{}, \dots, n_{s, t}^{}, \dots, n_{N_{G}, t}^{}]$ , where $n_{s, t}^{}$ is the number of available generation units in generation station $s$ ； $X_{L, t}^{} = [ζ_{1, t}^{}, \dots, ζ_{l, t}^{}, \dots, ζ_{N_{L}, t}^{}]$ ， where $ζ_{l, t}^{}$ is the status of transmission line $l$ and $ζ_{l, t}^{}$ is 1 if the transmission line is available and 0 if it is on outage; $X_{W, t}^{} = [g_{1, t}^{}, \dots, g_{w, t}^{}, \dots, g_{N_{W}, t}^{}]$ , where $g_{w, t}$ is the wind power of wind farm $w$ . Using the above notation, the risk can be mathematically expressed as:

$R_{t} = E (H (X)) = \int_{Ω} H (x) f_{t} (x) d x$

(1)

where $R_{t}$ is the risk index; $H (\cdot)$ is the limit-state function (LSF) or test function; $f_{t} (\cdot)$ is the joint multivariate PDF of random vector $X$ ; $Ω$ is the state space; $E (\cdot)$ is the expectation operator; and $x$ is a particular realization of $X$ . As can be deduced from (1), the choice of $f_{t} (\cdot)$ significantly impacts the risk indices. A more accurate estimation of $f_{t} (\cdot)$ would invariably lead to a more accurate evaluation of the risk indices [

13].

Assuming that the outages of conventional generation units and transmission lines, and variability of wind power are mutually independent of each other, $f_{t} (\cdot)$ can be expressed as:

$f_{t} (x) = f_{G, t}^{} (x_{G}) f_{L, t}^{} (x_{L}) f_{W, t}^{} (x_{W})$

(2)

For the conventional generation units, it is assumed that each generation station $s$ further comprises $N_{s}$ identical generation units. The failure events of these generation units are also assumed to be independent of each other [

3]. Therefore,

$f_{G, t}^{} (\cdot)$ can be modeled as a product of binomial distributions.

$f_{G, t}^{} (x_{G}) = \overset{N_{G}}{\prod_{s = 1}} B i n (N_{s}, p_{s} (Δ t))$

(3)

where

$B i n (N_{s}, p_{s} (Δ t)) = C_{n_{s, t}}^{N_{s}} {(1 - p_{s} (Δ t))}^{n_{s, t}^{}} {(p_{s} (Δ t))}^{N_{s} - n_{s, t}^{}}$

(4)

$p_{s} (Δ t) = 1 - e^{- λ_{s} Δ t} \approx λ_{s} Δ t$

(5)

where $λ_{s}$ is the failure rate of a generation unit in generation station $s$ ; $Δ t$ is the lead time (typically 1 hour); and $p_{s} (Δ t)$ is the outage replacement rate [

2].

Similar to the conventional generation units, it is assumed that the line outages are independent of each other. $f_{L, t}^{} (\cdot)$ is represented by a product of Bernoulli distributions:

$f_{L, t}^{} (x_{L}) = \overset{N_{L}}{\prod_{l = 1}} B e r (p_{l} (Δ t))$

(6)

where

$B e r (p_{l} (Δ t)) = {(1 - p_{l} (Δ t))}^{ζ_{l, t}} {(p_{l} (Δ t))}^{1 - ζ_{l, t}}$

(7)

$p_{l} (Δ t) = 1 - e^{- λ_{l} Δ t} \approx λ_{l} Δ t$

(8)

where $λ_{l}$ is the failure rate of transmission line $l$ .

Finally, for wind power, it is assumed that the spatial correlation among the wind farms is negligible, which implies that $f_{W, t}^{} (\cdot)$ can be expressed as:

$f_{W, t}^{} (x_{W}) = \overset{N_{W}}{\prod_{w = 1}} f_{w, t}^{} (g_{w, t}^{})$

(9)

where $f_{w, t}^{} (g_{w, t}^{})$ is the PDF for wind farm $w$ at time $t$ .

III. Proposed Probabilistic Modeling of Wind Power

In this section, a novel approach founded on k-means clustering and GMMs is presented to model the PDF of wind power for operation risk assessment. Before delving any further, the motivation behind the proposed approach is presented. As mentioned in Section I, the existing approaches in operation risk assessment are primarily based on modeling wind speed PDF. Wind power PDF is then obtained through a wind power curve [

2], [13]. Figure 1 portrays a typical wind power curve along with box plots representing measured wind power and wind speed data for a real wind farm near Swift Current, Canada. Two observations can be made from Fig. 1. Firstly, there is a high degree of discrepancy between the wind power estimated by the wind power curve and the actual wind power. On the one hand, the underestimation of wind power would correspond to indices higher than actual risk. On the other hand, the overestimation of wind power would result in indices lower than actual risk. This observation implies that operation risk indices would be inaccurate if there is only wind speed. The uncertainty of wind power is substantial in the region between the cut-in and rated wind speeds. Accurate modeling of this uncertainty is essential in order to calculate precise risk indices.

Fig. 1 Wind power curve with box plots of actual wind power data.

The wind power also possesses certain sophisticated statistical features that cannot be captured by simple parametric PDFs, e.g., Weibull, Beta, and Gaussian, which are used in the existing literature on power system reliability. One such feature is the multimodality. Figure 2 plots the histogram of actual wind power measured for an entire month for the same wind farm of Fig. 1. From Fig. 2, at least two modes can be easily identified. Apart from multimodality, the temporal correlation between wind power values at different hours is also essential for operation risk analysis [

2], [13]. Consequently, a univariate PDF could not be adopted to capture this correlation. These statistical features of wind power necessitate the use of non-parametric or semi-parametric multivariate PDF estimation techniques for wind power.

Fig. 2 Histogram of historical wind power data for a month.

In the light of the above discussion, GMMs are used to model the short-term uncertainty of wind power for specific hours of a specific day. To employ GMMs, additional random variables are defined. The random vector $X_{W, t}^{}$ delineated in the previous section represents the wind power of wind farms during time t. To study the temporal dependence, another random vector $X_{W, t - 1}^{}$ is considered. It should be noted that both $t$ and $t - 1$ are defined for specific instances of the time at a specific day. As highlighted in [

13], there is a strong correlation between

$g_{w, t}^{}$ and

$g_{w, t - 1}^{}$ , where

$g_{w, t - 1}^{}$ is the wind power of wind farm

$w$ in the current hour. Hence, the uncertainty of

$g_{w, t}^{}$ is best represented by the conditional PDF

$f_{w, t}^{} (g_{w, t}^{} | g_{w, t - 1}^{})$ . Using the concept of conditional probabilities,

$f_{w, t}^{} (g_{w, t}^{} | g_{w, t - 1}^{})$ can be obtained from

$f_{w, t}^{} (g_{w, t - 1}^{}, g_{w, t}^{})$ , which represents the joint density over

$g_{w, t - 1}^{}$ and

$g_{w, t}^{}$ . If

$g_{w, t - 1}^{}$ is deterministically known, which is generally the case during power system operation [2],

$f_{w, t}^{} (g_{w, t}^{} | g_{w, t - 1}^{})$ can be directly used to evaluate risk indices. Otherwise,

$f_{w, t}^{} (g_{w, t}^{})$ can be obtained by marginalizing

$g_{w, t - 1}^{}$ as:

$f_{w, t}^{} (g_{w, t}^{}) = \int f_{w, t}^{} (g_{w, t - 1}^{}, g_{w, t}^{}) d g_{w, t - 1}^{}$

(10)

As $f_{w, t}^{} (g_{w, t - 1}^{}, g_{w, t}^{})$ is defined for two specific hours, this PDF should be constructed using the appropriate wind power data of those specific hours. For instance, if $t - 1$ and $t$ are 14:00 and 15:00 on January 3, respectively, all the historical data for the two hours on this day would be employed to estimate the PDF.

In order to estimate $f_{w, t}^{} (g_{w, t - 1}^{}, g_{w, t}^{})$ for specific time periods, a substantial amount of historical data for those time periods are required. However, as only limited amount of historical data is available, a clustering approach is proposed. In particular, by using k-means clustering, the historical data for a specific month, i.e., January, is grouped into clusters. Then, for each cluster, the historical data for the two particular hours, i.e., 14:00 and 15:00, are used to estimate the required joint PDF. The joint PDF for each cluster is associated with a probability $λ_{c}$ , which is obtained through k-means clustering. This approach ensures sufficient data available for fitting the PDFs. The approach is pictorially depicted in Fig. 3, where $T_{h}$ is the current time instant; and $[T_{h}, T_{h + 1}]$ is the time interval for risk assessment.

Fig. 3 Clustering of historical data.

After obtaining sufficient data for specific hours of each cluster, different approaches can be utilized to estimate $f_{w, c, t}^{} (g_{w, t - 1}^{}, g_{w, t}^{})$ for each cluster. One approach is the kernel density estimation (KDE) which is non-parametric [

23]. In this paper, GMMs are adopted to model the PDF for the following reasons. Compared to KDE, GMM requires less data [24]. As the limited wind power data is only available in practice, GMM is obviously a choice. Also, as GMMs involve parametric PDFs, they can be interpreted more easily and included in the existing risk assessment frameworks. A minor drawback of GMMs is the assumption about the distribution which is absent in KDE [23].

Using MMs, the PDF for wind power in two particular hours is estimated as:

${\hat{f}}_{w, c, t}^{} (g_{w, t - 1}^{}, g_{w, t}^{}) = \overset{K}{\sum_{k = 1}} π_{k} Ψ_{k} (g_{w, t - 1}^{}, g_{w, t}^{} | θ_{k})$

(11)

where $Ψ_{k} (\cdot)$ is a parametric bivariate PDF with parameter $θ_{k}$ ; $K$ is the number of mixtures; and $π_{k}$ is the k^th mixing proportion satisfying the following conditions.

$\overset{K}{\sum_{k = 1}} π_{k} = 1$

(12)

$0 \leq π_{k} \leq 1$

(13)

There are three evident advantages of using the MM approach of (11). Firstly, by employing a mixture of parametric distributions, the multimodality of wind power PDF could be captured. Secondly, through the inclusion of $g_{w, t - 1}^{}$ , the temporal correlation could be included in the model. Thirdly, by employing a mixture of bivariate PDFs essentially, the existing risk assessment frameworks could be easily modified to include (11) in the risk assessment framework.

In this paper, the Gaussian PDF is used to model $Ψ_{k} (\cdot)$ . Consequently, (11) can also be expressed as:

${\hat{f}}_{w, c, t}^{} (g_{w, t - 1}^{}, g_{w, t}^{}) = \overset{K}{\sum_{k = 1}} π_{k} ϕ (g_{w, t - 1}^{}, g_{w, t}^{} | μ_{k}, Σ_{k})$

(14)

where $ϕ (g_{w, t - 1}^{}, g_{w, t}^{} | μ_{k}, Σ_{k})$ is the Gaussian PDF with the mean $μ_{k}$ and the covariance $Σ_{k}$ . For brevity, the GMM parameters are grouped as $π = [π_{1}, π_{2}, . . ., π_{K}]^{T}$ , $μ = [μ_{1}^{T}, μ_{2}^{T}, . . ., μ_{K}^{T}]^{T}$ and $Σ$ , which is a three-dimensional matrix of covariance matrices $Σ_{k}$ .

The main task is to determine the GMM parameters using the historical wind power data. In addition, $K$ needs to be set. Popular techniques for determining $K$ include the split-and-merge method [

16], cross-validation, and Akaike information criterion (AIC) [17]. For a given

$K$ , the remaining GMM parameters can then be estimated by maximizing the log-likelihood of (14) [25]. For a given set of N samples of bivariate wind power data

$\{g_{w, t - 1, i}^{}, g_{w, t, i}^{}\}$ where

$i = {1,2, . . ., N}$ , GMM parameters can be estimated using the MLE approach as:

$π, μ, Σ = a r g m a x l g \overset{N}{\prod_{i = 1}} {\hat{f}}_{w, c, t}^{} (g_{w, t - 1}^{}, g_{w, t}^{})$

(15)

The MLE approach for estimating GMM parameters suffers from two major shortcomings. Firstly, given the limited amount of data, it is prone to overfitting. Secondly, due to the collapsing variance problem, singularities could occur [

26]. Therefore, to avoid these drawbacks, the MAP approach is employed. Using the MAP approach, (15) is modified to:

$\begin{array}{l} π, μ, Σ = \\ a r g m a x (l g \overset{N}{\prod_{i = 1}} {\hat{f}}_{w, c, t}^{} (g_{w, t - 1}^{}, g_{w, t}^{}) + l g f_{D} (π) + l g f_{N I W} (μ, Σ)) \end{array}$

(16)

where $f_{D} (π)$ and $f_{N I W} (μ, Σ)$ are the prior distributions on GMM parameters. In particular, $f_{D} (π)$ is a Dirichlet distribution and $f_{N I W} (μ, Σ)$ is a Normal-Inverse-Wishart distribution. These prior distributions act to regularize the parameter fitting and thus avoid overfitting and singularities. Using the expectation maximization (EM) method, the MAP estimates in (16) are obtained as [

26]:

$r_{i, k} = \frac{π_{k} ϕ (g_{w, t - 1, i}^{}, g_{w, t, i}^{} | μ_{k}, Σ_{k})}{\sum_{l} π_{l} ϕ (g_{w, t - 1, i}^{}, g_{w, t, i}^{} | μ_{l}, Σ_{l})}$

(17)

$π_{k} = \frac{r_{k} + α_{k} - 1}{N + \sum_{k} α_{k} - K}$

(18)

$μ_{k} = \frac{r_{k} {\bar{g}}_{k} + κ_{0} m_{0}}{r_{k} + κ_{0}}$

(19)

${\bar{g}}_{k} = \frac{\sum_{i} r_{i, k} g_{i}}{r_{k}}$

(20)

$Σ_{k} = \frac{S_{0} + S_{k} + \frac{κ_{0} r_{k}}{κ_{0} + r_{k}} ({\bar{g}}_{k} - m_{0}) {({\bar{g}}_{k} - m_{0})}^{T}}{r_{k} + v_{0} + D + 2}$

(21)

$S_{k} = \sum_{i} r_{i, k} (g_{i} - {\bar{g}}_{k}) {(g_{i} - {\bar{g}}_{k})}^{T}$

(22)

where $r_{k} = \sum_{i} r_{i, k}$ ； $g_{i} = [g_{w, t - 1, i}^{}, g_{w, t, i}^{}]^{T}$ ; $α_{k}$ is the scalar parameter of Dirichlet distribution; $m_{0}$ , $v_{0}$ , $κ_{0}$ , $S_{0}$ are the parameters of the Normal-Inverse-Wishart distribution, $m_{0}$ is a D×1 vector parameter, $v_{0}$ and $κ_{0}$ are scalar parameters, and $S_{0}$ is a D×D matrix parameter， $D$ is the number of dimensions in the data. Equation (17) represents the E-step, and (18)-(22) correspond to the M-step. These two steps are conducted iteratively.

After obtaining the joint PDF, the conditional PDF for each cluster ${\hat{f}}_{w, c, t}^{} (g_{w, t}^{} |g_{w, t - 1}^{})$ can be obtained. Due to the characteristic of Gaussian PDFs, this conditional PDF is also a univariate GMM. Subsequently, it is employed in operation reliability assessment.

IV. Proposed Method of Short-term Risk Assessment

In this section, the proposed method of operation risk assessment is presented, which integrates the previously developed GMM model of wind power. As an example, the method is explained using the loss of load probability (LOLP) index. However, the method could be easily extended to estimate other reliability indices.

The risk in (1) is defined for a specific PDF $f_{t} (\cdot)$ . As there are multiple PDFs for multiple clusters, (1) needs to be modified. The law of total expectation can be used to obtain the total risk considering different PDFs for different clusters as:

$R = \sum_{c} λ_{c} (\int_{Ω} H (x) f_{t, c} (x) d x)$

(23)

where $f_{t, c} (\cdot)$ is similar to $f_{t} (\cdot)$ in (2) with the exception that $f_{W, t}^{} (\cdot)$ is replaced by PDF of each cluster $f_{W, c, t}^{} (\cdot)$ .

Due to a large number of states in the state space $Ω$ and high dimensionality of the integral, it is difficult to evaluate (23) analytically [

21]. Crude NSMCS could be used to estimate (23) as:

$\hat{R} = \sum_{c} λ_{c} (\frac{1}{N_{s}} \overset{N_{s}}{\sum_{n = 1}} H (x_{c, n}^{}))$

(24)

where ${x_{c, 1}^{}, x_{c, 2}^{}, . . ., x_{c, N_{s}}^{}}$ are independent and identically distributed (IID) samples drawn from $f_{t, c} (\cdot)$ ; and $N_{s}$ is the total number of samples. The LSF $H (\cdot)$ is defined as:

$H (x) = \{\begin{matrix} 0 & S (x) \geq L \\ 1 & S (x) < L \end{matrix}$

(25)

$S (x) = \{\begin{matrix} \overset{N_{B}}{\sum_{b = 1}} P_{b} & \overset{N_{B}}{\sum_{b = 1}} P_{b} \geq L \\ \overset{N_{B}}{\sum_{b = 1}} l_{b} & \overset{N_{B}}{\sum_{b = 1}} P_{b} < L \end{matrix}$

(26)

where $L$ is the load demand during lead time; $P_{b}$ is the cumulative available generation at bus b; and $l_{b}$ is the load supplied to bus b. The DC OPF is used to evaluate $P_{b}$ and $l_{b}$ .

Because of low failure probabilities during power system operation, most of the samples correspond to $H (\cdot)$ = 0. Thus, a larger number of samples are required to correctly estimate the risk indices. This would significantly increase the computation burden [

13], [21]. To circumvent this issue, the IS technique is adopted. The IS technique proposes another joint PDF

$f_{t, c}^{*} (\cdot)$ that is biased to obtain samples for which

$H (\cdot)$ is non-zero. The risk index is then calculated using:

$\hat{R} = \sum_{c} λ_{c} (\frac{1}{N_{s}} \overset{N_{s}}{\sum_{n = 1}} H (x_{n}^{c}) W_{t, c} (x_{n}^{c}))$

(27)

$W_{t, c} (x) = \frac{f_{t, c} (x)}{f_{t, c}^{*} (x)}$

(28)

In (27), the IID samples are now drawn from $f_{t, c}^{*} (\cdot)$ . Similar to (2), $f_{t, c}^{*} (\cdot)$ can be written as:

$f_{t, c}^{*} (x) = f_{G, t}^{*} (x_{G}) f_{L, t}^{*} (x_{L}) f_{W, t, c}^{*} (x_{W})$

(29)

The PDF $f_{t, c}^{*} (\cdot)$ which is also known as importance sampling density, can be obtained using the widely-used CE optimization [

22], [27]. For CE optimization, closed-form analytical updating rules are available for the PDFs of

$f_{G, t}^{*} (\cdot)$ and

$f_{L, t}^{*} (\cdot)$ as these PDFs belong to the exponential family of distributions [22]. However, for the GMMs of wind power, such closed-form analytical solution is not available. To mitigate this problem, a transformation strategy is adopted.

Considering the GMM PDF of wind farm w for cluster c, $f_{w, t, c}^{*} (\cdot)$ , the following transformation is used to obtain a new random variable $χ_{w}$ as:

$χ_{w} = Φ^{- 1} (F_{w, c, t}^{} (g_{w, t}^{} | g_{w, t - 1}^{}))$

(30)

where $Φ (\cdot)$ is the cumulative distribution function (CDF) of $ϕ (χ_{w} | μ_{w}, σ_{w}^{2})$ ; and $F_{w, c, t}^{} (\cdot)$ is the CDF of $f_{w, c, t}^{} (\cdot)$ . After transformation, the PDF for $θ_{w}$ is distorted using the CE optimization. As the PDF of $χ_{w}$ belongs to exponential family of distributions, analytical rules can be applied to obtain the CE parameters. In the calculation of (25) and (26) through DC-OPF, the random variable $χ_{w}$ is transformed back to actual wind power random variable using the distorted $ϕ^{*} (\cdot)$ via the inverse of (30). Thus, $χ_{w}$ acts as a proxy random variable to distort the GMM. The complete CE algorithm is depicted as Algorithm 1. After obtaining $f_{t, c}^{*} (\cdot)$ through Algorithm 1, NSMCS is employed to estimate (27).

Algorithm 1 : CE optimization for GMM-integrated NSCMS
Input: $p_{s}, \forall s$ , $p_{l}, \forall l$ , $f_{w, c, t}^{} (\cdot)$ , $ϕ (\cdot \| μ_{w}, σ_{w}^{2}), \forall w$
Output: $p_{s}^{}, \forall s$ , $p_{l}^{}, \forall l$ , $ϕ^{} (\cdot \| μ_{w}^{}, (σ_{w}^{*})^{2})$ , $\forall w$
1	Set CE parameters $N_{C E}$ , $ρ$ , $α$ and $c^{m a x}$
2	Set $p_{s}^{c} = p_{s}$ , $p_{l}^{c} = p_{l}$ , $ϕ^{c} (\cdot \| μ_{w}^{c}, (σ_{w}^{c})^{2}) = ϕ (\cdot \| μ_{w}, (σ_{w})^{2})$
3	For $c = 1$ to $c = c^{m a x}$
4	Sample ${x_{1}, x_{2}, . . ., x_{N_{C E}}}$ from (2) using $p_{g}^{c}$ in (3), $p_{l}^{c}$ in (6), and $μ_{w}^{c}$ and $σ_{w}^{c}$ in inverse of (30)
5	For each sample, evaluate $S (x)$ using (26) and sort $S (x_{i})$ samples in ascending order to obtain order statistics, $S [1] \leq S [2] \leq . . . \leq S [N_{C E}]$
6	If $(S [ρ N_{C E}] \geq L)$ , set $L^{c} = S [ρ N_{C E}]$ , otherwise set $L^{c} = L$
7	For each sample, evaluate $H (x)$ using (25) and $L^{c}$ instead of $L$ , also evaluate $W (x)$ using (28)
8	Calculate the updated parameters of PDFs as: $p_{s}^{c + 1} = α (1 - \frac{\overset{N_{C E}}{\sum_{i = 1}} H (x_{i}) W (x_{i}) n_{i}^{g}}{N_{S} \overset{N_{C E}}{\sum_{i = 1}} H (x_{i}) W (x_{i})}) + (1 - α) p_{s}^{c}$ $p_{l}^{c + 1} = α (1 - \frac{\overset{N_{C E}}{\sum_{i = 1}} H (x_{i}) W (x_{i}) ς_{i}^{l}}{\overset{N_{C E}}{\sum_{i = 1}} H (x_{i}) W (x_{i})}) + (1 - α) p_{l}^{c}$ $μ_{w}^{c + 1} = \frac{\overset{N_{C E}}{\sum_{i = 1}} H (x_{i}) W (x_{i}) χ_{i}^{w}}{\overset{N_{C E}}{\sum_{i = 1}} H (x_{i}) W (x_{i})}$ ${(σ_{w}^{c + 1})}^{2} = \frac{\overset{N_{C E}}{\sum_{i = 1}} H (x_{i}) W (x_{i}) (χ_{i}^{w} - μ_{w}^{c + 1})^{2}}{\overset{N_{C E}}{\sum_{i = 1}} H (x_{i}) W (x_{i})}$
9	If $L^{c} = L$ , break the “For” loop
10	The final parameters of PDFs are $p_{s}^{} = p_{s}^{c + 1}$ , $p_{l}^{} = p_{l}^{c + 1}$ , $μ_{w}^{} = μ_{w}^{c + 1}$ , $σ_{w}^{} = σ_{w}^{c + 1}$

V. Results

In this section, case studies are performed to depict the effectiveness of the proposed probabilistic modeling of wind power and the proposed operation risk assessment method. The simulations are performed on a modified IEEE 24-bus RTS as shown in Fig. 4 [

28], and a modified IEEE 73-bus 3-area RTS. In the original IEEE 24-bus RTS, a wind farm with a total capacity of 1000 MW is integrated at bus 19. A 155 MW conventional generation station at bus 16 is removed. The total wind power penetration is therefore equal to 23.5%. Ten years of real wind power data from the Sotavento wind farm in Spain is used [29]. All simulations are performed for January. The operation risk is evaluated for a lead time of one hour. The particular time interval for the lead time is set from 03:00 to 04:00. The load is set to the peak value. For CE-optimization, the parameters are set according to [13], and the stopping criteria for NSCMS is set to 5%. For k-means clustering, the number of clusters is set to 3.

Fig. 4 Modified IEEE 24-bus RTS.

A. GMM Model

The efficacy of the GMM model is explained in this subsection. Figure 5 portrays the bivariate histogram of the dataset for a specific cluster. Figure 6 depicts the GMM obtained for this cluster, when parameters are obtained using the MAP approach. By comparing Fig. 5 with Fig. 6, it can be concluded that the GMM accurately captures the variability of wind power in the lead time. From close observation of Fig. 6, two conclusions can be made. Firstly, the PDFs for wind power during the lead time are markedly different for different initial wind power. Secondly, the multimodality of PDF is evident. For instance, when the initial wind power lies in the interval [0.2, 0.4), the wind power PDF in the lead time has three distinct modes. The effect of multimodality on operation risk indices will be discussed in the next subsection. Figure 7 represents the GMM model for the same dataset. However, the MLE approach is used to obtain the parameters. Certain Gaussian components in Fig. 7 have low variance and sharp peaks, which indicates overfitting. Finally, a bivariate Gaussian PDF is estimated in Fig. 8. It is seen from this figure that such a PDF cannot truly represent the distribution of wind power in the lead time.

Fig. 5 Bivariate histogram for given dataset for one of the three clusters.

Fig. 6 GMM for given dataset when MAP estimation is employed.

Fig. 7 GMM for given dataset when MLE estimation is employed.

Fig. 8 Bivariate Gaussian approximation to given dataset.

B. Operation Risk Indices for IEEE RTS

In this section, the operation risk indices are evaluated. The proposed method is compared with another method (Method B) in which the PDF of wind power is modeled using a bivariate Gaussian distribution as shown in Fig. 8. The results are depicted in Table I.

The results indicate that there is a stark difference between the risks obtained from the two methods. For all values of initial wind power, the risk indices obtained by the proposed approach are higher than those obtained by Method B. A reason behind this observation can be deduced by investigating Fig. 6 and Fig. 8. As noted earlier, Method B is unable to capture the multiple modes of wind power PDF. Some of these modes occur at lower values of wind power. For instance, as visible from Fig. 6, there is a mode when next-hour wind generation is around 0.2 p.u.. With these modes missing, Method B assumes higher wind generation than actual values, and therefore overestimates the reliability of the power system. In the proposed approach, as these modes are captured, higher number of samples from low wind power states are drawn during NSMCS, which contributes to higher risk indices.

Table I Operation Risk Indices for IEEE 24-bus RTS

Initial wind power (p.u.)	Operation risk index
Initial wind power (p.u.)	Proposed method	Method B
0.1	5.1417×10^6	2.0783×10^6
0.3	5.2571×10^6	2.2170×10^6
0.5	2.7805×10^6	2.0449×10^
0.8	1.6245×10^6	7.5509×10^
1.0	9.1727×10^	4.3659×10^

C. Computation Performance

This subsection examines the computation performance of the proposed method. Figure 9 pictorially describes the computation burden of the proposed method when the initial wind generation is 0.8 p.u.. It can be observed that in this case, the operation risk indices are evaluated within 2500 samples of NSMCS. On the contrary, for crude NSMCS, to evaluate the risk index which is in the order of $10^{- 6}$ , the total number of samples required is $4 \times 10^{8}$ [

21]. The improved computation performance has been achieved by adopting the CE-IS technique and modifying it for GMM-based modeling of wind power.

Fig. 9 Convergence behavior of proposed method for three clusters.

D. Impact of Wind Farm Location

In this subsection, the effect of the location of the wind farm on the operation risk indices is investigated to understand the impact of transmission system constraints. As can be observed from Table II, the operation risk indices are markedly different for different buses. This variation stems from the difference in the total capacities of transmission lines connected to these buses. The operation risk indices for bus 4 are the highest as it has the lowest total capacity of transmission lines connected to it, i.e., 350 MW.

Table II Operation Risk Indices for Different Locations of Wind Farm

Initial wind power (p.u.)	Operation risk index
Initial wind power (p.u.)	Bus 4	Bus 10	Bus 19
0.1	$5.2098 \times 10^{- 6}$	$5.1127 \times 10^{- 6}$	$5.1417 \times 10^{- 6}$
0.3	$5.0623 \times 10^{- 5}$	$5.3847 \times 10^{- 11}$	$5.2571 \times 10^{- 6}$
0.5	$1.8012 \times 10^{- 5}$	$1.5909 \times 10^{- 9}$	$2.7805 \times 10^{- 6}$
0.8	$2.3407 \times 10^{- 6}$	$1.5908 \times 10^{- 6}$	$1.6245 \times 10^{- 6}$
1.0	$1.4133 \times 10^{- 6}$	$8.9567 \times 10^{- 7}$	$9.1727 \times 10^{- 7}$

Therefore, the output of the wind farm is highly constrained in this case. For bus 10, the total capacity of transmission lines is 1545 MW. Therefore, the operation risk indices for this bus are lower than that of bus 4. Finally, for bus 19, the transmission capacity available to wind farm is 1000 MW. Hence, the operation risk indices for this bus lie between those of bus 4 and bus 10. This effect is more pronounced for cases when the wind power PDF for the lead time is more skewed toward the maximum capacity, e.g., when the initial wind power is 1.0. These results highlight the effects of transmission system on operation risk indices. These effects could only be considered through the risk assessment of composite power systems.

E. Operation Risk Indices for IEEE 3-area RTS

This subsection performs studies on the IEEE 3-area RTS. Similar to the previous case studies, a 155 MW conventional generator on bus 16 of each area is removed. A 500 MW wind farm is installed on bus 19 of each area. The load is set to the peak value of 8550 MW. The probabilistic model for wind power is similar to that in the previous studies. Table III presents the risk indices obtained for this test system. Compared to the IEEE 24-bus RTS, the risk indices are expectedly lower. This is because the interconnection of the three areas improves the overall reliability of the system. Figure 10 depicts the computation performance of the method for this test system when initial wind generation is set to 0.5 p.u.. The maximum number of samples for NSMCS in this case is 10000. Thus, the computation burden is expectedly higher than that for IEEE 24-bus RTS. However, it is still not as high as crude NSMCS.

Fig. 10 Convergence behavior of proposed method for three clusters when IEEE 3-area RTS is employed.

Table III Operation Risk Indices for IEEE 3-area RTS

Initial wind power (p.u.)	Index of proposed method
0.1	8.7893×10^11
0.3	3.7503×10^11
0.5	2.1835×10^11
0.8	1.2957×10^11
1.0	7.3894×10^12

VI. Conclusion

In this paper, a new data-driven method for operation risk assessment of composite power systems considering wind power is proposed. The short-term uncertainty of wind power is directly modeled using GMMs. k-means clustering and MAP estimation are adopted to address the issue of limited availability of wind power data. The proposed GMM is then incorporated in the operation risk assessment framework using the total law of expectation. NSMCS is then applied to obtain the risk indices. The computation performance of NSMCS is improved by adapting the CE-IS.

Case studies have shown that the complicated statistical features of wind power, which are modeled by GMM, are necessary to obtain accurate operation risk indices. In particular, the multimodality of wind power PDF affects the calculated operation risk indices. The computation performance of the proposed data-driven method is suitable for application in power systems operation. The transmission system constraints significantly affect the operation risk indices. Therefore, it is crucial to include the transmission system in any operation risk studies. Compared to the existing approaches, the proposed data-driven approach avoids the pitfalls of using wind speed data. Moreover, the extensive modeling of uncertainty of wind generation leads to more accurate estimation of risk indices.

REFERENCES

Global Wind Energy Council (GWEC). (2018, May). Global wind report: annual market update 2017. [Online]. Available: https://www.researchgate.net/publication/324966225_GLOBAL_WIND_REPORT_-_ Annual_Market_Update_2017 [百度学术]

R. Billinton, B. Karki, R. Karki et al., “Unit commitment risk analysis of wind integrated power systems,” IEEE Transactions on Power Systems, vol. 24, no. 2, pp. 930-939, May 2009. [百度学术]

R. Billinton and R. N. Allan, Reliability Evaluation of Power Systems. New York: Plenum, 1996. [百度学术]

W. Li, Risk Assessment of Power Systems-Models, Methods, and Applications. New York: IEEE Press, 2005. [百度学术]

M. Liu, W. Li, J. Yu et al., “Reliability evaluation of tidal and wind power generation system with battery energy storage,” Journal of Modern Power Systems and Clean Energy, vol. 4, no. 4, pp. 636-647, Sept. 2016. [百度学术]

R. Billinton, R. Karki, Y. Gao et al., “Adequacy assessment considerations in wind integrated power systems,” IEEE Transactions on Power Systems, vol. 27, no. 4, pp. 2297-2305, Nov. 2012. [百度学术]

S. Sulaeman, M. Benidris, J. Mitra et al., “A wind farm reliability model considering both wind variability and turbine forced outages,” IEEE Transactions on Sustainable Energy, vol. 8, no. 2, pp. 629-637, Apr. 2017. [百度学术]

Z. Parvini, A. Abbaspour, M. Fotuhi-Firuzabad et al., “Operational reliability studies of power systems in presence of energy storage systems,” IEEE Transactions on Power Systems, vol. 33, no. 4, pp. 3691-3700, Jul. 2018. [百度学术]

S. Thapa, R. Karki, and R. Billinton, “Utilization of the area risk concept for operational reliability evaluation of a wind-integrated power system,” IEEETransactions on Power Systems, vol. 28, no. 4, pp. 4771-4779, Nov. 2013. [百度学术]

P. Wang, Z. Gao, and L. B. Tjernberg, “Operational adequacy studies of power systems with wind farms and energy storage,” IEEETransactions on Power Systems, vol. 27, no. 4, pp. 2377-2384. Nov. 2012. [百度学术]

Y. Ding, L. Cheng, Y. Zhang et al., “Operational reliability evaluation of restructured power systems with wind power penetration utilizing reliability network equivalent and time-sequential simulation approaches,” Journal of Modern Power Systems and Clean Energy, vol. 2, no. 4, pp. 329-340, Dec. 2014. [百度学术]

A. M. L. Silva, J. F. C. Castro, and R. Billinton, “Probabilistic assessment of spinning reserve via cross-entropy method considering renewable sources and transmission restrictions,” IEEETransactions on Power Systems, vol. 33, no. 4, pp. 4574-4582, Jul. 2018. [百度学术]

O. A. Ansari and C. Y. Chung, “A hybrid framework for short-term risk assessment of wind-integrated composite power systems,” IEEE Transactions on Power Systems, vol. 34, no. 3, pp. 2334-2344, May 2019. [百度学术]

Y. Wang, Q. Hu, and S. Pei, “Wind power curve modeling with asymmetric error distribution,” IEEE Transactions on Sustainable Energy. DOI: 10.1109/TSTE.2019.2920386. [百度学术]

B. Khorramdel, C. Y. Chung, N. Safari et al., “A fuzzy adaptive probabilistic wind power prediction framework using diffusion kernel density estimators,” IEEE Transactions on Power Systems, vol. 33, no. 6, pp. 7109-7121, Nov. 2018. [百度学术]

L. Geng, Y. Zhao, and W. Li, “Enhanced cross entropy method for composite power system reliability evaluation,” IEEE Transactions on Power Systems, vol. 34, no. 4, pp. 3129-3139, Jul. 2019. [百度学术]

J. Cai, Q. Xu, M. Cao et al., “A novel importance sampling method of power system reliability assessment considering multi-state units and correlation between wind speed and load,” International Journal of Electrical Power & Energy Systems, vol. 109, pp. 217-226, Jul. 2019. [百度学术]

F. Ge, Y. Ju, Z. Qi et al., “Parameter estimation of a Gaussian mixture model for wind power forecast error by Riemann L-BFGS optimization,” IEEE Access, vol. 6, pp. 38892-38899, Jul. 2018. [百度学术]

D. Ke, C. Y. Chung, and Y. Sun, “A novel probabilistic optimal power flow model with uncertain wind power generation described by customized Gaussian mixture model,” IEEE Transactions on Sustainable Energy, vol. 7, no. 1, pp. 200-212, Jan. 2016. [百度学术]

C. M. Bishop, Pattern Recognition and Machine Learning. New York: Springer, 2006. [百度学术]

R. Y. Rubinstein and D. P. Kroese, Simulation and the Monte Carlo Methods. New York: Wiley, 2007. [百度学术]

R. Y. Rubinstein and D. P. Kroese, The Cross-Entropy Method. New York: Springer, 2004. [百度学术]

D. W. Scott, Multivariate Density Estimation: Theory, Practice and Visualization. Hoboken: Wiley, 2015. [百度学术]

A. L. Rojas, （2005, Jul.. Conditional density estimation using finite mixture models with an application to astrophysics. Carnegie Mellon University, Pittsburgh, USA. [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.61.2002 [百度学术]

G. McLachlan and D. Peel, Finite Mixture Models. New York: Wiley, 2000. [百度学术]

K. P. Murphy, Machine Learning: A Probabilistic Approach. Cambridge: The MIT Press, 2012. [百度学术]

A. M. L. Silva, J. F. C. Castro, and R. A. González-Fernández, “Spinning reserve assessment under transmission constraints based on cross-entropy method,” IEEE Transactions on Power Systems, vol. 31, no. 2, pp. 1624-1632, Mar. 2016. [百度学术]

Probability Methods Subcommittee, “IEEE reliability test system,”IEEE Transactions on Power Apparatus and Systems, vol. PAS-98, no. 6, pp. 2047-2054, Nov. 1979. [百度学术]

P. E. E. Sotavento (2019, Oct.). Wind Power Data. [Online]. Available: http://www.sotaventogalicia.com/es [百度学术]

Address:No.19 Chengxin Avenue, Jiangning District, Nanjing 211106, China

E-mail: mpce@alljournals.cn

Tel:86-25-81093060

Fax:86-25-81093040

Home

Introduction

Editorial Board

For Author

Call For Papers

APC

Sponsor & Publisher