Abstract
Scenario forecasting methods have been widely studied in recent years to cope with the wind power uncertainty problem. The main difficulty of this problem is to accurately and comprehensively reflect the time-series characteristics and spatial-temporal correlation of wind power generation. In this paper, the marginal distribution model and the dependence structure are combined to describe these complex characteristics. On this basis, a scenario generation method for multiple wind farms is proposed. For the marginal distribution model, the autoregressive integrated moving average-generalized autoregressive conditional heteroskedasticity-t (ARIMA-GARCH-t) model is proposed to capture the time-series characteristics of wind power generation. For the dependence structure, a time-varying regular vine mixed Copula (TRVMC) model is established to capture the spatial-temporal correlation of multiple wind farms. Based on the data from 8 wind farms in Northwest China, sufficient scenarios are generated. The effectiveness of the scenarios is evaluated in 3 aspects. The results show that the generated scenarios have similar fluctuation characteristics, autocorrelation, and crosscorrelation with the actual wind power sequences.
TO achieve the clean and low-carbon energy supply, wind power has attracted extensive attention worldwide in the recent few decades. However, at present, the accurate forecasting of wind power generation is not an easy goal to achieve [
On the other hand, wind farms are often clustered [
In view of the uncertainty of wind power generation, the scenario forecasting method, which is an important method of probabilistic forecasting, has been extensively studied. Its basic principle is to establish the probability density function (PDF) of wind power or forecasting error by statistical methods, and then generate the scenarios by sampling methods [
The aforementioned works are proposed to reflect the long-term frequency distribution of wind power generation. Considering the autocorrelation of wind power over a period of time, the studies in recent years begin to focus on the time-series characteristics. In [
On the other hand, however, with the increase in the number and scale of wind farms, the traditional methods have limitations when applied to regional multiple wind farms. The key problem lies in that, the spatial-temporal correlation should be fully described, while the advantages of the original method are retained. In this respect, a good method is to adopt the Copula model, which can be decomposed into two parts: the marginal distribution model and the dependence structure. The first part is the independent PDF of each single wind power sequence, which ensures good continuity with the existing works mentioned above. The second part describes the spatial-temporal correlation of multiple wind farms. In [
It can be observed from the literature review that, some works have introduced the Copula model in analyzing the correlation of multiple wind farms. However, the existing works have three limitations: ① the time-series characteristics and spatial-temporal correlation are not effectively combined, resulting in frequent unreasonable fluctuations in the scenarios; ② the models are reliable only when dealing with low-dimensional data; for high-dimensional wind power data, the dependence structures can not fully describe the spatial-temporal correlation; ③ the complexity of joint-distribution between two arbitrary wind farms is underestimated. Specifically, the pair Copulas, as the basic units of the high-dimensional Copula model, are too simple to capture the tail characteristics, thus reducing the accuracy of the whole model.
Therefore, this paper aims to propose a scenario generation method that can effectively describe both the time-series characteristics and the spatial-temporal correlation of the power output of multiple wind farms. The contributions of this work are briefly summarized as follows.
1) The time-varying regular vine mixed Copula (TRVMC) model is established to fit the joint probability distribution of the output of regional multiple wind farms. By making the probability distribution capture the characteristics of the joint-frequency distribution, the TRVMC model can reflect the correlation between the wind farms. Compared with the Gaussian, C-vine, and D-vine Copula models in the existing research, the TRVMC model does not need to make strict assumptions about the correlation of input data, but fits the appropriate model structure for different input data. Consequently, the TRVMC model has higher fitting accuracy for the joint-probability distribution.
2) The AR integrated moving average-generalized autoregressive conditional heteroskedasticity-t (ARIMA-GARCH-t) model is established to fit the marginal distribution model of the output of each wind farm. The model can capture the time-series characteristics of wind power output. Compared with the commonly-used static models such as the KDE model, the ECDF model, and the student-t model, the ARIMA-GARCH-t model has higher fitting accuracy, which can provide more reliable input data for the Copula model.
3) The time-varying mixed Copula (TMC) model is established as the pair Copulas, i.e., the basic units of the TRVMC model. On one hand, the TMC model integrates the advantages of multiple basic bivariate Copula models. On the other hand, by introducing the dynamic correlation calculation models (including the dynamic conditional correlation (DCC) model and the Patton model) to fit the parameters of the TMC model, the model can capture the time-varying characteristics of the dependence structure of two-dimensional wind power sequences, so as to improve the accuracy of the entire model.
4) Based on the established ARIMA-GARCH-t model and the TRVMC model, the forecasting scenario generation method for the output of regional multiple wind farms is proposed. The scenarios present similar time-series characteristics and spatial-temporal correlation with the actual wind power sequence. Compared with the methods which ignore the above characteristics, the scenarios generated by the proposed method have a more reasonable fluctuation range and frequency. Besides, the scenarios can better envelop the actual wind power output sequence.
The rest of the paper is organized as follows. In Sections II and III, the ARIMA-GARCH-t model and the TRVMC model are described in detail, respectively. The forecasting scenario generation method is proposed in Section IV. Section V provides an overall description of the modeling and scenario generation process. The evaluation framework is introduced in Section VI. In Section VII, the forecasting scenarios are generated and evaluated based on the data from 8 wind farms. Finally, some concluding remarks are provided in Section VIII.
The marginal distribution model is the independent probability distribution model of each wind power sequence. By calculating the cumulative probability, the original wind power sequence is transformed into a uniform sequence bounded by [0, 1]. The converted sequence will be used as input data for the Copula model.
The cumulative distribution function (CDF) of the power output of each wind farm can be expressed as:
(1) |
where , , and are the measured power, forecasting power, and forecasting error of the
The ARIMA model can be expressed as [
(2) |
where is the forecasting error sequence; B denotes the lag operator, namely ; denotes the d-order difference computation process, which transforms the input data into a stationary sequence, and the augmented Dickey-fuller (ADF) test can be used to evaluate the stationarity of the sequences [
(3) |
where , , and are the constant parameter, AR coefficient, and MA coefficient obtained by the fitting, respectively; and are the orders of the AR and MA coefficients, respectively; and rp and rq are the count variables.
Through the ARIMA model, the estimation of the PDF/CDF of is converted into that of the residual error sequence .
The function of the GARCH-t model is to fit the PDF/CDF of . According to the existing research [
(4) |
where N is the normal distribution; denotes all historical information before time ; is the conditional variance; is an independent and identically-distributed variable that obeys the standard normal distribution ; , , and are the fixed parameter, the autoregressive conditional heteroskedasticity (ARCH) coefficient, and the GARCH coefficient obtained by the fitting, respectively; and are the orders of the GARCH and ARCH coefficients, respectively; and lp and lq are the count variables.
By combining the calculation results of the ARIMA model and the GARCH-t model, the PDF of the original wind power output sequence can be obtained as:
(5) |
where and are the calculation results of the ARIMA model and GARCH-t model, respectively; and and are the measured power and forecasting power of the
Taking the historical output data of a wind farm for 1 day as an example, the calculation results of ARIMA and GARCH-t models are shown in

Fig. 1 Calculation results of ARIMA and GARCH-t models. (a) ARIMA model. (b) GARCH-t model.
The parameters of the ARIMA model and the GARCH-t model can be estimated by optimizing the log-likelihood function as:
(6) |
where and are the parameter sets of the ARIMA model and the GARCH-t model, respectively; and are the functions of the ARIMA model and the GARCH-t model, respectively; and and are the estimation results, respectively.
An R-vine Copula model can be defined as follows [
1) The R-vine Copula is a nested set of layers of tree structures. Each tree is composed of the node set and the edge set . The nodes and edges are the input data and pair Copulas, respectively.
2) For the first tree , the node set is the calculation results of the marginal distribution models. For other trees, , i.e., each edge in tree corresponds to a node in tree .
3) A constraint is that two edges in only share one node in .
Except for the first tree, the nodes and edges are all composed of the conditioning and conditioned sets. For example, suppose that an edge e is defined as , , then and are the conditioned sets, is the conditioning set. Further, suppose , the edges (in tree ) corresponding to node a and node b are , and , , respectively, then the correlation between the edges and the nodes can be expressed as [
(7) |
(8) |
Based on (7) and (8), the pair Copula corresponds to edge e can be written as . The input data for the pair Copula are two conditional cumulative probability sequences, which can be written as and . Then the joint-PDF of an M-dimensional data set can be expressed as [
(9) |
where ; is the joint-PDF of ; is the marginal distribution model of ; and is the pair Copula in the tree.
For convenience, the parentheses in the pair Copulas are omitted in the rest of this paper.
The pair Copulas are the basic units of the R-vine Copula model. The function is to fit the joint-PDF of the binary data sequences. In this paper, the TMC model is established as the pair Copulas, which can be expressed as:
(10) |
where u and v are the input data of the pair Copula model; is the TMC model; is the basic bivariate Copula model selected to compose ; n is the number of basic bivariate Copula models; is the weight of ; and and are the parameter sets of and , respectively.
In this paper, the commonly-used t Copula model, Clayton Copula model, and Gumbel Copula model are selected as . The expressions are as follows [
(11) |
(12) |
(13) |
where , , and are the t Copula, Clayton Copula, and Gumbel Copula models, respectively; , , , and are the model parameters; and is the inverse function of the t Copula model.
To track the time-varying process of the correlation between the wind farms, the parameters of the above 3 models need to be calculated dynamically. For this purpose, the DCC model and the Patton model are introduced in this paper.
For the t Copula model, the DCC model is used to calculate the parameters [
(14) |
where and are the parameters of the DCC model; is the covariance coefficient of the input data; is the input data at time , i.e., ; and the symbol ' denotes the transposition.
For the Clayton Copula model and the Gumbel Copula model, the Patton model is used to calculate the parameters [
(15) |
where , , and are the parameters of the Patton model; is the length of the historical data sequence used to fit the correlation coefficient at time t, which is usually set to be 10; and is the logistic function which keeps the calculation results of the Patton model within the required range.
Moreover, the calculation results of the Patton model need to be further converted into the parameters of the Copula models. The calculation method is [
(16) |
(17) |
Having adopted the DCC and Patton models, the parameter set in (10) can be expressed as:
(18) |
where , , and are the parameter sets of the t Copula, Clayton Copula, and Gumbel Copula, respectively; are the parameters of the t Copula; are the parameters of the Clayton Copula; and are the parameters of the Gumble Copula. And the parameters of the TMC model can be estimated by optimizing the log-likelihood function as:
(19) |
where , and , , and are the weights of the t Copula, Clayton Copula, and Gumbel Copula, respectively; and is the estimation result.
In this subsection, the structure generation method of the TRVMC model is proposed based on the MST algorithm in [
Step 1: establish the TMC model as the pair Copula model.
Step 2: take the calculation results of the marginal distribution models as the input data of the first tree. Fit the pair Copulas of each two input data sequences with the TMC model.
Step 3: evaluate the accuracy of the pair Copulas in Step 2 with the quantitative index. In this paper, the Akaike information criterion (AIC) is adopted [
(20) |
where is the AIC index; is the number of model parameters; and is the value of the maximum likelihood function.
Step 4: generate the structure of each tree. This process is an optimization problem, and the objective is to minimize the sum of the AIC values of all pair Copulas in the tree. For the first tree, the Prim algorithm in [
Step 5: when the structure of the tree is generated, calculate the input data of the tree as [
(21) |
Step 6: except for the first tree, the structure of each tree is constrained by that of the previous tree, which limits the number of possible structures. When the structure of the tree is generated, list all possible structures of the tree. The validity of the structures can be judged as [
(22) |
where and are the edges in the tree; is the set of all edges in the tree; and # denotes the cardinality of the set.
Step 7: for each possible structure in Step 6, fit the pair Copulas with the TMC model.
Step 8: repeat Step 4 to Step 7 until the structures of all trees are generated, which together make up the structure of the R-vine Copula model.
The TRVMC model mainly aims at the wind farms located in the same region. More specifically, the model is more suitable for wind farms with a strong correlation. The reasons are as follows. The main function of the TRVMC model is to establish the joint-probability distribution model of the output of multiple wind farms. By making the probability distribution of the TRVMC model capture the characteristics of the statistical joint-frequency distribution of the wind power output sequences, the model can reflect the correlation of multiple wind farms.
Take the power output of 2 wind farms in the same region for example. The Kendall correlation coefficient is 0.761. The joint-frequency distribution of the power output sequences is shown in

Fig. 2 Joint-frequency distribution and probability distribution of power output of wind farms. (a) Joint-frequency distribution. (b) Probability distribution.
However, there is no clear boundary between strong correlation and weak correlation. Since the model is data driven, the data of any wind farm can be selected as the input data. If the wind farms are far away from each other and the power output sequences are independent, according to the Bayesian formula, the joint-PDF can be expressed as [
(23) |
In (23), the joint-PDF is equal to the multiplication of each independent PDF. In other words, the models considering the correlation are the same as those ignoring the correlation. Therefore, the calculation results of the models will become the same. Due to the above reasons, the TRVMC model is mainly applicable to regional wind farms.
Suppose that scenarios are generated for wind farms, and each scenario contains sampling points. Having fitted parameters of the ARIMA-GARCH-t model and the TRVMC model, the forecasting scenarios can be generated through the following steps.
Step 1: generate an random matrix Rnd, in which all elements obey the uniform distribution U(0,1).
Step 2: decompose the joint-PDF and reorder the input data sequences to according to the generated R-vine structure, as shown in (9).
Step 3: assign the values in Rnd to the conditional cumulative probability values as:
(24) |
where is the scenario; and is the sampling point.
Step 4: calculate the cumulative probability of the power output of each wind farm at the sampling point.
Supposing two variables and have been given by (24) or calculated by (25) in the last circle, the two variables are connected by a pair Copula model as:
(25) |
For the pair Copula , and are the corresponding input data. can be calculated by the interpolation methods [
Step 5: repeat Step 4 until the cumulative probability of the power output of all wind farms at the sampling point has been calculated, namely .
Step 6: taking as the input data, calculate the power output of each wind farm through the inverse function of the marginal distribution model:
(26) |
where is the power output of the wind farm at the sampling point; and is the inverse function of the marginal distribution model.
Step 7: based on the ARIMA-GARCH-t model, calculate the CDF of each wind power output sequence at the next sampling point.
Step 8: repeat Step 3 to Step 7 until all sampling points in the scenario are generated.
Step 9: repeat Step 3 to Step 8 until all scenarios are generated.
The basic principle of the models in this paper is the Skalr theorem. According to the theorem, the joint-PDF of high-dimensional data can be calculated by (9). And the corresponding joint-CDF is [
(27) |
where is the wind power output sequence; is the cumulative probability of ; is the joint-CDF; is the Copula model; and is the parameter set of .
According to (27), the joint-CDF can be calculated by combining the marginal distribution models with the Copula model. To this end, the ARIMA-GARCH-t model and the TRVMC model are established in this paper.
The modeling process is shown in part 1 of

Fig. 3 Overall process of modeling and scenario generation.
The scenario generation process is shown in part 2 of
The function of the marginal distribution model is to calculate the independent probability distribution of each wind power sequence. The model can be evaluated by the following steps.
Step 1: calculate the model parameters based on the historical output data of a wind farm.
Step 2: taking the wind power output of a period in the future as the test data, fit the probability distribution intervals of the test data under the preset confidence levels with the marginal distribution model.
Step 3: compare the fitted probability distribution intervals with the statistical characteristics of the test data.
According to [
The reliability index reflects the deviation between the probability distribution fitted by the model and the frequency distribution of the actual data sequence [
(28) |
where is the reliability index; is the number of data points in the interval; is the length of the test data sequence; and is the preset confidence level.
The sharpness index reflects the redundancy of the probability distribution intervals [
(29) |
where is the sharpness index; and and are the upper and lower boundaries of the interval of the data point, respectively.
The marginal distribution model with smaller reliability and sharpness values has better effectiveness.
The AIC and the Bayesian information criterion (BIC) are commonly used to evaluate the accuracy of the Copula models. The expression of the AIC index has been given in (20). The BIC index can be expressed as [
(30) |
where is the BIC index; is the number of model parameters; is the number of sample points; and is the value of the maximum likelihood function.
Both the AIC and the BIC are based on the maximum likelihood function. The AIC introduces a penalty factor for the complexity of the model. And the BIC further takes into account the influence of the sample size. The Copula model with smaller AIC and BIC has better effectiveness.
In this paper, the generated scenarios are evaluated from the following 3 aspects.
The ES index evaluates the difference between the actual wind power sequence and the generated scenarios [
(31) |
where is the ES index; is the number of generated scenarios; is the actual wind power sequence; and is the
In this paper, the time-series characteristics are evaluated from 2 aspects: the fluctuation characteristics and the autocorrelation function (ACF) [
The fluctuation is defined as the first-order difference sequence of wind power output, as shown in (32). The quantile-quantile (Q-Q) diagram and the cumulative probability curve are introduced to compare the fluctuation characteristics of the generated scenarios and the actual wind power sequence.
(32) |
where and are the wind power output sequence and the corresponding fluctuation sequence, respectively; and denotes the first-order difference computation.
The ACF index is the correlation between wind power sequences and . It intuitively reflects the time-series characteristics of wind power output. The expression is [
(33) |
where is the ACF index; and are the mean values of and , respectively; and are the standard deviations; and k is the delay time. When the delay time , .
The scenarios with ACF values closer to the actual wind power sequence have better effectiveness.
In [
(34) |
where is the CCF index; and are the output sequences of 2 wind farms; and are the mean values of and , respectively; and and are the standard deviations of and , respectively.
When the delay time , the CCF is the commonly-used Pearson correlation coefficient, which reflects the overall correlation of the 2 wind farms. With the change of delay time, the CCF reflects the correlation with time-series characteristics.
The scenarios with CCF values closer to the actual power sequence have better effectiveness.
In this subsection, the historical data from 8 wind farms in Northwest China is used for analysis. The sampling time is 3 months (from January
For the power output of the 8 wind farms, the average Kendall correlation coefficient of each wind farm to other wind farms is shown in
To better verify the superiority of the proposed model in this paper, the simulation work is divided into three parts. In Section VI-B and VI-C, the comparisons between different marginal distribution models and different Copula models are conducted. In Section VI-D, the effectiveness of the generated scenarios is evaluated from 3 aspects.
In addition, to verify the efficacy of the proposed method, based on the data from the 2012 Global Energy Forecasting Competition (GEFCom 2012) [
Four marginal distribution models are selected for comparison: ① the ARIMA-GARCH-t model; ② the student-t model [
The historical output data of 1 wind farm is taken to carry out the simulation. The data of the first 85 days are used to fit the model parameters, while the data of the last 5 days are used as test data. The reliability and the sharpness indexes are used for evaluation.

Fig. 4 Comparison of reliability index of different marginal distribution models at different confidence levels. (a) Comparison based on data of the third day. (b) Comparison based on data of all 5 days.

Fig. 5 Comparison of sharpness index of different marginal distribution models. (a) Comparison under 75% confidence level. (b) Comparison under 95% confidence level.
As shown in Figs.
Through theoretical analysis, for the student-t, KDE, and ECDF models, the basic principle is to simulate the long-term frequency distribution of historical samples and take it as the PDF of the test data. However, due to the time-varying characteristics, the short-term probability distribution of wind power might be significantly different from the long-term probability distribution, which may lead to reliability defects in the other 3 models.
Compared with the other 3 models, the ARIMA-GARCH-t model has two advantages. First, it can track the time-varying process of the probability distribution of wind power. On this basis, it can provide an accurate PDF for wind power at each moment, as shown in
In this part, 4 high-dimensional Copula models are selected for comparison: ① the TRVMC model in this paper; ② the static C-vine Copula model [
Taking the AIC and the BIC as the evaluation indexes, the simulation results are as follows.
As shown in
Through theoretical analysis, the TRVMC model is superior to the other 3 Copula models from 3 aspects.
1) The model does not need to make assumptions on the correlation between multiple wind farms. The flexible dependence structure enables the TRVMC model to capture the spatial-temporal correlation of multiple wind farms effectively.
2) The TMC model is used as the pair Copulas, which has higher accuracy than the classic bivariate Copula models in describing the complex joint distribution of every 2 wind farms.
3) The dynamic calculation method of the model parameters is introduced into the TRVMC model.
Therefore, the model can track the time-varying process of the correlation between the wind farms. The above factors greatly enhance the applicability of the TRVMC to wind farms in different regions and different periods.
In this part, 4 scenario generation models are selected for comparison: ① model 1, the proposed model in this paper; ② model 2, the independent ARIMA-GARCH-t model; ③ model 3, the static D-vine Copula model [
The typical characteristics of the 4 models are compared as in
Take the wind power data on March 2
The evaluation work consists of the following 3 parts.
In

Fig. 6 Generated scenarios of joint-output of 8 wind farms. (a) Model 1. (b) Model 2. (c) Model 3. (d) Model 4.

Fig. 7 SE index comparison of each wind farm.
As shown in
Through theoretical analysis, model 2 is the marginal distribution of model 1. Therefore, the SE values of the two models are similar when generating scenarios for a single wind farm. For the same reason, the SE values of model 3 and model 4 are also similar. The scenarios of 2 random wind farms are provided in Supplementary Material Part B.
When generating scenarios for the joint-output of 8 wind farms, the fitting results of the 4 models are quite different. The comparison can be divided into 2 aspects.
On one hand, compared with model 1 and model 2, model 3 and model 4 ignore the time-series characteristics, i.e., the auto-correlation of wind power in the temporal dimension. As a result, the scenarios generated by model 3 and model 4 fluctuate more frequently and sharply. Theoretically, over a short period of time, the wind power output at the next moment is correlated with that in the previous period. For example, when the current wind power is large, the wind power is unlikely to be quite small at the next moment. If the temporal-dimensional correlation of wind power output is ignored, the adjacent data points in the generated scenarios are more independent. When generating the output scenario at the next moment, the trend of the previous sequence is ignored, and the randomness of the calculation result is stronger. As a result, the generated scenarios fluctuate more frequently with greater amplitude. In some cases, the fluctuation range might be much larger than the actual wind power output sequence, as shown in
On the other hand, compared with model 1 and model 3, model 2 and model 4 ignore the spatial-temporal correlation between multiple wind farms. As a result, the scenarios of different wind farms are more independent. After superposition calculation, the fluctuation range of the joint-output scenarios is largely reduced. Theoretically, since the wind farms are located in the same region, the environmental factors are similar. Therefore, the output of the wind farms is correlated. And the changing process of the output sequences tends to follow similar trends. If the spatial-temporal correlation is ignored, in the generated scenarios, when the output of one wind farm is large, the output of the other wind farms may be small. In the superposition calculation, the peak output of one wind farm may be added with the valley output of the other wind farms. As a result, the fluctuation range of the joint-output scenarios is largely reduced. In some cases, the scenarios might not be able to envelop the actual joint-output sequence, as shown in
In this part, the scenarios of the joint-output of 8 wind farms are used for simulation.
The fluctuation characteristics of the scenarios are evaluated by the Q-Q diagram shown in

Fig. 8 Comparison of probability distribution of fluctuations. (a) Q-Q diagram of fluctuations. (b) Cumulative probability curve of fluctuations.

Fig. 9 ACF comparison of scenarios generated by different models. (a) Model 1. (b) Model 2. (c) Model 3. (d) Model 4.
As shown in Figs.
Through theoretical analysis, since model 3 and model 4 ignore the time-series characteristics, the adjacent data points in the generated scenarios are relatively independent. More specifically, when the wind power is large at this moment, the wind power might be rather small at the next moment. As a result, the overall fluctuation range of the scenarios largely exceeds that of the actual joint-output sequence, which is directly reflected in the generated scenario, as shown in
Compared with model 1, model 2 ignores the correlation of multiple wind farms. Therefore, the fluctuation range of the joint-output scenarios generated by model 2 is greatly reduced after the superposition calculation. As the direct performances, in
In this part, the generated scenarios of 2 wind farms are used for simulation. The effectiveness of the scenarios is evaluated by the CCF index. The simulation results are shown in

Fig. 10 CCF comparison of output scenarios of 2 wind farms. (a) Model 1. (b) Model 2. (c) Model 3. (d) Model 4.
As shown in
Through theoretical analysis, since model 2 and model 4 ignore the correlation between the wind farms, the CCF values are almost always smaller than the actual values. Besides, the distribution of the CCF curves is relatively dispersed, which indicates that the correlation between the wind farms in different scenarios is quite different. This is not consistent with the actual situation.
Although model 1 and model 3 both fit the correlation of the multiple wind farms by the Copula model, model 1 further considers the time-series characteristics of wind power output and the time-varying characteristics of the correlations. As a result, the CCF values of all scenarios generated by model 1 are close to the actual values. On the contrary, model 3 only considers the overall correlation of the wind farms, more specifically, the Kendall correlation coefficient. Consequently, the CCF values of the scenarios generated by model 3 are close to the actual values only when the delay time is 0 and still smaller than the actual value.
In this paper, a scenario generation method for the output of multiple wind farms considering the time-series characteristics and spatial-temporal correlation is proposed. The main conclusions are as follows.
1) The ARIMA-GARCH-t model can accurately fit the marginal distribution of wind power output, i.e., the independent CDF. For 1-day wind power output data, the reliability index value is within 10%, and the sharpness index value is within 0.1. Therefore, it can provide reliable input data for the Copula model.
2) Compared with the Copula models in the existing research, the TRVMC model has higher fitting accuracy for the joint-distribution of the output of multiple wind farms, which has smaller AIC and BIC values.
3) The ARIMA-GARCH-t model and the TRVMC model are combined to generate the output scenarios of multiple wind farms. The generated scenarios have similar time-series characteristics and spatial-temporal correlation with the actual wind power sequences. Specifically, the scenarios have good SE index performance, and the fluctuation characteristics, the ACF, and the CCF are similar to the actual wind power sequence.
Moreover, the proposed scenario generation method in this paper can be further applied to decision-making problems such as dispatch planning and optimization for trading strategies. Further studies are planned and will be reported.
References
A. Kavousi-Fard, A. Khosravi, and S. Nahavandi, “A new fuzzy-based combined prediction interval for wind power forecasting,” IEEE Transactions on Power Systems, vol. 31, no. 1, pp. 18-26, Jan. 2016. [Baidu Scholar]
A. Shukla and S. N. Singh, “Clustering based unit commitment with wind power uncertainty,” Energy Conversion and Management, vol. 111, pp. 89-102, Mar. 2016. [Baidu Scholar]
P. Pinson, N. Siebert, and G. Kariniotakis, “Forecasting of regional wind generation by a dynamic fuzzy-neural networks based upscaling approach,” in Proceedings of European Wind Energy Conference, Madrid, Spain, Jun. 2003, pp. 16-19. [Baidu Scholar]
M. G. Lobo and I. Sanchez, “Regional wind power forecasting based on smoothing techniques, with application to the Spanish peninsular system,” IEEE Transactions on Power Systems, vol. 27, no. 4, pp. 1990-1997, Nov. 2012. [Baidu Scholar]
X. Peng, L. Xiong, J. Wen et al., “A summary of the state of the art for short-term and ultra-short-term wind power prediction of regions,” Proceedings of the CSEE, vol. 36, no. 23, pp. 6596-6596, Dec. 2016. [Baidu Scholar]
P. Meibom, R. Barth, B. Hasche et al., “Stochastic optimization model to study the operational impacts of high wind penetrations in Ireland,” IEEE Transactions on Power Systems, vol. 26, no. 3, pp. 1367-1379, Aug. 2011. [Baidu Scholar]
H. Bludszuweit, “Statistical analysis of wind power forecast error,” IEEE Transactions on Power Systems, vol. 23, no. 3, pp. 983-991, Aug. 2008. [Baidu Scholar]
X. Ma, Y. Sun, and H. Fang, “Scenario generation of wind power based on statistical uncertainty and variability,” IEEE Transactions on Sustainable Energy, vol. 4, no. 4, pp. 894-904, Oct. 2013. [Baidu Scholar]
J. B. Bremnes, “A comparison of a few statistical models for making quantile wind power forecasts,” Wind Energy, vol. 9, no. 1, pp. 3-11, Apr. 2006. [Baidu Scholar]
P. Pinson and G. Kariniotakis, “Conditional prediction intervals of wind power generation,” IEEE Transactions on Power Systems, vol. 25, no. 4, pp. 1845-1856, Nov. 2010. [Baidu Scholar]
R. J. Bessa, V. Miranda, A. Botterud et al., “Time adaptive conditional kernel density estimation for wind power forecasting,” IEEE Transactions on Sustainable Energy, vol. 3, no. 4, pp. 660-669, Oct. 2012. [Baidu Scholar]
Y. Zhang, J. Wang, and X. Luo, “Probabilistic wind power forecasting based on logarithmic transformation and boundary kernel,” Energy Conversion and Management, vol. 96, pp. 440-451, May 2015. [Baidu Scholar]
G. Papaefthymiou and B. Klockl, “MCMC for wind power simulation,” IEEE Transactions on Energy Conversion, vol. 23, no. 1, pp. 234-240, Apr. 2008. [Baidu Scholar]
A. Tuohy, P. Meibom, E. Denny et al., “Unit commitment for systems with significant wind penetration,” IEEE Transactions on Power Systems, vol. 24, no. 2, pp. 592-601, May 2009. [Baidu Scholar]
J. M. Morales, R. Minguez, and A. J. Conejo, “A methodology to generate statistically dependent wind speed scenarios,” Applied Energy, vol. 87, no. 3, pp. 843-855, Mar. 2010. [Baidu Scholar]
D. D. Le, G. Gross, and A. Berizzi, “Probabilistic modeling of multisite wind farm production for scenario-based applications,” IEEE Transactions on Sustainable Energy, vol. 6, no. 3, pp. 748-758, Jul. 2015. [Baidu Scholar]
P. Pierre, M. Henrik, N. H. Aa et al., “From probabilistic forecasts to statistical scenarios of short-term wind power production,” Wind Energy, vol. 12, no. 1, pp. 51-62, Jan. 2010. [Baidu Scholar]
M. Yang, Y. Lin, S. Zhu et al., “Multi-dimensional scenario forecast for generation of multiple wind farms,” Journal of Modern Power Systems and Clean Energy, vol. 3, no. 3, pp. 361-370, May 2015. [Baidu Scholar]
W. Wu, K. Wang, B. Han et al., “A versatile probability model of photovoltaic generation using pair copula construction,” IEEE Transactions on Sustainable Energy, vol. 6, no. 4, pp. 1337-1345, Oct. 2015. [Baidu Scholar]
H. V. Haghi and S. Lotfifard, “Spatiotemporal modeling of wind generation for optimal energy storage sizing,” IEEE Transactions on Sustainable Energy, vol. 6, no. 1, pp. 113-121, Jan. 2015. [Baidu Scholar]
L. Kamal and Y. Z. Jafri, “Time series models to simulate and forecast hourly averaged wind speed in Quetta, Pakistan,” Solar Energy, vol. 61, no. 1, pp. 23-32, Jul. 1997. [Baidu Scholar]
C. Park, Y. Sun, K. T. Yoon et al., “Dickey-fuller test for an extended MA model,” Quantitative Bio-Science, vol. 38, no. 1, pp. 1-21, May 2019. [Baidu Scholar]
L. Li, S. Miao, Q. Tu et al., “Dynamic dependence modeling of wind power uncertainty considering heteroscedastic effect,” International Journal of Electrical Power and Energy Systems, vol. 116, pp. 105556-105558, Mar. 2020. [Baidu Scholar]
H. Liu, S. Jing, and X. Qu, “Empirical investigation on using wind speed volatility to estimate the operation probability and power output of wind turbines,” Energy Conversion and Management, vol. 67, pp. 8-17, Mar. 2013. [Baidu Scholar]
J. Dissmann, E. C. Brechmann, C. Czado et al., “Selecting and estimating regular vine copulae and application to financial returns,” Data Analysis, vol. 59, pp. 52-69, Nov. 2013. [Baidu Scholar]
X. Li, Copula Method and Its Application, Beijing: Economy and Management Publishing House, 2014. [Baidu Scholar]
R. Chou, C. Wu, and N. Liu, “Forecasting time-varying covariance with a range-based dynamic conditional correlation model,” Review of Quantitative Finance and Accounting, vol. 33, no. 4, pp. 327-345, Mar. 2009. [Baidu Scholar]
A. J. Patton, “Modelling asymmetric exchange rate dependence,” International Economic Review, vol. 47, no. 2, pp. 527-556, Jun. 2006. [Baidu Scholar]
E. C. Brechmann, C. Czado, and K. Aas, “Truncated regular vines in high dimensions with application to financial data,” Canadian Journal of Statistics, vol. 40, no. 1, pp. 68-85, Jan. 2012. [Baidu Scholar]
H. Akaike, Information Theory and an Extension of the Maximum Likelihood Principle, New York: Springer, 1998. [Baidu Scholar]
Y. Zhang, “The development of Bayesian theory and its applications in business and bioinformatics,” in Proceedings of IOP Conference, Beijing, China, Dec. 2017, pp. 28-31. [Baidu Scholar]
Z. Wang, W. Wang, C. Liu et al., “Probabilistic forecast for multiple wind farms based on regular vine copulas,” IEEE Transactions on Power Systems, vol. 33, no. 1, pp. 578-589, Apr. 2018. [Baidu Scholar]
M. Bogdan, “Modifying the Schwarz Bayesian information criterion to locate multiple interacting quantitative trait loci,” Genetics, vol. 167, no. 2, pp. 989-999, Jun. 2004. [Baidu Scholar]
P. Pinson and R. Girard, “Evaluating the quality of scenarios of short-term wind power generation,” Applied Energy, vol. 96, pp. 12-20, Aug. 2012. [Baidu Scholar]
D. Li, W. Yan, W. Li et al., “A two-tier wind power time series model considering day-to-day weather transition and intraday wind power fluctuations,” IEEE Transactions on Power Systems, vol. 31, no. 6, pp. 1-10, Dec. 2016. [Baidu Scholar]
Z. Wang, W. Wang, C. Liu et al., “Forecasted scenarios of regional wind farms based on regular vine copulas,” Journal of Modern Power Systems and Clean Energy, vol. 8, no. 1, pp. 77-85, Jan. 2020. [Baidu Scholar]
IEEE Power and Energy Society and IEEE Working Group on Energy Forecasting. (2021, Mar.). Global energy forecasting competition 2012-wind forecasting. [Onlline]. Available: https://www.kaggle.com/c/GEF2012-wind-forecasting/overview [Baidu Scholar]