Abstract
High-precision day-ahead short-term photovoltaic (PV) output forecasting is essential in PV integration to the smart distribution networks and multi-energy system, and provides the foundation for the security, stability, and economic operation of PV systems. This paper proposes a hybrid model based on principal component analysis, grey wolf optimization and generalized regression neural network (PCA-GWO-GRNN) for day-ahead short-term PV output forecasting, considering the features of multiple influencing factors and strong uncertainty. This paper first uses the PCA to reduce the dimension of meteorological features. Then, the high-precision day-ahead short-term PV output forecasting based on GWO-GRNN model is realized. GRNN is used to regressively analyze the input features after dimension reduction, and the parameter of GRNN is optimized by using GWO, which has strong global searching ability and fast convergence. The proposed PCA-GWO-GRNN model effectively achieves a high precision in day-ahead short-term PV output forecasting, which is demonstrated in a case study on a real PV plant in Jiangsu province, China. The results have validated the accuracy and applicability of the proposed model in real scenarios.
PHOTOVOLTAIC (PV) power generation technology is becoming an essential component of the smart grid as PV plants are being connected to the smart distribution network (SDN) on a large scale. The effective way to make full use of PV generation system is to keep the total output power of multi-energy system relatively stable, reduce power fluctuation, improve power quality and reduce the impact on the power grid. However, the output of PV plant has strong randomness and intermittent because of the significant effect by a number of factors such as solar irradiance, temperature, and humidity. Therefore, large-scale PV integration in SDN has a strong impact on the safety and stability of power system operations. In summary, to reduce the impact of large-scale PV penetration and ensure the secure and economic operation of power systems, it is imperative to achieve an accurately day-ahead short-term PV output forecasting by an effective model [
Existing short-term PV output forecasting approaches can be mostly categorized as physical or statistical methods. The physical method is built on solar irradiance transfer equation, PV module operation equation, and/or other physical equations. This category relies on detailed and precise geographic location information as well as the weather and solar irradiance data of the PV plants to create the model in an often-complicated process [
PV output is directly related to solar irradiance, but it is also affected by multiple complex meteorological factors, including temperature, humidity, and precipitation and others [

Fig. 1 PV output forecasting model based on PCA-GWO-GRNN.
1) In order to eliminate the collinearity of the input features of meteorological conditions, reduce the dimension of the input features, and avoid the model from overfitting, the PCA is used to reduce the dimensionality of the input features of meteorological conditions.
2) A day-ahead short-term PV output forecasting model based on GWO-GRNN is proposed. Among them, GRNN is used to fit the complex nonlinear relationship between PV output and input features. And the parameter of GRNN is optimized by GWO with strong global searching capacity. The case study verifies that the model proposed in this paper has strong forecasting accuracy and robustness.
The rest of the paper is organized as follows. The PCA is adopted to reduce the dimension of multi-weather factors and extract the features in Section II. Section III illustrates the basic theories of GRNN and GWO. Section IV presents a case study using a real PV plant in Jiangsu province, China, which verifies the accuracy and robustness of the proposed forecasting model. Section V draws the conclusions.
The energy used for PV output is completely derived from the solar irradiance, thus the solar irradiance directly affects the PV output. In addition, the output of PV plants is also affected by many other meteorological factors such as temperature, atmospheric pressure, and humidity. However, too many meteorological input features will cause the forecasting model to have a complicated structure, increase the training burden and affect the learning speed. Moreover, it will reduce the sensitivity of the forecasting model to solar irradiance.
The meteorological input features for PV output forecasting often have strong correlation. In this paper, the PCA is adopted to simplify the meteorological input features into a comprehensive meteorological factor. The PCA mainly finds a small set of linear combination variables to replace the original variables, so as to achieve the purpose of effectively separating the commonality between the data vectors while retaining the original variable information [
Assume that there are n samples and each sample has p variables, we can create an n × p data matrix. The process of PCA includes six steps as follows.
Step 1: standardize the original data into valid data.
Step 2: calculate the correlation coefficient matrix R.
Step 3: compute eigenvalues and eigenvectors. Firstly, the characteristic equation is solved and the Jacobian method is used to find the eigenvalues , where I is the identity matrix and is the eigenvalue. The eigenvalues are arranged in order of size, i.e., . Then, the eigenvectors corresponding to the eigenvalues are found.
Step 4: calculate the principal component contribution rate and the cumulative contribution rate. The contribution rate of principal component is formulated as . The cumulative contribution rate is defined by , where m is the number of principal components. In general, the principal component is taken, corresponding to the eigenvalues with cumulative contribution rate of 85%-95%.
Step 5: calculate the principal component value. The value of each principal component is calculated according to (1).
(1) |
where is the element of the eigenvector matrix; and is the value of principal component.
Step 6: calculate the comprehensive meteorological factor. The comprehensive meteorological factor F can be obtained from the linear weighted sum of the above m principal components, as shown in (2).
(2) |
where is the contribution rate.
Recently, ANNs have entered the PV output forecasting realm, and a number of approaches are developed based on Elman neural network [
The GWO algorithm is a swarm intelligence algorithm proposed by [
GRNN is composed of the input, the pattern, the summation, and the output layers. The corresponding input and output vectors can be denoted by and , respectively. The GRNN structure is shown in

Fig. 2 GRNN structure.
The number of input layer neurons is the same as the input dimension of the training samples, and each neuron transmits the input data directly to the pattern layer.
The number of pattern layer neurons is consistent with the number of training samples, and the transfer function is the RBF:
(3) |
where is the transfer function; and is the spread parameter.
The summation layer utilizes two summation ways: one is to calculate the weighted sum of the output of each neuron in the pattern layer; the other is to calculate the arithmetic sum of the outputs of the neurons in the pattern layer. The two types of formulas are shown in (4) and (5), respectively.
(4) |
(5) |
where is the weighted sum; is the arithmetic sum; ; and is the
The output layer adopts a linear function to output the result, and the estimation of the corresponding neuron j is:
(6) |
GRNN has only one parameter that needs to be determined, i.e., the spread parameter . If is too large, the forecasted value will approximate the mean of the target value in all training samples. If is too small, the generalization ability of the forecasting model will be limited. Therefore, in order to determine the best value of , the GWO is applied to find the optimal value and improve the forecasting accuracy of GRNN.
The GWO algorithm imitates the leadership hierarchy and hunting mechanism of grey wolves in nature. Compared with the PSO algorithm, the GWO algorithm does not depend on the setting of the parameters, and has both stronger search ability and faster search speed. , , , and are employed to simulate the leadership hierarchy as shown in

Fig. 3 Leadership hierarchy of wolf group.
To simulate the social hierarchy of wolves in GWO algorithm, we define three wolves , , and , and denote the remaining wolves as , according to the hierarchy. Among them, wolf is the optimal solution, wolves and are the sub-optimal solutions, and the remaining wolves are candidate solutions. The wolf pack approaches the optimal solution in the search space through the initial solution of three individual wolves , , and . The locations of wolves are then updated and evolved, while the distance from the prey is updated until the optimal solution is obtained.

Fig. 4 Location of grey wolf and prey in search space.
The distance between a wolf and the prey should be determined in advance before hunting:
(7) |
where and are the locations of the prey and the wolf at the iteration i, respectively; and is the coefficient vector.
(8) |
where is the spatial distance coefficient in [0, 1].
As the distance between the individual wolf and the prey decreases, the position of the individual wolf is constantly updated by:
(9) |
where is the coefficient vector.
(10) |
where decreases from 2 to 0 as the number of iterations increases; and is the same random coefficient as in [0, 1].
Wolves , , and are assumed to be the first wolves closest to the prey in the wolf pack. The positions of the remaining wolves are updated by:
(11) |
(12) |
(13) |
(14) |
(15) |
(16) |
(17) |
where , , and are the current positions of wolves , , and , respectively; , , and are the coefficient vectors of wolves , , and , respectively; , and are the coefficient vectors; and , , and are the distances between the individual wolfs , , and the head wolf in the remaining wolves , respectively.
The GWO algorithm locates the range of prey (optimal solution) through the positions of wolves , , and , as it gradually reduces the distance from the prey before finally catching it. Compared with other intelligent algorithms that search for the optimal solution, the GWO algorithm is capable of a multi-position search, which significantly improves the global search capacity.
This paper proposes a hybrid model based on PCA-GWO-GRNN for the PV output forecasting, which can be divided into the following five steps.
Step 1: data preprocessing.
Step 2: dimension reduction. The PCA is adopted to simplify the meteorological input features into a comprehensive meteorological factor.
Step 3: sample selection. The historic weather type, temperature, and day of year (DOY) are used as indicators to identify similar days for GRNN training. According to the meteorological conditions on the day of forecasting with the weather forecast, the samples with the same weather type as the day of forecasting are selected from the historical day to constitute the set A. Samples in set A whose daily maximum temperature is within ±3 ℃ from the day of forecasting are selected to form set B. Similarly, the samples in set B, whose DOY is within 30 days from the day of forecasting, are selected to form set C, which is called the set of similar days and used for parameter optimization and model training.
Step 4: parameter optimization. The samples in set C are divided into 10 folds for cross-validation, among which one fold is selected as the validation set, and the other 9 folds are combined as the training set. Then, we can use the forecasting error of GRNN as the fitness function of GWO algorithm to optimize the parameter of GRNN. The relevant initial parameters of the GWO algorithm are set as follows: the number of wolves is 20, the number of iterations is 50, and the variable dimension is 1. Finally, the optimal value among the 10 validations will be chosen.
Step 5: offline training. After determining the optimal parameter of the GRNN model, the training sample set is used for offline training of the GRNN model. The PV output forecasting model is then obtained after importing the input feature data on the day of forecasting to the trained GRNN model. The flowchart of short-term PV output forecasting based on PCA-GWO-GRNN is shown in

Fig. 5 Flowchart of short-term PV output forecasting based on PCA-GWO-GRNN.
In this case study, the focus is on verifying the accuracy and robustness of the proposed model for day-ahead short-term PV output forecasting [
In this paper, after obtaining the final PV output forecasting value, the nominal mean absolute error (nMAE) and root mean square error (RMSE) are used to evaluate the forecasting accuracy [
(18) |
(19) |
where n is the number of forecasting points; is the PV output forecasting value at time t; is the actual PV output at time t; and is the installed PV capacity.
Taking the features of PV into account, the irradiance, temperature, atmospheric pressure, wind speed, relative humidity, and precipitation are used as input features. The input variables and output variable y of the forecasting model are shown in
According to
As shown in
The sample values of the first three principal components can be obtained as follows:
(20) |
The weighted summation is based on the weighted contribution rate of each principal component to obtain a comprehensive meteorological factor:
(21) |
The input features for the proposed model are the comprehensive meteorological factor and the solar irradiance at forecasting time and 15 min before and after the forecasting time.
In order to verify the accuracy and superiority of the PCA-GWO-GRNN model, GWO-GRNN, PCA-LSTM, PCA-PSO-BP and the proposed model are used to forecast the output of the PV plant from July 4 to 31, 2018 (28 days in four weeks). Among them, the similar day selection of the four models adopts the same way as proposed in this paper, and the input and output variables are also the same. The neural network and its optimization algorithm are implemented using MATLAB 2018(b) and PCA is implemented using SPSS 19.
The 28 consecutive forecasting days include 6 sunny days, 12 cloudy days, 1 day of overcast day, and 9 rainy days. For each of the four weather types, 1 day is selected for qualitative analysis. The actual output and forecasting results of the four models are shown in Figs.

Fig. 6 Forecasting results on a sunny day.

Fig. 7 Forecasting results on a cloudy day.

Fig. 8 Forecasting results on an overcast day.

Fig. 9 Forecasting results on a rainy day.
The comparison of accuracy about Figs.
To further analyze the model performance, due to the large dispersion of the daily error of the PV output forecasting, the forecasting error of different weather types for 28 days is shown in
According to the error statistics in
The model proposed in this paper based on PCA-GWO-GRNN solves the problem of large number of input features and strong randomness in the day-ahead short-term PV output forecasting. This paper makes the following contributions:
1) PCA is adopted to reduce the dimension of meteorological input features and extract variables containing more than 85% of the original information. It can simplify the dimensionality of the input features of the model while ensuring accuracy.
2) GRNN can well fit the complex nonlinear relationship between PV output and input features, and further improve the ability to fit regression by introducing GWO algorithm to optimize its parameter. Thus, the proposed model is an appropriate mathematical tool to achieve high-precision PV output forecasting. Furthermore, because of the good forecasting performance, the proposed model can also provide reference for wind power output, power load and heat load forecasting in the future.
3) The results show that the forecasting model proposed in this paper fully excavates the effective information in the input features with high robustness and forecasting accuracy, which can offer effective solutions to the day-ahead short-term PV output forecasting and provide a basis for the optimal operation of multi-energy systems.
References
B. Elsinga and W. van Sark, “Short-term peer-to-peer solar forecasting in a network of photovoltaic systems,” Applied Energy, vol. 206, pp. 1464-1483, Nov. 2017. [百度学术]
M. N. Akhter, S. Mekhilef, H. Mokhlis et al., “Review on forecasting of photovoltaic power generation based on machine learning and metaheuristic techniques,” IET Renewable Power Generation, vol. 13, no. 7, pp. 1009-1023, Nov. 2019. [百度学术]
X. Zhang, Y. Li, S. Lu et al., “A solar time based analog ensemble method for regional solar power forecasting,” IEEE Transactions on Sustainable Energy, vol. 10, no. 1, pp. 268-279, Jan. 2019. [百度学术]
C. Feng, M. Cui, B. Hodge et al., “Unsupervised clustering-based short-term solar forecasting,” IEEE Transactions on Sustainable Energy, vol. 10, no. 4, pp. 2174-2185, Oct. 2019. [百度学术]
C. Wan, J. Zhao, Y. Song et al., “Photovoltaic and solar power forecasting for smart grid energy management,” CSEE Journal of Power and Energy Systems, vol. 1, no. 4, pp. 38-46, Dec. 2015. [百度学术]
C. Cui, Y. Zou, L. Wei et al., “Evaluating combination models of solar irradiance on inclined surfaces and forecasting photovoltaic power generation,” IET Smart Grid, vol. 2, no. 1, pp.123-130, Mar. 2019. [百度学术]
C. Lai, J. Li, B. Chen et al., “Review of photovoltaic power output prediction technology,” Transactions of China Electrotechnical Society, vol. 34, no. 6, pp.1201-1217, Mar. 2019. [百度学术]
E. Ogliari, A. Dolara, G. Manzolini et al., “Physical and hybrid methods comparison for the day ahead PV output power forecast,” Renewable Energy, vol. 113, pp. 11-21, Dec. 2017. [百度学术]
M. N. Akhter, S. Mekhilef, H. Mokhlis et al., “A review on forecasting of photovoltaic power generation based on machine learning and metaheuristic techniques,” IET Renewable Power Generation, vol. 13, no. 7, pp. 1009-1023, Feb. 2019. [百度学术]
B. Chen and J. Li, “Combined probabilistic forecasting method for photovoltaic power using an improved Markov chain,” IET Generation, Transmission & Distribution, vol. 13, no. 19, pp. 4364-4373, Oct. 2019. [百度学术]
J. Shi, W. Lee, Y. Yang et al., “Forecasting power output of photovoltaic system based on weather classification and support vector machine,” IEEE Transactions on Industry Applications, vol. 48, no. 3, pp. 1064-1069, May 2012. [百度学术]
C. Huang and P. Kuo, “Multiple-input deep convolutional neural network model for short-term photovoltaic power forecasting,” IEEE Access, vol. 7, pp. 74822-74834, Jun. 2019. [百度学术]
M. Abdel-Nasser and K. Mahmoud, “Accurate photovoltaic power forecasting models using deep LSTM-RNN,” Neural Computing & Applications, vol. 31, no. 7, pp. 2727-2740, Jul. 2019. [百度学术]
L. Li, S. Wen, and M. Tseng, “Renewable energy prediction: a novel short-term prediction model of photovoltaic output power,” Journal of Cleaner Production, vol. 228, pp. 359-375, Aug. 2019. [百度学术]
L. Liu, M. Zhan, and Y. Bai, “A recursive ensemble model for forecasting the power output of photovoltaic systems,” Solar Energy, vol. 189, pp. 291-298, Sept. 2019. [百度学术]
M. Yang and L. Meng, “Short-term photovoltaic power dynamic weighted combination forecasting based on least squares method,” IEEJ Transactions on Electrical and Electronic Engineering, vol. 14, no. 12, pp. 1739-1746, Dec. 2019. [百度学术]
H. Zhou, Y. Zhang, L. Yang et al., “Short-term photovoltaic power forecasting based on long short term memory neural network and attention mechanism,” IEEE Access, vol. 7, pp. 78063-78074, Jun. 2019. [百度学术]
A. Basilevsky, Statistical Factor Analysis and Related Methods: Theory and Applications. New York: Wiley, 1994, pp. 351-352. [百度学术]
G. A. Licciardi, R. Dambreville, J. Chanussot et al., “Spatiotemporal pattern recognition and nonlinear PCA for global horizontal irradiance forecasting,” IEEE Geoscience and Remote Sensing Letters, vol. 12, no. 2, pp. 284-288, Jul. 2014. [百度学术]
F. M. Bianchi, E. de Santis, A. Rizzi et al., “Short-term electric load forecasting using echo state networks and PCA decomposition,” IEEE Access, vol. 3, pp. 1931-1943, Oct. 2015. [百度学术]
X. Yao, Z. Wang, and H. Zhang, “A novel photovoltaic power forecasting model based on echo state network,” Neurocomputing, vol. 325, pp. 182-189, Jan. 2019. [百度学术]
P. Lin, Z. Peng, Y. Lai et al., “Short-term power prediction for photovoltaic power plants using a hybrid improved Kmeans-GRA-Elman model based on multivariate meteorological factors and historical power datasets,” Energy Conversion and Management, vol. 177, pp. 704-717, Dec. 2018. [百度学术]
H. Zhu, W. Lian, L. Lu et al., “An improved forecasting method for photovoltaic power based on adaptive BP neural network with a scrolling time window,” Energies, vol. 10, no. 10, 1542-1547, Oct. 2017. [百度学术]
J. Ospina, A. Newaz, and M. O. Faruque, “Forecasting of PV plant output using hybrid wavelet-based LSTM-DNN structure model,” IET Renewable Power Generation, vol. 13, no. 7, pp. 1087-1095, May 2019. [百度学术]
K. Nose-Filho, A. D. P. Lotufo, and C. R. Minussi, “Short-term multinodal load forecasting using a modified general regression neural network,” IEEE Transactions on Power Delivery, vol. 26, no. 4, pp. 2862-2869, Oct. 2011. [百度学术]
L. Yi, N. Dongxiao, and H. Wei-Chiang, “Short term load forecasting based on feature extraction and improved general regression neural network model,” Energy, vol. 166, pp. 653-663, Jan. 2019. [百度学术]
R. Hu, S. Wen, Z. Zeng et al., “A short-term power load forecasting model based on the generalized regression neural network with decreasing step fruit fly optimization algorithm,” Neurocomputing, vol. 221, pp. 24-31, Jan. 2017. [百度学术]
Z. Ming, X. Song, W. Zhijie et al., “Short-term load forecasting of smart grid systems by combination of general regression neural network and least squares-support vector machine algorithm optimized by harmony search algorithm method,” Applied Mathematics & Information Sciences, vol. 7, pp. 291-298, Feb. 2013. [百度学术]
L. Liu, Y. Zhao, D. Chang et al., “Prediction of short-term PV power output and uncertainty analysis,” Applied Energy, vol. 28, pp. 700-711, Jul. 2018. [百度学术]
S. Mirjalili, S. M. Mirjalili, and A. Lewis, “Grey wolf optimizer,” Advances in Engineering Software, vol. 69, no. 3, pp. 46-61, Mar. 2014. [百度学术]
K. Li, G. Cheng, X. Sun et al., “A nonlinear flux linkage model for bearingless induction motor based on GWO-LSSVM,” IEEE Access, vol. 7, pp. 36558-36567, May 2019. [百度学术]
S. Dai, D. Niu, and Y. Li, “Daily peak load forecasting based on complete ensemble empirical mode decomposition with adaptive noise and support vector machine optimized by modified grey wolf optimization algorithm,” Energies, vol. 11, pp. 1-25, Jan. 2018. [百度学术]
L. Ge, Y. Xian, J. Yan et al., “A FA-GWO-GRNN method for short-term photovoltaic output prediction,” in Proceedings of 2020 IEEE PES General Meeting, Montreal, Canada, Aug. 2020, pp. 1-9. [百度学术]
Y. Li, H. Zhang, X. Liang et al., “Event-triggered-based distributed cooperative energy management for multienergy systems,” IEEE Transactions on Industrial Informatics, vol. 15, no. 4, pp. 2008-2022, Apr. 2019. [百度学术]
H. Zhang, Y. Li, D. Gao et al., “Distributed optimal energy management for energy internet,” IEEE Transactions on Industrial Informatics, vol. 13, no. 6, pp. 3081-3097, Dec. 2017. [百度学术]
L. Fu, Y. Yang, X. Yao et al., “A regional photovoltaic output prediction method based on hierarchical clustering and the mRMR criterion,” Energies, vol. 12, no. 20, pp. 3817-3826, Oct. 2019. [百度学术]
P. Du, G. Zhang, P. Li et al., “The photovoltaic output prediction based on variational mode decomposition and maximum relevance minimum redundancy,” Applied Sciences, vol. 9, no. 17, pp. 3593-3599, Sept. 2019. [百度学术]