Abstract
The accurate prediction of photovoltaic (PV) power generation is significant to ensure the economic and safe operation of power systems. To this end, the paper establishes a new digital twin (DT) empowered PV power prediction framework that is capable of ensuring reliable data transmission and employing the DT to achieve high accuracy of power prediction. With this framework, considering potential data contamination in the collected PV data, a generative adversarial network is employed to restore the historical dataset, which offers a prerequisite to ensure accurate mapping from the physical space to the digital space. Further, a new DT-empowered PV power prediction method is proposed. Therein, we model a DT that encompasses a digital physical model for reflecting the physical operation mechanism and a neural network model (i.e., a parallel network of convolution and bidirectional long short-term memory model) for capturing the hidden spatiotemporal features. The proposed method enables the use of the DT to take advantages of the digital physical model and the neural network model, resulting in enhanced prediction accuracy. Finally, a real dataset is conducted to assess the effectiveness of the proposed method.
WITH the increasing integration of PV power generation, its nonlinearity, periodicity, and volatility pose great challenges to the stable operation of power systems. The uncertainty of the PV power generation and the randomness of the power demand may lead to the imbalance between the power supply and demand. Accurate prediction models can mitigate the impacts of uncertainty of PV power generation, improve power system stability, and reduce the maintenance costs of additional equipments [
Currently, several studies on PV power prediction have been proposed, which can be roughly divided into three categories: ① physical methods; ② statistical methods; and ③ artificial intelligence (AI)-based methods. The concept of physical methods is to use physical models to construct the relationship between PV power output and other factors such as numerical weather prediction (NWP) data [
To cope with shortcomings of physical and statistical methods, the AI-based methods for PV power generation have been proposed and gained significant attentions. For instance, convolutional neural networks (CNNs) were used for extracting spatial features [
The aforementioned PV power prediction models are built up on the assumption that the dataset is complete [
The aforementioned AI-based methods have yielded remarkable outcomes. However, there exist two challenges. On the one hand, although the consideration of data recovery is presented in [
To tackle those challenges, this paper establishes the DT empowered PV power prediction framework and proposes a DT-empowered PV power generation prediction method. The main contributions are described as follows.
1) We propose a new DT-empowered PV power prediction framework, which is composed of a physical layer, a data transmission layer, a DT layer, and a service layer, while defining the detailed functionality of each layer. This is a universal reference framework that enables the integration of the DT to empower the PV power prediction.
2) To ensure accurate mapping from the physical to the digital space, a GAN is employed to restore the historical dataset, considering potential data contamination in the collected PV data. This restoration process serves as a prerequisite for reliable data analysis and prediction within the DT framework.
3) A DT-empowered PV power prediction method is proposed, where the DT is constructed with a digital physical model and a parallel CNN and bidirectional long short-term memory (CNN-BiLSTM) model. The proposed method captures both the physical operation mechanism and hidden spatiotemporal features, leveraging the strengths of both models to increase the prediction accuracy.
The remainder of this paper is summarized as follows. Section II presents the DT-empowered PV power prediction framework. Section III provides the DT-empowered prediction method within the proposed framework. Section IV presents the simulations to evaluate the performance of the proposed method. Finally, Section V concludes this paper.

Fig. 1 Proposed DT-empowered PV power prediction framework.
This layer refers to physical objects in the real world such as PV panels and sensors. The layer will collect and store device parameters, the PV power generation data, and the meteorological data. Device parameters include short-circuit current , open-circuit voltage , data at the maximum power point (current , voltage , and the maximum power ), and volt-ampere characteristic curve of the PV panel. According to different sampling time points, historical datasets can be expressed as:
(1) |
(2) |
(3) |
where is the historical data including temperature, wind speed, solar radiation, relative humidity, and PV power generation, etc., collected at the
This layer serves as the connection channel between the physical and virtual spaces, enabling the collection and transmission of relevant data information from the PV power station. During data collection, the loss of data packets is possible, leading to incomplete time series data in the analysis of historical PV power generation data. To address this issue, we propose the utilization of a GAN for data recovery, which will be discussed in Section III-A. The historical data restored by GAN and the parameter data of PV panels, sensors, and other devices are transmitted from the physical space to the virtual space at one time, participating in the construction of the DT model of the PV power station. The real-time weather data are transmitted in real time from the physical space to the virtual space, which enables participate in the power prediction of the DT layer.
As the main part of this paper, this layer focuses on creating the DT model and using it to achieve PV power generation. In order to accurately reflect the real world and create a high-fidelity DT model, it is necessary to consider the physical characteristics of the PV system and extract the inherent relationships within the historical data simultaneously. In the virtual space, we set up a digital physical model that can reflect the physical operation mechanism and a parallel CNN-BiLSTM model to capture hidden spatiotemperal features. These components are combined using a fusion formula to accomplish the prediction of PV power. The detailed DT modeling process and the prediction procedure will be discussed in Section III-B and Section III-C, respectively.
Within the proposed framework, we propose the DT-empowered PV power prediction method that contains three phases: ① data preparation phase; ② DT modeling phase; and ③ power prediction phase.

Fig. 2 Flowchart of DT empowered PV prediction method.
The data preparation phase is performed at the data transmission layer. In this phase, the data transmission layer retrieves pertinent data from the PV power station. The historical meteorological and power data are fed into the GAN. Subsequently, the recovered historical data and device parameters are transferred from the physical layer to the DT layer at one time to participate in the construction of DT model. Real-time weather data are transmitted from the physical layer to the DT layer, which are used for subsequent PV power prediction.
Historical weather and power data collected from PV sites are combined and modeled as a tensor.
Firstly, adjacent vectors are established for , with a time interval of one sampling interval, i.e., 15 min. The adjacent vectors and adjacent to and are also established. For example, has two adjacent vectors and , where is the time step of the vector. It means that each vector contains data of sampling time points. The corresponding adjacent vectors of the mask matrix can also be obtained by using its procedure.
We represent as the
(4) |
where , , and is the set of all training samples input into GAN; , and . is the number of padding vectors , where is the number of meteorological factors, and is to obtain the value of remainder. An binary mask matrix with the same shape as is created to mark the positions of missing elements. For the missing elements in , the corresponding elements in are set to be 0; meanwhile, the rest of elements are set to be 1.
After modeling the historical data into a tensor, the problem of historical data recovery becomes the recovery of missing elements in the tensor.
To achieve effective data recovery, we employ the GAN consisting of a generator and a discriminator, which is capable of learning the temporal features of the data and capturing the intrinsic relationship between meteorological data and power data. The generator uses a CNN-based encoder-decoder structure. The encoder takes the missing dataset as input and generates the latent feature representation of , where is the dot product operator. Then, the decoder obtains the latent feature representation and outputs , which includes the recovered part of the missing data. Furthermore, to maximize the utilization of the reliable data that are already presented in set during the data generation process, a U-net is adopted in the generator to enhance feature extraction. The discriminator takes the restored matrix and the original complete matrix as inputs. The generator is trained to generate the restored matrix , while discriminator is trained to judge whether the quality of the missing data recovery is realistic enough. The employed generator and discriminator network structures are shown in

Fig. 3 Generator network structure.

Fig. 4 Discriminator network structure.
Based on the description of the model structure, the loss function of the generator and discriminator is proposed.
The loss function of generator includes the adversarial loss and the recovery loss. The adversarial loss is defined based on the output of discriminator, which represents the quality of recovery of missing data, i.e.,
(5) |
where is the discriminant value of the output of discriminator. The recovery loss is defined as the masked root-mean-squared error (RMSE) between and . Since has already dealt with the missing data part, the recovery loss mainly focuses on the part of intact data. The mathematical expression of the recovery loss is given by:
(6) |
Next, the loss function of the generator is defined as:
(7) |
The objective of the discriminator is to maximize the discriminative value of real historical data and minimize the discriminative value of the output of the generator. Therefore, the loss function of the discriminator is defined as:
(8) |
In order to build a virtual model at the DT layer that can accurately reflect the process of PV power generation in the real world, we receive the device parameters and the historical dataset after the data recovery from the transmission layer. First, we construct a digital physical model to simulate the internal mechanism of PV panel power generation. Then, a parallel CNN-BiLSTM model is built and trained to extract the inherent characteristics of meteorological factors and PV power generation. Eventually, a combination formula is applied to connect the two models to form the DT model.
This part is composed of the underlying physical model and the power deviation correction module. Specifically, the PV power plant is a device designed to convert solar radiation into direct current electricity. It primarily consists of solar cells, which are semiconductor thin films that directly generate electricity when exposed to sunlight of a specific irradiance. These solar cells can produce voltage and current when connected in a circuit. The power output of solar cells varies due to fluctuations under weather conditions. Solar radiation plays a crucial role in determining the power output. Higher temperatures can reduce the efficiency of power generation components, while strong winds can help to reduce the temperature of solar cells, thereby increasing power generation. This behavior can be effectively modeled using an equivalent circuit.
The formula for describing the output current of a single diode equivalent circuit is given by:
(9) |
where is the photocurrent generated by the battery due to incident solar radiation; is the short-circuit current caused by leakage at the edge of the battery and the formation of metal bridges; and is the diode current that comes from the Shockley equation. The mathematical expressions of and are given by:
(10) |
(11) |
where is the voltage drop across the battery due to incident solar radiation; is the series resistance; is the shunt resistance; is the reverse saturation current; is the electron charge; is the ideality factor of the diode; is the Boltzmann constant; and is the actual temperature of the PV module defined as:
(12) |
where is the ambient temperature; is the real-time irradiance; is the irradiance-induced shading effect; is the effect of wind speed; and is the real-time wind speed.
The generated power of the solar cell, denoted as , is calculated as:
(13) |
There exist five unknown parameters, i.e., , , , , and . By establishing five equations based on the short-circuit current , open-circuit voltage , the maximum power , at the maximum power point, and at the short-circuit point, the unknown parameters can be obtained. With those components, a physical model of the PV power station can be constructed. The input data are , , and , while the output data are , , and .
Based on the predicted PV power data obtained from the aforementioned underlying physical model, the model considers only environmental temperature, real-time irradiance, and real-time wind speed as inputs. However, this method fails to account for the complex practical conditions of the PV power station and other weather factors, leading to certain deviations in the prediction results. To address this issue, a deviation correction process is introduced. In this process, the similarity in PV power output under the influence of external climate conditions is taken into account, considering different seasons and sampling times within a day. By calculating and storing the difference between the output power of the underlying physical model and the actual historical power, it is possible to determine a correction value. This correction value is then used to adjust the predicted power from the underlying physical model, resulting in more accurate prediction results within the digital physical model. To implement the deviation correction, the historical weather data that have been restored through the use of GAN are employed as input to the underlying physical model. Let represent the output power, and is the actual power. The difference between the predicted power and the actual power of the underlying physical model can be calculated as:
(14) |
According to (14), the revised value, denoted as , can be calculated as:
(15) |
(16) |
where is an adjustable hyperparameter between 0 and 1.
As the historical dataset used for constructing the digital physical model typically contains a large amount of data, spanning more than one year, it is essential to fully utilize this dataset while ensuring the stability of the revised value calculation. To achieve this, the calculation result of the revised value is averaged on a yearly basis, resulting in . The formula for calculating the elements in the array is expressed as:
(17) |
where ; and is used to round up to an integer; and is the number of data sampling times per day.
After is obtained, the power value of the corrected output power at the
(18) |
To capture the underlying relationships among diverse meteorological data and the temporal dependencies within the data, we propose a parallel CNN-BiLSTM model, as depicted in

Fig. 5 Structure of parallel CNN-BiLSTM model.
Tensor modeling is conducted on the recovered historical meteorological data and historical power data, denoted as and , respectively. Meanwhile, the predicted power of the neural network model is defined as :
(19) |
(20) |
(21) |
(22) |
where is the input data required to predict the power at the
(23) |
where is the batch of samples for each training.
In order to leverage the advantages of the digital physical model and the parallel CNN-BiLSTM model, we design the combination formula that is a linear combination of the prediction results form the two models. The combined result is used as the final predicted PV power. We define and to represent the predicted values of the digital physical model and the parallel CNN-BiLSTM model. The difference between the real power and the predicted power from the digital physical model as well as the difference between the real power and the predicted power from the parallel CNN-BiLSTM model can be calculated as (24) and (25), respectively:
(24) |
(25) |
In order to reduce the amount of data, maximize the use of recovered historical data, and avoid the contingency of calculation results, the above two difference values are averaged annually. The calculation formula is given by:
(26) |
where or 2.
According to (26), we can obtain the averaged differences and as:
(27) |
(28) |
where ; and .
The combined formula of the power prediction results is defined as:
(29) |
where and are the weight coefficients of the predicted power from the digital physical model and the parallel CNN-BiLSTM model, respectively. The mathematical definitions of the weight coefficients are designed as:
(30) |
(31) |
where is a hyperparameter.
After finishing the phases of data preparation and DT modelling, we proceed the final power prediction phase. Taking the real-time weather data as input, we use the digital physical model and the parallel CNN-BiLSTM model to calculate the prediction results and . Then, the final predicted power is obtained through the calculation of the combined formula of the power prediction results.
Remark 1: the data augmentation method is a kind of data preprocessing technique for expanding training data through a series of transformations and extensions of the original dataset to generate new training samples. Distinguished from the data augmentation methods, the DT focuses on creating the data counterpart of the physical systems to provide simulation and analysis. In the aspect of solving the prediction problem, the data augmentation method enables the extension of training data to deal with the data imbalance and improve the prediction accuracy. In this paper, the DT is used to create digital physical models that reflect the intrinsic mechanisms of physical systems, and use machine learning models to capture hidden features that are difficult to analyze based on physical models. This enables the integration of physical knowledge and data-driven methods to achieve accurate modeling and prediction of real systems. In this paper, we have complete real dataset without the requirement of generating new dataset. Thus, we intend to introduce DT to increase the prediction accuracy.
The real dataset comes from the global intelligent evolution simulation experiment platform and engineering demonstration application project of distributed information energy system at Northeastern University in China. This dataset contains historical records of relevant information on power generation and weather conditions. Specifically, it covers the period between 2016 and 2018 and includes the data recorded from 08:00 a.m. to 17:00 p.m. daily. The sampling interval is 15 min. The data types include temperature, wind speed, solar irradiance, relative humidity, and PV output power. The first 24 months and the last 12 months of the historical dataset are taken as training and testing samples, respectively. The time dimension of the data, the number of meteorological factors, and the data sampling frequency per day are , , and , respectively. To handle the missing and abnormal data, invalid data are identified and set to be zero in the mask matrix M. In order to eliminate data dimensions and enhance data features, the historical dataset is normalized and then inputted into the GAN for data recovery to improve the quality of the dataset.
The parameters of the generator and discriminator networks are listed in Tables
Layer | Part | Kernel size | Number |
---|---|---|---|
1 | Convolution | 16 | |
2 | Convolution | 32 | |
3 | Attention | ||
4 | Convolution | 64 | |
5 | Deconvolution | 64 | |
6 | Attention | ||
7 | Deconvolution | 64 | |
8 | Deconvolution | 32 |
Layer | Part | Kernel size | Number |
---|---|---|---|
1 | Convolution | 8 | |
2 | Convolution | 16 | |
3 | Convolution | 32 | |
4 | Attention | ||
5 | Convolution | 64 | |
6 | Convolution | 1 |
The generator takes input data with a time step of and , resulting in and . The convolutional layers in the generator network employ SAME padding with a stride of . Similarly, the convolutional layers in the discriminator network also adopt SAME padding, with a stride of , except for the last convolutional layer, which has a stride of . The Adam optimizer is used for the GAN with the activation function Leaky ReLU and keep-probability of 0.8. The input data of the parallel CNN-BiLSTM model in the DT layer have a time step of and . We chose the batch size as 64, the epochs as 50, and the learning rate as 0.0002. The parameters of the parallel CNN-BiLSTM model are shown in
Part | Kernel size or hidden size | Number of convolutional kernels |
---|---|---|
Conv2D 1 | 6 | |
Conv2D 2 | 6 | |
Conv2D 3 | 8 | |
BiLSTM 1 | 64 | |
BiLSTM 2 | 64 | |
FC 1 | 128 | |
FC 2 | 64 |
We evaluate the accuracy of PV power prediction models by using the RMSE and the mean absolute error (MAE), which are defined as:
(32) |
(33) |
where is the measured PV power at the
The hyperparameter of the deviation correction module in the digital physical model and the hyperparameter in the combination formula of power prediction results are determined by using the grid searching method. The decision principle of and is that the higher the accuracy of the predicted power, the better the determination of hyperparameters. It means that the hyperparameter should be determined to minimize the RMSE. The searching results for RMSE of and k are shown in

Fig. 6 Searching result for RMSE of .

Fig. 7 Searching result for RMSE of .
In this case study, we focus on evaluating the performance of the proposed DT-empowered PV power prediction method by comparing with several baselines. The baselines include CNN [
Weather type | Winter | Spring | Summer | Autumn |
---|---|---|---|---|
Sunny | January 9 | April 7 | July 13 | October 27 |
January 10 | April 8 | July 14 | October 28 | |
January 11 | April 9 | July 15 | October 29 | |
January 12 | April 10 | July 16 | October 30 | |
Rainy | January 2 | April 16 | July 2 | October 14 |
January 5 | April 29 | July 3 | October 15 | |
January 19 | May 17 | July 4 | October 16 | |
January 21 | May 27 | July 24 | October 21 | |
Extreme | January 3 | March 7 | June 28 | November 5 |
January 4 | April 5 | July 6 | November 7 | |
January 27 | April 23 | July 22 | November 8 | |
February 19 | May 5 | August 17 | November 26 |
The results of PV power prediction on typical days using the proposed method and baselines are presented in

Fig. 8 Results of PV power prediction on typical days using proposed method and baselines. (a) Sunny in spring. (b) Rainy in spring. (c) Extreme weather in spring. (d) Sunny in summer. (e) Rainy in summer. (f) Extreme weather in summer. (g) Sunny in autumn. (h) Rainy in autumn. (i) Extreme weather in autumn. (j) Sunny in winter. (k) Rainy in winter. (l) Extreme weather in winter.
Season | Weather type | Evaluation indicator | DT | CNN | LSTM | CNN-LSTM | GCN |
---|---|---|---|---|---|---|---|
Spring | Sunny | RMSE | 5.4841 | 13.8077 | 9.1976 | 7.6780 | 11.3677 |
MAE | 3.8838 | 7.3492 | 5.3634 | 5.3930 | 5.2880 | ||
Rainy | RMSE | 6.9357 | 12.8645 | 10.2409 | 8.0464 | 11.7896 | |
MAE | 5.0822 | 7.9312 | 5.9123 | 5.2222 | 6.2912 | ||
Extreme | RMSE | 3.6891 | 9.3243 | 7.4335 | 5.6493 | 8.7926 | |
MAE | 2.3016 | 5.3195 | 4.3060 | 3.2325 | 5.0059 | ||
Summer | Sunny | RMSE | 4.2208 | 7.0925 | 6.6126 | 5.5649 | 7.7595 |
MAE | 3.2375 | 4.3243 | 4.2353 | 3.4159 | 5.1515 | ||
Rainy | RMSE | 5.4557 | 12.8795 | 10.3968 | 8.6488 | 13.4175 | |
MAE | 3.5505 | 8.0703 | 5.9831 | 4.8015 | 7.7678 | ||
Extreme | RMSE | 4.2045 | 13.2859 | 9.5298 | 7.7692 | 12.3455 | |
MAE | 2.7993 | 7.4703 | 5.4042 | 4.4423 | 7.0225 | ||
Autumn | Sunny | RMSE | 3.9254 | 10.3079 | 6.8957 | 6.2815 | 8.5103 |
MAE | 2.8185 | 5.6227 | 4.4287 | 4.0632 | 4.7555 | ||
Rainy | RMSE | 3.3121 | 11.1165 | 8.2280 | 6.9399 | 9.8849 | |
MAE | 2.2533 | 6.3080 | 4.7879 | 4.4101 | 5.4118 | ||
Extreme | RMSE | 2.7799 | 6.1071 | 4.8034 | 3.6499 | 5.7009 | |
MAE | 1.5666 | 3.4169 | 2.6687 | 2.0296 | 3.1732 | ||
Winter | Sunny | RMSE | 4.9257 | 12.4006 | 7.9110 | 6.6768 | 9.4003 |
MAE | 3.6179 | 7.7115 | 5.3117 | 4.3161 | 6.2213 | ||
Rainy | RMSE | 3.0111 | 6.4340 | 5.3785 | 4.7058 | 6.0889 | |
MAE | 1.7395 | 4.1226 | 3.3040 | 2.5032 | 3.8803 | ||
Extreme | RMSE | 3.0069 | 5.6353 | 3.7990 | 3.3466 | 5.1164 | |
MAE | 1.9092 | 3.3758 | 2.6538 | 2.1242 | 3.0018 |
Weather type | Evaluation indicator | DT | CNN | LSTM | CNN-LSTM | GCN |
---|---|---|---|---|---|---|
Sunny | RMSE | 4.6787 | 11.7701 | 7.7518 | 6.6092 | 9.3590 |
MAE | 3.3787 | 6.2900 | 4.5375 | 4.2887 | 5.2065 | |
Rainy | RMSE | 4.7853 | 11.4384 | 8.6185 | 7.0423 | 10.4900 |
MAE | 2.7853 | 6.4790 | 4.9233 | 4.1279 | 5.8891 | |
Extreme | RMSE | 3.3973 | 8.7974 | 6.5928 | 5.2062 | 8.2042 |
MAE | 2.1973 | 5.3658 | 3.8859 | 3.1043 | 4.7576 |
Season | Evaluation indicator | DT | CNN | LSTM | CNN-LSTM | GCN |
---|---|---|---|---|---|---|
Spring | RMSE | 5.2201 | 11.2352 | 8.2694 | 6.1115 | 9.8017 |
MAE | 3.4201 | 7.1627 | 4.7939 | 4.1282 | 5.6451 | |
Summer | RMSE | 4.5776 | 10.9193 | 8.6484 | 7.4751 | 9.6631 |
MAE | 3.1776 | 6.6673 | 5.3002 | 4.1911 | 6.7146 | |
Autumn | RMSE | 3.1688 | 8.9542 | 6.3748 | 4.9967 | 7.7242 |
MAE | 2.1688 | 5.3829 | 3.5284 | 3.2643 | 4.4611 | |
Winter | RMSE | 3.6216 | 7.7566 | 5.1221 | 4.9871 | 6.6926 |
MAE | 2.4216 | 4.9165 | 3.7249 | 2.8665 | 4.1703 |
Method | RMSE | MAE |
---|---|---|
DT | 4.2934 | 2.7841 |
CNN | 9.8195 | 6.2591 |
LSTM | 6.9598 | 4.4019 |
CNN-LSTM | 5.8476 | 3.7675 |
GCN | 9.1588 | 5.3338 |
1) The proposed method obtains the lowest RMSE and MAE values compared with the baselines, regardless of the season and weather conditions. The lowest RMSE value means that the prediction performance of the proposed method is the most stable and the error fluctuation range is small. The lowest MAE value denotes that the difference between the predicted results of the proposed method and the actual observed values are the smallest.
2) In comparison to the LSTM and CNN models, the proposed method is capable of extracting spatio-temporal features from the dataset more effectively and has stronger abilities in mining data features. Compared with the CNN-LSTM model, the proposed method considers not only the inherent hidden features of weather and power data, but also takes into account the practical conditions of PV panels and other devices. For the GCN, it relies primarily on the adjacency relationships of nodes, which limits information propagation and leads to lower prediction accuracy. Consequently, the prediction accuracy of the proposed method is significantly superior to that of baselines.
In order to further demonstrate the effectiveness of the proposed method, ablation analysis is conducted in this case study.

Fig. 9 Prediction results for three different weather types under ablation analysis. (a) Sunny. (b) Rainy. (c) Extreme.
Tables
Weather type | Evaluation indicator | DT | Digital physical model | Parallel CNN-BiLSTM model |
---|---|---|---|---|
Sunny | RMSE | 4.6787 | 14.1536 | 6.1925 |
MAE | 3.3787 | 8.7514 | 3.2086 | |
Rainy | RMSE | 4.7853 | 7.8433 | 6.6547 |
MAE | 2.7853 | 4.8217 | 3.8305 | |
Extreme | RMSE | 3.3973 | 3.7055 | 3.2356 |
MAE | 2.1973 | 2.4293 | 2.3866 |
Season | Evaluation indicator | DT | Digital physical model | Parallel CNN-BiLSTM model |
---|---|---|---|---|
Spring | RMSE | 5.2201 | 11.3062 | 5.3032 |
MAE | 3.4201 | 7.2975 | 3.2701 | |
Summer | RMSE | 4.5776 | 9.7968 | 6.1744 |
MAE | 3.1776 | 5.8107 | 3.8261 | |
Autumn | RMSE | 3.1688 | 6.2832 | 3.9064 |
MAE | 2.1688 | 4.2208 | 2.2982 | |
Winter | RMSE | 3.6216 | 9.5294 | 4.9965 |
MAE | 2.4216 | 5.0249 | 2.7553 |
Evaluation indicator | DT | Digital physical model | Parallel CNN-BiLSTM model |
---|---|---|---|
RMSE | 4.2934 | 9.6687 | 5.3674 |
MAE | 2.7841 | 5.5808 | 3.2338 |
The results indicate that the combined version achieves the highest prediction accuracy compared with the digital physical model and the parallel CNN-BiLSTM model. This is because the proposed method takes advantages of both the physical characteristics of PV power station and the inherent data features between meteorological and power data. This method enables better simulation of real-world PV power generation processes and achieves accurate PV power prediction.
In the paper, we have established a DT-empowered PV power prediction framework to achieve reliable data transmission and power prediction with high accuracy. We have designed the use of GAN for data recovery from historical data, which is capable of significantly improving the quality of constructing a DT virtual power station. This enhances the reliability of mapping from the physical space to the digital space. We have proposed a new DT-empowered PV power prediction method. By integrating the digital physical model and the parallel CNN-BiLSTM model, the proposed method effectively enhances the prediction accuracy for PV power generation. Finally, the testing results on the real dataset from Northeastern University show that the proposed method can achieve higher prediction accuracy that the baselines in different scenarios. In the future work, we would like to investigate the integration of federated learning to enhance the privacy of the proposed method.
References
A. Shafi, H. Sharadga, and S. Hajimirza, “Design of optimal power point tracking controller using forecasted photovoltaic power and demand,” IEEE Transactions on Sustainable Energy, vol. 11, no. 3, pp. 1820-1828, Jul. 2020. [Baidu Scholar]
H. Zhang, Y. Li, D. W. Gao et al., “Distributed optimal energy management for energy internet,” IEEE Transactions on Industrial Informatics, vol. 13, no. 6, pp. 3081-3097, Dec. 2017. [Baidu Scholar]
Y. Li, H. Zhang, X. Liang et al., “Event-triggered-based distributed cooperative energy management for multienergy systems,” IEEE Transactions on Industrial Informatics, vol. 15, no. 4, pp. 2008-2022, Apr. 2019. [Baidu Scholar]
X. Zhang, Y. Li, S. Lu et al., “A solar time based analog ensemble method for regional solar power forecasting,” IEEE Transactions on Sustainable Energy, vol. 10, no. 1, pp. 268-279, Jan. 2019. [Baidu Scholar]
K. Hu, S. Cao, L. Wang et al., “A new ultra-short-term photovoltaic power prediction model based on ground-based cloud images,” Journal of Cleaner Production, vol. 200, pp. 731-745, Nov. 2018. [Baidu Scholar]
B. Kim, D. Suh, M.-O. Otto et al., “A novel hybrid spatio-temporal forecasting of multisite solar photovoltaic generation,” Remote Sensing, vol. 13, no. 13, pp. 2605, Jul. 2021. [Baidu Scholar]
K. Doubleday, S. Jascourt, W. Kleiber et al., “Probabilistic solar power forecasting using bayesian model averaging,” IEEE Transactions on Sustainable Energy, vol. 12, no. 1, pp. 325-337, Jan. 2021. [Baidu Scholar]
L. F. Tratar and E. Strmcnik, “The comparison of holt-winters method and multiple regression method: a case study,” Energy, vol. 109, pp. 266-276, Aug. 2016. [Baidu Scholar]
H. T. C. Pedro and C. F. M. Coimbra, “Assessment of forecasting techniques for solar power production with no exogenous inputs,” Solar Energy, vol. 86, no. 7, pp. 2017-2028, Jul. 2012. [Baidu Scholar]
Y. Tang, K. Yang, S. Zhang et al., “Photovoltaic power forecasting: a hybrid deep learning model incorporating transfer learning strategy,” Renewable and Sustainable Energy Reviews, vol. 162, no. 11, p. 112473, Jul. 2022. [Baidu Scholar]
J. Yan, L. Hu, Z. Zhen et al., “Frequency-domain decomposition and deep learning based solar PV power ultra-short-term forecasting model,” IEEE Transactions on Industry Applications, vol. 57, no. 4, pp. 3282-3295, Jul.-Aug. 2021. [Baidu Scholar]
H. Li, Z. Ren, Y. Xu et al., “A multi-data driven hybrid learning method for weekly photovoltaic power scenario forecast,” IEEE Transactions on Sustainable Energy, vol. 13, no. 1, pp. 91-100, Jan. 2022. [Baidu Scholar]
F. Wang, J. Li, Z. Zhen et al., “Cloud feature extraction and fluctuation pattern recognition based ultrashort-term regional PV power forecasting,” IEEE Transactions on Industry Applications, vol. 58, no. 5, pp. 6752-6767, Sept.-Oct. 2022. [Baidu Scholar]
Y. Zhang, C. Qin, A. K. Srivastava et al., “Data-driven day-ahead PV estimation using autoencoder-LSTM and persistence model,” IEEE Transactions on Industry Applications, vol. 56, no. 6, pp. 7185-7192, Nov.-Dec. 2020. [Baidu Scholar]
L. Cheng, H. Zang, Z. Wei et al., “Short-term solar power prediction learning directly from satellite images with regions of interest,” IEEE Transactions on Sustainable Energy, vol. 13, no. 1, pp. 629-639, Jan. 2022. [Baidu Scholar]
R. Zhang, H. Ma, T. K. Saha et al., “Photovoltaic nowcasting with bi-level spatio-temporal analysis incorporating sky images,” IEEE Transactions on Sustainable Energy, vol. 12, no. 3, pp. 1766-1776, Jul. 2021. [Baidu Scholar]
J. Li, C. Zhang, and B. Sun, “Two-stage hybrid deep learning with strong adaptability for detailed day-ahead photovoltaic power forecasting,” IEEE Transactions on Sustainable Energy, vol. 14, no. 1, pp. 193-205, Jan. 2023. [Baidu Scholar]
S. Chai, Z. Xu, Y. Jia et al., “A robust spatiotemporal forecasting framework for photovoltaic generation,” IEEE Transactions on Smart Grid, vol. 11, no. 6, pp. 5370-5382, Nov. 2020. [Baidu Scholar]
J. Simeunovic, B. Schubnel, P.-J. Alet et al., “Spatio-temporal graph neural networks for multi-site PV power forecasting,” IEEE Transactions on Sustainable Energy, vol. 13, no. 2, pp. 1210-1220, Apr. 2022. [Baidu Scholar]
L. Cheng, H. Zang, Z. Wei et al., “Solar power prediction based on satellite measurements: a graphical learning method for tracking cloud motion,” IEEE Transactions on Power Systems, vol. 37, no. 3, pp. 2335-2345, May 2022. [Baidu Scholar]
M. Zhang, Z. Zhen, N. Liu et al., “Optimal graph structure based short-term solar PV power forecasting method considering surrounding spatio-temporal correlations,” IEEE Transactions on Industry Applications, vol. 59, no. 1, pp. 345-357, Jan.-Feb. 2023. [Baidu Scholar]
T. Yao, J. Wang, Y. Wang et al., “Very short-term forecasting of distributed PV power using GSTANN,” CSEE Journal of Power and Energy Systems, doi: 10.17755/CSEEJPES.2022.00110 [Baidu Scholar]
L. Cheng, H. Zang, T. Ding et al., “Multi-meteorological-factor-based graph modeling for photovoltaic power forecasting,” IEEE Transactions on Sustainable Energy, vol. 12, no. 3, pp. 1593-1603, Jul. 2021. [Baidu Scholar]
J. Choi, J.-I. Lee, I.-W. Lee et al., “Robust PV-BESS scheduling for a grid with incentive for forecast accuracy,” IEEE Transactions on Sustainable Energy, vol. 13, no. 1, pp. 567-578, Jan. 2022. [Baidu Scholar]
Q. Li, Y. Xu, B. S. H. Chew et al., “An integrated missing-data tolerant model for probabilistic PV power generation forecasting,” IEEE Transactions on Power Systems, vol. 37, no. 6, pp. 4447-4459, Nov. 2022. [Baidu Scholar]
W. Liu, C. Ren, and Y. Xu, “PV generation forecasting with missing input data: a super-resolution perception approach,” IEEE Transactions on Sustainable Energy, vol. 12, no. 2, pp. 1493-1496, Apr. 2021. [Baidu Scholar]
Y. Nie, A. S. Zamzam, and A. Brandt, “Resampling and data augmentation for short-term PV output prediction based on an imbalanced sky images dataset using convolutional neural networks,” Solar Energy, vol. 224, pp. 341-354, Aug. 2021. [Baidu Scholar]
T. Polasek and M. Cadik, “Predicting photovoltaic power production using high-uncertainty weather forecasts,” Applied Energy, vol. 339, p. 120989, Jun. 2023. [Baidu Scholar]
S. Goudarzi, A. Asif, and H. Rivaz, “Fast multi-focus ultrasound image recovery using generative adversarial networks,” IEEE Transactions on Computational Imaging, vol. 6, pp. 1272-1284, Aug. 2020. [Baidu Scholar]
L. Han, K. Zheng, L. Zhao et al., “Content-aware traffic data completion in ITS based on generative adversarial nets,” IEEE Transactions on Vehicular Technology, vol. 69, no. 10, pp. 11950-11962, Oct. 2020. [Baidu Scholar]
Y. Li and Y. Zhang, “Digital twin for industrial internet,” Fundamental Research, vol. 4, no. 1, pp. 21-24, Jan. 2024. [Baidu Scholar]
A. Marot, A. Kelly, M. Naglic et al., “Perspectives on future power system control centers for energy transition,” Journal of Modern Power Systems and Clean Energy, vol. 10, no. 2, pp. 328-344, Mar. 2022. [Baidu Scholar]
S. Mihai, M. Yaqoob, D. V. Hung et al., “Digital twins: a survey on enabling technologies, challenges, trends and future prospects, ” IEEE Communications Surveys and Tutorials, vol. 24, no. 4, pp. 2255-2291, Sept. 2022. [Baidu Scholar]
Y. Wu, K. Zhang, and Y. Zhang, “Digital twin networks: a survey,” IEEE Internet of Things Journal, vol. 8, no. 18, pp. 13789-13804, Sept. 2021. [Baidu Scholar]
C. Wang and Y. Li, “Digital-twin-aided product design framework for IoT platforms,” IEEE Internet of Things Journal, vol. 9, no. 12, pp. 9290-9300, Jun. 2022. [Baidu Scholar]
H. Elayan, M. Aloqaily, M. Guizani et al., “Digital twin for intelligent context-aware IoT healthcare systems,” IEEE Internet of Things Journal, vol. 8, no. 23, pp. 16749-16757, Dec. 2021. [Baidu Scholar]
D. Liu, Y. Du, W. Chai et al., “Digital twin and data-driven quality prediction of complex die-casting manufacturing,” IEEE Transactions on Industrial Informatics, vol. 18, no. 11, pp. 8119-8128, Nov. 2022. [Baidu Scholar]
H. Xu, A. Berres, S. B. Yoginath et al., “Smart mobility in the cloud: enabling real-time situational awareness and cyber-physical control through a digital twin for traffic,” IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 3, pp. 3145-3156, Mar. 2023. [Baidu Scholar]
L. Cascone, M. Nappi, F. Narducci et al., “DTPAAL: digital twinning pepper and ambient assisted living,” IEEE Transactions on Industrial Informatics, vol. 18, no. 2, pp. 1397-1404, Feb. 2022. [Baidu Scholar]