Abstract
Power flow (PF) is one of the most important calculations in power systems. The widely-used PF methods are the Newton-Raphson PF (NRPF) method and the fast-decoupled PF (FDPF) method. In smart grids, power generations and loads become intermittent and much more uncertain, and the topology also changes more frequently, which may result in significant state shifts and further make NRPF or FDPF difficult to converge. To address this problem, we propose a data-driven PF (DDPF) method based on historical/simulated data that includes an offline learning stage and an online computing stage. In the offline learning stage, a learning model is constructed based on the proposed exact linear regression equations, and then the proposed learning model is solved by the ridge regression (RR) method to suppress the effect of data collinearity. In online computing stage, the nonlinear iterative calculation is not needed. Simulation results demonstrate that the proposed DDPF method has no convergence problem and has much higher calculation efficiency than NRPF or FDPF while ensuring similar calculation accuracy.
POWER flow (PF) calculation, one of the most important calculations in power system, is widely-used in power system planning, operation, and control. The earliest proposed PF methods include Gauss-Seidel method and Newton-Raphson PF (NRPF) method [
In the emerging smart grids, MDPF methods may have significant shortcomings. First, power generations and loads become intermittent and much more uncertain, and the topology also changes frequently, resulting in significant state shifts. This will make the widely-used NRPF or FDPF difficult to converge [
In the smart grids, a large amount of historical/simulation data are available. Based on these data, data-driven methods are effective means to improve the calculation accuracy and efficiency of traditional model driven power system analysis methods [
DDPF methods generally need to use historical or simulation data as sample data and construct a learning model to obtain the mapping relationship between the boundary conditions of PF calculation and state variables. Through in-depth analysis, we find that the difficulty of DDPF modeling is mostly attributed to the nonlinearity of the PF equations. The nonlinear PF equations may lead to the over-learning problems of the existing DDPF learning models and affect their calculation accuracy and computing efficiency. If the original nonlinear PF equations can be accurately transformed into linear equations, the calculation accuracy and the computing efficiency of the existing DDPF learning models may be improved. This motivates us to propose a novel DDPF method based on exact linear regression equations (ELREs).
The original nonlinear PF equations are as follows:
(1) |
(2) |
where Pi and Qi are the active and reactive power injections at bus i, respectively; k(i) denotes all buses directly connected to bus i (including i); and is the element in the bus admittance matrix of node j and is the element in the bus admittance matrix of node i. Without the loss of generality, we assume bus 1 is the slack bus and the state variables (voltage magnitudes and phase angles) are .
Base on nonlinear PF equations (
(3) |
(4) |
where , , and are the auxiliary state variables, , , and .
The following equation should be built for each PV bus:
(5) |
where , and is the given voltage magnitude of the
After the introduction of auxiliary state variables, the original nonlinear PF equations are accurately transformed into linear counterparts. For each PQ bus, linear PF equations (
Equations (
(6) |
where is the auxiliary boundary vector; is the auxiliary state vector; and is a constant matrix, whose elements are determined by the network topology and parameters.
According to (6), the following ELRE can be obtained by:
(7) |
where is the unknown constant mapping matrix depending on the network topology and network parameters, which is obtained in the offline learning stage; is the possible error matrix in historical/simulation data with the expected values .
Multiple historical or simulation PF snapshots are used to learn . Suppose s historical or simulation PF snapshots with the same topology are available, and each snapshot includes the given boundary conditions and the results of the PF calculation. The auxiliary boundary vectors and the auxiliary state vectors of all historical or simulation PF snapshots are aggregated into:
(8) |
(9) |
where and are the auxiliary boundary vector and the auxiliary state vector in the
According to (7), the ELRE between and is obtained as:
(10) |
where is the error matrix and .
The task of the offline learning stage is to aggregate the historical or simulation PF snapshots with the same topology which will be addressed in Section III-D, and then obtain the mapping matrix addressed below.
If the row of matrix Z is full rank, the weighted least square (WLS) method can be used to estimate the H matrix directly. However, the matrix Z obtained from historical or simulation PF data may not meet the condition of full row rank, i.e., the collinearity problem may exist in the historical or simulation PF data. To suppress the effect of collinearity of historical or simulation data, the ridge regression (RR) method is used to estimate corresponding to each topology:
(11) |
where is the estimated value of ; is the Frobenius norm; is a tuning parameter and ; and is an identity matrix with dimension .
Model (11) can be run offline to obtain the estimated mapping matrix . The ratio of the number of historical/simulation PF snapshots s to the number of auxiliary state variables affects the calculation accuracy of . Obviously, it is preferable to ensure that historical/simulation PF snapshots are available.
The task of the online computing stage is to quickly find the historical or simulation power flow snapshots with the same topology as the current snapshot so as to obtain the corresponding matrix. Then, the auxiliary state vector of the current snapshot can be obtained by:
(12) |
where and are the auxiliary boundary vector and the estimated auxiliary state vector of the current PF snapshot, respectively.
Note that PF calculation is for the auxiliary state variables, and the estimated values of the original state variables of PF can be obtained. In a word, the main computational burden of the online computing stage is the matrix multiplication, which is less than that of the iterative solution of nonlinear algebraic equations in traditional NRPF or FDPF methods.
Obviously, it is necessary to judge whether different snapshots have the same topology in both the offline learning stage and online computing stage.
According to [
(13) |
where ; ; ; and are the
In offline learning stage, the historical/simulation snapshots with the same topology are clustered based on (13). In the online computing stage, the historical/simulation snapshots with the same topology as the current snapshot are also found based on (13), and then the corresponding can be obtained.
Remark: because there is no need for nonlinear iterative computation in the offline learning stage and online computing stage, the proposed DDPF method has no convergence problem.
The performance of the proposed DDPF method is tested and compared with NRPF, FDPF, and two existing DDPF methods in [
To verify the correctness of the topology identification method based on PCC, 6 sets of auxiliary boundary vectors are taken from the IEEE 300-bus system, which correspond to three different topologies for the test, i.e., and , and , and and , respectively, having the same topology. The PCC correlation matrix is calculated and shown in

Fig. 1 PCC correlation matrix of 6 sets of auxiliary boundary vectors.
To test the effect of the number of snapshots used in the offline learning stage on the estimation accuracy of the proposed DDPF method, let . For different ratios (from 0.1 to 1.2), the mean absolute error of voltage magnitude (denoted by ) and the mean absolute error of phase angle (denoted by ) between the true values and the proposed results of DDPF are given in Figs.

Fig. 2 with Ratio for different IEEE systems.

Fig. 3 with Ratio for different IEEE systems.
When , both and obtained by the proposed DDPF method are smaller than those given in [
Suppose the uncertainty of power generations and loads makes the voltage magnitudes vary randomly between 0.95 and 1.05 and the phase angles vary randomly between and . One hundred tests are performed and the state vector of the previous snapshot is used as the initial guess for NRPF and FDPF.
The number of times that NRPF and FDPF do not converge in 100 tests (denoted by ) is given in
Furthermore, in 100 continuous PF sampling snapshots, the topology is assumed to change continuously. For each sampling snapshot, NRPF, FDPF, the DDPF method in [
In the emerging smart grids, traditional MDPF methods may have convergence problems because of the intermittent and uncertain power generations and loads as well as the frequent changes of the topology. At the same time, the accuracy of traditional MDPF methods may also be affected by inaccurate parameters. In order to solve the above problems, this letter proposes a novel DDPF method including offline learning stage and online computing state which can make full use of historical/simulation PF big data. The proposed DDPF method has the following advantages: ① it has no convergence problem and has very high computational efficiency; ② it can deal with different topological structures and can adapt to the frequent changes of topological structures in smart grids.
Note that there may be gross errors in real-time power flow snapshot caused by accidental fault or malicious attack. In these circumstances, it may be impossible to get accurate state vector by using the proposed DDPF method directly. To this end, we can combine the proposed method with state estimators so as to obtain the estimation value of state vector quickly and accurately. Also, the proposed DDPF method can be extended to the integrated energy system (IES) so as to solve the problem of fast and accurate multi-energy calculation of IES in uncertainty circumstances. These are our future research areas.
References
W. F. Tinney and C. E. Hart, “Power flow solution by Newton’s method,” IEEE Transactions on Power Apparatus and Systems, vol. 86, no. 11, pp. 1449-1460, Nov. 1967. [Baidu Scholar]
W. F. Tinney and J. W. Walker, “Direct solution of sparse network equations by optimally ordered triangular factorization,” Proceedings of the IEEE, vol. 55, no. 11, pp. 1801-1809, Nov. 1967. [Baidu Scholar]
B. Stott and O. Alsac, “Fast decoupled load flow,” IEEE Transactions on Power Apparatus and Systems, vol. 91, no. 3, pp. 859-869, May 1974. [Baidu Scholar]
Y. Weng, R. Negi, C. Faloutsos et al., “Robust data-driven state estimation for smart grid,” IEEE Transactions on Smart Grid, vol. 8, no. 4, pp. 1956-1967, Jul. 2017. [Baidu Scholar]
C. Chen, M. Cui, F. Li et al., “Model-free emergency frequency control based on reinforcement learning,” IEEE Transactions on Industrial Informatics, vol. 17, no. 4, pp. 2336-2346, Apr. 2021. [Baidu Scholar]
J. Yu, Y. Weng, and R. Rajagopal, “Mapping rule estimation for power flow analysis in distribution grids,” in Proceedings of 2017 Nath American Power Symposium, Boston, USA, Sept. 2017, pp. 1-6. [Baidu Scholar]
Y. Liu, N. Zhang, Y. Wang et al., “Data-driven power flow linearization: a regression approach,” IEEE Transactions on Smart Grid, vol. 10, no. 3, pp. 2569-2580, May 2019. [Baidu Scholar]
R. A. Jabr, “Radial distribution load flow using conic programming,” IEEE Transactions on Power Systems, vol. 21, no. 3, pp. 1458-1459, Aug. 2006. [Baidu Scholar]