Analytical Verification of Performance of Deep Neural Network Based Time-synchronized Distribution System State Estimation

Behrouz Azimian; Shiva Moshtagh; Anamitra Pal; Shanshan Ma

网刊加载中。。。

使用Chrome浏览器效果最佳，继续浏览，你可能不会看到最佳的展示效果，

确定继续浏览么?

复制成功，请在其他浏览器进行阅读

Analytical Verification of Performance of Deep Neural Network Based Time-synchronized Distribution System State Estimation PDF

- ORCID：
Behrouz Azimian (Student Member, IEEE)
✉
- ORCID：
Shiva Moshtagh (Student Member, IEEE)
✉
- ORCID：
Anamitra Pal (Senior Member, IEEE)
✉
- ORCID：
Shanshan Ma
✉

Updated：2024-07-26

DOI：10.35833/MPCE.2023.000432

OUTLINE

Abstract

Recently, we demonstrated the success of a time-synchronized state estimator using deep neural networks (DNNs) for real-time unobservable distribution systems. In this paper, we provide analytical bounds on the performance of the state estimator as a function of perturbations in the input measurements. It has already been shown that evaluating performance based only on the test dataset might not effectively indicate the ability of a trained DNN to handle input perturbations. As such, we analytically verify the robustness and trustworthiness of DNNs to input perturbations by treating them as mixed-integer linear programming (MILP) problems. The ability of batch normalization in addressing the scalability limitations of the MILP formulation is also highlighted. The framework is validated by performing time-synchronized distribution system state estimation for a modified IEEE 34-node system and a real-world large distribution system, both of which are incompletely observed by micro-phasor measurement units.

Keywords

Deep neural network (DNN); distribution system state estimation (DSSE); mixed-integer linear programming (MILP); robustness; trustworthiness

I. Introduction

DISTRIBUTION system state estimation (DSSE) utilizing micro-phasor measurement units (µPMUs) and deep neural networks (DNNs) is currently a topic of active research interest in the power system community‎. This is because their combination can provide high-speed situational awareness in real-time unobservable distribution systems, as already demonstrated in [

1] and [2]. However, DNNs are vulnerable to perturbations in their inputs that lead to errors in their outputs, many of which are not captured during DNN training/validation [3], [4]. This is because DNN hyper-parameters are chosen to minimize validation loss and not to be robust against input perturbations. In this paper, we establish formal guarantees of DNN performance to theoretically prove that for a bounded perturbation in the inputs, the errors in the outputs of a trained DNN are also bounded.

Providing performance guarantees to machine learning (ML) algorithms (DNN being a type of ML algorithm) is particularly important for power system problems as the electric power grid is a mission-critical system. In line with this realization, [

5] and [6] provided performance guarantees to ML algorithms for power system classification problems, while [7] focused on a power system regression problem. However, the ML algorithms investigated in [5]-[7] were shallow models. For example, [5] and [6] used support vector classification and decision trees, respectively, while [7] used linear regression and support vector regression. These shallow ML algorithms are not suitable for performing time-synchronized DSSE in modern distribution systems due to the severity of real-time unobservability and the increased uncertainty caused by behind-the-meter generation. Recently, a nontrivial certified lower bound of the minimum adversarial distortion has been derived for a general class of ML problems involving DNNs in [8]-[11], and for a DNN-based classification problem of power systems in [12]. The methodology developed in [12] was extended in [13] to provide worst-case performance guarantees by integrating physics-based constraints directly into the trained DNN. However, the scarcity of µPMUs in the distribution system makes it impossible to analytically relate their measurements with every state. In summary, to the best of our knowledge, no prior work has investigated the verification of DNNs for power system regression problems in which the underlying analytical relation between inputs and outputs is unavailable.

In this paper, we exploit the piecewise linear nature of the rectified linear unit (ReLU) activation function, which is one of the most commonly used activation functions, to analytically examine robustness and trustworthiness of a trained DNN for performing time-synchronized DSSE in µPMU-unobservable distribution systems. We first express the ReLU operation integrated with batch normalization (BN) as a mixed-integer linear programming (MILP) problem. We then introduce two sets of verification formulations. For robustness verification, we show that for a prespecified range of perturbation in the input, the deviation in the output from its reference value is guaranteed to lie within a bounded region. For trustworthiness verification, we find the minimum perturbation required in the input to generate a given error in the output. Extensive simulations carried out using a modified IEEE 34-node distribution system demonstrate that the proposed formulations ensure robustness and trustworthiness of DNNs for time-synchronized DSSE. We have also tested our verification formulations on a real-world large distribution system to demonstrate their scalability and widespread applicability.

The salient contributions of this letter are as follows.

1) Providing bounds on the estimation error of a DNN- based time-synchronized distribution system state estimator given a bounded perturbation in the input measurements (robustness verification).

2) Quantifying the minimum perturbation in the input measurements required to create a given amount of error in the state estimates (trustworthiness verification).

3) Integrating BN with the verification formulations to improve scalability.

The rest of the paper is structured as follows. Section II presents the time-synchronized DSSE using DNNs. The proposed formulations are developed in Section III. The results and discussion are presented in Section IV, while the conclusion is presented in Section V.

II. Time-synchronized DSSE Using DNNs

In [

1], we formulated a Bayesian approach to perform time-synchronized DSSE for the systems that are incompletely observed by µPMUs. The resulting minimum mean squared error (MMSE) estimator minimized the estimation error of each state,

x_{i}

, for a given µPMU measurement vector,

z

, as shown below:

\underset{{\hat{x}}_{i} (\cdot)}{m i n} E (| | x_{i} - {\hat{x}}_{i} (z) | |^{2}) \Rightarrow {\hat{x}}_{i}^{*} (z) = E (x_{i} | z) \forall i \in [1, M]

(1)

where ${\hat{x}}_{i}^{*}$ is the optimal answer to the optimization problem; and $M$ denotes the total number of states to be estimated. The conditional expectation of (1) can be expressed in terms of the joint probability density of $x_{i}$ and $z$ , $p (x_{i} | z)$ , as shown in (2).

E (x_{i} | z) = \int_{- \infty}^{+ \infty} x_{i} p (x_{i} | z) d x_{i} = \int_{- \infty}^{+ \infty} x_{i} \frac{p (x_{i}, z)}{p (z)} d x_{i}

(2)

For µPMU-unobservable distribution systems, the probability density function (PDF) between µPMU data and all the voltage phasors (states) is unknown or impossible to specify, making direct computation of ${\hat{x}}_{i}^{*} (z)$ intractable. Even if the underlying joint PDF is known, finding a closed-form solution to (2) can be hard. The role of DNN is to approximate the MMSE estimator as it has excellent approximation capabilities [

14], i.e., the DNN for DSSE finds a mapping function that relates

x_{i}

and

z

III. Proposed Formulations

A well-trained DNN that gives satisfactory response for unseen test data cannot necessarily ensure similar performance for all possible combinations of its inputs. This brings into question the rationale of using DNNs for decision-making in mission-critical systems. The goals of this paper are to address this concern through verification-based methodologies and build credibility of DNNs for time-synchronized DSSE.

A. Reformulating ReLU Activation Function with BN Based on MILP

The DNN used in this analysis is a fully-connected feed-forward neural network with $K$ hidden layers integrated with BN, each having $N$ neurons, as shown in Fig. 1, where $x$ denotes the outputs obtained from the inputs $z$ . The inputs and outputs are denoted by $z \in R^{N_{0}}$ and $x \in R^{M}$ , respectively, and $N_{0} ≪ M$ for unobservable distribution systems.

Fig. 1 DNN architecture with $K$ hidden layers.

The hidden layers, denoted by $h_{k} \in R^{N}$ , are equipped with ReLU activation function. The input to the ReLU activation function is a linear transformation of the output of the previous layer denoted by ${\hat{h}}_{k} \in R^{N}$ . The output of each neuron in every hidden layer is sent to a BN operator. Hence, for each layer, we have:

{\hat{h}}_{k} = W_{k} (B N_{k - 1}) + b_{k} \forall k \in [1, K]

(3)

h_{k}^{n} = m a x ({\hat{h}}_{k}^{n}, 0) \forall k \in [1, K], \forall n \in [1, N]

(4)

B N_{k}^{n} = \frac{γ_{k}^{n} (h_{k}^{n} - μ_{k}^{n})}{\sqrt[]{v a r_{k}^{n} + ε}} + β_{k}^{n} \forall k \in [1, K], \forall n \in [1, N]

(5)

where $W_{k}$ is the weight matrix; $b_{k}$ is the bias vector; $ε$ is a small configurable constant; $μ_{k}^{n}$ and $v a r_{k}^{n}$ are the moving average and variance of the batches observed during the training process, respectively; and $γ_{k}^{n}$ and $β_{k}^{n}$ are the scaling and offset factors, respectively. The values of these hyperparameters are obtained during the training process. Note that in (3), $B N_{0} = z$ . $μ_{k}^{n}$ and $v a r_{k}^{n}$ are non-trainable variables that are updated each time the layer is called during the training process based on the given batch.

μ = η μ + (1 - η) E (b a t c h)

(6)

v a r = η \cdot v a r + (1 - η) (E (b a t c h^{2}) - E^{2} (b a t c h))

(7)

where $η$ is a configurable constant called momentum. In accordance with [

8] and [9], the ReLU activation function defined in (4) is reformulated as an MILP problem. Defining the binary variable,

r_{k}^{n} \in {0,1}

, for all hidden layers

k \in [1, K]

, and each neuron

n \in [1, N]

, we can rewrite the ReLU activation function as:

\{\begin{array}{l} h_{k}^{n} = m a x ({\hat{h}}_{k}^{n}, 0) \\ r_{k}^{n} \in {0,1} \end{array} \Rightarrow \{\begin{array}{l} h_{k}^{n} \leq {\hat{h}}_{k}^{n} - {\hat{h}}_{k_{m i n}}^{n} (1 - r_{k}^{n}) \\ h_{k}^{n} \geq {\hat{h}}_{k}^{n} \\ h_{k}^{n} \leq {\hat{h}}_{k_{m a x}}^{n} r_{k}^{n} \\ h_{k}^{n} \geq 0 \end{array}

(8)

where $r_{k}^{n}$ indicates whether the corresponding ReLU neuron is active (equals 1) or inactive (equals 0); and ${\hat{h}}_{k_{m a x}}^{n}$ and ${\hat{h}}_{k_{m i n}}^{n}$ are the upper and lower bounds of the ReLU activation function, respectively. The two bounds are calculated using the following equations:

{\hat{h}}_{k_{m a x}} = m a x (W_{k}, 0) \cdot m a x (h_{k_{m a x}}, 0) + m i n (W_{k}, 0) \cdot m a x (h_{k_{m i n}}, 0) + b_{k}

(9a)

{\hat{h}}_{k_{m i n}} = m a x (W_{k}, 0) \cdot m a x (h_{k_{m i n}}, 0) + m i n (W_{k}, 0) \cdot m a x (h_{k_{m a x}}, 0) + b_{k}

(9 b)

where ${\hat{h}}_{k_{m a x}}$ and ${\hat{h}}_{k_{m i n}}$ are vectors that contain the maximum and minimum input values for all the neurons in layer k, respectively.

For example, for the first hidden layer, $h_{k_{m a x}}$ and $h_{k_{m i n}}$ will correspond to the normalized inputs, implying that $h_{1_{m a x}} = 1$ and $h_{1_{m i n}} = 0$ . This means that the bounds on the first hidden layer will be ${\hat{h}}_{1_{m a x}} = m a x (W_{1}, 0) + b_{1}$ and ${\hat{h}}_{1_{m i n}} = m i n (W_{1}, 0) + b_{1}$ , respectively. The bounds for the remaining layers can be obtained by applying (9a) and (9b) sequentially. Lastly, $x$ is calculated based on the linear transformation of the output of the BN operator in the last hidden layer, $B N_{K}$ .

In the proposed formulation, one integer variable is assigned to each neuron for the linearization of the ReLU activation function. Hence, the number of integer variables quickly increases as deeper and wider DNNs are used. BN enables us to significantly reduce the number of integer variables while maintaining the high accuracy of the DNN. This is due to three main reasons [

15], [16]. Firstly, the BN layer combats the vanishing gradient problem by normalizing activations, which prevents the creation of very small gradients during training. Secondly, it counters internal covariate shift by maintaining a consistent input distribution in each layer, which promotes stable and efficient training. Lastly, it acts as a form of regularization by introducing noise, which helps prevent overfitting and enhancing the ability of DNN to generalize to new data. These abilities of the BN layer help reduce the size of the DNN without compromising its accuracy, resulting in a direct improvement in the efficacy of the proposed formulations.

B. Formulating Robustness for Regression Problems

In this subsection, we examine the robustness of DNNs for time-synchronized DSSE. Given a prespecified bounded perturbation in the input that deviates it from the actual value (reference), a DNN will be deemed robust if the output deviation is guaranteed to be within an acceptable threshold. This is pictorially depicted in Fig. 2 for a two-input one-output DNN.

Fig. 2 Robustness analysis for an operating condition described by $z_{r e f}$ .

The distortion-free input measurement, $z_{r e f}$ , can be perturbed in either or both dimensions, with the maximum perturbation limit indicated by the black rectangle. For the time-synchronized DSSE application, the limit is specified in terms of permissible error in µPMU measurements (denoted by $α$ ). Now, for any randomly selected perturbed sample obtained during training/validation (purple dot), there can be a perturbed adversarial sample (black dot) encountered during testing that causes the maximum error in the output (red oval). The goal of the robustness analysis is to quantify this maximum output error given the prespecified input perturbation limit, $α$ . The following formulations are proposed to this end as:

\underset{x}{m a x} | | x - x_{r e f} | |_{p}

(10a)

s.t.

\begin{matrix} | | z - z_{r e f} | |_{p} \leq α \end{matrix}

(10b)

(3), (5), (8)

(10c)

where $x_{r e f}$ is a known output (e.g., true value of the state); and $p \geq 1$ is an appropriate norm. Equations (3) and (5) are DNN and BN constraints, respectively, and (8) denotes the linearized constraints of the ReLU activation function. By its very definition, (10) finds the maximum perturbation in the outputs corresponding to an input perturbation that is bounded by $α$ . Consequently, it provides formal guarantees of robustness of a DNN for any regression problem involving ReLU activation function.

To verify the robustness of the trained DNN for all possible input combinations, we choose $p$ to be the infinity norm as it ensures that the DNN error is bounded throughout. For infinity norm maximization, we rewrite the objective function of (10a) as: $\underset{x}{m a x} m a x (| x_{1} - x_{1 r e f} |, | x_{2} - x_{2 r e f} |, \dots, | x_{M} - x_{M r e f} |)$ . Next, we convert the overall maximization problem to one maximization problem and one minimization problem for each state. Finally, we pick the maximum absolute value between the two as:

m a x (|\underset{x_{i}}{m a x} (x_{i} - x_{i r e f})|, |\underset{x_{i}}{m i n} (x_{i} - x_{i r e f})|) \forall i \in [1, M]

(11a)

s.t.

\begin{matrix} - α \leq z_{j} - z_{j r e f} \leq α \forall j \in [1, N_{0}] \end{matrix}

(11b)

(3), (5), (8)

(11c)

C. Formulating Trustworthiness of a DNN Trained for Regression Problems

In this subsection, we present a formulation for analyzing trustworthiness of a DNN trained for regression problems. If a perturbation in the input vector is denoted by $δ$ , the objective of trustworthiness analysis is to determine the smallest input perturbation (i.e., $m i n (δ) = δ_{m i n}$ ) that can create erroneous results exceeding a threshold, $β$ , in the output. Afterwards, we compare the resulting $δ_{m i n}$ with the actual level of perturbation allowed in the given application. For the time-synchronized DSSE problem, this will be the permissible error in µPMU measurements, specified by $α$ . If $δ_{m i n}$ consistently surpasses $α$ , we can have trust in the ability of our trained DNN to provide erroneous estimates that are always within $β$ .

The above-mentioned logic is pictorially depicted in Fig. 3 for a two-input, one-output DNN.

Fig. 3 Trustworthiness analysis for an operating condition described by $z_{r e f}$ .

In Fig. 3, the blue arrow represents the smallest input perturbation, $δ_{m i n}$ , which yields an error of $β$ in the output, while the green arrow represents $α$ . As long as the blue arrow is longer than the green arrow for all scenarios, we can say with certainty that the estimation error will never surpass $β$ . To find $δ_{m i n}$ , the following verification formulations are proposed.

m i n δ

(12a)

| | z - z_{r e f} | |_{p} \leq δ

(12b)

s.t.

\begin{matrix} | | x - x_{r e f} | |_{p} \geq β \end{matrix}

(12c)

(3), (5), (8)

(12d)

Equation (12) is solved in a manner similar to how (11) is solved in the previous subsection.

D. Data Preparation and Implementation of Proposed Formulations

The DNN described in Section II is trained on historical smart meter data in the offline learning stage and tested using µPMU measurements in the online execution stage [

1]. As the proposed formulations provide guarantees to the performance of this trained DNN, it is important to ensure that the learning and execution are done properly. For example, bad or missing data present in the smart meter measurements must be corrected, which was done by employing the data cleaning procedures described in [17]. Similarly, a Wald test-based bad data detection and correction procedure [18] was used to identify bad/missing µPMU measurements in real time. These two procedures ensured that inaccurate or incomplete data did not limit the performance of the proposed formulation.

To implement the proposed formulation, the following steps were performed.

Step 1: cleaned smart meter data were used to create a big dataset by solving many power flows. The voltage phasors obtained from the power flow solution were saved as $x_{r e f}$ . The voltage and current phasors corresponding to µPMU locations were saved as $z_{r e f}$ . $z$ was obtained from $z_{r e f}$ by using an appropriate perturbation limit, $α$ .

Step 2: the big dataset was split into training and testing datasets, and the former was used to train a DNN that finds a mapping function that relates $x_{r e f}$ and $z$ . Cleaned µPMU data were then fed into the trained DNN during the test to determine $x$ , and calculate the maximum testing dataset error.

Step 3: the robustness verification formulation given by (11) was solved. The solution gave the maximum error that the trained DNN would have for an input perturbation that was bounded by $α$ . If this solution is greater than the testing error found in Step 2, it means that the robustness formulation has found an input perturbation for which the trained DNN performs worse than what the testing accuracy indicates.

Step 4: the trustworthiness verification formulation given by (12) was solved for every node to determine $δ_{m i n}$ that was needed to create an error greater than $β$ in any of the state estimates. If no perturbation is found for a given node or the smallest perturbation found is greater than $α$ used in Step 1, the DNN is deemed trustworthy in the sense that it will not give an error greater than $β$ for any input perturbation that is bounded by $α$ .

IV. Results and Discussion

The DNNs created based on the logic proposed in [

1] were able to estimate the voltage magnitude and angle of every phase of all the nodes of µPMU-unobservable distribution systems. We used these DNNs to demonstrate the validity of the formulations proposed in Section III. The simulations were performed on a modified IEEE 34-node distribution system (henceforth, called System S1) and a real-world distribution system located in a metropolitan city of USA (henceforth, called System S2 [19]). The optimization problems were solved using the branch and bound approach in Pyomo coding environment with Gurobi 10.0.1 as the solver on a computer with 384 GB RAM, Intel Xeon Platinum 8368 CPU @ 2.40 GHz.

A. System S1

1) Robustness Results

System S1 has three distributed generation units with ratings of 135 kW, 60 kW, and 60 kW placed on nodes 822, 848, and 860, respectively. To train a DNN for DSSE, we created a database comprising input, $z$ , and output, $x_{r e f}$ . The database was then split into training and testing datasets. Note that according to the sensor placement algorithm of [

1], three µPMUs placed on nodes 800, 850, and 832 of System S1 are sufficient for performing time-synchronized DSSE using DNNs. A total vector error (TVE) of 0.05% [20] was employed to simulate erroneous voltage and current phasor measurements for these three µPMUs and determine

α

. For a TVE of

0.05

α

was equal to 0.05% and 0.0005 for angle and magnitude, respectively. Next, the measurements were normalized and fed as inputs to the DNN, which had two hidden layers with BN, and 30 neurons/layers. Lastly, to verify the robustness of the trained DNN for every node of the system, the optimization problem in (11) was solved

2 M

times for each operating condition.

In the first set of simulations, we compared the output of (11) obtained from the trained DNN using the test dataset with the estimation errors produced by the same DNN for the same (test) dataset. Due to space limitation, we only present the results for phase A voltage magnitude estimation. However, similar observations are made when analysis of magnitudes of other phases as well as angles are conducted. The maximum absolute error of all the test samples is shown in Fig. 4. The blue line shows the maximum absolute error in the output of the DNN for every node where phase A is present. The orange line shows the maximum absolute error for the same nodes, found using (11). It is observed from the figure that for all the nodes, the maximum absolute error calculated using the robustness analysis is greater than the maximum absolute error calculated from the DNN output. This signifies the importance of robustness analysis for trained DNNs, as evaluating the performance of a trained DNN based only on the testing dataset may give more optimistic results (as shown by the blue line). In addition, the maximum absolute errors found by (11) ensures that as long as the perturbation in the input is less than $α$ ( $= 0.05 %)$ , the error in the state estimates is guaranteed to be less than the values indicated by the orange line of Fig. 4.

Fig. 4 Comparison of DNN-based voltage magnitude estimation error and DNN robustness analysis for phase A of System S1.

Next, we tested the sensitivity of the proposed formulation to different sample sizes. The results presented in Fig. 5 were obtained from a dataset of 7500, 10000, and 12500 samples, respectively, each of which is divided into 80% for training and 20% for testing. It can be observed from the figure that the maximum absolute errors found by robustness analysis progressively decrease as the number of samples increase, with the decrease magnitude becoming smaller with increase in sample sizes. Since the computational complexity of the proposed formulation is a function of the number of samples (and we did not see much improvement after 12500 samples), we deduced from this analysis that a sample size of 12500 is sufficient for drawing valid conclusions for this system.

Fig. 5 Robustness analysis for System S1 with different data sizes.

2) Trustworthiness Results

To identify the minimum perturbation in µPMU measurements capable of inducing a prespecified error in the state estimates, (12) was employed. We assumed a maximum allowable error of 1% in voltage magnitude estimation, i.e., $β = 0.01$ . Then, we determined $δ_{m i n}$ specific to each operating condition that resulted in $β = 0.01$ . To ensure trust in our trained DNN for DSSE, we must verify whether $δ_{m i n}$ consistently exceeds $α$ ( $= 0.05$ %). The results obtained are shown in Table I.

TABLE I Trustworthiness Results for Phase A Voltage Magnitude Estimation for Each Node of System S1

Node No.	$δ_{m i n}$ (%)	Node No.	$δ_{m i n}$ (%)	Node No.	$δ_{m i n}$ (%)
800		822	4.6729	864	0.9582
802		824		834	0.9667
806		828		842	0.9668
808	10.5745	830		844	0.9671
812	6.2330	854		846	0.9665
814	4.7759	852	6.8931	848	0.9664
850		832	0.9553	860	0.9684
816		888	0.9445	836	0.9691
818		890	0.8169	840	0.9692
820	3.5278	858	0.9591	862	0.9692

Table I shows the least amount of input perturbation, expressed as a percentage of the magnitude measurement, that is required to achieve a phase A voltage magnitude estimation error greater than 1% for every node in System S1. For example, node 814 necessitates a perturbation of at least 4.7759% in the µPMU measurements to generate a 1% error in its phase A voltage magnitude estimation. For the nodes shown without any value, a $δ_{m i n}$ value that can create a 1% voltage magnitude estimation error in phase A could not be found, implying that it is not possible to induce an error of $β = 0.01$ in them. However, it should be noted that by lowering the value of $β$ , a corresponding $δ_{m i n}$ could be found for these nodes.

The findings presented in Table I instill trust in the trained DNN since for the nodes for which $δ_{m i n}$ was found, it always exceeded the value of $α$ (= $0.05$ %). It can be further implied from Table I that based on the assumed TVE accuracy of the µPMUs, we can be confident that the trained DNN will consistently provide estimates with a voltage magnitude error of no more than 1%. Finally, the results obtained in Fig. 4 and Table I prove the robustness and trustworthiness of the created DNN for System S1 and its ability to give accurate voltage magnitude estimations within the prespecified measurement error bounds.

B. System S2

1) Robustness Results

In this test system, µPMU measurements were only possible at the feeder-head (see Fig. 6 for a depiction of this system).

Fig. 6 System S2 with one µPMU available at feeder-head.

Note that having real-time measurements only at the feeder-head is common for most distribution systems. Therefore, it is of interest to evaluate the DNN performance of time-synchronized DSSE in situations where additional µPMU placement cannot be done due to budget constraints. There are 642, 665, and 637 nodes in phase A, phase B, and phase C of this system, respectively, whose voltages must be estimated under different operating conditions. Additionally, this feeder has 766 house-hold/commercial roof-top solar photovoltaic units, implying that it has a high penetration of renewable energy resources. Thus, this was an ideal test system for investigating the scalability as well as the handling capability of the renewable-rich system.

Due to the sheer number of nodes in this system, we show the difference between robustness analysis and DNN testing dataset for phase A voltage magnitude estimation as a histogram, as shown in Fig. 7. We display the maximum absolute error found by robustness analysis for the $i^{t h}$ node, denoted by $R_{i}$ , and the maximum absolute error based on the testing dataset for the same node, denoted by $T_{i}$ . The histogram shows the numerical difference between $R_{i}$ and $T_{i}$ , i.e., $R_{i} - T_{i}$ . The X-axis indicates the ranges of the differences, while the Y-axis denotes the number of nodes belonging to a given range. For example, there are 188 nodes for which the differences in $R_{i}$ and $T_{i}$ lie between 0.6×10^-4 p.u. and 2.6×10^-4 p.u.. It is evident from the histogram that, for all nodes, the difference is always positive, as there are zero nodes for which $R_{i} - T_{i}$ is less than 0.6×10^-4 p.u.. This indicates that robustness analysis consistently finds adversarial examples that could result in more errors compared with the testing dataset. Similar observations were made when the analyses were conducted for the other phase magnitudes and angles. This implies that when reporting the accuracy of the trained DNN for DSSE, it is more appropriate to present the robustness analysis results than just represent the testing dataset results. In summary, the proposed robustness analysis offers a means to provide guarantees for ReLU-based regression DNNs by accounting for the existence of potential errors beyond what is evident from the testing dataset alone.

Fig. 7 Difference in maximum absolute errors obtained using $R_{i}$ and $T_{i}$ for all 642 nodes.

2) Trustworthiness Results

The trustworthiness result analysis for System S2 is presented in the form of a histogram in Fig. 8. Since System S2 is a real-world distribution system that requires reliable operation, we have chosen a smaller value of $β = 0.002$ to ensure that the estimation error for all nodes and operating conditions never exceeds 0.2%. For 527 nodes of this system, no $δ_{m i n}$ was found using (12), similar to the nodes without any value in Table I. As such, they are not included in the figure. Hence, Fig. 8 shows the number of nodes for which a $δ_{m i n}$ was found, as well as the corresponding value intervals. For example, the first non-zero bar indicates that there are 16 nodes in System S2 for which $δ_{m i n}$ lies between 5.4% and 8.6%. The fact that the minimum value in Fig. 8 is 5.4% proves that in order to have a 0.2% error in voltage magnitude estimation, an error of at least 5.4% must be injected into the input data. Since this value is much higher than the $α$ value of 0.05%, this analysis instills trust in our trained DNN for DSSE as it ensures that the estimation error will always be less than 0.2%.

Fig. 8 Trustworthiness results for phase A voltage magnitude estimation for 115 nodes of System S2.

C. Discussion

1) Strategies to Address Computational Burden

The proposed formulation is built on an MILP-based formulation whose worst-case run-time complexity is exponential. For example, the computational burden of the verification formulations developed in Section III is of the order of $O (2^{K N} S)$ , where $S$ denotes the total number of samples. Since optimization formulations with exponential time complexity face scalability issues when applied to the problems involving large numbers of variables, we employed three strategies to lower the severity of this issue for the proposed formulation, namely, time-synchronized DSSE in µPMU-unobservable distribution systems.

1) Strategy 1: incorporation of BN layer. It was observed that by incorporating BN layers within the DNN, a smaller-sized DNN could give similar validation accuracy as a larger-sized DNN that did not have BN layers. For example, in the absence of BN, a DNN with eight hidden layers and 500 neurons/layer was needed for System S1 for achieving similar accuracy as the DNN with BN described in Section IV-A. This considerable reduction in the size of the DNN also reduced the number of integer variables in the proposed formulation by a significant amount, resulting in faster convergence of the optimization process.

2) Strategy 2: identifying always-dead and always-active neurons. During the training process, the output of each neuron was monitored. It was observed that some neurons were always-active, while others became always-dead, outputting zero. For these neurons, the corresponding binary integer variable, $r_{k}^{n}$ , was fixed to 1 and 0, respectively, reducing the number of integer variables required for post-training robustness and trustworthiness analyses. This strategy improves the efficacy of the proposed formulation even for large DNNs.

3) Strategy 3: effective parallelization. This was done in two ways. First, the verification formulations were implemented parallelly for the three phases (since DSSE is performed on a per-phase basis). Second, the verification formulations were specific to a given power system node and could be performed independently of any other node. Therefore, different nodes of the test system were grouped together into clusters (e.g., 100 nodes for System S2), and these clusters were solved in parallel. Since both of these ways are agnostic of the DNN size, they can be easily applied to large DNNs for DSSE.

2) Practical Significance of Proposed Formulations

Trained ML models are prone to poor performance due to the presence of adversarial examples that can be present in the input space domain but not seen during training and testing stages. In the context of the DSSE using DNN application, this can be due to the presence of non-Gaussian noise in the µPMU measurements. Most studies have modeled the noise as a zero-mean Gaussian distribution, but in reality, the noise model could be non-Gaussian [

21]. In such a scenario, if a DNN is trained for a Gaussian measurement noise, but is tested on a non-Gaussian measurement noise, then relying on the testing accuracy alone may not give the correct picture (in the worst case, it may give a sense of false security). Moreover, as long as the noise amount is low, it will not be detected/corrected by any bad data detection/correction module.

This is precisely the scenario where the proposed formulations become crucial. Consider the robustness verification formulation that calculates the maximum error in the state estimation caused by a perturbation bounded by $α$ in the input measurements, once this maximum error is found, one can say with certainty that as long as the input is corrupted by a perturbation bounded by $α$ (irrespective of the distribution that the perturbation may have), the DNN-based DSSE error will be less than or equal to this calculated maximum error value. This is a powerful result that clearly indicates the practical significance of the proposed formulation for mission-critical systems such as the electric power grid.

V. Conclusion

The black box nature of a DNN often makes power system operators question the validity of the obtained results. This is because although a well-trained DNN can make accurate predictions, it might lack requisite robustness to (adversarial) input perturbations. Therefore, providing formal guarantees of DNN performance is necessary for ensuring their acceptability in the power system. To this end, we formulated two verification problems, namely, robustness and trustworthiness, for DNN-based time-synchronized DSSE using MILP. The robustness formulation finds the maximum error in the output for a given bounded perturbation in the input, while the trustworthiness formulation finds the minimum perturbation in the input that is required to produce a given error in the output. The proposed formulations are also applicable to DNN-based regression problems in other domains.

The analytical verification of DNN-based time-synchronized DSSE was first performed on a modified IEEE 34-node system. It was confirmed that the robustness analysis conducted using the testing data on a DNN resulted in a higher error than what was observed when the same data were fed as an input into that DNN. This implied that relying on the outputs of the testing data alone (i.e., without a robustness analysis) might result in a sense of false security, which is dangerous for mission-critical systems such as power systems. Through trustworthiness analysis, it was observed that we could verify the adherence of the estimation error to a prespecified threshold that was based on the characteristics of the inputs (e.g., permissible error in µPMU measurements). Lastly, the applicability of the proposed formulation to a real-world, large-scale, and renewable-rich distribution system was demonstrated, confirming its practical utility. A future scope of this work will be to address the exponential run-time complexity of the proposed formulations by creating verification problems that do not involve MILP.

Disclaimer

This paper was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

References

B. Azimian, R. S. Biswas, S. Moshtagh et al., “State and topology estimation for unobservable distribution systems using deep neural networks,” IEEE Transactions on Instrumentation and Measurement, vol. 71, p. 9003514, Jan. 2022. [Baidu Scholar]

B. Zargar, A. Angioni, F. Ponci et al., “Multiarea parallel data-driven three-phase distribution system state estimation using synchrophasor measurements,” IEEE Transactions on Instrumentation and Measurement, vol. 69, no. 9, pp. 6186-6202, Sept. 2020. [Baidu Scholar]

H. Xu, Y. Ma, H. Liu et al., “Adversarial attacks and defenses in images, graphs and text: a review,” International Journal of Automation and Computing, vol. 17, no. 2, pp. 151-178, Feb. 2020. [Baidu Scholar]

Y. Chen, Y. Tan, and D. Deka, “Is machine learning in power systems vulnerable?” in Proceedings of 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), Aalborg, Denmark, Oct. 2018, pp. 1-6. [Baidu Scholar]

Z. Zhang, M. Sun, R. Deng et al., “Physics-constrained robustness evaluation of intelligent security assessment for power systems,” IEEE Transactions on Power Systems, vol. 38, no. 1, pp. 872-884, Jan. 2023. [Baidu Scholar]

Z. Zhang, K. Zuo, R. Deng et al., “Cybersecurity analysis of data-driven power system stability assessment,” IEEE Internet of Things Journal, vol. 10, no. 17, pp. 15723-15735, Nov. 2023. [Baidu Scholar]

Y. Liu, B. Xu, A. Botterud et al., “Bounding regression errors in data-driven power grid steady-state models,” IEEE Transactions on Power Systems, vol. 36, no. 2, pp. 1023-1033, Mar. 2021. [Baidu Scholar]

V. Tjeng, K. Xiao, and R. Tedrake, “Evaluating robustness of neural networks with mixed integer programming,” in Proceedings of 7th International Conference on Learning Representations (ICLR), New Orleans, USA, May 2019, pp. 1-21. [Baidu Scholar]

R. Anderson, J. Huchette, W. Ma et al., “Strong mixed-integer programming formulations for trained neural networks,” Mathematical Programming, vol. 183, no. 1, pp. 3-39, Jan. 2020. [Baidu Scholar]

M. I. Khedher, H. I. Khedher, and M. Hadji, “Dynamic and scalable deep neural network verification algorithm,” in Proceedings of the 13th International Conference on Agents & Artificial Intelligence, Vienna, Austria, Feb. 2021, pp. 1122-1130. [Baidu Scholar]

E. Wong, T. Schneider, J. Schmitt et al., “Neural network virtual sensors for fuel injection quantities with provable performance specifications,” in Proceedings of 2020 IEEE Intelligent Vehicles Symposium, Las Vegas, USA, Oct. 2020, pp. 1753-1758. [Baidu Scholar]

A. Venzke and S. Chatzivasileiadis, “Verification of neural network behaviour: formal guarantees for power system applications,” IEEE Transactions on Smart Grid, vol. 12, no. 1, pp. 383-397, Jan. 2021. [Baidu Scholar]

A. Venzke, G. Qu, S. Low et al., “Learning optimal power flow: worst-case guarantees for neural networks,” in Proceedings of 2020 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), Tempe, USA, Nov. 2020, pp. 1-7. [Baidu Scholar]

S. Sonoda and N. Murata, “Neural network with unbounded activation functions is universal approximator,” Applied and Computational Harmonic Analysis, vol. 43, no. 2, pp. 233-268, Sept. 2017. [Baidu Scholar]

S. Ioffe and C. Szegedy. (2015, Feb.). Batch normalization: accelerating deep network training by reducing internal covariate shift. [Online]. Available: http://arxiv.org/abs/1502.03167 [Baidu Scholar]

J. Bjorck, C. Gomes, B. Selman et al., “Understanding batch normalization,” in Proceedings of 32nd Conference Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada, Nov. 2018, pp. 1-12. [Baidu Scholar]

B. Azimian, R. S. Biswas, and A. Pal, “Application of AI and machine learning algorithms in power system state estimation,” in Cyber-physical Power Systems: Challenges and Solutions by AI/ML, Big Data, Blockchain, IoT, and Information Theory Paradigms, New Jersey: Wiley-IEEE Press. [Baidu Scholar]

K. R. Mestav, J. Luengo-Rozas, and L. Tong, “State estimation for unobservable distribution systems via deep neural networks,” in Proceedings of 2018 IEEE PES General Meeting (PESGM), Portland, USA, Aug. 2018, pp. 1-5. [Baidu Scholar]

K. Montano-Martinez, S. Thakar, S. Ma et al., “Detailed primary and secondary distribution system model enhancement using AMI data,” IEEE Open Access Journal of Power and Energy, vol. 9, pp. 2-15, Jun. 1883. [Baidu Scholar]

G. Cavraro and R. Arghandeh, “Power distribution network topology detection with time-series signature verification method,” IEEE Transactions on Power Systems, vol. 33, no. 4, pp. 3500-3509, Jul. 2018. [Baidu Scholar]

A. C. Varghese, A. Pal, and G. Dasarathy, “Transmission line parameter estimation under non-gaussian measurement noise,” IEEE Transactions on Power Systems, vol. 38, no. 4, pp. 3147-3162, Jul. 2023. [Baidu Scholar]

Address:No.19 Chengxin Avenue, Jiangning District, Nanjing 211106, China

E-mail: mpce@alljournals.cn

Tel:86-25-81093060

Fax:86-25-81093040

Home

Introduction

Editorial Board

For Author

Call For Papers

APC

Sponsor & Publisher

Analytical Verification of Performance of Deep Neural Network Based Time-synchronized Distribution System State Estimation PDF

Abstract

Keywords

I. Introduction

II. Time-synchronized DSSE Using DNNs

III. Proposed Formulations

A. Reformulating ReLU Activation Function with BN Based on MILP

B. Formulating Robustness for Regression Problems

C. Formulating Trustworthiness of a DNN Trained for Regression Problems

D. Data Preparation and Implementation of Proposed Formulations

IV. Results and Discussion

2) Trustworthiness Results

2) Trustworthiness Results

2) Practical Significance of Proposed Formulations

V. Conclusion

References