1 Introduction

Power system state estimation (SE) is a core function of energy management system (EMS) [1]. As a data filter, SE can provide reliable data to EMS, thereby improve the safety of power network operation. With the development of smart grid, SE will play an increasingly important role in power system operation and control. The model and implementation of SE were firstly proposed by Schweppe and Wildes in [24] in 1970. From then on, various SE models have been proposed, among which the weighted least square (WLS) approach and the fast-decoupled SE (FDSE) approach [5] are the most popular SE methods; but WLS and FDSE are very sensitive to bad measurements, i.e. bad data (BD). To suppress the effect of bad measurements on the estimation value of WLS or FDSE, the largest normal residual (LNR) test [1] or other BD identification approaches based on residual [6, 7] are always used to detect and identify any existing bad measurements, but these methods cannot effectively identify conforming bad measurements and leverage bad measurements [1].

For retaining unbiased estimation despite the existence of different types of bad measurements, many robust state estimation (RSE) approaches have also been proposed, including the weighted least absolute value (WLAV) estimation [812], the quadratic-linear (QL) estimator [13, 14], and the quadratic-constant (QC) estimator [15, 16], etc. Recently, the maximum normal measurement rate (MNMR) estimator, the maximum exponential square (MES) estimator and the maximum exponential absolute value (MEAV) estimator have been suggested in [17], [18] and [19], respectively, showing good performance in suppressing the effect of bad measurements.

Mathematically, traditional SE models boil down to solve an optimization problem that is nonlinear and non-convex in general. Thus, several issues are inevitably concerned: ① The global optimum cannot be guaranteed theoretically, whereas a local optimum is meaningless for SE; ② Iterative algorithms are generally required for solving the nonlinear programs, the process may become time-consuming as the number of iterations increases and in certain severe circumstances, the iterative algorithms may fail to converge; ③ Leverage bad measurements will affect the estimation performance at certain extent. In literature, some research work has been devoted to address these issues. For example, a backtracking and trust region based method is proposed to enhance the convergence properties of SE in [20]. In [21], a novel RSE approach using mixed integer nonlinear programming (MINP) formulation is proposed. Since it is not susceptible to leverage bad measurements, this approach shows strong robustness even in pathological cases. However, the above three problems have not yet been comprehensively solved due to the intrinsic non-convexity and nonlinearity of traditional SE models.

Reference [22] proposes a factorized approach for WLS, further giving rise to a bilinear state estimation approach [23]. Both of the approaches actually imply an exact linearization scheme of measurement equations. Motivated by this, we propose a mixed integer linear programming (MILP) formulation for RSE. The main idea is to use the exactly linearized measurement equations instead of the original nonlinear ones in the MINP model. Since the global optimum of MILP can be efficiently obtained by employing mature solvers, such as CPLEX, this approach has a very good prospect of online application.

Main contributions of our paper are twofold: ① A methodology for obtaining the global optimum of SE is proposed, and a MILP model for RSE is presented; ② A mixed integer quadratic programming (MIQP) formulation for comprehensive SE is proposed.

The rest of this paper is organized as follows: traditional SE models are shortly reviewed in Section 2. Section 3 proposes a MILP formulation for RSE and a MIQP formulation for comprehensive SE. Case studies on a rudimentary 3-bus system and several IEEE benchmark systems are given in Section 4. Finally, conclusions are drawn in Section 5.

2 Short reviews on traditional SE models

2.1 Traditional nonlinear measurement equation

The state variables of power systems generally refer to the voltage magnitudes and the phase angles of all buses (except for the reference bus). For traditional SE, the relationship between the state variables and measurements can be described by the following nonlinear measurement equation

$$ {\varvec{z}} = {\varvec{h}}({\varvec{x}}) + {\varvec{e}} $$
(1)

where, \( {\varvec{z}} \) is a m-dimensional measurement vector, usually including the power flows, bus power injections, bus voltage magnitudes, etc.; \( {\varvec{x}} \) the n-dimensional state vector (voltage magnitudes and phase angles) with n = 2N − 1; N the number of buses; \( {\varvec{h}}: R^{n} \to R^{m} \) the nonlinear vector function mapping the state vector to the measurement vector; \( {\varvec{e}} \) is a m-dimensional measurement error vector with variance R (an m × m diagonal matrix). The details about the measurement equation can be found in [1].

2.2 Traditional SE models

Based on (1), most of the existing SE models can be unified and boiled down into a general nonlinear optimization model as

$$ \begin{aligned} {\text{Min(or Max) }} & \quad J({\varvec{x}}) = \sum\nolimits_{i =1}^{m} {f (r_{i} )} \\ {\text{s.t.}} & \quad {\varvec{z}} = {\varvec{h}}({\varvec{x}}) + {\varvec{r}} \\ \end{aligned} $$
(2)

where r is a m-dimensional residual vector; r i the ith component of r; and f (r i ) a certain function of the residual r i , depending on different SE models.

Various solvers can be employed to solve the nonlinear optimization problem. However, from the mathematical point of view, two key issues should be attended to. Firstly, (2) is a nonlinear and non-convex optimization problem in general because of (1). Thus, there may be multiple local optimums and it is not easy to obtain the global optimum with gradient-based solvers in theory (such as the Gauss-Newton method or interior point method). Secondly, as the model is nonlinear, iterative algorithms are required. Consequently, solving SE problem may have convergence issue. These inherent drawbacks might limit the applications of existing SE methods based on (2).

2.3 MINP model

The MINP formulation of RSE proposed by [21] is given by

$$ \begin{gathered} {\text{Min.}} \, J({\varvec{x}},{\varvec{b}}) = \sum\nolimits_{i = 1}^{m} {b_{i} } \hfill \\ {\text{s.t.}} \ \left\{ \begin{gathered} h_{i} ({\varvec{x}}) \le z_{i} + t_{i}^{+} + Mb_{i} \hfill \\ h_{i} ({\varvec{x}}) \ge z_{i} - t_{i}^{-} - Mb_{i} \hfill \\ \end{gathered} \right. \quad i = 1,2, \ldots ,m \hfill \\ \end{gathered} $$
(3)

where z i and h i are the ith components of the measurement vector \( {\varvec{z}} \) and \( {\varvec{h}} \), respectively; \( t_{i}^{+}/t_{i}^{ - } \) upper/lower tolerance for measurement i; M an arbitrarily large positive number; b i binary variable for measurement i. For bad measurement, b i  = 1, else b i  = 0; and \( {\varvec{b}} = [b_{1} ,b_{2} , \ldots ,b_{m} ]^{\text{T}} \in R^{m} \). For details, please refer to [21].

Apparently, in the MINP model, a tolerance range is associated with each measurement and an estimation value of state vector is chosen to maximize the number of estimated measurements that remain within tolerance [21]. Reference [21] points out that MINP is not susceptible to leverage bad measurements and it shows strong robustness even in pathological cases.

Mathematically, the aforementioned (in the introduction) drawbacks of MINP model stem from its nature of nonlinearity. That means, if the nonlinear measurement equations can be transformed to linear ones, then the MINP model will be converted to MILP model, and the above drawbacks can be overcome. This motivates us to develop a MILP formulation for RSE.

3 Proposed MILP formulation for RSE

3.1 Linear measurement equation

As mentioned above, the approaches in [22] and [23] essentially boil down to an exact linearization scheme of measurement equations. It is achieved by introducing an auxiliary state vector:

$$ {\varvec{y}} = \{ U_{i} ,K_{ij} ,L_{ij} \} $$
(4)

where \( \varvec{y} \in R^{N + 2b} \) is the auxiliary state vector; N the number of buses; b the number of branches; U i  = v 2 i the square of voltage magnitude; and K ij  = v i v j  cos θ ij and L ij  = v i v j  sin θ ij the contribution of branch ij (from bus i to bus j) to y.

At the same time, an auxiliary measurement vector is defined as

$$ {\tilde{\varvec{z}}} = \{ U_{i} ,P_{ij} ,Q_{ij} ,P_{i} ,Q_{i} ,I_{ij}^{2} \} $$
(5)

where \( {\tilde{\varvec{z}}} \in R^{m} \) is the auxiliary measurement vector; P ij and Q ij the power flow measurements of branch ij (from bus i to bus j); and \( I_{ij}^{2} \) the square of current magnitude of branch ij (from bus i to bus j).

Based on the auxiliary state vector and the auxiliary measurement vector, (1) can be converted to

$$ {\tilde{\varvec{z}}} = {\varvec{By}} + {\tilde{\varvec{e}}} $$
(6)

where \( {\varvec{B}} \in R^{m \times (N + 2b)} \) is the constant Jacobian matrix and \( \tilde{\varvec{e}} \in R^{m} \) is the auxiliary noise vector. The details can be found in [22] and [23].

Comparing (6) with (1), it can be found that the original nonlinear measurement equation is converted to be a linear one through the introduction of auxiliary state vector and auxiliary measurement vector. If (6) is used for the modeling of SE, then the issues of traditional SE models (such as global optimum and convergence problems) might be solved.

3.2 MILP model

1) First linear stage

According to [21], each auxiliary measurement (normal or bad data) can be represented by a pair of inequality constraints:

$$ \left\{ \begin{gathered} {\varvec{B}}_{i} {\varvec{y}} \le \tilde{z}_{i} + \tilde{t}_{i}^{ + } + Mb_{i} , \quad \, i = 1,2, \ldots ,m \hfill \\ {\varvec{B}}_{i} {\varvec{y}} \ge \tilde{z}_{i} - \tilde{t}_{i}^{-} - Mb_{i} , \quad \, i = 1,2, \ldots ,m \hfill \\ \end{gathered} \right. $$
(7)

where \( \tilde{z}_{i} \) is the ith component of \( {\tilde{\varvec{z}}} \); \( {\varvec{B}}_{i} \) the ith row of \( {\varvec{B}} \); and \( \tilde{t}_{i}^{ + } /\tilde{t}_{i}^{ - } \) the upper/lower tolerance for ith auxiliary measurement.

According to the formulation methodology of MINP, the criteria for the estimation of the auxiliary state vector can be selecting a state vector \( {\varvec{y}} \) which make as few auxiliary measurements as possible being ignored, thus the following model can be got as

$$ \begin{gathered} {\text{Min}} .\, { }J({\varvec{y}},{\varvec{b}}) = \sum\nolimits_{{i{ = }1}}^{m} {b_{i} } \hfill \\ {\text{s.t.}} \ \left\{ \begin{array}{l} {\varvec{B}}_{i} {\varvec{y}} \le \tilde{z}_{i} + \tilde{t}_{i}^{ + } + Mb_{i} \hfill \\ {\varvec{B}}_{i} {\varvec{y}} \ge \tilde{z}_{i} - \tilde{t}_{i}^{-} - Mb_{i} \hfill \\ \end{array} \right. \quad i = 1,2, \ldots ,m \hfill \\ \end{gathered} $$
(8)

Compared with traditional SE methods based on (2) (including MINP model), (8) is a MILP problem, thus it possesses the following advantages: ① the global optimal solution can be guaranteed mathematically; ② there is no convergence problem for (8); ③ it can effectively suppress bad measurements (including leverage bad measurements), which will be proved by the tests in the next section.

Eq. (8) can be efficiently solved using CPLEX. As soon as (8) is solved, the estimation value of the auxiliary state vector can be gotten, and then the method proposed in [22, 23] can be used to obtain the estimation value of the original state vector described by (1). However, a simple and convenient alternative nonlinear transformation and an alternative second linear stage will be presented below.

2) Nonlinear transformation

An alternative nonlinear transformation is given as

$$ {\tilde{\varvec{u}}} = {\tilde{\varvec{g}}}({\varvec{y}}) $$
(9)

where \( {\tilde{\varvec{u}}} = \{v_{i} ,\theta_{ij}^{(c)} ,\theta_{ij}^{(s)} \} \in R^{(N + 2b)} \) is the pseudo-measurement vector obtained by nonlinear transformation, \( v_{i} = (U_{i} )^{0.5}, \theta_{ij}^{(c)} = {\text{arccos}}(K_{ij} / (v_{i} v_{j} )), \theta_{ij}^{(s)} = {\text{arcsin}}L_{ij} / (v_{i} v_{j})).\)

Through the above nonlinear transformation, the bus voltage magnitudes of all the buses as well as the phase angle differences between both ends of all the lines (two values for each line) can be obtained. The next task is to estimate the bus voltage angles of all the buses from the angle differences of all the lines, which will be completed in the second linear stage.

3) Second linear stage

An alternative second linear model is given as

$$ {\varvec{\theta}}_{b} = {\varvec{A\theta }} $$
(10)

where \( \varvec{\theta}_{b} = \{ {( {\theta_{ij}^{(c)} + \theta_{ij}^{(s)} } )/2} \} \in R^{b} \) is the pseudo- measurement vector; \( {\varvec{\theta}} = [\theta_{2} ,\theta_{3} , \ldots ,\theta_{N} ]^{\text{T}} \in R^{N - 1} \) the voltage angle vector (bus 1 is set as the reference bus); and \( {\varvec{A}} = [a_{ij} ] \) (1 ≤ i ≤ b, 1 ≤ j ≤ N  1) the reduced branch-bus incidence matrix (without the column corresponding to the reference bus). Then \( {\varvec{A}} = [a_{ij} ] \) is a \( b \times (N - 1) \) matrix with

$$ a_{ij} = \left\{\begin{array}{ll} 1& {\text{if bus}}\;j\; {\text{is the sending terminal of branch}} \; i \\ -1 &{\text{if bus}} \; j \; {\text{is the receiving terminal of branch}} \; i \\ 0 & {\rm{otherwise}} \end{array} \right.$$

However, since \( {\varvec{\theta}}_{b} \) is obtained by the nonlinear transformation of MILP, (10) does not hold in general. Thus, we can alternatively regard \( {\varvec{\theta}}_{b} \) as one type of special “measurements” with noises and \( {\varvec{\theta}} \) as an unknown state variable vector. Then we have a fictional measurement equation as

$$ {\varvec{\theta}}_{b} = {\varvec{A\theta }} + {\varvec{\tau}} $$
(11)

where τ is the b-dimensional noise vector.

Apparently, θ can be obtained by solving SE problem, the WLS problem is used here with the model as

$$ {\text{Min }}J ({\varvec{\theta}} )= ({\varvec{\theta}}_{b} - {\varvec{A\theta }})^{\text{T}} {\varvec{W}}_{\theta } ({\varvec{\theta}}_{b} - {\varvec{A\theta }}) $$
(12)

where W θ is the weighted matrix.

Without loss of generality, assume W θ  = I. To minimize (11), we have

$$ {\varvec{A}}^{\text{T}} {\varvec{A\theta }} = {\varvec{A}}^{\text{T}} {\varvec{\theta}}_{b} $$
(13)

As the gain matrix in (13) always has a very small condition number, it can be directly solved by using the Cholesky decomposition and conventional forward/back substitutions; note that (12) is a quadratic programming (QP) problem, and it also can be solved by CPLEX.

To summarize, the overall procedure of the proposed MILP algorithm is presented as follows.

  • Step 1 (forming matrix B and A): Form the constant Jacobian matrix B and the reduced branch-bus incidence matrix A.

  • Step 2 (solving MILP): Solve the MILP (8) by CPLEX software.

  • Step 3 (nonlinear transformation): Obtain all the bus voltage magnitudes and the phase angle differences between both ends of all the lines by (9).

  • Step 4 (solving QP): Solve (11) by CPLEX software.

  • Step 5: END.

Note that K ij and L ij is relaxed in the first linear stage of MILP, thereby affecting the estimation accuracy of MILP. To solve this problem, the two-stage algorithm presented in [21] can be employed: ① to identify and eliminate bad measurements by the overall algorithm of MILP given in Table 1; ② regarding the estimation value of MILP as the initial value, to process WLS algorithm on the polished measurements. For ease of expression, we call the above method MILP+WLS, which contains an additional WLS estimation following the run of MILP. In this way, the three problems of traditional SE models might be comprehensively solved.

Table 1 The network data of the 3-bus system

Remarks are as follows.

1) In order to further improve the computation efficiency of (8), b i can be viewed as a continuous variable within 0 to 1, and then the original MILP problem is converted to be a linear programming (LP) problem, thereby greatly improving the computational efficiency. By solving this LP, an estimation value of bi close to 1 is an indication of the corresponding measurement being bad measurement, while a value of b i approaching 0 implies a normal measurement.

2) Reference [24] suggests a comprehensive RSE approach that simultaneously considers bad measurement identification, parameter estimation and topology errors identification. When (6) are applied to the generalized SE model presented in [24], only the inequalities corresponding to the measurements related to suspicious parameters are nonlinear, while all other inequalities are linear.

The uncertainty of the suspicious parameters can be represented by a pair of linear inequality constraints as

$$ \left\{ \begin{gathered} \hat{p}_{k} \le p_{k} + t_{k}^{ + } + Mb_{k} , \, \quad k = 1,2, \ldots ,m_{p} \hfill \\ \hat{p}_{k} \ge p_{k} - t_{k}^{-} - Mb_{k} , \,\quad k = 1,2, \ldots ,m_{p} \hfill \\ \end{gathered} \right. $$
(14)

where p k is the kth suspicious parameter; \( \hat{p}_{k} \) is its estimated value; t + k /t k upper/lower tolerance for kth suspicious parameter; m p the number of the suspicious parameters; and b k binary variable, for wrong parameter, b k  = 1, else b k  = 0.

Suppose the topology status of the lth link is suspicious, the uncertainty of this suspicious link can be represented by

$$ \left\{ \begin{aligned} & - Mb_{l} \le U_{i} - U_{j} \le Mb_{l} \\ & - Mb_{l} \le L_{ij} \le Mb_{l} \\ & - M(1 - b_{l} ) \le P_{ij} \le M(1 - b_{l} ) \\ & - M(1 - b_{l} ) \le Q_{ij} \le M(1 - b_{l} ) \\ \end{aligned} \right. \quad l = 1,2, \ldots ,m_{s} $$
(15)

where i and j are the sending terminal and the receiving terminal of the lth link, respectively; m s the number of the suspicious links; and b l binary variable, for open link, b l  = 1, else b l  = 0.

Then a comprehensive RSE model can be obtained as

$$ \begin{array} {l}{\text{Min}} . { }\sum\nolimits_{i = 1}^{m} {b_{i} } + \sum\nolimits_{k = 1}^{{m_{p} }} {b_{k} } + \sum\nolimits_{l = 1}^{{m_{s} }} {b_{l} } \\ {\text{s}} . {\text{t}} . { }\left\{\! \begin{array}{ll} \tilde{z}_{i} - \tilde{t}_{i}^{-} - Mb_{i} \le {\varvec{B}}_{i} {\varvec{y}} \le \tilde{z}_{i} + \tilde{t}_{i}^{ + } + Mb_{i} , \,\quad i = 1,2, \ldots ,m \\ p_{k} - t_{k}^{-} - Mb_{k} \le \hat{p}_{k} \le p_{k} + t_{k}^{-} + Mb_{k} , \,\quad k = 1,2, \ldots ,m_{p} \\ \left. \begin{array}{ll} \!\!- Mb_{l} \le U_{i} - U_{j} \le Mb_{l} \\ \!\! - Mb_{l} \le L_{ij} \le Mb_{l} \\ \!\! - M(1 - b_{l} ) \le P_{ij} \le M(1 - b_{l} ) \\ \!\! - M(1 - b_{l} ) \le Q_{ij} \le M(1 - b_{l} ) \\ \end{array} \right\}, \quad l = 1,2, \ldots ,m_{s} \\ b_{i} ,b_{k} ,b_{l} = 0 \; {\hbox{or}} \; 1 \\ \end{array} \right. \\ \end{array} $$
(16)

Note (16) is not a MILP problem, but a MINP problem. However, since the nonlinearity only involves quadratic terms, it is actually a MIQP problem that can also be efficiently solved using CPLEX. Specifically, if only bad measurements and topology errors are considered, the resulting model is still a MILP problem, since no nonlinearity will be involved.

3.3 Observability analysis

In this subsection, the network observability analysis of MILP is discussed. Because the reduced branch-bus incidence matrix A retains full column rank, the observability condition is naturally satisfied in the second linear stage of MILP. Hence, we only need to discuss the observability condition of the first stage. Since the measurement errors have no effect on the observability analysis,the observability analysis for MILP can be simplified to consider the following measurement equation

$$ {\tilde{\varvec{z}}} = \left[ {\begin{array}{l} {{\tilde{\varvec{z}}}_{R} } \\ {{\tilde{\varvec{z}}}_{A} } \\ \end{array} } \right] = \left[ {\begin{array}{ll} {{\varvec{B}}_{RR} } & {{\varvec{B}}_{RA} } \\ {{\varvec{B}}_{AR} } & {{\varvec{B}}_{AA} } \\ \end{array} } \right]\left[ {\begin{array}{l} {{\varvec{y}}_{R} } \\ {{\varvec{y}}_{A} } \\ \end{array} } \right] $$
(17)

where \( {\varvec{y}}_{R} = \{ U_{i} ,K_{ij} \}, \ {\varvec{y}}_{R} \in R^{N + b} \); \( {\varvec{y}}_{A} = \{ L_{ij} \}, \ {\varvec{y}}_{A} \in R^{b} \); \( {\tilde{\varvec{z}}}_{R} = \{ U_{i} ,Q_{ij} ,Q_{i} \} \) the auxiliary measurement vector associated with reactive power, \( {\tilde{\varvec{z}}}_{A} = \{ P_{ij} ,P_{i} \} \) is the auxiliary measurement vector associated with active power, \( I_{ij}^{2} \) is not used here; and B RR , B RA , B AR and B AA the corresponding Jacobian submatrices. Nonzero elements of B RR and B AA are composed of susceptance or 1, while that of B RA and B AR are composed of conductance.

In observability analysis, the system observability is generally independent of the branch parameters. Thus, without loss of generality, we assume the impedance of each branch to be j 1.0 p.u., and the conductance to be 0. This yields B RA  = 0 and B AR  = 0. Then (17) can be simplified to be

$$ \left[ {\begin{array}{l} {{\tilde{\varvec{z}}}_{R} } \\ {{\tilde{\varvec{z}}}_{A} } \\ \end{array} } \right] = \left[ {\begin{array}{ll} {{\varvec{B}}_{RR} } & {{0}} \\ {{0}} & {{\varvec{B}}_{AA} } \\ \end{array} } \right]\left[ {\begin{array}{l} {{\varvec{y}}_{R} } \\ {{\varvec{y}}_{A} } \\ \end{array} } \right] $$
(18)

According to the expression of B, it is apparent that if B RR holds full column rank, then B AA must be of full column rank, provided P, Q measurements come in pairs. Therefore, the observability condition of MILP is that B RR is of full column rank, provided that P, Q measurements come in pairs. Unobservable lines and their corresponding measurements should be removed in estimation by MILP.

Note that the dimension of y R is N + b, larger than that of the state vector associated with the reactive problem in the traditional SE models. Thus the observability condition of MILP is more rigorous than that of the traditional SE models. However, the measurement status in current power systems is usually good enough to guarantee the observability condition of the proposed MILP. This has been verified by a great number of trials on various benchmark systems in our tests. To check whether the observability condition is satisfied or not, a number of conventional numerical approaches can be used [2528].

4 Case studies

In this section, numerical experiments are carried out to evaluate the performance of the proposed model and algorithm. The test systems include a rudimentary 3-bus system and seven benchmark IEEE systems. All tests are performed on a laptop, with an Intel(R) Core(TM) i5, 2.60 GHz CPU and 2 GB RAM.

4.1 Case 1: 3 bus system

  1. 1)

    Correctness test

Consider a rudimentary 3-bus system shown in Fig. 1. The network data and measurements (in p.u.) are given in Table 1 and Table 2, respectively. Bus 1 is set as the reference bus. The true value of the complex phasor voltages for the three buses are \( 1\angle 0 \), \( 0.9732\angle { - }0.0217 \) and \( 0.9431\angle { - }0.0482 \), respectively.In the test, WLS, MINP, MILP and MILP+WLS are independently executed based on the measurements shown as Table 2, the estimation results given by the above four SE approaches are shown as Table 4. As can be seen from Table 3, the estimation values of MINP and MILP are correct, but the accuracy is not high enough; while the estimation results given by WLS and MINP+WLS are the same and are very close to the true value. Note in this test, no nonlinear iterations are needed by MILP, while WLS needs four nonlinear iterations (the convergence precision is 1e−6 in all the tests) and MINP needs seven, MILP+WLS needs three nonlinear iterations. The computation efficiency of MILP+WLS is more than ten times as high as that of MINP for this test.

Fig. 1
figure 1

One-line diagram and measurement configuration of a 3-bus system

Table 2 Measurements of the 3-bus system
Table 3 The estimated results given by WLS, MINP, MILP and MILP+WLS
  1. 2)

    Robustness test

Furthermore, the reactance of branch 1–3 is reduced to 1/10 of its original value so as to create a leverage point, and Q13 is intentionally changed (by adding 10% error) to simulate bad measurement, while other measurements shown as Table 3 keep the same, then WLS (with the largest normal residual test for bad measurement identification, denoted by WLS+LNR), MINP, MILP and MILP+WLS are independently executed. For WLS, after the first estimation of WLS, the LNR (larger than the threshold 3.0) corresponds to P2, eliminate P2 and run WLS again; after the second estimation, the largest normal residual (larger than the threshold 3.0) corresponds to Q31. Apparently, WLS+LNR cannot correctly identify bad measurements when leverage point exists. However, the bad measurement Q31 is correctly identified by MINP, MILP and MILP+WLS even leverage point exists, illustrating the good robustness of MINP, MILP and MILP+WLS. In terms of the computation efficiency, the computation efficiency of MILP+WLS is more than twelve times as high as that of MINP for this test.

  1. 3)

    Global optimality test

This test is performed for verifying the global optimality of MILP. As aforementioned, the conventional SE methods based on (2) are non-convex and the global optimum cannot be guaranteed to be found, especially for heavily loaded, stressed systems. The proposed MILP, however, can theoretically guarantee the global optimality. To demonstrate this, we test WLS and the proposed MILP with another group of measurements (given in Table 4). One might argue that such measurements are rare in practice; however, such situation can occur with voltage collapse.

Table 4 Another group of measurements of the 3-bus system

The estimation results are shown in Table 5. The traditional WLS using flat start converges after 18 iterations. However, it can be seen from Table 6 that WLS converges to a local minimum instead of to the global optimum, making the estimation result inacceptable; whereas the proposed MILP successfully finds the unique global optimum, showing its capability of guaranteeing the global optimality. As for MILP+WLS, because of the good initial value provided by MILP for WLS, its estimation value is closer to the true value compared with the estimation value of MILP, illustrating that MILP+WLS can also find the global optimum.

Table 5 The estimation results of WLS and MILP
Table 6 The maximum deviations of the estimation results

4.2 Case 2: IEEE bus systems

In this case, numerical tests are carried on seven benchmark systems, including IEEE 9, 14, 30, 39, 57, 118, 300-bus systems. For comparison, the traditional WLS, MILP and MILP+WLS are tested. In the tests, WLS is solved by the Gauss-Newton method, while MILP is solved by CPLEX. The measurements are created using load flow results with additional small Gaussian noises. The standard deviation of noise is set to be 0.001.

  1. 1)

    Estimation accuracy test

The maximum deviations between the estimation results (by WLS, MILP and MILP+WLS) and the corresponding true states are presented in Table 6. In Table 6, \( \left\| {\Delta {\varvec{v}}} \right\| \) (p.u.) and \( \left\| {\Delta {\varvec{\theta}}} \right\| \) (rad) denote the maximum deviation of the voltage magnitudes and the maximum deviation of the angles, respectively. It can be seen that the estimation values of MILP are always correct; and the estimation values of WLS and MILP+WLS are always the same, they are closer to the true value compared with the estimation values of MILP. With the standard deviation of noise decreasing, all the maximum deviations approach to zero, implying that the estimate values approach to the true states.

  1. 2)

    Robustness test

As for robustness, we take the IEEE-300 bus system as example. In the test, the reactance of branch 1–5 is reduced to 1/10 of its original value so as to create a leverage point, and 4 correlated measurements are set as bad measurements. Then the proposed MILP+WLS is used to identify the bad measurements. Test results show that all the bad measurements are correctly identified by MILP+WLS. For the purpose of comparison, two other estimators are also tested on the same problem (including the WLS+LNR and the WLAV). Both the two estimators fail to correctly identify the bad measurements due to the existence of leverage point. These test results illustrate strong robustness of the proposed methodology.

  1. 3)

    Computational efficiency

As for the computational efficiency, Table 7 gives the number of measurements as well as CPU time of MINP and MILP for different IEEE benchmark systems. Note that the computational efficiency of the MINP model (by using polar coordinate) is quite low and not suitable for online application. In contrast, Table 7 indicates that the computation efficiency of our MILP model is very high. Furthermore, Fig. 2 illustrates that the computation time of MILP model grows approximately and linearly with the increase of system scale, showing a very good prospect of online application.

Table 7 Number of measurements and CPU time of MINP and MILP for different IEEE bus systems
Fig. 2
figure 2

Relationship of CPU time and system scale

In the next test, the MILP model is slightly modified to incorporate topology errors. Then, 10 link errors and 20 bad measurements are set on the IEEE 300 bus system for test. After estimation, all the link errors and bad measurements are correctly identified. The total estimation time is 2632 ms, which is acceptable for online application. This test indicates that the generalized estimation using MILP model is capable of detecting/rejecting bad measurements and topology errors simultaneously. For the comprehensive SE considering bad measurements, topology and parameter errors, a MIQP formulation is required and the estimation algorithm for SE in the second-stage may need further investigation. We leave this issue to the future work.

5 Conclusions

In this paper, MILP formulation for RSE is proposed. It can be easily solved by using mature software such as CPLEX. Numerical experiments illustrate its strong robustness and high efficiency of the proposed methodology, showing great promise to online applications.