Abstract
During state perception of a power system, fragments of harmonic data are inevitably lost owing to the loss of synchronization signals, transmission delays, instrument failures, or other factors. A harmonic data recovery method is proposed based on multivariate norm matrix in this paper. The proposed method involves dynamic time warping for correlation analysis of harmonic data, normalized cuts for correlation clustering of power-quality monitoring devices, and adaptive alternating direction method of multipliers for multivariable norm joint optimization. Compared with existing data recovery methods, our proposed method maintains excellent recovery accuracy without requiring prior information or optimization of the power-quality monitoring device. Simulation results on the IEEE 39-bus and IEEE 118-bus test systems demonstrate the low computational complexity of the proposed method and its robustness against noise. In addition, the application of the proposed method to field data from a real-world system provides consistent results with those obtained from simulations.
STATE perception is essential for the correct interconnection of all the components in a power system to guarantee safe and stable operation [
Missing data are generally represented with zeroes. In terms of recovery hierarchy, existing data recovery methods can be classified into local and distributed recovery methods. Local recovery is only applicable to historical data through the methods such as interpolation [
Distributed recovery applies multi-bus data to recover missing data mainly using machine learning [
To address the above-mentioned problems, we propose a harmonic data recovery (HDR) method based on multivariate norm matrix. The main contributions of this paper are summarized as follows.
1) Based on the dynamic time warping (DTW) and normalized cut algorithms, the proposed method can select appropriate PQMDs when the system topology information is unknown or the parameters change, thereby improving applicability to large-scale power systems.
2) An adaptive penalty factor reduces the number of iterations and ensures convergence stability for data recovery.
3) Pollution matrices (for noise and sparse outliers) are introduced to extract the features from original data and thus improve the anti-interference ability.
The remainder of this paper is organized as follows. Section II introduces clustering steps of PQMDs based on harmonic data analysis. Section III details the proposed HDR method for PQMDs based on multivariate norm matrix. Simulation analysis and verification on real data are reported in Section IV. Section V summarizes the paper.
Harmonic data from different PQMDs are sequences of harmonic measurements acquired at different physical locations throughout a power system. Thus, PQMD data at similar locations tend to have similar dynamic properties.
Before analyzing data, harmonic data with similar dynamic properties can be represented as a harmonic data matrix:
(1) |
where n is the number of PQMDs with similar dynamic properties; m is the number of measurements from a PQMD; and is a harmonic datapoint from the monitoring device at instant j.
We perform singular value thresholding decomposition on X:
(2) |
where U and V are the orthogonal matrices; and S is a diagonal matrix that can be expressed as:
(3) |
where s1 is a diagonal matrix with non-zero diagonal elements; and s1, s2, ..., sr are the singular values of X that satisfy .
can be said to have a strict low-rank property if (4) holds.
(4) |
If the matrix does not satisfy (4) but satisfies (5), it can be considered to have an approximate low-rank property.
(5) |
A strict low-rank property only occurs theoretically, whereas matrices constructed from measurements generally show an approximate low-rank property. Nevertheless, the approximate low-rank property still allows to perform matrix complementation [
In practice, a matrix can be assumed to have a low-rank property when one or more of its singular values are considerably greater than the remaining singular values. To determine the low rank of data matrices, the following expression can be defined as [
(6) |
where . The low-rank property of a matrix can be determined based on the value of l: when , a smaller l indicates a better low rankness.
Singular value thresholding decomposition can be performed on matrix X1 constructed from random harmonic data. In addition, matrix X2 is constructed from partially similar data, and matrix X3 is constructed from completely similar data. The low-rank properties of the matrices are determined using (6).
As shown in
Matrix | β1 | β2 | β3 | β4 |
---|---|---|---|---|
X1 | 0.763 | 0.776 | 0.787 | 0.796 |
X2 | 0.746 | 0.834 | 0.893 | 0.917 |
X3 | 0.993 | 0.996 | 0.997 | 0.998 |
The analysis above indicates that the recovery of harmonic data requires finding data with similar dynamic properties. Hence, we use the DTW algorithm to perform correlation analysis on harmonic data from different PQMDs [
Let and be two sets of harmonic data containing voltage or current amplitudes. Distance matrix D is constructed with elements given by:
(7) |
where x and y are the datapoints from datasets and , respectively; ax and by are the datapoints from the x and y elements, respectively; and n and m are the numbers of samples in the two datasets.
As shown in

Fig. 1 Diagram of DTW process.
Many curved paths satisfy the above-mentioned constraints [
(8) |
where is used to calculate the distance between sets A and B. A smaller DTW value indicates a higher correlation between the two datasets. The correlation between any pair of PQMDs can be evaluated using (8), thus establishing a foundation for correlation clustering of PQMDs in power systems.
The correlation between two PQMDs can be calculated using (8), which establishes correlation matrix C, i.e., , where i and j denote different monitoring devices. To facilitate correlation analysis, the elements of C can be normalized as:
(9) |
We use data correlation to perform spectral clustering on , achieving reasonable clustering of different PQMDs, as illustrated in

Fig. 2 Schematic of correlation subgroups for different PQMDs.
First, Laplacian matrix L can be constructed to map the measurements from a low-dimensional space onto a high-dimensional one.
(10) |
The elements of degree matrix K can be defined as:
(11) |
where t denotes the number of PQMDs that require correlation subgroups.
As illustrated in
1) The sum of data correlations (DTW values) within the subgraph should be maximized.
2) The data correlation between subgraphs should be minimized.
3) Each subgraph should contain as many measurements as possible.
Based on these principles, the normalized cut algorithm can be expressed as [
(12) |
where Oi denotes different subgraphs that can be divided into q subgraphs; and is the sum of the weights for each subgraph. We introduce indicator vector hj given by:
(13) |
Based on (13), (12) can be rewritten as:
(14) |
where denotes the matrix trace. Thus, the normalized cut algorithm aims to minimize :
(15) |
where , and T is the characteristic matrix.
(16) |
Because causes partial information loss, T cannot be used directly to segment the subgraph. The row vector of can be used as a sample for k-means clustering to obtain the final subgroup results.
Algorithm 1 : PQMD clustering |
---|
Input: harmonic data of different PQMDs Output: subgroup results of PQMDs 1: Construct distance matrix D using (7) 2: Construct correlation matrix using (8) and (9) 3: Construct Laplacian matrix L using (10) and (11) 4: Introduce indicator vector using (13) 5: Combined with hj, convert (12) into (14) 6: Combining (15) and (16), obtain characteristic matrix T 7: Use row vectors of T as samples for k-means clustering 8: Obtain subgroup results of PQMDs |
Noise and sparse outliers often pollute harmonic data. As noise has a high density and small amplitude, we use the Frobenius norm for mathematical description, while outliers with sparsity are described using the L1-norm.
We formulate an HDR model based on multivariate norm matrices considering the low-rank matrix model as:
(17) |
where is a matrix with recovered data; denotes the kernel norm of the matrix; E is a sparse outlier matrix; denotes the L1-norm; G is a noise matrix; denotes the Frobenius norm; N is an auxiliary matrix, which stores opposite values of missing data; and and are the weight coefficients.
(18) |
where Y is a Lagrangian multiplier; is a penalty factor; and denotes the internal product of the matrix.
Because (18) is a multidimensional multi-norm joint optimization problem, we solve it using an adaptive alternating direction method of multipliers; that is, alternating iterations are required to update .
1) Update : fixing , the objective function of can be expressed as:
(19) |
where .
2) Update E: fixing , the objective function of E can be expressed as:
(20) |
where .
3) Update G: fixing , the objective function of G can be expressed as:
(21) |
where .
The unconstrained optimization problem with a double Frobenius norm can be solved using the matching method.
(22) |
where and are the elements in the row and column of and , respectively.
From (22), is consistently greater than 0. Consequently, for , the optimal solution of (22) can be obtained as:
(23) |
4) Update N:
(24) |
The solution of N can be obtained directly by letting the norm in (24) to be 0.
(25) |
5) Update Y:
(26) |
The convex functions above have closed-form solutions, and the convergence of their overall and individual parts has been proven [
From (26), it is evident that affects the iteration step length and convergence efficiency of the proposed method. Thus, we propose an adaptive strategy. A larger step is set when the solution is far from the optimal point to accelerate convergence, while a smaller step is set when it is close to optimality to avoid fluctuations and ensure convergence.
(27) |
(28) |
where is the gradient change rate at iteration k; is a variation coefficient; and and are the thresholds of the gradient change rate. The variables can be updated alternately from (19) to (26) until the errors between the two calculations are below threshold , which is set to be 0.001.
Algorithm 2 : solving HDR model |
---|
Input: harmonic data matrix X; parameters Output: matrices , E, G 1: Construct HDR model based on multivariate norm matrix using (17) 2: Combining the augmented Lagrangian algorithm, transform (17) into (18) 3: for to k According to (18)-(28), update in sequence if , , and Output , E, and G else
end end 4: Complete recovery of missing data |
The flowchart of the proposed HDR method based on multivariate norm matrix is shown in

Fig. 3 Flowchart of proposed HDR method based on multivariate norm matrix.
Step 1: correlation analysis of harmonic data. First, we construct distance matrix D using (7). Then, (8) is used to evaluate the data correlations of the PQMDs.
Step 2: correlation clustering of PQMDs. Correlation matrix is constructed using (8) and (9). Spectral clustering is then performed on via (10)-(16), and the subgroup results are obtained.
Step 3: construction of HDR model based on multivariate norm matrices. The characteristics of harmonic data are analyzed. Then, the HDR model based on multivariate norm matrices is applied based on the conventional low-rank matrix model, as expressed in (17).
Step 4: solution of HDR model. By combining with the augmented Lagrangian algorithm, (17) is transformed into (18). Then, an adaptive alternating direction method of multipliers is applied, as shown in (19)-(28). Finally, the errors of the two calculations are used to determine whether to terminate the algorithm and retrieve the recovered harmonic data.
Simulation analysis was performed for the verification of the HDR method. Subsequently, field data from a power system in southeastern China were used to verify the reliability of our proposal. The simulation results were compared with those of the Lagrange interpolation (LIP) [
Most data loss events in existing studies have involved at most three continuous datapoints [

Fig. 4 Models of data loss.
1) Model I: single datapoint loss. A datapoint is randomly lost in a dataset.
2) Model II: continuous data loss. A segment of datapoints is randomly lost in a dataset.
3) Model III: mixed data loss. Models I and II are combined.
The mean squared error (MSE) can be used to quantify the accuracy of HDR as:
(29) |
where and are the restored and true values, respectively; and n is the number of recovered datapoints.
In (17), weight coefficient of E is generally set as [

Fig 5 Rectification of weight coefficient .
As shown in

Fig. 6 Convergence of adaptive penalty factor. (a) Parameters and β. (b) Parameter .

Fig. 7 Convergence rate of penalty factors.
We used the IEEE 39-bus test system (New England) for simulation analysis (

Fig. 8 Diagram of IEEE 39-bus test system (New England) [
Subgroup | Bus |
---|---|
I | 2, 3, 9 |
II | 6, 7, 10, 12 |
III | 4, 13, 14, 16, 18 |
IV | 19, 21, 23 |
V | 26, 27, 28 |
To make the simulation environment more realistic, we injected the
To examine the applicability of the proposed method, we randomly performed single, continuous, and mixed data loss simulations on 30 datasets collected from buses 6, 14, and 19. The MSE values of the LIP, PCA, IPR, and proposed methods for the data recovery of the

Fig. 9 Data recovery for bus 19 under a definite 30% mixed data loss. (a)
1) Correctness Verification of HDR
Tables III-V show that the four evaluated methods work well for single datapoint loss. The errors of the four methods fluctuate by approximately 0.02% for the considered harmonics. However, the proposed method outperforms the LIP, PCA, and IPR methods regarding 30% continuous data loss and 30% mixed data loss. For bus 6, the MSE values of the proposed method are 0.39%, 0.42%, 0.36%, 0.35%, 0.39%, 0.41%, and 0.41% under 30% continuous data loss of the
Order | Model | MSE (%) | |||
---|---|---|---|---|---|
LIP | PCA | IPR | Proposed | ||
| I (single loss) | 0.02 | 0.02 | 0.01 | 0.02 |
II (30% loss) | 42.26 | 30.22 | 29.86 | 0.39 | |
III (30% loss) | 50.26 | 35.74 | 35.01 | 0.48 | |
| I (single loss) | 0.02 | 0.02 | 0.01 | 0.02 |
II (30% loss) | 44.28 | 31.16 | 29.93 | 0.42 | |
III (30% loss) | 51.36 | 37.27 | 36.82 | 0.51 | |
1 | I (single loss) | 0.02 | 0.02 | 0.02 | 0.01 |
II (30% loss) | 39.28 | 30.37 | 29.91 | 0.36 | |
III (30% loss) | 47.89 | 36.17 | 35.27 | 0.46 | |
2 | I (single loss) | 0.02 | 0.02 | 0.02 | 0.02 |
II (30% loss) | 39.16 | 30.83 | 29.36 | 0.35 | |
III (30% loss) | 48.92 | 37.16 | 34.90 | 0.48 | |
3 | I (single loss) | 0.02 | 0.02 | 0.02 | 0.02 |
II (30% loss) | 40.88 | 31.75 | 29.74 | 0.39 | |
III (30% loss) | 48.86 | 36.26 | 35.63 | 0.49 | |
3 | I (single loss) | 0.02 | 0.02 | 0.01 | 0.02 |
II (30% loss) | 40.39 | 30.99 | 27.72 | 0.41 | |
III (30% loss) | 50.62 | 37.18 | 34.26 | 0.50 | |
5 | I (single loss) | 0.02 | 0.02 | 0.02 | 0.02 |
II (30% loss) | 41.64 | 31.26 | 30.11 | 0.41 | |
III (30% loss) | 50.51 | 36.02 | 35.25 | 0.48 |
Order | Model | MSE (%) | |||
---|---|---|---|---|---|
LIP | PCA | IPR | Proposed | ||
| I (single loss) | 0.02 | 0.02 | 0.02 | 0.02 |
II (30% loss) | 40.91 | 30.45 | 27.65 | 0.41 | |
III (30% loss) | 48.32 | 34.87 | 32.89 | 0.46 | |
| I (single loss) | 0.02 | 0.02 | 0.02 | 0.01 |
II (30% loss) | 42.14 | 31.28 | 28.35 | 0.44 | |
III (30% loss) | 48.73 | 36.38 | 34.51 | 0.52 | |
1 | I (single loss) | 0.02 | 0.02 | 0.01 | 0.02 |
II (30% loss) | 40.26 | 32.27 | 30.26 | 0.39 | |
III (30% loss) | 51.86 | 36.95 | 34.73 | 0.48 | |
2 | I (single loss) | 0.02 | 0.02 | 0.02 | 0.02 |
II (30% loss) | 42.16 | 31.08 | 27.87 | 0.40 | |
III (30% loss) | 52.19 | 35.98 | 33.94 | 0.49 | |
3 | I (single loss) | 0.01 | 0.02 | 0.02 | 0.01 |
II (30% loss) | 40.02 | 29.86 | 28.52 | 0.37 | |
III (30% loss) | 48.27 | 35.37 | 34.29 | 0.45 | |
3 | I (single loss) | 0.02 | 0.02 | 0.02 | 0.01 |
II (30% loss) | 41.26 | 32.21 | 30.01 | 0.39 | |
III (30% loss) | 47.48 | 37.18 | 36.49 | 0.48 | |
5 | I (single loss) | 0.02 | 0.03 | 0.01 | 0.02 |
II (30% loss) | 40.67 | 31.22 | 27.33 | 0.42 | |
III (30% loss) | 50.25 | 36.67 | 34.12 | 0.53 |
Order | Model | MSE (%) | |||
---|---|---|---|---|---|
LIP | PCA | IPR | Proposed | ||
| I (single loss) | 0.02 | 0.02 | 0.02 | 0.02 |
II (30% loss) | 40.27 | 29.83 | 28.41 | 0.39 | |
III (30% loss) | 49.55 | 36.24 | 34.19 | 0.47 | |
| I (single loss) | 0.02 | 0.02 | 0.02 | 0.02 |
II (30% loss) | 42.85 | 32.24 | 28.87 | 0.41 | |
III (30% loss) | 46.27 | 36.79 | 35.24 | 0.49 | |
1 | I (single loss) | 0.02 | 0.02 | 0.02 | 0.01 |
II (30% loss) | 40.29 | 31.38 | 30.05 | 0.37 | |
III (30% loss) | 49.96 | 35.98 | 34.98 | 0.42 | |
2 | I (single loss) | 0.02 | 0.02 | 0.02 | 0.02 |
II (30% loss) | 42.09 | 31.19 | 28.63 | 0.39 | |
III (30% loss) | 50.27 | 37.28 | 35.28 | 0.49 | |
3 | I (single loss) | 0.02 | 0.03 | 0.01 | 0.01 |
II (30% loss) | 39.97 | 32.28 | 27.78 | 0.42 | |
III (30% loss) | 49.24 | 35.94 | 35.29 | 0.51 | |
3 | I (single loss) | 0.02 | 0.02 | 0.02 | 0.02 |
II (30% loss) | 41.28 | 30.08 | 29.28 | 0.39 | |
III (30% loss) | 51.71 | 36.27 | 34.28 | 0.49 | |
5 | I (single loss) | 0.02 | 0.02 | 0.02 | 0.01 |
II (30% loss) | 42.39 | 30.94 | 28.86 | 0.41 | |
III (30% loss) | 49.91 | 36.24 | 35.22 | 0.45 |
When the PQMDs with data loss are located in different areas, the proposed HDR method still exhibits a good recovery performance and high stability. For bus 14, the errors of the proposed method are approximately 0.02%, 0.40%, and 0.48% for single, 30% continuous, and 30% mixed data loss, respectively. This is similar to the performance of the HDR method for the recovery in buses 6 and 19. Meanwhile, the recovery performances of the IPR, PCA, and LIP methods fluctuate. Under 30% mixed data loss, the recovery errors fluctuate in 32%-37% for the IPR method, 34%-38% for the PCA method, and 46%-52% for the LIP method.
Tables III-V reveal that the IPR method outperforms the LIP and PCA methods for the HDR but is still inferior to the proposed method. This is because the IPR method propagates the residual values to nearby buses and adds them to the results of the previous iteration to obtain the output of the current iteration. Using this method, reconstruction residuals obtained from insufficient local information cannot compensate for the fluctuations of unknown information. Consequently, the IPR method is vulnerable to the data loss ratio under continuous or mixed data loss.
The PCA method decomposes data into components using the direction of maximum variance as the main feature. However, the data loss ratio can affect decomposition. Moreover, harmonic data do not necessarily follow a Gaussian distribution, and the principal components obtained using the PCA method are not always optimal. Multiple factors can cause the ineffectiveness of the PCA method in recovering harmonic data. The LIP method uses known data to build a suitable interpolation function to estimate unknown data. With increasing continuous missing data, the error of the LIP method increases. Moreover, this method is more dependent on sample data, hindering the accurate estimation of data trends. Consequently, the LIP method is poor at recovering missing harmonic data with transient and disordered characteristics.
2) Running Time of HDR
The computational complexity of the four evaluated methods was investigated, obtaining the running time for single, 30% continuous, and 30% mixed data loss listed in
Bus (order) | Model | Running time (s) | |||
---|---|---|---|---|---|
LIP | PCA | IPR | Proposed | ||
4 ( | I (single loss) | 0.04 | 0.58 | 0.62 | 0.56 |
II (30% loss) | 1.17 | 2.05 | 2.31 | 1.89 | |
III (30% loss) | 1.39 | 2.57 | 2.87 | 2.26 |

Fig. 10 Running time of four evaluated methods for different data loss ratios of model III.
3) Applicability Verification of HDR
1) Scenario 1: different data loss ratios
The effectiveness of the four evaluated methods was compared by varying the data loss ratios of the
Loss ratio (%) | MSE (%) | |||
---|---|---|---|---|
LIP | PCA | IPR | Proposed | |
10 | 20.94 | 9.72 | 11.29 | 0.29 |
20 | 42.18 | 26.29 | 20.35 | 0.42 |
30 | 51.36 | 37.27 | 36.82 | 0.51 |
40 | 67.20 | 46.03 | 42.15 | 1.26 |
50 | 58.14 | 49.27 | 2.35 | |
60 | 72.93 | 61.27 | 4.75 | |
70 | 74.84 | 9.51 |
The recovery performance of the four methods decreases with increasing data loss ratio. For high data loss ratios (e.g., 50%), the LIP and PCA methods are ineffective for recovery, losing their practical significance. For continuous data loss, the error of the LIP method is large. When the data loss ratio is less than 30%, the PCA and IPR methods are effective. When data loss ratios are above 40%, the errors of the PCA, IPR, and HDR methods increase sharply. The PCA and IPR methods stop recovering data, but the proposed method maintains a high recovery accuracy. When the data loss ratio is 70%, the MSE of the proposed method is 9.51%. Overall, under severe harmonic data loss, the proposed method retains an acceptable recovery accuracy.
To better explore the applicability of the proposed method, 18 PQMDs were sequentially subjected to mixed data loss ratios ranging from 30% to 70% (at 10% intervals). The MSE values of the proposed method at the PQMDs are shown in

Fig. 11 MSE values of proposed method under different ratios of model III.
2) Scenario 2: noise and outliers
Noise often interferes with measurements of harmonic data. To investigate the noise robustness of the HDR method, white Gaussian noise and sparse outliers were added to the data from bus 14. The signal-to-noise ratio (SNR) ranged from 30 dB to 80 dB (at 10 dB intervals). Sparse outliers were randomly added to the data. The MSE values at bus 14 with 30% mixed data loss are listed in
Order | SNR | MSE (%) | |||
---|---|---|---|---|---|
LIP | PCA | IPR | Proposed | ||
| 30 | 70.27 | 4.68 | ||
40 | 57.29 | 3.29 | |||
50 | 70.23 | 52.38 | 2.53 | ||
60 | 75.27 | 56.25 | 45.28 | 1.83 | |
70 | 63.29 | 47.93 | 40.11 | 1.16 | |
80 | 52.26 | 39.26 | 35.27 | 0.72 | |
| 30 | 75.26 | 4.99 | ||
40 | 60.71 | 3.46 | |||
50 | 75.92 | 53.01 | 2.68 | ||
60 | 77.12 | 59.49 | 46.92 | 1.79 | |
70 | 67.26 | 48.79 | 41.23 | 1.19 | |
80 | 53.27 | 39.87 | 36.27 | 0.81 | |
1 | 30 | 72.33 | 4.73 | ||
40 | 61.23 | 3.06 | |||
50 | 72.37 | 55.28 | 2.38 | ||
60 | 82.14 | 55.26 | 47.06 | 1.72 | |
70 | 70.26 | 48.27 | 40.96 | 1.35 | |
80 | 55.26 | 40.15 | 37.28 | 0.76 |
Overall, the proposed method exhibits good noise robustness. The LIP and PCA methods are the most affected by noise because they neglect noise modeling, thus being sensitive to Gaussian noise and pulse noise. The IPR method provides better noise immunity by first propagating the residuals to the other buses in each iteration before projecting them to the low-pass filter subspace. The proposed method suitably separates harmonic data, Gaussian noise, and sparse outliers through the recovery model based on multivariate norm matrices. The data and noise spaces are separated by joint multidimensional multinorm optimization. This method weakens the influence of noise on the solution of low-rank data matrices. Thus, the proposed method achieves high immunity against noise.
The IEEE 118-bus test system was used to further verify the effectiveness of the proposed method. The diagram of the IEEE 118-bus test system is shown in

Fig. 12 Diagram of IEEE 118-bus test system.
Subgroup | PQMD |
---|---|
I | 3, 7, 9, 11, 12 |
II | 17, 21, 25, 28, 32 |
III | 24, 68, 71, 75 |
IV | 34, 37, 38, 41 |
V | 45, 49, 53, 56 |
VI | 59, 62, 64, 65 |
VII | 77, 79, 92, 96, 100 |
VIII | 83, 85, 90 |
IX | 103, 105, 106, 110 |
As subgroup VIII has the fewest PQMDs, its data recovery environment is the most hostile (fewer devices without data loss mean less recovery information available). Therefore, bus 83 was randomly selected for data recovery in subgroup VIII. Single, continuous, and mixed data loss models were randomly applied to bus 83.
As shown in
Order | Model | MSE (%) | |||
---|---|---|---|---|---|
LIP | PCA | IPR | Proposed | ||
| I (single loss) | 0.01 | 0.02 | 0.02 | 0.02 |
II (30% loss) | 40.38 | 32.14 | 29.58 | 0.40 | |
III (30% loss) | 51.22 | 36.77 | 34.05 | 0.49 | |
| I (single loss) | 0.02 | 0.02 | 0.02 | 0.02 |
II (30% loss) | 42.93 | 32.53 | 31.23 | 0.38 | |
III (30% loss) | 48.85 | 37.27 | 34.73 | 0.46 | |
1 | I (single loss) | 0.02 | 0.02 | 0.02 | 0.02 |
II (30% loss) | 39.95 | 30.21 | 32.29 | 0.42 | |
III (30% loss) | 48.37 | 36.18 | 35.76 | 0.51 | |
2 | I (single loss) | 0.02 | 0.02 | 0.02 | 0.02 |
II (30% loss) | 43.10 | 30.48 | 29.64 | 0.38 | |
III (30% loss) | 50.61 | 35.72 | 35.01 | 0.46 | |
3 | I (single loss) | 0.02 | 0.03 | 0.02 | 0.01 |
II (30% loss) | 44.27 | 32.29 | 28.49 | 0.41 | |
III (30% loss) | 49.58 | 37.55 | 34.19 | 0.52 | |
5 | I (single loss) | 0.02 | 0.02 | 0.02 | 0.02 |
II (30% loss) | 40.22 | 30.99 | 30.26 | 0.36 | |
III (30% loss) | 50.24 | 36.18 | 35.94 | 0.48 |
As shown in
Order | Loss ratio (%) | MSE (%) | |||
---|---|---|---|---|---|
LIP | PCA | IPR | Proposed | ||
| 10 | 20.16 | 11.38 | 10.32 | 0.29 |
20 | 40.28 | 25.83 | 22.15 | 0.43 | |
30 | 49.95 | 37.49 | 34.91 | 0.51 | |
40 | 72.36 | 46.01 | 44.53 | 1.17 | |
50 | 60.37 | 53.36 | 2.29 | ||
60 | 71.99 | 66.47 | 4.52 | ||
70 | 78.72 | 9.25 | |||
| 10 | 22.05 | 12.26 | 10.35 | 0.30 |
20 | 44.73 | 23.34 | 20.82 | 0.43 | |
30 | 54.24 | 37.38 | 36.35 | 0.49 | |
40 | 69.89 | 49.72 | 43.27 | 1.06 | |
50 | 60.30 | 51.83 | 2.16 | ||
60 | 73.88 | 62.18 | 4.28 | ||
70 | 74.49 | 8.66 | |||
1 | 10 | 26.53 | 12.45 | 10.75 | 0.21 |
20 | 44.90 | 23.78 | 18.89 | 0.38 | |
30 | 53.07 | 39.84 | 35.47 | 0.45 | |
40 | 72.25 | 50.01 | 41.25 | 0.97 | |
50 | 64.97 | 48.32 | 1.95 | ||
60 | 57.68 | 3.82 | |||
70 | 69.93 | 7.55 | |||
2 | 10 | 23.80 | 11.84 | 9.95 | 0.28 |
20 | 42.25 | 24.05 | 19.47 | 0.43 | |
30 | 50.27 | 35.82 | 33.85 | 0.55 | |
40 | 63.36 | 47.37 | 41.01 | 1.32 | |
50 | 78.83 | 59.26 | 55.34 | 2.46 | |
60 | 74.49 | 70.28 | 4.19 | ||
70 | 8.28 | ||||
3 | 10 | 20.14 | 10.24 | 12.23 | 0.24 |
20 | 39.27 | 23.46 | 20.47 | 0.39 | |
30 | 49.93 | 38.51 | 35.59 | 0.51 | |
40 | 58.89 | 51.93 | 42.21 | 1.05 | |
50 | 73.21 | 64.28 | 51.04 | 2.27 | |
60 | 65.52 | 4.01 | |||
70 | 74.49 | 8.94 | |||
5 | 10 | 25.53 | 10.23 | 11.14 | 0.32 |
20 | 43.28 | 22.38 | 23.92 | 0.44 | |
30 | 51.94 | 36.04 | 35.14 | 0.52 | |
40 | 64.42 | 44.28 | 45.03 | 1.25 | |
50 | 57.70 | 53.22 | 2.29 | ||
60 | 75.34 | 66.35 | 4.17 | ||
70 | 72.34 | 9.01 |
Overall, the simulation analysis for the IEEE 118-bus system is consistent with that for the IEEE 39-bus system. The effect of harmonic orders on the proposed method is small, and the proposed method retains a high recovery accuracy even under severe data loss.
We used field data from a power system in southeastern China to validate the proposed method. The PQMDs in the region were correlated into clusters to obtain a test platform, as illustrated in

Fig. 13 Diagram of 35 kV regional network for HDR validation.
The MSE values of data recovery for PQMD 2 are listed in
Order | Model | MSE (%) | |||
---|---|---|---|---|---|
LIP | PCA | IPR | Proposed | ||
| I (single loss) | 0.03 | 0.03 | 0.02 | 0.02 |
II (40% loss) | 69.20 | 50.27 | 44.38 | 1.61 | |
III (40% loss) | 75.26 | 55.18 | 48.21 | 1.89 | |
| I (single loss) | 0.02 | 0.03 | 0.02 | 0.02 |
II (40% loss) | 69.28 | 51.06 | 44.57 | 1.63 | |
III (40% loss) | 76.12 | 55.26 | 47.73 | 1.82 | |
| I (single loss) | 0.03 | 0.03 | 0.02 | 0.02 |
II (40% loss) | 70.05 | 52.91 | 42.95 | 1.72 | |
III (40% loss) | 75.12 | 56.38 | 45.26 | 1.91 | |
1 | I (single loss) | 0.03 | 0.03 | 0.02 | 0.02 |
II (40% loss) | 71.02 | 51.98 | 45.72 | 1.61 | |
III (40% loss) | 77.28 | 56.84 | 50.97 | 1.84 |
The LIP method has the worst recovery performance with large errors for single, 40% continuous, and 40% mixed data loss of approximately 0.03%, 70%, and 76%, respectively. The errors of the PCA and IPR methods are lower but still considerably higher than those of the proposed method. In particular, the PCA method has errors of approximately 0.03%, 51%, and 56% for the three data loss models. The error of the IPR method is slightly lower than that of the PCA method but also reaches approximately 48% under 40% mixed data loss. In contrast, the errors of the proposed method are approximately 0.02%, 1.62%, and 1.90% under the three models, respectively, showing that it retains high recovery accuracy and stability for practical applications.
The reliability and accuracy of the proposed method were verified by varying the data loss ratios of the
Loss ratio (%) | MSE (%) | |||
---|---|---|---|---|
LIP | PCA | IPR | Proposed | |
10 | 36.97 | 18.42 | 20.64 | 0.43 |
20 | 53.89 | 32.28 | 29.20 | 0.51 |
30 | 60.56 | 40.19 | 40.81 | 1.18 |
40 | 75.26 | 55.18 | 48.21 | 1.89 |
50 | 79.02 | 65.94 | 3.76 | |
60 | 74.63 | 6.86 | ||
70 | 12.28 |
Model | Running time (s) | |||
---|---|---|---|---|
LIP | PCA | IPR | Proposed | |
I (single loss) | 0.04 | 1.35 | 1.04 | 1.07 |
II (40% loss) | 2.75 | 3.42 | 4.26 | 3.25 |
III (40% loss) | 3.36 | 4.59 | 5.01 | 4.47 |
We introduce an HDR method based on multivariate norm matrices. Through extensive simulation and field data verifications, the following conclusions are drawn.
1) Based on the DTW and normalized cut algorithms, the proposed method can select appropriate PQMDs when the system topology is unknown or parameters change, thereby improving the applicability to large-scale power systems.
2) The adaptive penalty factor can reduce the number of iterations and ensure the convergence stability.
3) Pollution matrices (noise and sparse outliers) are introduced to separate features from original data, thereby improving the anti-interference ability.
4) Comparison experiments at different power system scales and recovery scenarios demonstrate the advantages of the proposed method in terms of speed, accuracy, and applicability.
The proposed method leverages low-rank properties for HDR, being suitable for practical engineering applications. However, the method cannot fully explore system data relationships because the system models are simplified. In future research, we will aim to recover complete data loss (total data loss for all devices in an area) and compensate for systematic errors owing to power transformer deviations or device measurement algorithms.
REFERENCES
C. Zhuang, J. An, Z. Liu et al., “Data completion for power load analysis considering the low-rank property,” CSEE Journal of Power and Energy Systems, vol. 8, no. 6, pp. 1751-1759, Nov. 2022. [Baidu Scholar]
F. L. Grando, A. E. Lazzaretti, and M. Moreto, “The impact of PMU data precision and accuracy on event classification in distribution systems,” IEEE Transactions on Smart Grid, vol. 13, no. 2, pp. 1372-1382, Mar. 2022. [Baidu Scholar]
R. Xu, X. Ma, Y. Wang et al., “Spectral graph theory-based recovery method for missing harmonic data,” IEEE Transactions Power Delivery, vol. 37, no. 5, pp. 3688-3697, Oct. 2022. [Baidu Scholar]
W. Yao, Y. Liu, D. Zhou et al., “Impact of GPS signal loss and its mitigation in power system synchronized measurement devices,” IEEE Transactions on Smart Grid, vol. 9, no. 2, pp. 1141-1149, Mar. 2018. [Baidu Scholar]
M. Wu and L. Xie, “Online detection of low-quality synchrophasor measurements: a data driven approach,” IEEE Transactions on Power Systems, vol. 32, no. 4, pp. 2817-2827, Jul. 2017. [Baidu Scholar]
A. Sundararajan, T. Khan, A. Moghadasi et al., “Survey on synchrophasor data quality and cybersecurity challenges, and evaluation of their interdependencies,” Journal of Modern Power Systems and Clean Energy, vol. 7, no. 3, pp. 449-467, May 2019. [Baidu Scholar]
M. Netto, J. Zhao, and L. Mili, “A robust extended Kalman filter for power system dynamic state estimation using PMU measurements,” in Proceedings of IEEE PES General Meeting, Boston, USA, Jul. 2016, pp. 1-5. [Baidu Scholar]
J. Zhao, G. Zhang, K. Das et al., “Power system real-time monitoring by using PMU-based robust state estimation method,” IEEE Transactions on Smart Grid, vol. 7, no. 1, pp. 300-309, Jan. 2016. [Baidu Scholar]
Q. Cui and Y. Weng, “Enhance high impedance fault detection and location accuracy via μ-PMUs,” IEEE Transactions on Smart Grid, vol. 7, no. 1, pp. 797-809, Jan. 2020. [Baidu Scholar]
K. D. Jones, A. Pal, and J. S. Thorp, “Methodology for performing synchrophasor data conditioning and validation,” IEEE Transactions on Power Systems, vol. 30, no. 3, pp. 1121-1130, May 2015. [Baidu Scholar]
C. Huang, F. Li, L. Zhan et al., “Data quality issues for synchrophasor applications Part II: problem formulation and potential solutions,” Journal of Modern Power Systems and Clean Energy, vol. 4, no. 3, pp. 353-361, Jul. 2016. [Baidu Scholar]
Z. Yang, H. Liu, T. Bi et al., “An adaptive PMU missing data recovery method,” International Journal of Electrical Power & Energy Systems, vol. 116, p. 105577, Mar. 2020. [Baidu Scholar]
J. Yu, A. Y. S. Lam, D. J. Hill et al., “Delay aware power system synchrophasor recovery and prediction framework,” IEEE Transactions on Smart Grid, vol. 10, no. 4, pp. 3732-3742, Jul. 2019. [Baidu Scholar]
Y. Qin and J. G. Zhu, “TDMR with machine learning data detection channel,” IEEE Transactions on Magnetics, vol. 58, no. 2, pp. 1-5, Feb. 2022. [Baidu Scholar]
J. C. Fan and T. Chow, “Deep learning based matrix completion,” Neurocomputing, vol. 266, pp. 540-549, Nov. 2017. [Baidu Scholar]
C. Genes, I. Esnaola, S. M. Perlaza et al., “Robust recovery of missing data in electricity distribution systems,” IEEE Transactions on Smart Grid, vol. 10, no. 4, pp. 4057-4067, Jul. 2019. [Baidu Scholar]
X. Wang, M. Wang, and Y. Gu, “A distributed tracking algorithm for reconstruction of graph signals,” IEEE Journal of Selected Topics in Signal Processing, vol. 9, no. 4, pp. 728-740, Jun. 2015. [Baidu Scholar]
G. Lee, S. H. Kim, D. I. Kim et al., “Multi-channel recovery for distributed quality management of synchrophasor data,” IEEE Transactions on Power Systems, vol. 37, no. 3, pp. 2384-2396, May 2022. [Baidu Scholar]
T. Wu, Y. Zhang, Y. Liu et al., “Missing data recovery in large power systems using network embedding,” IEEE Transactions on Smart Grid, vol. 12, no. 1, pp. 680-691, Jan. 2021. [Baidu Scholar]
J. Yu, D. J. Hill, V. Li et al., “Synchrophasor recovery and prediction: a graph-based deep learning approach,” IEEE Internet of Things Journal, vol. 6, no. 5, pp. 7348-7359, Oct. 2019. [Baidu Scholar]
C. Ren and Y. Xu, “A fully data-driven method based on generative adversarial networks for power system dynamic security assessment with missing data,” IEEE Transactions on Smart Grid, Vol. 34, no. 6, pp. 5044-505, Nov. 2019. [Baidu Scholar]
T. Wu, W. Xue, H. Wang et al., “Extreme learning machine-based state reconstruction for automatic attack filtering in cyber physical power system,” IEEE Transactions on Industrial Informatics, vol. 17, no. 3, pp. 1892-1904, Mar. 2021. [Baidu Scholar]
J. Miao and K. I. Kou, “Color image recovery using low-rank quaternion matrix completion algorithm,” IEEE Transactions on Image Processing, vol. 31, pp. 190-201, Nov. 2022. [Baidu Scholar]
Z. Zhang, M. Dong, K. Ota et al., “Alleviating new user cold-start in user-based collaborative filtering via bipartite network,” IEEE Transactions on Computational Social Systems, vol. 7, no. 3, pp. 672-685, Jun. 2020. [Baidu Scholar]
J. Yi and W. Xu, “Necessary and sufficient null space condition for nuclear norm minimization in low-rank matrix recovery,” IEEE Transactions on Information Theory, vol. 66, no. 10, pp. 6597-6604, Oct. 2020. [Baidu Scholar]
J. Lin and G. Michailidis, “System identification of high-dimensional linear dynamical systems with serially correlated output noise components,” IEEE Transactions on Signal Processing, vol. 68, pp. 5573-5587, Aug. 2020. [Baidu Scholar]
P. Gao, M. Wang, S. G. Ghiocel et al., “Missing data recovery by exploiting low-dimensionality in power system synchrophasor measurements,” IEEE Transactions on Power Systems, vol. 31, no. 2, pp. 1006-1013, Mar. 2016. [Baidu Scholar]
S. Konstantinopoulos, G. M. De Mijolla, J. H. Chow et al., “Synchrophasor missing data recovery via data-driven filtering,” IEEE Transactions on Smart Grid, vol. 11, no. 5, pp. 4321-4330, Sept. 2020. [Baidu Scholar]
K. Mahapatra and N. R. Chaudhuri, “Online robust PCA for malicious attack-resilience in wide-area mode metering application,” IEEE Transactions on Power Systems, vol. 34, no. 4, pp. 2598-2610, Jul. 2019. [Baidu Scholar]
K. Chatterjee, K. Mahapatra, and N. R. Chaudhuri, “ Robust recovery of PMU signals with outlier characterization and stochastic subspace selection,” IEEE Transactions on Smart Grid, vol. 11, no. 4, pp. 3346-3358, Jul. 2020. [Baidu Scholar]
Y. Hao, M. Wang, and J. H. Chow, “Modeless streaming synchrophasor data recovery in nonlinear systems,” IEEE Transactions on Power Systems, vol. 35, no. 2, pp. 1166-1177, Mar. 2020. [Baidu Scholar]
B. Foggo and N. Yu, “Online PMU missing value replacement via event-participation decomposition,” IEEE Transactions on Power Systems, vol. 37, no. 1, pp. 488-496, Jan. 2022. [Baidu Scholar]
E. J. Candes and T. Tao, “The power of convex relaxation: near-optimal matrix completion,” IEEE Transactions on Information Theory, vol. 56, no. 5, pp. 2053-2080, May 2010. [Baidu Scholar]
M. Liao, D. Shi, Z. Yu et al., “An alternating direction method of multipliers based approach for PMU data recovery,” IEEE Transactions on Smart Grid, vol. 10, no. 4, pp. 4554-4565, Jul. 2019. [Baidu Scholar]
J. Mei, M. Liu, Y. Wang et al., “Learning a Mahalanobis distance-based dynamic time warping measure for multivariate time series classification,” IEEE Transactions on Cybernetics, vol. 46, no. 6, pp. 1363-1374, Jun. 2016. [Baidu Scholar]
D. J. Berndt and J. Clifford, “Using dynamic time warping to find patterns in time series,” in Proceedings of KDD Workshop, Seattle, USA, Jul. 1994, pp. 359-370. [Baidu Scholar]
U. V. Luxburg, “A tutorial on spectral clustering,” Statistics and Computing, vol. 17, no. 4, pp. 395-416, Aug. 2007. [Baidu Scholar]
E. J. Candès and B. Recht, “Exact matrix completion via convex optimization,” Foundations of Computational Mathematics, vol. 9, no. 6, pp. 717-772, Apr. 2009. [Baidu Scholar]
K. W. Chen, C. C. Lai, Y. Huang et al., “An adaptive learning method for target tracking across multiple cameras,” in Proceedings of IEEE Conference on Computer Vision & Pattern Recognition, Anchorage, USA, Jun. 2008, pp. 1-8. [Baidu Scholar]
A. Bruckstein, D. Donoho, and M. Elad, “From sparse solutions of systems of equations to sparse modeling of signals and images,” SIAM Review, vol. 51, no. 1, pp. 34-81, Feb. 2009. [Baidu Scholar]
P. Jain, R. Meka, and I. S. Dhillon, “Guaranteed rank minimization via singular value projection,” in Proceedings of Advances in Neural Information Processing Systems, Red Hook, USA, Dec. 2010, pp. 937-945. [Baidu Scholar]
E. J. Candes, X. Li, Y. Ma et al., “Robust principal component analysis,” Journal of the ACM, vol. 58, no. 3, pp. 1-37, May 2011. [Baidu Scholar]