Abstract
This paper presents a new method for the estimation of the injection state and power factor of distributed energy resources (DERs) using voltage magnitude measurements only. A physics-based linear model is used to develop estimation heuristics for net injections of real and reactive power at a set of buses under study, allowing a distribution engineer to form a robust estimate for the operating state and the power factor of the DER at those buses. The method demonstrates and exploits a mathematical distinction between the voltage sensitivity signatures of real and reactive power injections for a fixed power system model. Case studies on various test feeders for a model of the distribution circuit and statistical analyses are presented to demonstrate the validity of the estimation method. The results of this paper can be used to improve the limited information about inverter parameters and operating state during renewable planning, which helps mitigate the uncertainty inherent in their integration.
THE success of the ongoing global energy transition is contingent upon the development of accurate models of distributed energy resources (DERs) to be used for the planning and operation of power systems. The rapid deployment of these DERs consequently leads to the cases of limited data availability and uncertainty regarding unobservable impacts to the distribution system.
One of the remaining limiting factors for high penetrations of DERs is the risk of unforeseen violations of engineering constraints due to unacceptable voltage rises from volatile power injections inherent in DERs. Direct curtailment is typically one of the solutions to this issue. However, this is unfavorable due to the loss of clean generation and revenue for the DER owner.
Advanced inverter technologies have emerged as central elements of the solution to these issues, and research works on their impact on distribution networks have appeared swiftly in [
The primary contributions of this paper are the solutions to the inverse problem of estimating an unknown fixed power factor control setting for a DER. The methods presented here have remarkably low data input requirements, in that only voltage magnitude measurements are required. Although the deployment of advanced metering infrastructure (AMI) continues to expand, most utilities only have access to net energy measurements and do not have access to reactive power measurements, making the calculation of a DER power factor setting impossible. Through the methods developed in this paper, we show that it is possible to recover this setting with highly incomplete measurement data. This is achieved through a physics-based linearization of the AC power flow manifold, through which we demonstrate a novel result that the voltage magnitude sensitivities to real and reactive power injections are linearly independent for a fixed distribution network model. The result allows for a simultaneous estimation of the real and reactive power injection states at a set of DER interconnection points under study, which is used to construct a robust estimate for the power factor.
Furthermore, we enhance the predictive power and applicability of the method by deriving both a regularized regression model and a sparse approximation model, recasting the estimation as convex optimization problems. These reformulations improve the robustness of the estimation method in cases of a poor or inaccurate distribution system model or user uncertainty of the DER’s location, respectively.
The theorem in [
(1) |
(2) |
where and are the complex nodal voltages; and and are the complex conjugate nodal voltages.
These equations and the research work in [
Researchers have taken the interest in data-driven modeling of advanced inverters, but little work has been done in terms of the estimation of their operational parameters and behaviors. Reference [
Research works such as in [
An interest has grown in analyzing how distributed inverter-based systems behave with fixed control parameters [
The literature has only recently explored inverse problems in this domain. Approaches such as those in [
However, the literature lacks the exploration into the inverse problems of estimating DER settings and operational parameters such as the power factor using data-driven methods.
Throughout this paper, we consider a distribution network with nodes and possible points of interconnection for a DER system, where .
1) Voltage Magnitude Sensitivities
The sensitivity matrix of node voltages to both real and reactive power injections is well understood [
Assuming circuit parameters are fixed and injection magnitudes are normalized, the voltage sensitivities to real and reactive power injections, while linearly related, are intrinsically different functions. This distinction has been noted [
We define the voltage sensitivity matrix of a circuit as the change in voltage magnitude at any node due to the presence of a real or reactive power injection at another “candidate” injection bus . The formulation we will use in this paper alternates the columns between real and reactive injections, as shown in (3).
(3) |
where is the voltage magnitude measurement. This gives us an sensitivity matrix that appears as follows:
(4) |
This matrix effectively captures two linearly independent signatures of the power system and describes the response of the node voltages to real and reactive power injections.
In practice, these matrices can be formed using perturb and observe methods [
We will denote each column of the interleaved matrix as , where . Note that this matrix has full column rank when formulated using a power system model without any voltage regulation equipment, and that pairs of columns for electrically identical buses such as those connected by switches or fuses are excluded. This verifies that the columns are linearly independent, and implies that the signatures of voltage sensitivities to real and reactive power are inherently different from the perspective of signal processing.
This distinction can also be motivated through a graphical example using simulated results from the IEEE 13-bus test feeder. If the submatrices of the matrix associated with real and reactive power injections are compared graphically, the functions have highly different behaviors. Submatrices of containing the alternating sensitivity columns corresponding to real or reactive power is shown in

Fig. 1 Submatrices of containing alternating sensitivity columns corresponding to real or reactive power. (a) Sensitivity of node voltage to real power. (b) Sensitivity of node voltage to reactive power.
The real and reactive power injection columns of , while linearly independent, may still exhibit some multicollinearity. We will later show how this contributes to the variance of the least-squares estimator in some use cases.
2) Measurement Data
When working with time-series data, we assume that networked voltage meters will give the user access to voltage profile data at bus for nodes across time horizon . By taking the simple difference between the voltage samples across time, we can form a difference matrix as follows:
(5) |
(6) |
where is the voltage deviation.
Furthermore, we define a measurement vector as a column vector containing samples of each row of the difference matrix at a specific time differential point of interest :
(7) |
(8) |
When considering solar PV, time intervals with low irradiance such as nighttime or cloudy days will have near-zero changes in voltage due to the real and reactive power injections of the PV and its corresponding inverter. Thus, it can be challenging to select in practice for each bus .
One choice is to simply select during midday, during which the irradiance, power magnitude as well as the variance of the injection between and are likely to be the highest [

Fig. 2 A subset of voltage deviation entries in from 08:00 to 16:20 for 100 kW solar PV installation with power factor of 0.8 on 633(1) of the IEEE 13-bus feeder.
Generally, the measurement vector is most likely to best approximate a linear combination of columns of during times of maximum variance in the voltage profile between time points. captures the change in voltage magnitude during the time differential due to a PV injection change at a location . When the vector is normalized by the size of the PV system, the contributions of the columns of corresponding to real and reactive power injections from a PV and var-dispatching inverter located at bus can be estimated with linear regression models.
Since sensor data noise will primarily be driven by additive white Gaussian noise (AWGN), we assume that all model errors are distributed according to a standard normal .
1) Least-squares Regression
The measurements obtained from the sensors will contain noise due to manufacturing errors, sensor class, or model inaccuracies. Hence, when projecting the vector on the subspace of , we obtain close but not exactly equal injection contributions. Using the measurement vector and the voltage sensitivity matrix , we can form a multiple linear regression approximation to estimate the corresponding injection type, location, and quantity as:
(9) |
With many parameter estimation problems, we are estimating a vector when we multiply by our design matrix. We yield an approximation of the measurement targets as:
(10) |
where is the coefficient vector that results in the minimization of the sum of squared errors [
(11) |
2) Shrinkage Methods
A properly formed matrix with will be full rank, allowing for simultaneous estimation of real and reactive power using voltage data only and the distinct real and reactive power sensitivity signatures. However, there may be multicollinearity between the columns of the interleaved . Therefore, the least-squares estimator, while very unbiased, may potentially have high variance [
There are several instances where an ill-posed estimation could be faced. As the number of candidate injection columns in approaches the number of measurement buses, i.e., , we are exposed to a higher risk of overfitting and multicollinearity in the sensitivities as . When , the estimation will become underdetermined.
If the distribution model used to derive the sensitivity matrix is significantly incorrect or out of date, some of the sensitivity signatures may be a poor basis for the model, and there may be instability in the fit against the observed measurement data. In these instances, the high variance of the least-square estimator can be combated by applying ridge regression, which allows the user to place a penalty to the norm of the solution, thus biasing the model toward the solutions that are more regular. This may significantly improve the predictive accuracy of the model.
Forming an underdetermined sensitivity matrix (more columns than rows) may also be necessary if the distribution engineer has the uncertainty about the DER location, but still wishes to estimate the power factor settings. In this case, a higher-dimensional underdetermined will result in few coefficients of the injection estimation vector being relevant to the estimation of the DER’s power factor. This can be cast as a sparse approximation problem. In this case, to form our model, we apply an norm penalty to the least-squares model that we developed previously and optimize for the weight of the penalty that yields the strongest predictive power. This technique is also known as the least absolute shrinkage and selection operator (LASSO) and has the convenient ability to perform simultaneous model selection and feature extraction [
(12) |
where is the Lagrangian penalty factor set by the user. This method, also known as the basis pursuit, promotes sparsity. This means that the resulting estimation vector will have a small number of nonzero components. Unlike the least-squares formulation, there is no closed-form solution [
(13) |
The equivalent Lagrangian form can also be expressed as:
(14) |
where is an element of the measurement vector observed; and is a particular row of .
For cases where there is a high degree of multicollinearity in the sensitivity matrix, e.g., a poor or inaccurate distribution model, we propose the use of ridge regression [
(15) |
This can be converted to the standard least-squares solution in (11) by concatenating to the bottom of and zeros to the bottom of , allowing for a closed-form solution.
This technique smooths out singularities in the sensitivity data. Similar to least-squares, it is assumed that the coefficients of and the residuals of this estimate will also be normally distributed [
For both of the regularized regression models described in (12) and (15), the norm penalty can easily be chosen in practice using modern cross-validation algorithms such as those in [
1) Chi-squared Goodness-of-fit
A chi-squared goodness-of-fit test can be used for the least-squares procedure by noting that the estimated voltage magnitude deviations can be given by . By the central limit theorem, the normalized residuals of these voltage deviations are assumed to be distributed according to a standard normal, i.e.,
(16) |
where is the element of ; and is the residual.
Voltage meters typically have an error of at most less than one percent, so it is assumed that . Note that this is a worst-case scenario, and most meters in practice have errors of less than half a percent.
The least-squares solution minimizes the sum of squares of the residuals according to the definitions in (11). Hence, we can compute the chi-squared test statistic as:
(17) |
The value of can be written succinctly as:
(18) |
where is defined as:
(19) |
Using the observations above, we can then perform a chi-squared goodness-of-fit test with degrees of freedom:
(20) |
where is the goodness-of-fit heuristic of the injection estimates used as the basis for the power factor estimation given the sensitivity model.
However, the validity of (17) does not hold for the shrinkage estimates and , as these estimates by definition do not always reach the global minimum sum of squared errors. Goodness-of-fit tests for LASSO, ridge, and other regularized regression estimates are actually only a recent topic, see [
2) Confidence Intervals
Reference [
(21) |
where is the diagonal element of the variance-covariance matrix for the estimated regression coefficients, defined as:
(22) |
where is the mean squared error (MSE) of the model.
Similar to the chi-squared goodness-of-fit test, this analytic confidence interval can only be obtained for the least- squares estimator. However, a well-known method for achieving a confidence interval that is applicable to all of the aforementioned methods, including least-squares, is the bootstrap method [
Note that the entries of our parameter estimate are the projections of onto the subspace , as described in (9). This interpretation of the model holds for all estimators developed. To obtain confidence intervals for each injection coefficient, we can draw subsets of a random rows of and entries, yielding augmented regression data denoted as and . At each iteration, we then estimate a new from projecting onto the rows of . This method captures the variance in the estimate attributable to the data [
We can obtain a maximum likelihood estimate of the expected value for the injection estimates by finding the sample mean of the bootstrapped statistic:
(23) |
For least-squares regression models, it holds that bootstrap confidence intervals are exactly equivalent to the analytic confidence intervals described in (21) as [
We propose the use of a nonparametric bootstrap confidence interval, specifically the percentile methodology, which is frequently used for the estimators we have selected. Through the resampling process described above, we can obtain an approximate distribution for the coefficient of interest using this method. Finally, a confidence interval for the coefficient can be obtained using the percentiles. In symbols, this would be the such that:
(24) |
The sensitivity matrix can be understood as a static, model-based quantity fixed intertemporally for the power system under analysis. The construction of this matrix is straightforward and is outlined in Algorithm 1.
As discussed previously, must be full rank to perform the estimation. Depending on the power system modeled, some columns may have linear dependencies, because the nodes that they represent may be electrically identical. For instance, the IEEE 13-bus best feeder has a switch between buses 671 and 692 [
(25) |
For the IEEE 13-bus test feeder, we discard the rows and columns associated with bus 692, yielding a full rank matrix. We also use the similarity metric (25) for a hierarchical clustering of the sensitivities, using the single linkage clustering algorithm described in [
The visualization of sensitivity columns in an example matrix formed with 100 kW or 100 kvar injections on the IEEE 13-bus test feeder is shown in

Fig. 3 Visualization of sensitivity columns in formed with 100 kW or 100 kvar injections on IEEE 13-bus test feeder.
Using the injection estimation methods, we form
In this section, we present the results for two case studies using a small (IEEE 13-bus) feeder and a larger (IEEE 123-bus) feeder to demonstrate the performance of the algorithms. The sensitivity matrix models for the feeders are constructed according to (1) and Algorithm 1. As described previously, identical points are identified and filtered using (25), with an tolerance of , and the removal decisions are verified by referencing [
Typically, a DER system outputs a mixture of real and reactive power at the interconnection point. Thus, we will show the results obtained when static injections of real and reactive power are placed on the buses of interest. This model represents a particular measurement instance where a grid-connected PV system simultaneously generates real power, and the PCC voltage is being regulated by a reactive power injection or absorption by its advanced inverter system.
For this experiment, we consider two static three-phase injections on buses 633 and 671 of the IEEE 13-bus feeder, as shown in

Fig. 4 Graph plot of IEEE 13-bus feeder showing interconnection locations for static systems.
By using Algorithm 1, the sensitivity matrix is constructed and preprocessed as described previously using (25). Using the first control flow of
When the estimation problem is well posed, i.e., , least-squares regression can be used without issue, yielding accurate estimates for the injections and power factor shown in Figs.

Fig. 5 Well-posed () least-squares injection estimate for three-phase injections.

Fig. 6 Well-posed least-squares power factor estimation for three-phase injections.
The least-squares injection estimate is highly unbiased, as is the case with all least-squares estimators. However, these estimators typically have a high variance.
To illustrate this tradeoff,

Fig. 7 Bootstrap sampling distributions for a subset of least-squares injection estimate coefficients of IEEE 13-bus model.
With least-squares, while the mean of the coefficient distribution in principle represents a highly accurate estimate of the true parameter, the precision, clearly, may be low for small feeders. This is because for small feeder models such as the IEEE 13-bus case, the matrix has limited samples , which can be attributed to the variance in the coefficient distribution.
As we expand the number of candidate injections in , the problem becomes ill-posed, and the risk of overfitting to the sensitivities increases dramatically. In our experiments, a large amount of instability in the least-squares solution is observed when candidate injection columns are used in . However, using the shrinkage estimator , a much more accurate solution is obtained.

Fig. 8 Ill-posed ridge regression injection estimation vector for three-phase injections.
Distribution engineers may be interested in a wider range of candidate buses when considering feeder models such as the IEEE 123-bus case. Furthermore, in large feeder models, it may be more difficult to preprocess the matrix so that electrically identical rows and columns are removed due to incomplete information regarding the state of the switches.

Fig. 9 LASSO regression coefficients for single-phase injections.
Like ridge, LASSO coefficients also exhibit significantly less variance, as shown in

Fig. 10 Example of bootstrap sampling distributions for a subset of LASSO injection estimate coefficients for IEEE 123-bus model.
Further details of the behavior of the optimization algorithm are shown in

Fig. 11 LASSO injection estimate coefficients and MSE trace plots versus . (a) Coefficients weights versus . (b) MSE versus .
In
To generate the trace in
There are several key limitations to the results in this paper. The preprocessing of time-series data in forming the vector has not yet been fully solved. If the user only has access to this type of data, determining which subset of to sample as may be challenging.
A practical solution is to select a window of values in the midday period of the time-series. Recent developments in the field of statistical learning could be used to perform feature extraction or selection to identify the timepoints of interest [
As for many physics-based inverse problems, the estimation methods provided in this paper may have a high degree of variance. Additional improvements could be studied to determine the proper amount of bias to add to the estimators to enable more interpretable models, particularly when working with non-midday time-series data.
An additional limitation of these methods is that the rough location of the distributed generator is assumed to be known. However, [
Future works will include a more robust time-series feature extraction method as well as the extension of these methods to the estimation of other inverter parameters such as control settings and curtailment.
This research has presented a novel physics-based and data-driven method to estimate the injection state and power factor of inverter-based DERs. Based purely on voltage magnitude measurements and model-derived sensitivities, flexible estimation approaches have been developed for various realistic use cases.
Firstly, we have shown that for a fixed power system model with a properly constructed sensitivity matrix, the real and reactive power voltage sensitivity signatures for a set of candidate buses under study are linearly independent, in which case, is full rank. Notably, we have also shown that there exists a significant enough difference between these signatures to estimate the real and reactive power injection states of inverter-based DERs using purely voltage measurements and linear parameter estimation models.
Additionally, we have shown that regularization methods are highly effective at improving the precision of the linear models in certain use case scenarios. Often, the least squares model may not be a viable option from a practical standpoint. For models with poor goodness of fit as derived in Section II, or when the model characteristics meet those described in Section III, the regularized methods will often be necessary, and will improve the estimation performance.
As the demand and need for solar PV and other inverter-based DERs increase, it is vital to access the information that characterizes these oftentimes unobservable distributed systems for ensuring a smooth transition to a decarbonized grid. These algorithms can be used in power system planning, operation, and control applications to give utilities and ISOs with the ability to estimate important control parameters and the operating point of these distributed generation systems. This can be of service in bolstering sustainable power system planning and decarbonization efforts.
References
M. U. Qureshi, S. Grijalva, and M. J. Reno, “A fast quasi-static time series simulation method for PV smart inverters with VAR control using linear sensitivity model,” in Proceedings of 2018 IEEE 7th World Conference on Photovoltaic Energy Conversion (WCPEC), Waikoloa, USA, Jun. 2018, pp. 1614-1619. [Baidu Scholar]
S. Deshmukh, B. Natarajan, and A. Pahwa, “State estimation and voltage/var control in distribution network with intermittent measurements,” IEEE Transactions on Smart Grid, vol. 5, no. 1, pp. 200-209, Jan. 2014. [Baidu Scholar]
Q. Long, J. Wang, D. Lubkeman et al., “Volt-var optimization of distribution systems for coordinating utility voltage control with smart inverters,” in Proceedings of 2019 IEEE PES Innovative Smart Grid Technologies Conference (ISGT), Washington, USA, Feb. 2019, pp. 1-5. [Baidu Scholar]
A. O’Connell and A. Keane, “Volt-var curves for photovoltaic inverters in distribution systems,” IET Generation, Transmission & Distribution, vol. 11, no. 3, pp. 730-739, Mar. 2017. [Baidu Scholar]
IEEE Standard for Interconnection and Interoperability of Distributed Energy Resources with Associated Electric Power Systems Interfaces, IEEE Standard 1547-2018, 2018. [Baidu Scholar]
L. Blakely, M. J. Reno, and J. Peppanen, “Identifying common errors in distribution system models,” in Proceedings of 2019 IEEE 46th Photovoltaic Specialists Conference (PVSC), Chicago, USA, Jun. 2019, pp. 3132-3139. [Baidu Scholar]
K. Christakou, J. LeBoudec, M. Paolone et al., “Efficient computation of sensitivity coefficients of node voltages and line currents in unbalanced radial electrical distribution networks,” IEEE Transactions on Smart Grid, vol. 4, no. 2, pp. 741-750, Jun. 2013. [Baidu Scholar]
C. McEntee, N. Lu, and D. Lubkeman. (2020, Oct.). A regression-based voltage estimation method for distribution volt-var control with limited data. [Online]. Available: https://arxiv.org/abs/2010.12456 [Baidu Scholar]
M. Rylander, M. J. Reno, J. E. Quiroz et al., “Methods to determine recommended feeder-wide advanced inverter settings for improving distribution system performance,” in Proceedings of 2016 IEEE 43rd Photovoltaic Specialists Conference (PVSC), Portland, USA, Jun. 2016, pp. 1393-1398. [Baidu Scholar]
M. Rylander, J. Smith, and H. Li, “Determination of smart inverter control settings to improve distribution system performance,” in Proceedings of CIGRE US National Committee 2014 Grid of the Future Symposium, Houston, USA, Oct. 2014, pp. 1-6. [Baidu Scholar]
R. Dobbe, P. Hidalgo-Gonzalez, S. Karagiannopoulos et al., “Learning to control in power systems: design and analysis guidelines for concrete safety problems,” Electric Power Systems Research, vol. 189, p. 106615, Dec. 2020. [Baidu Scholar]
M. Emmanuel, J. I. G. Miner, P. Gotseff et al., “Estimation of solar photovoltaic energy curtailment due to volt-watt control,” IET Renewable Power Generation, vol. 14, no. 4, p. 6, Jan. 2020. [Baidu Scholar]
S. Grijalva, A. Khan, J. S. Mbeleg et al., “Estimation of PV location in distribution systems based on voltage sensitivities,” in Proceedings of IEEE North American Power Symposium (NAPS), Tempe, USA, Apr. 2021, p. 5. [Baidu Scholar]
C. Gomez-Peces, S. Grijalva, M. J. Reno et al., “Estimation of PV location based on voltage sensitivities in distribution systems with discrete voltage regulation equipment,” in Proceedings of IEEE PowerTech Madrid, Madrid, Spain, Jun. 2021, pp. 1-6. [Baidu Scholar]
A. Dubey and S. Santoso, “On estimation and sensitivity analysis of distribution circuit’s photovoltaic hosting capacity,” IEEE Transactions on Power Systems, vol. 32, no. 4, pp. 2779-2789, Jul. 2017. [Baidu Scholar]
J. L. Devore and K. N. Berk, Modern Mathematical Statistics with Applications, New York: Springer, 2nd ed., 2012. [Baidu Scholar]
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, New York: Springer, Jan. 2017. [Baidu Scholar]
W. N. van Wieringen. (2020, Aug.). Lecture notes on ridge regression. [Online]. Available: https://arxiv.org/abs/1509.09169 [Baidu Scholar]
F. Pedregosa, G. Varoquaux, A. Gramfort et al., “Scikit-learn: machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825-2830, Dec. 2011. [Baidu Scholar]
R. Lockhart, J. Taylor, R. J. Tibshirani et al., “A significance test for the lasso,” The Annals of Statistics, vol. 42, no. 2, pp. 413-468, Apr. 2014. [Baidu Scholar]
A. Davison and D. Hinkley, “Bootstrap methods and their application,” Journal of the American Statistical Association, vol. 94, p. 5, Jan. 1997. [Baidu Scholar]
W. H. Kersting and G. Shirek, “Short circuit analysis of IEEE test feeders,” in Proceedings of 2012 IEEE PES Transmission and Distribution Conference and Exposition, New Orleans, USA, May 2012, pp. 1-9. [Baidu Scholar]
W. H. Kersting, Distribution System Modeling and Analysis, Boca Raton: CRC Press, 2018. [Baidu Scholar]
D. Müllner. (2011, Sept.). Modern hierarchical, agglomerative clustering algorithms. [Online]. Available: https://arxiv.org/abs/1109.2378 [Baidu Scholar]
W. H. Kersting, “Radial distribution test feeders,” IEEE Transactions on Power Systems, vol. 6, no. 3, pp. 975-985, Aug. 1991. [Baidu Scholar]
Y. Sun, J. Li, J. Liu et al., “Using causal discovery for feature selection in multivariate numerical time series,” Machine Learning, vol. 101, pp. 377-395, Oct. 2015. [Baidu Scholar]
J. Taylor and R. J. Tibshirani, “Statistical learning and selective inference,” Proceedings of the National Academy of Sciences, vol. 112, pp. 7629-7634, Jun. 2015. [Baidu Scholar]
K. Mason, M. J. Reno, L. Blakely et al., “A deep neural network approach for behind-the-meter residential PV size, tilt and azimuth estimation,” Solar Energy, vol. 196, pp. 260-269, Jan. 2020. [Baidu Scholar]