Journal of Modern Power Systems and Clean Energy

ISSN 2196-5625 CN 32-1884/TK

网刊加载中。。。

使用Chrome浏览器效果最佳,继续浏览,你可能不会看到最佳的展示效果,

确定继续浏览么?

复制成功,请在其他浏览器进行阅读

  PDF

None

  • Xiaoge Huang 1 (Student Member, IEEE)
  • Zhenhuan Ding 2 (Member, IEEE)
  • Zhao Liu 3 (Member, IEEE)
  • Tianqiao Zhao 4 (Member, IEEE)
  • Pei Zhang 5 (Fellow, IEEE)
  • Xiaojun Wang 3 (Senior Member, IEEE)
1. Department of Electrical and Computer Engineering, State University of New York at Binghamton, Binghamton, USA; 2. School of Artificial Intelligence, Anhui University, Hefei230601, China; 3. School of Electrical Engineering, Beijing Jiaotong University, Beijing100044, China; 4. Brookhaven National Lab, Upton, USA; 5. School of Electrical Automation and Information Engineering, Tianjin University, Tianjin300072, China

Updated:2025-01-22

DOI:10.35833/MPCE.2024.000279

  • Full Text
  • Figs & Tabs
  • References
  • Authors
  • About
OUTLINE

Abstract

The hybrid photovoltaic (PV)-battery energy storage system (BESS) plant (HPP) can gain revenue by performing energy arbitrage in low-carbon power systems. However, multiple operational uncertainties challenge the profitability and reliability of HPP in the day-ahead market. This paper proposes two coherent models to address these challenges. Firstly, a knowledge-driven penalty-based bidding (PBB) model for HPP is established, considering forecast errors of PV generation, market prices, and under-generation penalties. Secondly, a data-driven dynamic error quantification (DEQ) model is used to capture the variational pattern of the distribution of forecast errors. The role of the DEQ model is to guide the knowledge-driven bidding model. Notably, the DEQ model aims at the statistical optimum, but the knowledge-driven PBB model aims at the operational optimum. These two models have independent optimizations based on misaligned objectives. To address this, the knowledge-data-complementary learning (KDCL) framework is proposed to align data-driven performance with knowledge-driven objectives, thereby enhancing the overall performance of the bidding strategy. A tailored algorithm is proposed to solve the bidding strategy. The proposed bidding strategy is validated by using data from the National Renewable Energy Laboratory (NREL) and the New York Independent System Operator (NYISO).

A. Indices

d Column index of standardized datasets

k Index of iterations

n Sample index of datasets

r Index of scenarios

t Index of time

B. Parameters

ξ1 Forecast error of photovoltaic (PV) generation (MW)

ξ2 Forecast error of market clearing price ($/MWh)

ξ3 Forecast error of under-generation penalty ($/MWh)

ξ1̲,ξ1¯ The minimum and maximum values of ξ1 (MW)

ξ2̲,ξ2¯ The minimum and maximum values of ξ2 ($/MWh)

ξ3̲,ξ3¯ The minimum and maximum values of ξ3 ($/MWh)

η Auxiliary variable

σf,d, σk,d Kernel-hyper parameters

ηc,ηd Charging and discharging efficiencies of battery energy storage system (BESS)

ϵ Risk of uncertainty outlier

π Dual variable

γr Probability of the rth scenario

𝒜 Number of modified training sets

Σ Covariance matrix

λ^ Forecast value of market clearing price ($/MWh)

λ Actual value of market clearing price ($/MWh)

ρ^ Forecast value of under-generation penalty ($/MWh)

ρ Actual value of under-generation penalty ($/MWh)

cB Operational cost of charge or discharge ($/MWh)

Emax The maximum capacity of BESS (MWh)

E Capacity of BESS (MWh)

EtDay BESS storage level of Day at time t (MWh)

G Gaussian process

Kd(·,·) Covariance matrix

Mbig A large auxiliary constant value

P^PV Forecast value of PV generation (MW)

P(θd(N+1)) Distribution of θd(N+1)

P^PV Forecast value of PV generation (MW)

PPV Actual value of PV generation (MW)

PPVmax Capacity of PV (MW)

PBmax Rated power of BESS (MW)

vmp Vector of master problem (MP) binary variables

vsp Vector of subproblem (SP) binary variables

C. Matrices and Sets

X Dataset matrix of forecast values

𝒴 Dataset matrix of forecast errors

μX, μ𝒴 Sample means of standardized dataset matrices ΘX and Θ𝒴

ΣX, Σ𝒴 Sample covariances of standardized dataset matrices ΘX and Θ𝒴

Ξ1,t Set of forecast errors of PV generation at time t

Ξ2,t Set of forecast errors of market clearing price at time t

Ξ3,t Set of forecast errors of under-generation penalty at time t

T Set of hours

D. Uncertain Parameters

ξ1,t Forecast error of PV generation at time t (MW)

ξ2,t Forecast error of market clearing price at time t ($/MWh)

ξ3,t Forecast error of under-generation penalty at time t ($/MWh)

ξ˙2,t Equivalent market clearing price at time t ($/MWh)

ξ˙3,t Equivalent under-generation penalty at time t ($/MWh)

E. Decision Variables

λtbid Bidding price of hybrid PV-BESS plant (HPP) at time t ($/MWh)

Et BESS storage level at time t (MWh)

Ptbid Bidding quantity of HPP at time t (MW)

PPV,tsell PV generation scheduled to be sold at time t (MW)

PB,tarb+, PB,tarb- Charging and discharging power of BESS for arbitrage at time t (MW)

PB,tcom+, PB,tcom- Reserve power to compensate for over- and under-generation with BESS at time t (MW)

PPV,tUG Power of PV under-generation at time t (MW)

PB,tadjust Adjustment of charge scheduling in stage I at time t (MW)

PI Profitability index of bidding strategy

PRBI Balance index of profitability-reliability

RI Reliability index of bidding strategy

ut+, ut- Variables denoting charging and discharging of BESS at time t

I. Introduction

THE movement towards a low-carbon power system spurs the increasing integration of clean energy production, particularly large-scale photovoltaic (PV) installations, into the energy network [

1]. However, this movement poses challenges to the reliability and safety of power system operation due to the unpredictability of clean energy generation [2]. In this context, the application of the hybrid PV-battery energy storage system (BESS) plant (HPP) has gained considerable attention as it enhances the resilience of PV plants against PV generation uncertainties [3]. With the support of BESS, HPP offers a controllable and friendly power supply and can participate in the day-ahead market [4]. However, the involvement of HPP in the market introduces imbalances between system demand and supply. From a system-level perspective, these imbalances can be mitigated by expanding the reserve market or strengthening the system operator intervention [5]. Alternatively, these imbalances can be managed at the plant level. To achieve this, an HPP must account for all uncertainties in its bidding strategy. Thus, the development of an effective bidding strategy to mitigate the impact of uncertainties gains significant attention.

Two fundamental methods are employed to address multiple uncertainties: stochastic optimization (SO) and robust optimization (RO). While SO focuses on risk-neutral decision-making [

6]-[9], RO aims to devise risk-averse strategies [10]-[13]. Recently, researchers have explore approaches to combin SO and RO. For example, the stochastic robust methods in [14]-[16] and stochastic adaptive robust methods in [17] are used in energy trading tasks. The general idea of these methods is to apply RO and SO individually to specific uncertainties. However, these combinations introduce ambiguity about the nature of risk, making it challenging to determine whether the solution is risk-neutral or risk-averse. Distributional robust optimization (DRO) inherits the merits of SO and RO, making it a popular paradigm in scheduling and operation tasks [18]-[20]. However, the DRO typically requires optimization for each historical sample [21], [22], demanding significant computation power. In addition, it is difficult to select an appropriate ambiguity set for DRO when multiple uncertainties exist. The above-mentioned methods primarily encapsulate the engineering knowledge of HPP into tractable optimization models, which are referred to knowledge-driven models in this paper. In contrast to the knowledge-driven model, the fully data-driven deep reinforcement learning (DRL) models are proposed for bidding in [4], [23], and [24], but designing reward functions that ensure convergence remains challenging.

Quantifying forecast errors for operational uncertainties is crucial for the uncertain optimization of HPP operation. Traditional data-driven models such as max percentage error [

11] and historical confidence intervals [10] are widely used. Recently, the advanced data-driven models such as Gaussian mixture [25] and Dirichlet process mixture models [14] provide more flexibility in capturing uncertainties from large datasets. The optimization-based models are also explored to customize uncertainty sets [20]. Notably, the operational uncertainties are highly volatile and can be affected by non-random factors such as day length, temperature, and clearness indices [26]. While the data-driven models can capture overall dataset uncertainties, they cannot reveal the dynamic volatility of the uncertainties. To fill this gap, a data-driven dynamic error quantification (DEQ) model is proposed to learn the variational pattern of the distribution of forecast errors and estimate the dynamic distribution of forecast errors.

Considering the recent surge in data-driven models, the combination of data-driven and knowledge-driven models is the trend for uncertainty-dependent optimization in power systems. For instance, the support vector machine models with microgrid economic dispatch are integrated in [

27]. The graph deep learning is combined with security-constrained unit commitment in [28]. The clustering technique is embedded in the unit commitment problem in [29]. A framework merging reinforcement learning and distribution voltage management is proposed in [30]. These implementations showcase the potential of the knowledge-data-combined paradigm. In this paper, we combine the data-driven DEQ model and the knowledge-driven bidding model to improve the performance of HPP.

Furthermore, we notice that the DEQ model and the bidding model exhibit misaligned optimization objectives when being simply linked together. The DEQ model is technically a black box of mined data, aiming at statistical minimization. However, the bidding model is oriented towards the operational objectives of HPP. Although the DEQ model is designed to improve the bidding model, the operational performance of HPP is not considered in the optimization of DEQ model, leading to indirect optimization guidance. Note that this misalignment of optimization objectives is a prevalent issue in current practices of knowledge-data-combined methods. To address this issue, we propose a knowledge-data-complementary learning (KDCL) framework to combine the DEQ and penalty-based bidding (PBB) models, which aligns the DEQ model with the operational objectives of HPP. The major contributions are summarized as follows.

1) A PBB model is proposed for HPP. Innovatively, we propose the concept of potential generation cost of HPP to quantify the impact of PV uncertainties based on the market penalty for generation deviations. The PBB model optimizes the HPP revenue against PV uncertainties by integrating the potential generation cost in the formulation. A novel pricing method is designed in the PBB model based on potential generation cost. The PBB model supports both risk-neutral and risk-averse solutions, catering to diverse user preferences.

2) A DEQ model is proposed to quantify the dynamic distributions of various forecast errors in an HPP. Based on the DEQ model, the dynamic data-driven constraints can be constructed, reducing the conservatism of static error modeling.

3) A KDCL framework is proposed to reinforce the PBB model by combining DEQ and PBB models. The KDCL framework uses the output of PBB model to enhance the training data, further improving the profitability and reliability of the bidding strategy. Notability, the KDCL framework is a general framework that can be easily applied to other topics with knowledge-data-combined strategies.

4) A solution methodology is proposed for the PBB model and KDCL framework. Mathematical techniques are used to reformulate the PBB model, and a tailored column-and-constraint generation (C&CG) algorithm is developed for higher computational efficiency.

The rest of this paper is organized as follows. Section II outlines the general knowledge-data-complementary procedure in power system operations. Section III formulates the proposed bidding strategy for HPP via KDCL framework. Section IV describes the solution methodology. Numerical results are conducted in Section V. The conclusion of this paper is given in Section VI.

II. General Knowledge-data-complementary Procedure in Power System Operations

This section first outlines the limitations of regular knowledge-data-combined procedure in power systems. Then, the necessity and advance of the KDCL framework are presented.

A. Limitations of Regular Knowledge-data-combined Procedure

In power system applications, the data-driven and knowledge-driven models adopt distinct technical routes. Data-driven models, typically based on learning algorithms, extract and quantify operational uncertainties from historical data. In contrast, knowledge-driven models formulate engineering knowledge as an operation decision-making model, commonly using optimization techniques in practice.

Figure 1(a) depicts the a regular knowledge-data-combined procedure in power systems. It begins with the data-driven model, which optimizes its accuracy in quantifying uncertainty offline and outputs data-driven parameters to the knowledge-driven model online. The data-driven parameters are diverse, including point forecasts, uncertainty distributions, etc. The knowledge-driven model is used fully online with data-driven parameters. It is optimized oriented to the operational interests, and its solution is adopted as the optimal decision.

Fig. 1  Regular knowledge-data-combined procedure and KDCL framework in power system operations. (a) Regular knowledge-data-combined procedure. (b) KDCL framework.

The regular knowledge-data-combined procedure seems reasonable if point forecast errors are zero or the relationship between quantification accuracy and operational objective is monotonic. However, achieving zero error is almost impossible. Also, the relationship between quantification accuracy and operational objective can be asymmetric to zero error and non-monotonic, as revealed in [

31] and [32].

In this context, the learning objective of data-driven models should be extended to go beyond accuracy and be aligned with knowledge-driven objectives.

B. KDCL Framework

To align the objectives of data-driven and knowledge-driven models, one can embed the operational objective in the learning loss [

31] or formulate the data-driven model as a constraint in the knowledge-driven model [32]. However, the data-driven and knowledge-driven models themselves can be non-linear and non-differentiable, which poses challenges to these models.

To this end, a KDCL framework that offers a model-free route is proposed to align the objectives of data-driven and knowledge-driven models, as depicted in Fig. 1(b). Unlike regular learning, the KDCL framework involves the operation optimization process in offline learning. A key innovation is the knowledge-guided data enhancement model, which fine-tunes the training data based on the optimization result, enabling the knowledge-driven model to guide the data-driven learning. Importantly, KDCL framework preserves the link between data-driven and knowledge-driven models via data-driven parameters, allowing them to use their standard solutions. The KDCL framework removes restrictions on the formats of the data-driven and knowledge-driven models and improves the optimal operation.

III. Bidding Strategy for HPP via KDCL Framework

This section presents a proof-of-concept study for the KDCL framework introduced in Section II, applying it to the bidding task of HPP. To suit this task, three models are proposed: a data-driven DEQ model, a knowledge-driven PBB model, and a knowledge-guided data enhancement model. These models are interconnected as depicted in Fig. 1(b), forming a bidding strategy for the HPP via KDCL framework. Detailed descriptions of these models are provided below.

A. Data-driven DEQ Model

The data-driven DEQ model quantifies forecast errors of PV generation, market price, and under-generation penalty. The distribution of these forecast errors are shaped by forecast methods, leading to variations across different HPPs. Given these variations, we choose the Gaussian distribution to model the forecast errors. The Gaussian distribution is chosen for its robustness against inaccurate distribution assumptions [

33], effectively accommodating error variations from different forecast methods. Additionally, we use the Gaussian process [34] to capture the temporal variability in the error distribution of a specific forecast method. Therefore, the distribution of forecast errors is no longer static but a self-adapting function that responds to highly volatile forecast errors.

The errors between forecast values and actual values are defined as:

ξ1=PPV-P^PV (1a)
ξ2=λ-λ^ (1b)
ξ3=ρ-ρ^ (1c)

Given N samples, the dataset matrices 𝒳 and 𝒴 are expressed as:

𝒳=P^PV(1)λ^(1)ρ^(1)P^PV(2)λ^(2)ρ^(2)P^PV(N)λ^(N)ρ^(N)𝒴=ξ1(1)ξ2(1)ξ3(1)ξ1(2)ξ2(2)ξ3(2)ξ1(N)ξ2(N)ξ3(N) (2)

The standardized dataset matrices Θ𝒳 and Θ𝒴 can be defined as:

Θ𝒳=(𝒳-μ𝒳)Σ𝒳-12=θ^1(1)θ^2(1)θ^3(1)θ^1(2)θ^2(2)θ^3(2)θ^1(N)θ^2(N)θ^3(N) (3)
Θ𝒴=(𝒴-μ𝒴)Σ𝒴-12=θ1(1)θ2(1)θ3(1)θ1(2)θ2(2)θ3(2)θ1(N)θ2(N)θ3(N) (4)

Assuming that the forecast values of PV generation, market price, and under-generation penalty are output by three independent forecast algorithms, we can define the mappings from standardized forecasts to errors as:

θd=fd(θ^d)    d=1,2,3 (5)

We use the Gaussian process to characterize errors:

fd~ G(0, Σd) (6)

The (l1,l2) element in covariance Σd is calculated by (7). In line with other applications of the Gaussian process [

35], a kernel function kd(·) is used for ease of computation.

Σd(l1,l2)=kd(θ^d(l1),θ^d(l2))=σf,d2exp-||θ^d(l1)-θ^d(l2)||22σk,d2 (7)

Let ΘX,d and ΘY,d denote the dth column in standardized matrices ΘX and ΘY, respectively. Given a new standardized forecast θ^d(N+1), the joint distribution of ΘY,d and standardized forecast error θd(N+1) is expressed as:

ΘY,dθd(N+1)~N0,Kd(ΘX,d,ΘX,d)Kd(ΘX,d,θ^d(N+1))Kd(θ^d(N+1),ΘX,d)Kd(θ^d(N+1),θ^d(N+1)) (8)

where N(0,·) is the Gaussian distribution.

For vectors XA=[x1A,x2A,,xnA]T and XB=[x1B,x2B,,xmB]T, we have:

Kd(XA,XB)=kd(x1A,x1B)kd(x1A,x2B)kd(x1A,xmB)kd(x2A,x1B)kd(x2A,x2B)kd(x1A,xmB)kd(xnA,x1B)kd(xnA,x2B)kd(xnA,xmB) (9)

Based on the joint distribution in (8), we can obtain the dynamic distribution of θd(N+1) as:

P(θd(N+1)|ΘX,d,ΘY,d,θ^d(N+1))~N(μd(N+1),σd(N+1)) (10a)
μd(N+1)=Kd(θ^d(N+1),ΘX,d)Kd-1(ΘX,d,ΘX,d)Θ𝒴,d (10b)
σd(N+1)=σf,d2-Kd(θ^d(N+1),ΘX,d)·Kd-1(ΘX,d,ΘX,d)Kd(ΘX,d,θ^d(N+1)) (10c)

Based on the dynamic distribution in (10), the DEQ model outputs the following two types of data-driven parameters of PBB model.

1) The dynamic distribution in (10) can be represented by a large number of sampled scenarios. Denoting the sampled scenarios as θd,t,r, the representative scenarios of forecast errors can be obtained by:

[ξ1,t,r,ξ2,t,r,ξ3,t,r]=[θ1,t,r,θ2,t,r,θ3,t,r]ΣX12+μX (11)

2) Dynamic uncertainty set: the vector θt=[θ1,t,θ2,t,θ3,t] denoting the uncertainty set can be calculated in (12). ϵ in (12c) is the probability of an adjustable uncertainty outlier. Formula (12c) denotes the probability of θt falls outside [θt̲,θt¯] should be less than ϵ, where θt̲ and θt¯ are the minimum and maximum of θt, respectively. The objective function (12a) minimizes the size of the dynamic uncertainties. Note that (12a) is designed for the standardized error because it is closer to a cube in the three-dimensional distribution than the original error.

minθt¯,θt̲{1(θt¯-θt̲)} (12a)

s.t.

θt̲θt¯ (12b)
P(θt̲θtθt¯)1-ϵ (12c)

The uncertain dynamic forecast errors can be obtained as:

θt̲ΣX12+μX[ξ1,t,ξ2,t,ξ3,t]θt¯ΣX12+μ𝒳 (13)

B. Knowledge-driven PBB Model

The knowledge-driven PBB model formulates the bidding-related engineering knowledge as two decision stages: here-now (H&N) stage and wait-see (W&S) stage, corresponding to PV-uncertainty-independent and PV-uncertainty-dependent decisions, respectively.

1) In stage I, i.e., the H&N stage, the day-ahead bidding decision and the arbitrage schedules are determined.

2) Based on the bidding decision, the HPP may receive the under-generation penalty. We assume that HPP is penalized only for under-generation, as over-generation can be managed with inner plant PV curtailment.

3) In stage II, i.e., the W&S stage, the schedules of HPP are adjusted to minimize the economic losses and compensate for under-generation. We assume that HPP may not join the real-time market due to the challenge of estimating tradable energy in highly renewable markets [

36].

4) Decisions can be made in risk-averse or risk-neutral mode.

The objective function in stage I Obj1(·) includes the income and costs of HPP, as expressed in (14). The income in stage I comes from the anticipated revenues from power trading minus the primary costs from the operating costs of the BESS. The variables in stage I are grouped in a vector q in (14).

Obj1(q)=tT(λ^t+ξ2,t)Ptbid-tTcB(PB,tarb++PB,tarb-)q=[PPV,tsell,PB,tarb+,PB,tarb-,ut+,ut-,Ptbid]TR6×|T| (14)

The constraints in stage I are expressed in (15a)-(15f). In (15a), the PV generation is limited by its maximum capacity. The PV generation can be sold to the grid or used to charge the BESS. In (15b), the bidding decisions are made by superimposing the PV generation used for selling and the discharging power of BESS for arbitrage. In (15c)-(15f), BESS is scheduled for arbitrage. We used two binary integer variables, ut+ and ut-, to switch the charging and discharging states of BESS. Formulas (15e)-(15f) constrain that ut+ and ut- cannot both be 1 at the same time, so the BESS can be either in charging or in discharging state at any given time t.

0PPV,tsell+PB,tarb+PPVmax (15a)
Ptbid=PPV,tsell+PB,tarb- (15b)
0PB,tarb+PBmaxut+ (15c)
0PB,tarb-PBmaxut- (15d)
ut++ut-1 (15e)
ut+{0,1}ut-{0,1} (15f)

The objective function in stage II Obj2(·) is defined in (16), which quantifies the economic losses of HPP in response to both the decision in stage I and the realization of the PV generation. The first term of (16) is the under-generation penalty after reduction by the BESS. The second term is the operational cost of the BESS. The variables in stage II are grouped in a vector x in (16).

Obj2(x)=-tT(ρ^t+ξ3,t)(PPV,tUG-PB,tcom-)-tTcB(PB,tcom++PB,tcom--PB,tadjust)x=[PPV,tUG,PB,tadjust,PB,tcom-,PB,tcom+,E]TR5×|T| (16)

The constraints in stage II are expressed in (17a)-(17i). Formula (17a) and (17b) use slack variables PPV,tUG and PB,tadjust to balance the scheduled PV generation PPVsell+PBarb-+PB,tcom+ and the actual PV generation P^PV+ξ1,t. Equation (17c) squeezes the solution space of the model formed by (17a)-(17i). The advantage of squeezing solution space constraction is shown in Section IV-A. The reserve power for under-generation/over-generation is modeled in (17d)-(17g). While the BESS schedules are decoupled, (17f) and (17g) limit the total charging and discharging power of BESS below the rated power. The BESS storage energy is modeled in (17h) and is limited by the capacity of BESS in (17i).

(PPV,tsell-PPV,tUG)+(PB,tarb+-PB,tadjust)+PB,tcom+=P^PV,t+ξ1,t (17a)
PPV,tUGPPV,tsellPB,tadjustPB,tarb+ (17b)
(PPV,tUG+PB,tadjust)PB,tcom+=0 (17c)
0PB,tcom+PBmaxut+ (17d)
0PB,tcom-PPV,tUG (17e)
0PB,tarb++PB,tcom+-PB,tadjustPBmax (17f)
0PB,tarb-+PB,tcom-PBmaxut- (17g)
Et=Et-1+PB,tarb+ηc-PB,tadjustηc-PB,tarb-1ηdt+PB,tcom+ηc-PB,tcom-1ηdt (17h)
0EtEmax (17i)

Inspired by [

37], when the risk-averse mode is chosen, the objective functions Obj1RA(q) and Obj2RA(x) are formulated considering the worst-case scenario of uncertain parameters ξ2,t and ξ3,t, respectively. 

Obj1RA(q)=minξ2,tΞ2,t tT(λ^t+ξ2,t)Ptbid-tTcB(PB,tarb++PB,tarb-)Ξ2,t={ξ2,t:ξ2,t̲ξ2,tξ2,t¯} (18a)
Obj2RA(x)=minξ3,tΞ3,t-tT(ρ^t+ξ3,t)(PPV,tUG-PB,tcom-)-tTcB(PB,tcom++PB,tcom--PB,tadjust)Ξ3,t={ξ3,t:ξ3,t̲ξ3,tξ3,t¯} (18b)

The PBB model in the risk-averse mode is modeled as (19a)-(19d). The min (·) function in (19a) represents the worst case of the forecast error of PV generation. FRA(q,ξ1) represents that the W&S decision x is made after the H&N decision q and the uncertainty ξ1 are determined. The data-driven parameters used in (19d) are obtained from (13).

maxqObj1RA(q)+minξ1,tΞ1,tFRA(q,ξ1) (19a)

s.t.

(15a)-(15f) (19b)
FRA(q,ξ1)=maxx{Obj2RAx: (17a)-(17i)} (19c)
Ξ1,t={λt:ξ1,t̲ξ1,tξ1,t¯}      tT (19d)

In the risk-neutral mode, following [

38], the objective functions Obj1RN(q) and Obj2RN(x) are formulated based on the probability-weighted scenario of uncertain parameters ξ2,t and ξ3,t, respectively.

Obj1RN(q)=rγrtT(λ^t+ξ2,t,r)Ptbid-tTcB(PB,tarb++PB,tarb-) (20a)
Obj2RN(x)=rγr-tT(ρ^t+ξ3,t,r)(PPV,tUG-PB,tcom-)-tTcB(PB,tcom++PB,tcom--PB,tadjust) (20b)

The PBB model in the risk-neutral mode is modeled as (21a)-(21d). Eξ1(FRN(q,ξ1)) in (21a) is the expectation of FRN(q,ξ1) with the probability distribution of ξ1 specified in (21d). The data-driven scenarios of uncertain parameters are obtained from (11).

maxqObj1RN(q)+Eξ1(FRN(q,ξ1)) (21a)

s.t.

(15a)-(15f) (21b)
FRN(q,ξ1)=maxxObj2RN(x): (17a)-(17i) (21c)
ξ1,t~P(ξ1,t|P^PV,t) (21d)

An HPP can bid at zero prices to ensure that bids are accepted [

16]. However, such a practice can cause high under-generation penalties and harm modern markets. Instead, we use the potential generation cost given in (22) as the bidding price. The potential generation cost is viewed as the marginal price of energy generation from the HPP, thus maximizing the accepted bids while preventing high under-generation penalties.

λtbid=cB(PB,tarb++PB,tarb--PB,tadjust+PB,tcom++PB,tcom-)+(ρ^t+ξ˙3,t-λ^t-ξ˙2,t)(PPV,tUG-PB,tcom-) (22)

The expression in (22) consists of two terms. The first term represents the cost incurred from the optimal scheduling of BESS operation. The second term depicts the potential economic loss caused by penalties.

C. Knowledge-guided Data Enhancement Model

The knowledge-guided data enhancement model fine-tunes the training data of DEQ model based on the performance of PBB model. In bidding tasks, the reliability of operational decisions often inversely relates to profitability. The data enhancement aims for an optimal balance of reliability and profitability.

The actual income of HPP can directly evaluate the profitability of a strategy. Define PI in (23a) as the profitability of the bidding strategy.

PI=ICAct(q,x) (23a)

The deviation between the actual income ICAct(q,x) and the day-ahead expected income ICDA(q,x) can evaluate the reliability of the bidding strategy. Let RI be the reliability index of the bidding strategy, which is calculated as:

RI=1|ICAct(q,x)-ICDA(q,x)| (23b)

A balance index of profitability and reliability PRBI is defined in (23c) to reflect the profitability and reliability balance level. The knowledge-guided data enhancement mode is designed to edit the training set for the DEQ model to approach the maximum PRBI.

PRBI=PIRI=ICAct(q,x)|ICAct(q,x)-ICDA(q,x)| (23c)

The data enhancement is based on the observed engineering knowledge, i.e., higher extreme levels in the training data set of DEQ model can increase the reliability of operational decision but reduce the profitability. Thus, approaching the maximum PRBI can be defined as a process of adjusting the extreme level of the dataset by executing the following five steps.

Step 1:   calculate the extreme level for each sample in the original training set of DEQ model.

Step 2:   sort the sample in the original training set with their extreme levels.

Step 3: replicate 𝒜 copies of the original training set and drop out α𝒜+1×100% of the most extreme levels of the dataset in the αth copied set.

Step 4:   train the DEQ model with the original training set and 𝒜 modified training sets to obtain 𝒜+1 independent DEQ models. Use the original training set as input of 𝒜+1 DEQ+PBB models and obtain 𝒜+1 sets of bids and schedules.

Step 5:   calculate the 𝒜+1 copies of PRBI based on bids and schedules from Step 4, and select the training set with the maximum PRBI as the enhanced dataset.

The data enhancement essentially employs grid search [

39] to approach the maximum PRBI. Fundamentally, 𝒜 is the number of the modified training sets. An increase of 𝒜 results in denser grids, enhancing the precision of the search result but also raising the computation costs. Typically, selecting 𝒜 is empirical, aiming to balance computational efficiency and the precision of the search result.

Remark 1: the data enhancement is performed in two modes separately. We assume the case where the income of the worst case is lower than the actual income is undesirable for users in risk-averse mode. Therefore, we set PRBI=0 if the risk-averse mode is selected and ICAct(q,x)-ICDA(q,x)<0.

Remark 2: the extreme level of a sample in Step 1 is defined as the entropy reduction to the dataset when the sample drops out. Entropy reflects the disorder level of a dataset. Following the definition, the extreme level Eextreme measures the contribution of a sample to the disorder level of the dataset, as calculated in (24). Eentropy(𝒴-ξ) denotes the entropy of the training dataset matrix Y without ξ. The entropy of a dataset can be estimated via the method [

40].

Eextreme(ξ)=Eentropy(𝒴)-Eentropy(𝒴-ξ) (24)

IV. Solution Methodology

The solution procedure of the proposed bidding strategy is illustrated in Fig. 2. To execute the solution procedure, we need the solution methodology of the PBB model in the risk-averse and risk-neutral modes described below.

Fig. 2  Solution procedure of proposed bidding strategy.

A. Solution Procedure of PBB Model in Risk-averse Mode

With any given Ptbid0, we always have (25a). Therefore, we set the equivalent price in risk-averse mode as (25b) for easy calculation.

ξ2,t̲ξ2,t:minξ2,tΞ2,ttT(λ^t+ξ2,t)Ptbid (25a)
ξ˙2,tRA=ξ2,t̲ (25b)

According to (17e), we have PPV,tUG-PB,tcom-0. Similar to (25a), with any given PPV,tUG-PB,tcom-, we have (26a). The equivalent penalty in the risk-averse mode is set as (26b).

ξ3,t¯ξ3,t:minξ3,tΞ3,t-tT(ρ^t+ξ3,t)(PPV,tUG-PB,tcom-) (26a)
ξ˙3,tRA=ξ3,t¯ (26b)

Then, (18a) and (18b) can be reformulated as:

Obj1RA(q)=tT(λ^t+ξ˙2,tRA)Ptbid-tTcB(PB,tarb++PB,tarb-)Obj2RA(x)=-tT(ρ^t+ξ˙3,tRA)(PPV,tUG-PB,tcom-)-tTcB(PB,tcom++PB,tcom--PB,tadjust) (27)

According to the risk-averse mode, (27) can be cast as:

minq {kTq+maxξ1Ξ1(q,ξ1)} (28a)

s.t.

Nqg (28b)
(q,ξ1)=minx,x-,x+ cTx    Gxh-Eq-Mξ1 (28c)
x-x+=0 (28d)
x-=[x1,x2,,x|T|]+[x|T|+1,x|T|+2,,x2|T|] (28e)
x+=[x3|T|+1,x3|T|+2,,x4|T|] (28f)

max{min(·)} in (19a) can be transferred to min(·)+max(·) in (28a) by adding a minus sign to the objective function. Formula (28b) is cast from (15a)-(15f) and (28c) is cast from (17a), (17b), and (17d)-(17i). Formulas (28d)-(28f) are cast from (17c) to squeeze the solution space.

The model in (28a)-(28c) is a typical two-stage mix-integer linear (MIL) problem that can be solved by C&CG algorithm [

41]. The key challenge arises when integrating (28d)-(28f) and the problem is transformed into a mixed-integer non-linear format, which is beyond the scope of the standard C&CG algorithm. To address this, we develop a tailored C&CG algorithm. Inspired by the C&CG algorithm, we decompose (28a)-(28c) into a master problem (MP) and a subproblem (SP). Building on the C&CG algorithm, constraints (28d)-(28f) are reformulated and added to both MP and SP. When solving SP, the critical scenarios of PV generation are identified. The identified scenarios are added to MP as generated constraints to cut the solution space. The optimum can be obtained by solving MP and SP iteratively.

The MP associated with (28) is expressed as:

minq,ηR,xkR5×|T|,xk-,xk+R|T| {kTq+η} (29a)

s.t.

Nqg (29b)
ηcTxk      kK (29c)
Gxkh-Eq-Mξ*k      kK (29d)
xk-xk+=0      kK (29e)
xk-=[x1k,x2k,...,x|T|k]+[x|T|+1k,x|T|+2k,...,x2|T|k]      kK (29f)
xk+=[x3|T|+1k,x|3T|+2k,...,x4|T|k]      kK (29g)

where K={1,2,,kmax}; and ξ*k represents the critical scenarios identified from the SP.

A new ξ*kmax is collected and stored in a set ={ξ*1,ξ*2,,ξ*kmax}. xk is introduced to add the collected critical scenarios to MP. To write (29) in a tractable form, (29e) can be equivalently converted as:

xk+Mbigvmp (30a)
xk-Mbig(1-vmp) (30b)

vmp(i) ensures one of xk-(i) and xk+(i) is 0, and thus (29e) can be met. By solving (29), the optimal solution q* can be obtained. Therefore, MP is reformulated as an MIL problem that can be solved by commercial solvers.

The SP associated with (28) can be expressed as:

𝒬(q*)=maxξ1Ξ1   minx,x-,x+cTx (31a)

s.t.

Gxh-Eq*-Mξ1 (31b)
(28d)-(28f) (31c)

The max-min problem in (31a) can be converted to (32a)-(32f) using the Karush-Kuhn-Tucker (KKT) conditions.

𝒬(q*)=maxξ1,x,π,x-,x+cTx (32a)

s.t.

Gxh-Eq*-Mξ1      ξ1Ξ1 (32b)
GTπc (32c)
(Gx-h+Eq*+Mξ1)π=0 (32d)
(c-GTπ)x=0 (32e)
(28d)-(28f) (32f)

To handle (32d), (32e), and (28d), we further transfer (32) into (33), which is an MIL problem.

𝒬(q*)=maxξ1,x,π,x-,x+,vsp1,vsp2,vsp3cTx (33a)

s.t.

(28e), (28f), (32b), (32c) (33b)
(Gx-h+Eq*+Mξ1)Mbigvsp1 (33c)
πMbig(1-vsp1) (33d)
(c-GTπ)Mbigvsp2 (33e)
xMbig(1-vsp2) (33f)
x+Mbigvsp3 (33g)
x-Mbig(1-vsp3) (33h)

The process of the tailored C&CG algorithm is shown in algorithm 1.

Algorithm 1  : tailored C&CG algorithm

Initialization: lower bound LB=-, upper bound UB=+, kmax=1, and K={·}, E={·}.

1.

 Solve (10) and obtain the uncertainty sets by (13)

2.

 while |UB-LB| is large than tolerance TOL do

3.

   Solve MP and obtain the optimum q* and η*

4.

   LB=kTq*+η*

5.

   Solve SP to obtain optimum x* and critical scenario ξ*kmax

6.

   UB=min{UB,kTq*+Q(q*)}

7.

      if |UB-LB|TOL break

8.

      else kmax=kmax+1, K={Kkmax}, E={Eξ*kmax}

9.

   end if

10.

 end while

Output: q*, x*

To summarize, Algorithm 1 describes the procedure for solving (28a)-(28f).

Remark 3: KKT condition-induced complementary constraints (33c)-(33f) notably elevate computation time [

42]. In the tailored C&CG algorithm, (33g) and (33h) squeeze the solution space of x, thus reducing the computational burden from complementary constraints (33c)-(33f).

B. Solution Procedure of PBB Model in Risk-neutral Mode

The equivalent price and penalty in risk-neutral mode are defined as:

ξ˙2,tRN=rγr(λ^t+ξ2,t,r) (34a)
ξ˙3,tRN=rγr(λ^t+ξ3,t,r) (34b)

Therefore, (20a) and (20b) can be reformulated as:

Obj1RN(q)=tT(λ^t+ξ˙2,tRN)Ptbid-tTcB(PB,tarb++PB,tarb-)Obj2RN(x)=-tT(ρ^t+ξ˙3,tRN)(PPV,tUG-PB,tcom-)-tTcB(PB,tcom++PB,tcom--PB,tadjust) (35)

Cast the PBB model in risk-neutral mode (21a)-(21d) and (35) as:

minqkTq+Eξ1(F(q, ξ1)) (36a)

s.t.

Nqg (36b)
F(q,ξ1)=minx, x-,x+cTx    Gxh-Eq-Mξ1 (36c)
x-x+=0 (36d)
x-=[x1,x2,...,x|T|]+[x|T|+1,x|T|+2,...,x2|T|] (36e)
x+=[x3|T|+1,x|3T|+2,...,x4|T|] (36f)

Prepare ξ1,r (rR) via scenario generation in (11). The model in (36a)-(36f) can then be reformulated as a tractable form, as defined in (37a)-(37f). To reformulate (36c), we introduce the auxiliary variable βr to rewrite the minimum of (36c) as (37c). We transfer (36d)-(36f) to (37e)-(37f) by using the big-M method. Therefore, the PBB model in risk-neutral mode is reformulated as a solvable MIL problem.

minq,βR|R|,xrR,vRN{0,1}|T| kTq+1|R|rRβr (37a)

s.t.

Nqg (37b)
βrcTxr (37c)
Gxrh-Eq-Mξ1,r (37d)
[xr,3|T|+1,xr,|3T|+2,...,xr,4|T|]MbigvRN (37e)
[xr,1,xr,2,...,xr,|T|]+[xr,|T|+1,xr,|T|+2,...,xr,2|T|]Mbig(1-vRN) (37f)

V. Numerical Results

This section reports numerical results. We first use a representative one-day case to demonstrate the execution of the proposed bidding strategy. Then, the proposed bidding strategy is tested with a one-year dataset. We separate the one-year dataset into a training set (181 days of the data) and a testing set (184 days of the data). The training set will demonstrate offline learning with the KDCL framework. The testing set will analyze the economic performance of the proposed bidding strategy.

The HPP used in the simulation contains a 21 MW PV and a 10 MW/10 MWh BESS. The characteristics of the BESS are the same as those in [

43]. The PV data of the HPP, including the forecast and actual data, are taken from the National Renewable Energy Laboratory (NREL) solar power data [44]. Specifically, we select data from a PV site at 40°51'00.0"N 73°51'00.0"W, near the New York Metropolitan area. Additionally, we collect the data of the clearing price and under-generation penalty from New York Independent System Operator (NYISO) historical dataset for the same year as the PV data [45], focusing on the New York City zone. The under-generation penalty is calculated based on the NYISO service tariff policy [46]. We conduct the price and penalty forecasts separately based on model A in [47] to obtain the forecast data. We list major related parameters in Table I. All case studies are conducted with an Intel Core i7 CPU using CVX + Gurobi as the solver. The kernel-hyperparameters of DEQ model are optimized using the MATLAB Bayesian Optimization toolbox [49].

TABLE I  Related Parameter Setting
ParameterValue
ηc,ηd 98%, 98%
EtDay E11=0%, E1n+1=E24n
cB 0.5 $/MWh [48]
PPVmax 21 MW
PBmax 10 MW
Emax 10 MWh
Mbig 103
ϵ 5%
Tolerance of |UB-LB| in C&CG algorithm 10-6

A. Results Based on One-day Case

Figure 3 shows the day-ahead forecast and actual values of PV generation, market price, and under-generation penalty, which are the inputs of the proposed bidding strategy.

Fig. 3  Day-ahead forecast and actual values. (a) PV generation. (b) Market price. (c) Under-generation penalty.

The output of the proposed bidding strategy is shown in Fig. 4. As shown in Fig. 4, the curves of bidding quantities in the two modes are similar in trend. Bidding quantities in both modes reach their first peak during 10:00-11:00 and the second peak during 14:00-15:00. However, it can be observed that the decision bids more power in the risk-neutral mode throughout the day than in the risk-averse mode. Meanwhile, we can see that risk-averse and risk-neutral modes reserve power during 14:00-15:00 because the under-generation penalty reaches its peak all day. We also observe that most stored power throughout the day is reserved for under-generation in the risk-averse mode. On the contrary, the decision in risk-neutral mode prefers using BESS for arbitrage.

Fig. 4  One-day bidding decisions and BESS schedules. (a) Risk-neutral mode. (b) Risk-averse mode.

Figure 5 depicts the hourly bidding prices in risk-averse and risk-neutral modes.

Fig. 5  Hourly bidding prices. (a) Risk-averse mode. (b) Risk-neutral mode.

As discussed in Section III-B, the bidding prices quantify the under-generation risk of an HPP schedule in monetary terms. The risk-averse mode has near-zero bidding prices as it is conservative and includes sufficient reserves. Naturally, the risk-neutral mode corresponds to high bidding prices. Bidding prices in the risk-averse mode are notably lower than those in the risk-neutral mode all day, which means the bidding prices in risk-averse mode are more likely to be accepted. From the grid perspective, these bidding prices enable the operator to hedge against the risk of PV generation shortage, facilitating the practical participation of more HPPs in the market.

B. Adaptiveness to Varying Under-generation Penalty

In diverse regions and power pools, the unit price of under-generation penalties can vary markedly, sometimes by multiplying several multiples based on specific rules for renewable resources [

50]. This highlights the necessity for HPPs to adaptively adjust operational modes and decisions in response to penalties. In this context, we compare deterministic bidding, PBB model in risk-averse mode, and PBB model in risk-neutral mode using the case in Section VI, considering scenarios where the under-generation penalty varies. Deterministic bidding optimizes the expected income by utilizing forecast values as static inputs.

Figure 6 shows the impact of varying under-generation penalties on three models. A near-linear decrease in income of the deterministic bidding is noted as the penalty increases, along with a near-linear growth in the penalty of the deterministic bidding. Excessive penalties even lead to negative profitability with the deterministic bidding. The income in the risk-neutral mode also trends downwards as the penalty increases, yet at a markedly reduced pace relative to the income of deterministic bidding. This slower decline is credited to integrating penalties into the PBB model formulation. Therefore, the PBB model can actively reduce bidding quantities based on penalties to mitigate under-generation risks, as shown in Fig. 7. the PBB model in risk-averse mode that employs conservative bidding initially underperforms in low-penalty settings. However, with the rise in penalties, the PBB model in risk-averse mode gradually emerges as the best profitability and reliability strategy.

Fig. 6  Impact of varying under-generation penalties on three models. (a) Impact of unit penalty price on daily income. (b) Impact of unit penalty price on total under-generation penalties.

Fig. 7  Impact of varying under-generation penalties in risk-neutral mode and deterministic bidding.

C. Analysis of Uncertainty Sets

The scatter plot of 181×24 forecast errors in training set and 184×24 forecast errors in testing set are shown in Fig. 8(a) and Fig. 8(b), respectively. It is observed that the training and testing errors generally share similar distribution patterns, but some differences still exist. For example, the testing errors have a wider distribution on ξ2 axis.

Fig. 8  Forecast errors and different uncertainty sets. (a) Forecast errors in training set. (b) Forecast errors in testing set. (c) Static data-driven uncertainty set of training errors. (d) Static data-driven uncertainty set of testing errors. (e) Uncertainty set of DEQ model of training errors. (f) Uncertainty set of DEQ model of testing errors.

Figure 8(c) and (d) shows the static data-driven uncertainty sets of training and testing errors, respectively. The static data-driven uncertainty set is sparse because it wants to cover extreme samples. The uncertainty set of DEQ model overcomes the drawback of the static data-driven uncertainty set. Considering the uncertainty set of DEQ model is constantly changing, we use the uncertainty set of DEQ model to visualize the dynamics of DEQ model uncertainty sets, as defined in (38). Figure 8(e) and (f) shows uncertainty set of DEQ model of training and testing errors, respectively. Note that the testing errors are unseen data for the DEQ model. The union set of DEQ model has included most errors following the risk level setting without creating too much redundancy. The shape of the union set of DEQ model varies conspicuously to adapt to the error distribution. As observed in Fig. 8(a) and (b), testing errors have a wider distribution than training errors on ξ2 axis. Correspondingly, Fig. 8(e) and (f) shows that the union set of DEQ model covers a broader range on ξ2 axis. The flexibility and accuracy of union set of DEQ model can boost the income of the HPP.

ΞUnion=Ξ1Ξ2Ξ181×24 or 184×24  (38)

D. Offline Learning of KDCL Framework

In the offline learning of KDCL framework, the data enhancement is applied to the training set following the five steps in Section III-C. To balance the computational efficiency and outcome precision, 𝒜 is set to be 10. Namely, we will have 11 modified/candidate training sets after Step 3. Step 4 and Step 5 calculate PRBI for 11 candidate sets based on the unmodified training set to select the data-enhanced training set. The calculation results are depicted in Fig. 9.

Fig. 9  Results of data enhancement model in offline learning. (a) Incomes and PRBIs for candidate training sets in risk-averse mode. (b) Incomes and PRBIs for candidate training sets in risk-neutral mode.

Figure 9(a) shows the enhancement results in the risk-averse mode. It is observed that both the worst-case and validated incomes rise with the dropout of the extreme cases from the training set. The worst-case and validated income curves intersect at 20%-30% dropout percentages, indicating that the risk-averse mode is no longer satisfied with a 30% dropout percentage. We note that beyond a 30% dropout percentage, the gap between worst-case and validated income widens, suggesting a reduced strategy robustness. Based on the setting in Remark 1, the candidate sets with 30%-90% dropout percentages have PRBI=0 as the worst-case incomes are higher than validated incomes. The candidate set with a 20% dropout percentage is chosen as the enhanced data for its highest PRBI.

Figure 9(b) shows the enhancement results in the risk-neutral mode. We observe that both expected and validated incomes converge, yet at different dropout percentages. The expected incomes converge later than validated incomes. To avoid the unnecessary loss of the reliability, the highest PRBI falls on the converging point of validated incomes.

E. Economic Performance Analysis

We compare six bidding strategies listed in Table II. As part of this comparison, we obtain validated income of HPP based on the following assumptions.

TABLE II  Comparision of Six Bidding Strategies
Strategy No.Error quantificationBidding model
1 Static RO bidding model [11]
2 Static SO bidding model [6]
3 Static PBB model
4 DEQ PBB model
5 DEQ + KDCL PBB model
6 100% accurate forecast Deterministic model

1) Bids can be accepted only if the bidding prices are lower than the market clearing price.

2) The market clearing price can be that of selling power.

3) The under-generation penalty can be received once the HPP fails to deliver the committed power to the grid.

4) Testing data are unseen to DEQ model in the offline learning.

Table III summarizes the incomes of test set in the risk-averse mode. We observe that all validated incomes are higher than the corresponding worst-case incomes, which shows the robustness of the proposed bidding strategy (strategy 5). We also find that the PBB model, the DEQ model, and the KDCL framework all positively affect the profitability of HPP. Table IV summarizes the incomes of test set in the risk-neutral mode. It shows the same advantage of the proposed bidding strategy we observed in Table III. Meanwhile, we notice that selecting the risk-averse mode would inevitably sacrifice potential incomes. Compared with the results in the risk-averse mode, the expected income and validated income of strategy 5 in risk-neutral mode are close to the ideal optimum.

TABLE III  Incomes of Test Set in Risk-averse Mode
Strategy No.Worst-case income ($)Validated income ($)
1 0.80×105 0.97×105
3 1.63×105 1.96×105
4 6.05×105 6.30×105
5 7.78×105 7.94×105
6 12.16×105 12.16×105
TABLE IV  Incomes of Test Set in Risk-neutral Mode
Strategy No.Expected income ($)Validated income ($)
2 9.77×105 9.29×105
3 10.68×105 10.21×105
4 11.22×105 10.79×105
5 11.72×105 11.21×105
6 12.16×105 12.16×105

F. Computation Time Analysis

Table V lists the computation time of different bidding strategies. It shows that integrating under-generation penalties into the bidding model increases the computation time. However, by squeezing the solution space, the tailored C&CG algorithm can significantly improve the computation efficiency of PBB model in risk-averse mode compared with the original C&CG algorithm.

TABLE V  Computation Time of Different Bidding Strategies
Bidding strategyComputation time (s)
Strategy 1 0.73
Strategy 2 1.12
PBB model in risk-neutral mode 11.32
PBB model in risk-averse mode + original C&CG algorithm 152.54
PBB model in risk-averse mode + tailored C&CG algorithm 2.41

VI. Conclusion

This paper introduces a KDCL framework to align the optimization objectives of models in knowledge-data-combined strategies. We apply and verify the KDCL framework in the context of the HPP bidding problem. Additionally, we propose a data-driven DEQ model and a knowledge-driven PBB model specific to the HPP bidding problem. Case studies based on the data of NREL and NYISO are conducted. In the simulations, the PBB (strategy 3), DEQ + PBB (strategy 4), and DEQ + PBB + KDCL (strategy 5) outperform the baseline by 9.9%, 16.2%, and 20.7% in half-year validated income, respectively. Moreover, the DEQ model shows better adaptivity and flexibility than static models in capturing the forecast errors of PV generation, market price, and under-generation penalty. Case studies also demonstrate that the KDCL framework can automatically compute the dropout percentage of the training set and generate enhanced data. The effectiveness of KDCL framework in the HPP bidding problem verifies it can enhance the overall performance of knowledge-data-combined strategies.

In future work, we will analyze the automatic selection criteria for the operation mode of the bidding model.

References

1

Y. Yuan, Y. Cao, X. Zhang et al., “Optimal proportion of wind and PV capacity in provincial power systems based on bilevel optimization algorithm under low-carbon economy,” Journal of Modern Power Systems and Clean Energy, vol. 3, no. 1, pp. 33-40, Feb. 2015. [Baidu Scholar] 

2

C. Byers and A. Botterud, “Additional capacity value from synergy of variable renewable energy and energy storage,” IEEE Transactions on Sustainable Energy, vol. 11, no. 2, pp. 1106-1109, Apr. 2020. [Baidu Scholar] 

3

U.S. Department of Energy Office of Energy Efficiency and Renewable Energy. (2020, Jan.). Solar energy technologies office fiscal year 2020 funding program. [Online]. Available: https://www.hydro.org/wp-content/uploads/2020/02/SETO_FY_2020_FOA_hybridSystem.pdf [Baidu Scholar] 

4

B. Huang and J. Wang, “Deep-reinforcement-learning-based capacity scheduling for PV-battery storage system,” IEEE Transactions on Smart Grid, vol. 12, no. 3, pp. 2272-2283, May 2021. [Baidu Scholar] 

5

National Renewable Energy Laboratory. (2013, May). Impacts of variability and uncertainty in solar photovoltaic generation at multiple timescales. [Online]. Available: https://www.osti.gov/servlets/purl/1081387 [Baidu Scholar] 

6

O. Lak, M. Rastegar, M. Mohammadi et al., “Risk-constrained stochastic market operation strategies for wind power producers and energy storage systems,” Energy, vol. 215, p. 119092, Jan. 2021. [Baidu Scholar] 

7

D. Xiao, H. Chen, C. Wei et al., “Statistical measure for risk-seeking stochastic wind power offering strategies in electricity markets,” Journal of Modern Power Systems and Clean Energy, vol. 10, no. 5, pp. 1437-1442, Sept. 2022. [Baidu Scholar] 

8

D. Xiao, M. K. AlAshery, and W. Qiao, “Optimal price-maker trading strategy of wind power producer using virtual bidding,” Journal of Modern Power Systems and Clean Energy, vol. 10, no. 3, pp. 766-778, May 2022. [Baidu Scholar] 

9

A. Samimi, “Probabilistic day-ahead simultaneous active/reactive power management in active distribution systems,” Journal of Modern Power Systems and Clean Energy, vol. 7, no. 6, pp. 1596-1607, Jul. 2019. [Baidu Scholar] 

10

L. Baringo and A. J. Conejo, “Offering strategy via robust optimization,” IEEE Transactions on Power Systems, vol. 26, no. 3, pp. 1418-1425, Aug. 2011. [Baidu Scholar] 

11

A. A. Thatte, L. Xie, D. E. Viassolo et al., “Risk measure based robust bidding strategy for arbitrage using a wind farm and energy storage,” IEEE Transactions on Smart Grid, vol. 4, no. 4, pp. 2191-2199, Dec. 2013. [Baidu Scholar] 

12

A. Attarha, N. Amjady, and S. Dehghan, “Affinely adjustable robust bidding strategy for a solar plant paired with a battery storage,” IEEE Transactions on Smart Grid, vol. 10, no. 3, pp. 2629-2640, May 2019. [Baidu Scholar] 

13

M. Farahani, A. Samimi, and H. Shateri, “Robust bidding strategy of battery energy storage system (BESS) in joint active and reactive power of day-ahead and real-time markets,” Journal of Energy Storage, vol. 59, pp. 106520-106535, Mar. 2023. [Baidu Scholar] 

14

F. Fang, S. Yu, and X. Xin, “Data-driven-based stochastic robust optimization for a virtual power plant with multiple uncertainties,” IEEE Transactions on Power Systems, vol. 37, no. 1, pp. 456-466, Jan. 2022. [Baidu Scholar] 

15

M. Farahani, A. Samimi, and H. Shateri, “Optimal scheduling of energy storage systems with private ownership based on a stochastic-robust hybrid optimization model in energy and ancillary services markets,” Journal of Modeling in Engineering, vol. 22, pp. 1-12, Aug. 2023. [Baidu Scholar] 

16

M. Daneshvar, B. Mohammadi-Ivatloo, K. Zare et al., “Two-stage robust stochastic model scheduling for transactive energy based renewable microgrids,” IEEE Transactions on Industrial Informatics, vol. 16, no. 11, pp. 6857-6867, Nov. 2020. [Baidu Scholar] 

17

A. Baringo and L. Baringo, “A stochastic adaptive robust optimization approach for the offering strategy of a virtual power plant,” IEEE Transactions on Power Systems, vol. 32, no. 5, pp. 3492-3504, Sept. 2017. [Baidu Scholar] 

18

X. Zheng and H. Chen, “Data-driven distributionally robust unit commitment with Wasserstein metric: tractable formulation and efficient solution method,” IEEE Transactions on Power Systems, vol. 35, no. 6, pp. 4940-4943, Nov. 2020. [Baidu Scholar] 

19

X. Zheng, B. Zhou, X. Wang et al., “Day-ahead network-constrained unit commitment considering distributional robustness and intraday discreteness: a sparse solution approach,” Journal of Modern Power Systems and Clean Energy, vol. 11, no. 2, pp. 489-501, Mar. 2023. [Baidu Scholar] 

20

R. Zhu, H. Wei, and X. Bai, “Wasserstein metric based distributionally robust approximate framework for unit commitment,” IEEE Transactions on Power Systems, vol. 34, no. 4, pp. 2991-3001, Jul. 2019. [Baidu Scholar] 

21

C. Zhao and Y. Guan, “Data-driven risk-averse stochastic optimization with Wasserstein metric,” Operations Research Letters, vol. 46, no. 2, pp. 262-267, Mar. 2018. [Baidu Scholar] 

22

C. A. Gamboa, D. M. Valladão, A. Street et al., “Decomposition methods for Wasserstein-based data-driven distributionally robust problems,” Operations Research Letters, vol. 49, no. 5, pp. 696-702, Sept. 2021. [Baidu Scholar] 

23

Y. Ye, D. Qiu, M. Sun et al., “Deep reinforcement learning for strategic bidding in electricity markets,” IEEE Transactions on Smart Grid, vol. 11, no. 2, pp. 1343-1355, Mar. 2020. [Baidu Scholar] 

24

Y. Dong, Z. Dong, T. Zhao et al., “A strategic day-ahead bidding strategy and operation for battery energy storage system by reinforcement learning,” Electric Power Systems Research, vol. 196, pp. 107229-107235, Jul. 2021. [Baidu Scholar] 

25

Y. Yang, W. Wu, B. Wang et al., “Analytical reformulation for stochastic unit commitment considering wind power uncertainty with Gaussian mixture model,” IEEE Transactions on Power Systems, vol. 35, no. 4, pp. 2769-2782, Jul. 2020. [Baidu Scholar] 

26

D. W. van der Meer, J. Widen, and J. Munkhammar, “Review on probabilistic forecasting of photovoltaic power production and electricity consumption,” Renewable and Sustainable Energy Reviews, vol. 81, pp. 1484-1512, Jan. 2018. [Baidu Scholar] 

27

S. Karagiannopoulos, P. Aristidou, and G. Hug, “Data-driven local control design for active distribution grids using off-line optimal power flow and machine learning techniques,” IEEE Transactions on Smart Grid, vol. 10, no. 6, pp. 6461-6471, Nov. 2019. [Baidu Scholar] 

28

X. Tang, X. Bai, Z. Weng et al., “Graph convolutional network-based security-constrained unit commitment leveraging power grid topology in learning,” Energy Reports, vol. 9, pp. 3544-3552, Dec. 2023. [Baidu Scholar] 

29

C. Ning and X. Ma, “Data-driven Bayesian nonparametric Wasserstein distributionally robust optimization,” IEEE Control Systems Letters, vol. 7, pp. 3597-3602, Nov. 2023. [Baidu Scholar] 

30

Z. Ding, X. Huang, Z. Liu et al., “A two-level scheduling algorithm for battery systems and load tap changers coordination in distribution networks,” IEEE Transactions on Power Delivery, vol. 37, no. 4, pp. 3027-3037, Aug. 2022. [Baidu Scholar] 

31

J. Han, L. Yan, and Z. Li, “A task-based day-ahead load forecasting model for stochastic economic dispatch,” IEEE Transactions on Power Systems, vol. 36, no. 6, pp. 5294-5304, Nov. 2021. [Baidu Scholar] 

32

X. Chen, Y. Yang, Y. Liu et al., “Feature-driven economic improvement for network-constrained unit commitment: a closed-loop predict-and-optimize framework,” IEEE Transactions on Power Systems, vol. 37, no. 4, pp. 3104-3118, Jul. 2022. [Baidu Scholar] 

33

E. T. Jaynes, “Information theory and statistical mechanics: II,” Physical Review, vol. 108, no. 2, pp. 171-190, Oct. 1957. [Baidu Scholar] 

34

E. Schulz, M. Speekenbrink, and A. Krause, “A tutorial on Gaussian process regression: modelling, exploring, and exploiting functions,” Journal of Mathematical Psychology, vol. 85, pp. 1-16, Aug. 2018. [Baidu Scholar] 

35

H. Sheng, J. Xiao, Y. Cheng et al., “Short-term solar power forecasting based on weighted Gaussian process regression,” IEEE Transactions on Industrial Electronics, vol. 65, no. 1, pp. 300-308, Jan. 2018. [Baidu Scholar] 

36

P. Sheikhahmadi and S. Bahramara, “The participation of a renewable energy-based aggregator in real-time market: a bi-level approach,” Journal of Cleaner Production, vol. 276, p. 123149, Dec. 2020. [Baidu Scholar] 

37

F. Si, J. Wang, Y. Han et al., “Risk-averse multiobjective optimization for integrated electricity and heating system: an augment epsilon-constraint approach,” IEEE Systems Journal, vol. 16, no. 4, pp. 5142-5153, Dec. 2022. [Baidu Scholar] 

38

S. Bruno, S. Ahmed, A. Shapiro et al., “Risk neutral and risk averse approaches to multistage renewable investment planning under uncertainty,” European Journal of Operational Research, vol. 250, no. 3, pp. 979-989, May 2016. [Baidu Scholar] 

39

H. Alibrahim and S. A. Ludwig, “Hyperparameter optimization: comparing genetic algorithm against grid search and Bayesian optimization,” in Proceedings of 2021 IEEE Congress on Evolutionary Computation (CEC), Kraków, Poland, Jun. 2021, pp. 1551-1559. [Baidu Scholar] 

40

H. A. Noughabi, “Entropy estimation using numerical methods,” Annals of Data Science, vol. 2, no. 2, pp. 231-241, Oct. 2015. [Baidu Scholar] 

41

B. Zeng and L. Zhao, “Solving two-stage robust optimization problems using a column-and-constraint generation method,” Operations Research Letters, vol. 41, no. 5, pp. 457-461, Sept. 2013. [Baidu Scholar] 

42

H. Zhao, B. Wang, X. Wang et al., “Active dynamic aggregation model for distributed integrated energy system as virtual power plant,” Journal of Modern Power Systems and Clean Energy, vol. 8, no. 5, pp. 831-840, Sept. 2020. [Baidu Scholar] 

43

X. Huang, Z. Zhang, Y. Lin et al., “Arbitrage and capacity firming in coordination with day-ahead bidding of a hybrid PV plant,” in Proceedings of 2022 IEEE PES General Meeting, Denver, USA, Jul. 2022, pp. 1-5. [Baidu Scholar] 

44

National Renewable Energy Laboratory. (2022, Jan.). Solar power data for integration studies. [Online]. Available: https://www. https://www.nrel.gov/grid/ solar-power-data.html [Baidu Scholar] 

45

New York Independent System Operator. (2023, May). Energy market & operational data. [Online]. Available: https://www.nyiso.com/energy-market-operational-data [Baidu Scholar] 

46

New York Independent System Operator. (2023, May). Market administration and control area services tariff. [Online]. Available: https://nyisoviewer.eta riff.biz/ViewerDocLibrary/MasterTariffs/9FullTariffNYISOMST.pdf [Baidu Scholar] 

47

I. P. Panapakidis and A. S. Dagoumas, “Day-ahead electricity price forecasting via the application of artificial neural network based models,” Applied Energy, vol. 172, pp. 132-151, Jun. 2016. [Baidu Scholar] 

48

G. He, Q. Chen, C. Kang et al., “Optimal bidding strategy of battery storage in power markets considering performance-based regulation and battery cycle life,” IEEE Transactions on Smart Grid, vol. 7, no. 5, pp. 2359-2367, Sept. 2016. [Baidu Scholar] 

49

MathWorks. (2024, Jan.). Bayesian optimization. [Online]. Available: https://www.mathworks.com/help/stats/bayesianoptimization.html [Baidu Scholar] 

50

E. Du, N. Zhang, C. Kang et al., “Managing wind power uncertainty through strategic reserve purchasing,” IEEE Transactions on Power Systems, vol. 32, no. 4, pp. 2547-2559, Jul. 2017. [Baidu Scholar]