Journal of Modern Power Systems and Clean Energy

ISSN 2196-5625 CN 32-1884/TK

网刊加载中。。。

使用Chrome浏览器效果最佳,继续浏览,你可能不会看到最佳的展示效果,

确定继续浏览么?

复制成功,请在其他浏览器进行阅读

Scenario-based Optimal Real-time Charging Strategy of Electric Vehicles with Bayesian Long Short-term Memory Networks  PDF

  • Hongtao Ren 1
  • Chung-Li Tseng 2
  • Fushuan Wen 3 (Fellow, IEEE)
  • Chongyu Wang 1
  • Guoyan Chen 4
  • Xiao Li 5
1. College of Electrical Engineering, Zhejiang University, Hangzhou 310027, China, ; 2. UNSW Business School, The University of New South Wales, NSW 2052, Sydney, Australia; 3. Hainan Institute, Zhejiang University, Sanya 572000, China; 4. Department of Electrical Engineering, College of Information Science and Engineering, Huaqiao University, Xiamen, China; 5. Guangzhou Power Supply Bureau of Guangdong Power Grid Co., Ltd., Guangzhou, China

Updated:2024-09-24

DOI:10.35833/MPCE.2023.000512

  • Full Text
  • Figs & Tabs
  • References
  • Authors
  • About
CITE
OUTLINE

Abstract

Joint operation optimization for electric vehicles (EVs) and on-site or adjacent photovoltaic generation (PVG) are pivotal to maintaining the security and economics of the operation of the power system concerned. Conventional offline optimization algorithms lack real-time applicability due to uncertainties involved in the charging service of an EV charging station (EVCS). Firstly, an optimization model for real-time EV charging strategy is proposed to address these challenges, which accounts for environmental uncertainties of an EVCS, encompassing EV arrivals, charging demands, PVG outputs, and the electricity price. Then, a scenario-based two-stage optimization approach is formulated. The scenarios of the underlying uncertain environmental factors are generated by the Bayesian long short-term memory (B-LSTM) network. Finally, numerical results substantiate the efficacy of the proposed optimization approach, and demonstrate superior profitability compared with prevalent approaches.

I. Introduction

WITH recent advances in electric vehicle (EV) batteries and charging technologies, EVs play an increasingly important role in reducing the consumption of fossil fuels and the emission of carbon dioxide [

1], [2]. However, an extensive integration of EV charging loads to a distribution network (DN) may result in adverse effects to the DN, e.g., increased peak-valley difference in the power grid and enhanced cost of power transmission losses [3], which may indirectly impede the promotion of EVs and the implementation of the zero-carbon target. These adverse impacts, however, can be mitigated by optimizing the EV charging strategy in an EV charging station (EVCS) [4].

Developing an optimal EV charging strategy is a viable solution to maintain the security of the power grid and increase renewable energy utilization with a high penetration of EVs. However, faced with stochastic traffic conditions [

5], various habits of EV users [6], and dynamic energy prices and renewable energy generation outputs [7], it is challenging to efficiently optimize the EV charging/discharging [8].

To formulate EV charging scheduling as a stochastic optimization problem, a real-time optimization scheduling method that emphatically considers uncertain traffic flows of EVs is proposed using the well-developed model predictive control (MPC) in [

9] and [10]. In a similar manner, the MPC is used to depict uncertain charging behaviors of EV users for developing a real-time EV charging strategy in [11]. A two-stage management framework for an island microgrid considering renewable intermittency at the day-ahead time scale is proposed in [12]. However, due to unavoidable forecasting error, the EV charging scheduling obtained by the MPC may be far from the optimal one.

Recently, the model-free reinforcement learning (RL) approaches have achieved great success in dealing with problems with high-dimensional EV data and uncertainties [

13], [14]. The charging scheduling problem is solved using Q-learning in [15]. A reward table is employed to estimate the optimal action-value function by discretizing the charging/discharging actions. A neural network (NN) based deep Q-learning (DQN) is utilized for overcoming the downside of using a Q-table [16]. A probability-based optimal EV charging strategy integrated with a DQN algorithm is developed for an EV aggregator (EVA) in [17]. However, the dimensionality of the inputs to the Q-network is fixed [18], which implies that the charging features of EVs dynamically arriving at the EVCS cannot be used as inputs to the DQN. Some studies address the issue by utilizing the aggregated features of EVCS rather than the specific features of each EV as input such as the total charging demand of EVCS [19], which are less efficient at extracting valuable information from the available data compared with model-based approaches.

Scenario analysis is a popular model-based approach to address the uncertainties involved in the EV charging scheduling problem. A scalable method is proposed in [

20] to cater for uncertain traffic flows by extracting the expected overall cost using the Monte Carlo simulation. A dynamic method is proposed in [21] to forecast the renewable energy future in Germany by using a multi-scenario probability approach. Each candidate solution is evaluated with respect to various scenarios and the objective function is then obtained by weighting the evaluation results according to the probabilities of the considered scenarios. However, these methods manifest the superiority in an optimal real-time charging strategy only when they can well model the environmental uncertainties. Moreover, they may not be suitable for the optimization of real-time EV charging strategy due to heavy computational burden when traditional statistical methods are employed to build scenarios.

With the development of the Bayesian neural network (BNN) [

22], the performance of model-based approach could be promoted by robustly learning the dynamic nature of an environment. Following this approach, an optimization model for real-time EV charging strategy is proposed based on scenario analysis. The uncertain environmental factors in various scenarios including photovoltaic generation (PVG) output, electricity prices, and EV charging demands are forecasted based on the Bayesian long short-term memory (B-LSTM) network. The main contributions of this paper include three aspects.

1) Based on the analysis of EV charging process, an optimization model for EV charging strategy is formulated for an EVCS considering uncertain environmental factors such as EV arrivals, EV charging demands, PVG outputs, and electricity prices.

2) A scenario-based two-stage optimization approach is proposed to devise farsighted real-time EV charging strategies to increase the expected profitability with these uncertainties.

3) The B-LSTM network is applied to forecast some uncertain environmental factors with randomness for an EVCS, aiming to generate typical scenarios in support of the proposed optimization approach.

The remainder of this paper is organized as follows. The optimization model for EV charging strategy is formulated in Section II. The proposed optimization model is solved by a scenario-based two-stage optimization approach in Section III. Then, the B-LSTM network is employed in Section IV as the stochastic forecast functions to forecast some uncertain environmental factors. In Section V, numerical simulations are carried out to demonstrate the effectiveness of the proposed optimization approach. Finally, this paper is concluded in Section VI.

II. Formulation of Optimization Model for EV Charging Strategy

The optimization of EV charging strategy in this paper primarily targets EVs with extended parking durations, typically found in workplace and residential areas. These EVs tend to remain parked for considerably longer periods than required for a full charge at their rated power, indicating a high potential for adjustment. The process for an EV to receive charging service provided by such an EVCS is shown in Fig. 1. Each parking spot at the EVCS is assumed to be connected to the charging network through a charging port. When the nth EV arrives at the EVCS, the EVCS records its arrival time TnA. The EVCS also records the initial state of charge (SOC) of EV battery SOCnI. The driver then announces the departure time TnD, when the charging service would be completed with an expected SOC SOCnE. The driver then surrenders charging control to the EVCS and may wait or leave for personal reasons [

23]. When the charging service finishes at TnD, the nth EV leaves the EVCS.

Fig. 1  Process for an EV to receive charging service provided by EVCS.

The EVCS purchases electricity from a DN to supply the EV charging demands. Depending on the requested charging duration of the nth EV, which is TnDTnA, the EVCS decides its charging strategy. For example, if the charging duration is short, fast charging with the maximum power may be needed, which may have a negative impact on the DN. If the charging duration is much longer, it is possible for the EVCS to maximize its profit by delaying the charging to the hours when the electricity price is expected to be lower.

Figure 2 is a Gantt chart that illustrates the charging durations of EVs at the EVCS. Each bar represents the charging duration of an EV, starting at time TnA and ending at TnD for the nth EV. At each time t, EVs are in different statuses. It records the EVs that are still under charging at time t, while Lt tracks the EVs that have just finished charging and are going to leave the EVCS. The two sets are defined as:It=nTnAt<TnD and Lt=nt=TnD. Taking Fig. 2 as an example, we have It=2,3,4 and Lt=1.

Fig. 2  Gantt chart illustrating charging durations of EVs at EVCS.

The EVCS determines the charging power Pn,t for each EV at time t based on the information of electricity price ct and PVG output PtPV at time t. Meanwhile, the following constraints regarding the SOC of the nth EV at time t during the charging process, denoted by SOCn,t, need to be tracked for all nIt.

Qn,t=ηcPn,tΔtPn,t0Pn,tΔt/ηdPn,t<0 (1)
SOCminSOCn,t+Qn,t/QnEVSOCmax (2)
nItQn,t/Δt-PtPVPt,maxEVCS (3)
Qn,t+TnD-t-1PmaxEVΔtQnEVSOCnE-SOCn,t (4)
SOCn,t+1=SOCn,t+Qn,t/QnEV (5)

Formulas (1) and (2) prevent the nth EV from being over-charged or over-discharged. Formula (3) represents the security constraints of the DN integrated with the EVCS, including voltage constraints and transformer capacity constraints, etc., which are simplified as the sum of charging power of the EVCS being within the cap on the charging power of the EVCS. Formula (4) ensures that the expected SOC could be achieved when the EV leaves the EVCS with the maximum charging power. Formula (5) describes the transition of the SOCs of these EVs. In particular, for the EVs arriving at the EVCS at time t, their SOCs are the initial states the EVCS recorded.

At time t, the EVCS collects revenue when each charging service is completed, i.e., at TnD for the nth EV. Therefore, the reward that the EVCS receives from charging service Rt' at time t is composed of three parts, as shown in (6).

Rt'Pt=nLtcEVQnEVSOCnE-SOCnI-ctmaxnItPn,t-PtPVΔt,0+βctmaxPtPV-nItPn,tΔt,0 (6)

The contents of the first bracket in (6) represents the fee collected at the end of each charging service. The second and last parts measure the electricity purchasing cost and selling revenue from and to the DN at time t considering the PVG output, respectively.

In addition, EVCSs can also gain more revenue by providing ancillary service. Demand response helps grid operators manage peak demand, reduce the need for additional power generation, and enhance grid reliability. Generally, the grid operator provides timing and price signals to EVCSs participating in demand response in a variety of ways (e.g., e-mail, short messaging, or automated alerts). EVCS operators will be compensated based on the amount of electricity they use or supply during the demand response, as formulated in (7).

RtPt=Rt'Pt+ctPmaxPtPV-nItPn,tΔt,0+ctVmaxnItPn,t-PtPVΔt,0 (7)

The expected value of long-term profit needs to be considered in optimizing the charging strategy of EVCS, so one must consider the underlying uncertainties, including the future charging demands (TnA, TnD, SOCnI, and SOCnE), PVG outputs, and electricity prices. This stochastic optimization problem can be described as a multi-stage decision-making problem as:

VtSOCt=maxPtRtPt+EUt+1Vt+1SOCt+1 (8)

To simplify notations, let SOCt=SOCn,tnIt, and Pt=Pn,tnIt. It should be noted that since the number of EVs at EVCS is constantly changing, the dimensions of SOCt and Pt are also dynamic.

Given the current operational and market states SOCt at time t, VtSOCt represents the expected total profit for the EVCS over the remaining period of assessment; and EUt+1 represents the expectation operator with respect to the uncertainties Ut+1. Formula (8) can be applied by the EVCS on a rolling horizon basis.

III. Scenario-based Two-stage Optimization Approach for EV Charging Strategy

A. Formulation of Two-stage Optimization Approach

Instead of tackling the multi-stage decision-making problem formulated in (8), we formulate a two-stage optimization approach to approximate (8), in which the first stage is the current period (t=0), and the second stage covers all remaining periods (t1). The approximate formula for (8) is expressed as:

V0*SOCtmaxP0,PtR0P0+Et=1TRtPt;Uts.t. (1)-(4) (9)

Note that in (9), we have explicitly shown the set for uncertainties Ut as arguments of the reward function Rt because now we are dealing with the uncertainties. Our approach is to generate many (K) scenarios for the uncertain environmental factors U1(j),U2(j),...,UT(j), j= 1,2,,K. To further solve the two-stage problem, (9) can be reduced to a deterministic and multi-period EV charging problem, as expressed in (10), since all future uncertainties Ut(j) in the jth scenario can be known.

V0*SOCtmaxP0,Pt(j)R0P0+1Kj=1Kt=1TRtPt(j);Ut(j) (10)

Formula (10) involves a maximum operation in (7), i.e., maxnItPn,t-PtPVΔt,0, which is a nonlinear term. The nonlinear term in (7) can be easily converted into a mixed-integer linear function by a piecewise linear approximation algorithm [

24]. The whole problem then can be solved by a commercial solver such as YALMIP with CPLEX.

B. Generating EV Charging Scenarios Based on Probability Forecasts

Considering the future environmental uncertainties, how to obtain these scenarios U1(j),U2(j),,UT(j) in (10) is the key to attain the accurate solution of the two-stage problem. The uncertain variables in one scenario include the number of EV arrivals NtA, the PVG outputs PtPV, and the electricity price ct. Moreover, the demand information of EV arrival, including their initial SOC SOCnI, expected SOC SOCnE, and departure time TnD, is also included in each scenario. Most existing scenario generation methods are directly approximating the probability distributions of the variables based on historical data [

25]. While the probability distributions of some variables may be estimated purely based on historical data, they may be better forecasted based on their influencing factors. For example, PVG outputs depend on weather, and EV charging demand varies from weekdays to weekends and workdays to holidays.

In this paper, we identify variables and their influencing factors, including the time information Tt (year, month, day, hour, and a binary variable indicating whether it is a workday) and meteorological information WtM (weather, temperature, humidity, and wind strength), to construct a forecast function for forecasting the probability distributions of these uncertain variables in the future.

While these uncertain variables can be forecasted given sufficient data for training, they only provide point estimates. To avoid making overly confident forecasting and furthermore to produce random samples for the proposed optimization approach, it is more desirable that the outputs of the forecast function are probability distributions rather than fixed values.

Parameters in the stochastic forecast function are no longer a set of determined values obtained by training, but a set of random variables satisfying the probability density function qw [

26]. We can build stochastic forecast functions for each uncertain variable to approximate their future values from t+1 to t+T, as shown in (11).

(yt+1,yt+2,,yt+T)=fTt,Tt+1,,Tt+T,WtM,Wt+1M,,Wt+TM;ww~qw (11)

where the deterministic parameters w are randomly sampled based on the probability density function qw.

Figure 3 illustrates the process of generating EV charging scenarios. Firstly, the time and meteorological information from the past T time periods are input into the stochastic forecast functions, enabling the forecasting of electricity prices ct+1,ct+2,,ct+T, PVG outputs Pt+1PV,Pt+2PV,,Pt+TPV, and the number of EV arrivals Nt+1A,Nt+2A,,Nt+TA in the subsequent T time periods. For the forecasted NtA EVs at time t, the forward propagation process in B-LSTM network is executed NtA times to forecast the initial SOC, expected SOC, and parking time, obtaining the charging demands for all NtA EVs. Since the parameters in B-LSTM network are random variables, the distribution of charging demands for NtA EVs can be forecasted with multiple executions. So far, a typical scenario is obtained.

Fig. 3  Process of generating EV charging scenarios.

The process described in Fig. 3 can be used repeatedly to generate scenarios for formula (10). What remains to show is how the stochastic forecast function f() is designed to generate random scenarios of the uncertain environmental factors, which will be detailed in Section IV.

IV. Developing Stochastic Forecast Functions Based on B-LSTM Networks

A. Probabilistic Forecast Model Based on B-LSTM Networks

A classical forecast model is often constructed as a neural-networks (NNs) based regression model [

27], [28]. The nonlinear relationship between the inputs (time and meteorological information) and outputs (forecast values in the future) can be fitted by NNs. In this paper, the underlying environmental uncertainties preserve some temporal correlations, which means the outputs at each time step depend not only on current inputs but also on previous states. The relationships of hidden layers at different steps are established in recurrent neural networks (RNNs), where the historical memories of hidden layers are transmitted to consider past states, namely, the values of the hidden layers at the current step are related not only to the input at the current step but also to the values of the hidden layers at the previous steps. LSTM networks are further applied to deal with the vanishing and exploding gradient problems of standard RNNs when learning long-term temporal dependencies [29], [30].

The structure of an LSTM lower cell is illustrated on the lower left sides, as shown in the blue dashed box in Fig. 4.

Fig. 4  Structure of stacked LSTM cell.

In Fig. 4, the superscripts U and L represent the LSTM upper and lower cells, respectively. The key to an LSTM network includes a series of cell states Ct, which can maintain information for a long period of time. Unlike in RNNs where values of hidden layers are all updated, the cell states of an LSTM network can be either newly integrated, forgotten, or output, depending on the current input and the output at the previous moment.

The effects of past states and current inputs on the cell states are modulated by an intermediate gate, which is composed of a neuron with a sigmoid activation function. The intermediate gate produces a vector with values between 0 and 1, multiplied by the information to optionally filter the information. The intermediate gates in an LSTM cell contain a forget gate, an input gate, and an output gate, with each of them processing input/output following (12), where the superscripts are omitted for simplicity. The outputs of the gate function could be obtained through the element-wise sigmoid transformation of the sum of the weighted input and the weighted hidden state with the bias vector added.

gt=σWHt-1,xt+B (12)

where Ht-1,xt represents an action to stack Ht-1 and xt into one vector; and σ() is the sigmoid function.

The update of the cell states at step t includes inheriting information from the previous LSTM cell and integrating information from the new inputs, which is a joint effect of the forget gate and the input gate.

Ct=gtFCt-1+gtItanhWHt-1,xt+B (13)

The output gate optionally outputs the information of the cell states at step t to obtain the forecast result.

Ht=gtOtanhCt (14)

Because the features of the input vector xt are complex and nonlinear, the stacked LSTM network as a deep learning technique is adopted. Stacking LSTM cells enable the model to learn the description deeper and more accurately. In the stacked model shown on the left side in Fig. 4, the outputs of the lower LSTM cell ΗtL are taken as the inputs of the upper LSTM cell, whose dimension is consistent with the input vector to store the input information completely. The dimension of the outputs of the upper LSTM cell HtU is the same as that of the forecast results, namely, the outputs of the upper LSTM cell are the forecast results so that yt=HtU.

Since the sequence of an LSTM network is usually too long to entirely unfold, the difficulties of forecasting and training can be addressed by truncating and unfolding the LSTM network, as shown on the right side of Fig. 4. Since we need to generate scenarios for future T time periods, we truncate and unfold the LSTM networks for T steps. The values of cell states and outputs at the time before t-T are set to be 0, which means the information before t-T will not affect the subsequent output.

The process of forecasting is the forward propagation of the networks. Information propagated between steps, including Ct and Ht, can be obtained by rolling (13) and (14) from t-T to t+T. The outputs of the upper LSTM cell of each forward propagation are the forecast results at that time.

In the above forecast model based on an LSTM network, the historical observations of the uncertain variables have not been used, which may lead to an accumulation of errors. To avoid this problem, we replace the inputs of the upper LSTM cell with the historical observation values before step t, i.e., Ht-1U in Fig. 4 is replaced by yt-1 from t-T to t. Therefore, the forecast function based on the LSTM network becomes:

(yt+1,yt+2,,yt+T)=fTt-T,Tt-T+1,,Tt+T,Wt-TM,Wt-T+1M,,Wt+TM,yt-T,yt-T+1,,yt;w (15)

The topology of a B-LSTM network is inherited from its LSTM counterpart with the same nonlinearity and scalability. The uncertain output is estimated by extending the conventional LSTM network to a B-LSTM network [

31], [32] with all network parameters becoming random variables described by the probability density function qw. Note that since the parameters in a B-LSTM network follow now probability distributions, both the cell states and the outputs in the B-LSTM network follow probability distributions as well. We can, therefore, use it to make probabilistic forecasts and generate random scenarios.

B. Training Parameters of Forecast Model

The parameters of a conventional NN can be learned by the maximum likelihood estimation (MLE) given training dataset D using backpropagation.

pDw/w=iDpyixi,w/w (16)

In the LSTM networks, to reduce the variance in the gradients, more than one sequence is trained at one time. Let the dataset D be divided into M minibatches Dm. Then, we can write the MLE of the mth minibatch as:

pDmw=s=1SpDm,sw=s=1St=1Tpytxt,Ht,Ct,w (17)

where Ht and Ct can be obtained from the previous state through the forward propagation of the LSTM networks. By M backpropagations of this MLE shown in (17), we apply all the data to the training of this network.

Different from conventional LSTM networks, the training for the B-LSTM network is to calculate the posterior distribution of the parameters pwD. The posterior distribution for the parameters w of a network can be calculated by Bayes’ rule, as shown in (18).

pwD=pDwpwpD=pDwpwwpDwpwdw (18)

where pw represents the prior distribution of the parameters; and pD represents the evidence based on the dataset D.

Since the calculation of the integrals in (18) is intractable, a variational distribution qw;θ defined by parameters θ is used to approximate it by minimizing the Kullback-Leibler (KL) divergence as shown in (19), which is a trade-off between the prior distribution and the influence of historical data.

minθKLqw;θpwD=minθqw;θlogqw;θpwpDwdw=minθqw;θlogqw;θ-logpw-logpDwdw (19)

The minibatch method is also applied to the training of the B-LSTM network. The KL penalty is equally distributed to each minibatch. The KL divergence of the mth minibatch is shown as:

minθKLqw;θpwDm=minθqw;θlogqw;θ-logpw/M-logpDmwdw (20)

To simplify the notation, we define:

gw,θ=logqw;θ-logpw/M-logpDmw (21)

The prior distribution of parameter w in the B-LSTM network is assumed to follow a Gaussian distribution equivalent to a weighted L2 regularization, so the variational posterior distribution is a Gaussian distribution as well. As a result, the parameters of the variational distribution θ can be defined as a combination of μ,ρ.

A gradient-descent algorithm for training is used to minimize the KL divergence function. Since it is difficult to calculate the gradient of the integral in (20), a Gaussian reparameterization trick has been implemented. The parameters of the B-LSTM network can be expressed as w=μ+ρδ. Therefore, the derivative qwdw is equivalent to pδdδ [

33]. So, the derivative of the integral in (20) can be expressed as the integral of a derivative:

θqwgw,θdw=pδgw,θθdδ=pδgw,θwwθ+gw,θθdδ (22)

The training algorithm of B-LSTM network is summarized as follows.

Algorithm 1  : training algorithm of B-LSTM network

Input: data sample D

Output: posterior parameters (μ,ρ)

1: Set m=0, Δμ=, Δρ=, and α

2: While Δμ1ε or Δρ1ε, do

3:  Random sample a vector δ from the standard normal distribution   N0,1

4:   Calculate w=μ+ρδ

5:  Select the next minibatch Dm+1 if m<M; otherwise, select the first minibatch Dm=1, and calculate gw,θ

6:   Calculate Δμ and Δρ:

     Δμ=gw,θ/w+gw,θ/μΔρ=gw,θ/wδ+gw,θ/ρ

7:   Update μ and ρ:

     μ=μ-αΔμρ=ρ-αΔρ

8: End While

The EVCS operator employs the trained B-LSTM network based stochastic forecast function to allocate charging power to EVs during each time period. During each time period, the historical time, meteorological time, and EV charging demand are input into the B-LSTM network based stochastic forecast function, generating a series of typical EV charging scenarios. Subsequently, the environment factors from multiple scenarios are fed into the two-stage optimization approach, as illustrated in (10), in order to derive the optimal EV charging power allocation strategy Pt.

It should be noted that in the actual EVCS operation, EV arrivals are continuous and do not align precisely with the whole time period intervals. In this context, we recalculate the optimal charging power for each EV whenever a new EV arrives at the EVCS, approximating the current environmental factors as the values during the whole time period.

V. Numerical Simulations

A. Data Specification

In the numerical simulations, the real-world PVG output data and meteorological data over 182 days are employed. The charging data of EVs are based on the statistics from an EVCS in Nanshan District, Shenzhen, China, which encompasses the charging demand and the initial SOC of EVs. The arrival and departure time of EVs is provided by the parking lot operator of this EVCS [

34]. Electricity prices over the 182 days are obtained from the intra-day clearing price of a power exchange center in China [35]. According to the notice of power demand response work in China [36], the time and price signals of the demand response participation is set by the following principle: when the grid load is higher than 80% of the maximum load or less than 1.2 times the minimum load of the day, the demand response incentive of peak cutting and valley filling are provided, respectively. In order to train and evaluate the performance of the proposed B-LSTM network, 70% of the data are utilized for training and the remaining 30% are reserved for testing. Then, the obtained stochastic forecast model based on B-LSTM network, together with the test dataset, is employed to demonstrate the effectiveness of the proposed optimization approach for EV charging strategy. The electricity purchase price of EVs from the EVCS is assumed to be 2.5 ¥/kWh and the battery capacity of each EV is 60 kWh.

The uncertain environmental factors for the EVCS, including PVG outputs, electricity prices, and EV arrivals, are shown in Fig. 5. It can be observed from Fig. 5 that there is a significant difference between the PVG outputs on cloudy days and sunny days, while the difference of electricity price is less dependent on the weather. The number of EV arrivals at the EVCS varies significantly between weekdays and weekends. Therefore, a probabilistic forecast model based on B-LSTM network is more suitable.

Fig. 5  Distributions of uncertain environmental factors in dataset. (a) PVG output. (b) Electricity price. (c) Number of EV arrivals at EVCS.

B. Generating Scenarios Using a Probabilistic Forecast Model

The proposed B-LSTM network is trained for 500000 epochs to learn the probabilistic characteristics of the uncertain environmental factors. The prior distributions of parameters in the B-LSTM network are all set as N(0, 0.052), i.e., a normal distribution with a standard deviation of 0.05, which is equivalent to an L2 penalty for deterministic models. The number of neurons in the LSTM upper and lower cells are both set to be 64, which strikes a balance between computational power and model expressiveness. The length of the truncated sequence is determined such that the historical data beyond this length are not used for forecasting the current time step. Since the scheduling cycle for EV charging is typically one day, it is reasonable to approximate that the EV charging demand beyond 24 hours will not significantly influence the current charging strategy, so the length of truncated sequence is set to be 24. With real-time optimal charging strategy at time t, the environmental factors for the subsequent 24 hours are forecasted to generate typical scenarios. This encompasses EV charging demand, electricity price, and PVG output from t+1 to t+24. To capture the regularity of EV charging sequence and ensure the efficacy of the charging strategy, the length of a time interval is set to be 1 hour. The historical data of one week are used for parameter update in each epoch, and thus the minibatch size of the B-LSTM network is set to be 7.

The probability distribution N(u, σ2) of each parameter of the B-LSTM network can be obtained from the training result. The time and meteorological information over the previous 24 hours is taken as the input at the beginning of the day. Then, the distributions of the number of EV arrivals, the PVG outputs, and the electricity prices over the future 24 hours can be obtained by the forward propagation of the B-LSTM network. The forecast accuracy of the above environmental factors is shown in Table II.

TABLE II  Forecast Accuracy of Environmental Factors

Environmental

factor

RMSEProportion of actual values falling within 95% confidence interval of forecast values (%)
PVG output 13.600 96.7
Electricity price 0.026 95.2
Number of EV arrivals 1.230 93.5

It can be observed from Table II that while there are deviations between the actual values and their corresponding mean forecast values, a substantial majority of the observed values fall within the 95% confidence interval of the forecast values, demonstrating the effectiveness of the proposed B-LSTM network.

The actual values and forecast values of these environmental factors obtained by the proposed B-LSTM networks during a one-week period are compared, as shown in Fig. 6.

Fig. 6  Comparison between actual values and forecast values of environmental factors. (a) PVG output. (b) Electricity price. (c) Number of EV arrivals at EVCS.

It can also be observed from Fig. 6 that, although the overall PVG outputs are higher on sunny days (Monday, Wednesday, and Friday are all sunny days in this test case), the variability of the PVG outputs also increases since the luminous intensity is affected by clouds. In terms of the forecast of the electricity price, the confidence band in the afternoon is wider, representing a higher level of variability. For EV arrivals, there are more EVs on weekends with higher variabilities than on the weekdays.

In each scenario, the three charging demand variables of each EV also need to be forecasted, including initial SOC, expected SOC, and the departure time. Compared with traditional LSTM networks, which only obtain point estimates, the B-LSTM network could obtain the probability distributions at a specific time. Then, the charging demand variables of all EVs could be forecasted by multiple sampling of these probability distribution, as demonstrated in Fig. 7.

Fig. 7  Comparison between actual distribution and forecast distribution for charging demand variables of each EV. (a) Initial SOC. (b) Expected SOC. (c) Charging duration.

The Jensen-Shannon (JS) divergence is introduced to quantify the similarity between the actual and forecast distributions [

37]. The mean values of the JS divergences between the actual and forecast distributions for the above three EV charging demand variables are 0.0126, 0.0139, and 0.0852, respectively. The actual distributions of EV charging demands are close to those forecasted by B-LSTM network, which demonstrates its ability in forecasting the charging demand variables of specific EVs.

C. Optimization of Real-time EV Charging Strategy

We first use an offline method to obtain the optimal profit, which can serve as a benchmark, and an upper bound to evaluate the performance of the proposed optimization approach. Four typical scenarios are selected to assess the performance of the proposed optimization approach under different PVG output and EV number conditions. Based on the probability distribution of PVG outputs and EV numbers across various weather and date scenarios in Fig. 5, one sunny weekday, one sunny weekend, one rainy weekday, and one rainy weekend are selected as the test scenarios.

The EV charging strategies obtained by the proposed optimization approach are shown in Fig. 8, compared with the benchmark. Because different methods obtain the same revenues from the charging service, the comparison of electricity purchase costs is shown in Fig. 8, instead of the profit of the EVCS.

Fig. 8  EV charging strategies and electricity purchase costs of proposed optimization approach compared with benchmark. (a) Sunny weekday. (b) Sunny weekend. (c) Rainy weekday. (d) Rainy weekend.

On the rainy days, the PVG output is relatively low, even less than the EV charging demand on a weekday. Therefore, the optimal EV charging strategy mainly focuses on lowering the electricity cost. In this situation, the performance of the proposed optimization approach is almost the same as that of the benchmark. On the sunny weekday, since the EV charging demand is lower than the PVG output, EVs are preferentially charged by the PVG output. Therefore, the accuracy of scenario generation has little influence on the performance of the proposed optimization approach. However, when there are more EVs on weekends, the EVCS will arrange EVs to be charged during the period of a lower electricity price, since the forecast of the PVG output is not fully reliable during 13:00 and 15:00. In conclusion, the proposed optimization approach is not inferior when compared with the benchmark.

To demonstrate the effectiveness of the proposed optimization approach, it is compared with two other online optimization approaches.

1) One is the MPC approach [

11], where the deterministic long-term profit is taken into account in the charging strategy optimization for the EVCS. First, the environmental factors for the future 24 hours, denoted as Ut, including the EV arrivals, EV charging demands, PVG outputs, and the electricity price, are obtained by a deterministic forecast model based on LSTM networks. Then, the deterministic forecast model, as shown in (27), can be solved by a commercial solver such as CPLEX.

maxP0,PtR0P0+t=1TRtPt;Ut (27)

2) The other one is the DQN approach [

18], where the value function VtSOCt is directly fitted by full connect NNs based on historical data. The parameters of the NNs are updated by minimizing the temporal difference (TD) error.

All approaches are trained and tested in a personal computer equipped with an Intel i9 12900k CPU, and 64 GB of RAM. The average revenue of the EVCS on various days under different meteorological conditions are calculated, as shown in Table III.

Table Iii  Performances of Different Optimization Approaches
ApproachAverage revenue (103 ¥)Total profit (103 ¥)Average CPU time for making decision (s)
Sunny weekdaysSunny weekendsRainy weekdaysRainy weekends
DQN 20.2 38.6 6.3 10.4 75.5 13.6
MPC 23.4 36.5 7.3 10.9 78.1 28.6
Proposed 23.6 42.3 7.2 11.5 84.6 45.4

The proposed optimization approach takes all environmental uncertainties of the EVCS into full consideration. Therefore, it achieves a higher profit for the EVCS compared with the other two approaches in all evaluated cases. The proposed optimization approach takes longer CPU time than the other two approaches, as shown in Table III. Considering that the time scale for optimization of real-time EV charging strategy is 1 hour, the optimization computation time of all approaches is sufficient to meet the calculation time requirement.

The charging strategy and total cost of the EVCS on a sunny weekend and a rainy weekday are shown in Fig. 9. On the rainy weekday, the PVG output has little influence on the optimization of EV charging strategy. The MPC approach achieves almost the same result as the proposed optimization approach. However, on the sunny weekend, the PVG output and the EV charging demands are both harder to forecast well. The proposed optimization approach avoids overconfidence by B-LSTM network and thus considers the possibility of occurrence of various scenarios in the charging strategy optimization, resulting in a lower total cost.

Fig. 9  Charging strategy and total cost of EVCS on a sunny weekend and a rainy weekday with different optimization methods. (a) Sunny weekend. (b) Rainy weekday.

Since the dimensionality of the set of EVs receiving charging service in the EVCS is dynamic, the DQN approach cannot take the features of each EV as NN inputs. In the DQN approach, only the EVCS features such as the total charging demand are used as the inputs to Q-network, the ability of which to cope with the constraints is limited such as the SOCs of EVs should be higher than their expected SOCs when leaving the EVCS. On the sunny weekend, even though the electricity price is lower after hour 22, the EVs have to be charged at hour 21 because they will leave the EVCS at hour 22, which explains the poorer performance of DQN approach.

D. Discharge of EVs to DN

Considering the discharge of EVs to the DN, the EVCS can increase its revenue through the time-of-use electricity price and ancillary service provision. The comparison between revenues of two charging schemes for EVs, i.e., only charging from DN (scheme 1) and simultaneously charging from DN and discharging to DN (scheme 2), is shown in Table IV.

TABLE IV  Comparison Between Revenue of Two Charging Schemes for EVs
Charging schemeElectricity purchase cost (103 ¥)Total revenue (103 ¥)Ancillary service provision revenue (103 ¥)
Scheme 1 99.2 183.8
Scheme 2 102.8 256.2 72.4

The EVCS will prioritize the provision of ancillary services due to the attractive incentive price offered for such services, which is higher than the battery loss and power purchase costs. The operation of the EVCS with schemes 1 and 2 on a typical day is shown in Fig. 10.

Fig. 10  Operation of EVCS with schemes 1 and 2.

It can be observed from Fig. 10 that, the EVCS provides the ancillary service for peak shaving during hours 10-11 and valley filling during hours 21-22. The ancillary service provision brings higher electricity purchase cost, but concurrently generates the revenue of ¥260, which is much higher than the added value of the electricity cost.

VI. Conclusion

This paper addresses an optimization problem for real-time EV charging strategy, where long-term expected profit is intricately linked with environmental uncertainty. The proposed optimization model for real-time EV charging strategy is solved by a two-stage scenario-based optimization approach, aiming to maximize the long-term expected profit for the EVCS. A B-LSTM network is employed to generate the probability distributions of scenarios in real time considering the uncertainty of the EV charging demands, electricity prices, and PVG outputs. Numerical simulations show that the performance of the proposed optimization approach is close to that of the offline benchmark with perfect information and is superior to other online approaches examined. At the same time, the EVCS can achieve a higher operating profit by providing a variety of ancillary services to the power grid.

Along this direction, one future direction is to apply Gaussian approximation for the priori distribution of the B-LSTM network (e.g., Gaussian mixture distribution) to further enhance the forecast accuracy.

A. Indices

i Index for samples of dataset, i=1,2,,D

j Index for samples of scenarios, j=1,2,,K

m Index for minibatches, m=1,2,,M

n Index for electric vehicles (EVs), n=1,2,,N

s Index for truncated sequences, s=1,2,,S

t Index for time periods, t=1,2,,T

B. Uncertainties

ct Real-time electricity price at time t (¥/kWh)

NtA Number of EV arrivals at time t

PtPV Real-time power of photovoltaic generation (PVG) output at time t (kW)

SOCnE Requested state of charge (SOC) of the nth EV at the end of charging

SOCnI Initial SOC of the nth EV

TnA Arrival time of the nth EV

TnD Departure time of the nth EV

Ut Set of uncertainties at time t

C. Decision Variables

Pn,t Charging/discharging power of the nth EV at time t (kW)

Pt Vector of charging power of all EVs in EV charging station (EVCS) at time t

Pt(j) Vector of charging power of all EVs in the jth scenario at time t

D. States

SOCn,t SOC of the nth EV at time t (kWh)

SOCt Vector of SOCs of all EVs in EVCS at time t

E. Parameters of Bayesian Long Short-term Memory (B-LSTM) Network

δ Auxiliary random vector

α Learning rate

ε Convergence parameter

θ Distribution parameter vector of weights and biases of B-LSTM network, comprising mean (μ) and variance (ρ)

B Bias vector of a fully-connected layer in B-LSTM network

Ct State vector of LSTM cells at step t

gtF,gtI,gtO Output vectors of forget gate, input gate, and output gate of LSTM cell at step t

Ht Output vector of LSTM cells at step t

M Number of samples contained in a minibatch for training LSTM network

S Length of truncated sequence for training LSTM network

W Weight matrix of a fully-connected layer in B-LSTM network

xt Input vector of B-LSTM network at step t

yt Output vector of B-LSTM network at step t

F. Sets and Vectors

Dm,s Data vector of the sth truncated sequence in the mth minibatch, formulated as Dm,s=x1,y1,x2,y2,,xT,yT

Dm Dataset of the mth minibatch formulated as Dm=Dm,1,Dm,2,,Dm,S

D Dataset of minibatches formulated as D=D1,D2,,DM

It Set of EVs receiving charging service in EVCS at time t

Lt Set of EVs completing charging and leaving EVCS at time t

Tt Time information vector at time t

w Parameter vector of forecast model, including weights and biases

WtM Meteorological information vector at time t

G. Other Parameters

Δt Length of a time interval

β Ratio of electricity price sold to EVs and purchased from distribution network by EVCS

ηc, ηd Charging and discharging efficiencies of EV chargers

cEV Benefits of EVCS from selling electricity to EVs (¥/kWh)

ctP Peak shaving incentive at time t (¥/kW)

ctV Valley filling incentive at time t (¥/kW)

Pt,maxEVCS Cap on charging power of EVCS due to power grid operation constraints at time t (kW)

PmaxEV The maximum charging power of charging piles (kW)

Qn,t Charging/discharging energy of the nth EV at time t (kW)

QnEV Battery capacity of the nth EV

Rt Reward received by EVCS at time t

SOCmax, The maximum and minimum SOCs of EVs

References

1

Z. Jia, J. Li, X.-P. Zhang et al., “Review on optimization of forecasting and coordination strategies for electric vehicle charging,” Journal of Modern Power Systems and Clean Energy, vol. 11, no. 2, pp. 389-400, Mar. 2023. [Baidu Scholar] 

2

A. S. A. Awad, M. F. Shaaban, T. H. M. EL-Fouly et al., “Optimal resource allocation and charging prices for benefit maximization in smart PEV-parking lots,” IEEE Transactions on Sustainable Energy, vol. 8, no. 3, pp. 906-915, Jul. 2017. [Baidu Scholar] 

3

H. Patil and V. N. Kalkhambkar, “Grid integration of electric vehicles for economic benefits: a review,” Journal of Modern Power Systems and Clean Energy, vol. 9, no. 1, pp. 13-26, Jan. 2021. [Baidu Scholar] 

4

Y. Sun, Z. Chen, Z. Li et al., “EV charging schedule in coupled constrained networks of transportation and power system,” IEEE Transactions on Smart Grid, vol. 10, no. 5, pp. 4706-4716, Sept. 2019. [Baidu Scholar] 

5

H. Wang, M. Shi, P. Xie et al., “Electric vehicle charging scheduling strategy for supporting load flattening under uncertain electric vehicle departures,” Journal of Modern Power Systems and Clean Energy, vol. 11, no. 5, pp. 1634-1645, Sept. 2023. [Baidu Scholar] 

6

İ. Şengör, O. Erdinç, B. Yener et al., “Optimal energy management of EV parking lots under peak load reduction based DR programs considering uncertainty,” IEEE Transactions on Sustainable Energy, vol. 10, no. 3, pp. 1034-1043, Jul. 2019. [Baidu Scholar] 

7

S. I. Vagropoulos, D. K. Kyriazidis, and A. G. Bakirtzis, “Real-time charging management framework for electric vehicle aggregators in a market environment,” IEEE Transactions on Smart Grid, vol. 7, no. 2, pp. 948-957, Mar. 2016. [Baidu Scholar] 

8

J. Zhao, C. Wan, Z. Xu et al., “Risk-based day-ahead scheduling of electric vehicle aggregator using information gap decision theory,” IEEE Transactions on Smart Grid, vol. 8, no. 4, pp. 1609-1618, Jul. 2017. [Baidu Scholar] 

9

Y. Jin, B. Yu, M. Seo et al., “Optimal aggregation design for massive V2G participation in energy market,” IEEE Access, vol. 8, pp. 211794-211808, Nov. 2020. [Baidu Scholar] 

10

C. Li, T. Ding, X. Liu et al., “An electric vehicle routing optimization model with hybrid plug-in and wireless charging systems,” IEEE Access, vol. 6, pp. 27569-27578, May 2018. [Baidu Scholar] 

11

B. Wang, Y. Wang, H. Nazaripouya et al., “Predictive scheduling framework for electric vehicles with uncertainties of user behaviors,” IEEE Internet of Things Journal, vol. 4, no. 1, pp. 52-63, Feb. 2017. [Baidu Scholar] 

12

Y. Guo, J. Xiong, S. Xu et al., “Two-stage economic operation of microgrid-like electric vehicle parking deck,” IEEE Transactions on Smart Grid, vol. 7, no. 3, pp. 1703-1712, May 2016. [Baidu Scholar] 

13

S. Li, W. Hu, D. Cao et al., “Electric vehicle charging management based on deep reinforcement learning,” Journal of Modern Power Systems and Clean Energy, vol. 10, no. 3, pp. 719-730, May 2022. [Baidu Scholar] 

14

S. Li, W. Hu, D. Cao et al., “Electric vehicle charging management based on deep reinforcement learning,” Journal of Modern Power Systems and Clean Energy, vol. 10, no. 3, pp. 719-730, May 2022. [Baidu Scholar] 

15

Z. Wen, D. O’Neill, and H. Maei, “Optimal demand response using device-based reinforcement learning,” IEEE Transactions on Smart Grid, vol. 6, no. 5, pp. 2312-2324, Sept. 2015. [Baidu Scholar] 

16

Z. Wan, H. Li, H. He et al., “Model-free real-time EV charging scheduling based on deep reinforcement learning,” IEEE Transactions on Smart Grid, vol. 10, no. 5, pp. 5246-5257, Sept. 2019. [Baidu Scholar] 

17

C. Dong, J. Sun, F. Wu et al., “Probability-based energy reinforced management of electric vehicle aggregation in the electrical grid frequency regulation,” IEEE Access, vol. 8, pp. 110598-110610, Jun. 2020. [Baidu Scholar] 

18

Y. Gao, J. Yang, M. Yang et al., “Deep reinforcement learning based optimal schedule for a battery swapping station considering uncertainties,” IEEE Transactions on Industry Applications, vol. 56, no. 5, pp. 5775-5784, Sept. 2020. [Baidu Scholar] 

19

Y. Zhang, Q. Yang, D. An et al., “Multistep multiagent reinforcement learning for optimal energy schedule strategy of charging stations in smart grid,” IEEE Transactions on Cybernetics, vol. 53, no. 7, pp. 4292-4305, Jul. 2023. [Baidu Scholar] 

20

J. Andrade, L. F. Ochoa, and W. Freitas, “Regional-scale allocation of fast charging stations: travel times and distribution system reinforcements,” IET Generation, Transmission & Distribution, vol. 14, no. 19, pp. 4225-4233, Oct. 2020. [Baidu Scholar] 

21

D. Jost, M. Speckmann, F. Sandau et al., “A new method for day-ahead sizing of control reserve in Germany under a 100% renewable energy sources scenario,” Electric Power Systems Research, vol. 119, pp. 485-491, Feb. 2015. [Baidu Scholar] 

22

Z. Wei, Y. Li, and L. Cai, “Electric vehicle charging scheme for a park-and-charge system considering battery degradation costs,” IEEE Transactions on Intelligent Vehicles, vol. 3, no. 3, pp. 361-373, Sept. 2018. [Baidu Scholar] 

23

N. M. Pindoriya, S. N. Singh, and S. K. Singh, “An Adaptive wavelet neural network-based energy price forecasting in electricity markets,” IEEE Transactions on Power Systems, vol. 23, no. 3, pp. 1423-1432, Aug. 2008. [Baidu Scholar] 

24

E. L. Allgower and K. Georg, “Piecewise linear methods for nonlinear equations and optimization,” Journal of Computational and Applied Mathematics, vol. 124, no. 1-2, pp. 245-261, Dec. 2000. [Baidu Scholar] 

25

P. Pflaum, M. Alamir, and M. Y. Lamoudi, “Probabilistic energy management strategy for EV charging stations using randomized algorithms,” IEEE Transactions on Vehicular Technology, vol. 26, no. 3, pp. 1099-1106, May 2018. [Baidu Scholar] 

26

Y. Gal and Z. Ghahramani, “Dropout as a bayesian approximation: Representing model uncertainty in deep learning,” in Proceedings of 33rd Internation Conference on Machine Learning, New York, USA, Jun. 2016, pp. 1050-1059. [Baidu Scholar] 

27

S. P. Durrani, S. Balluff, and L. Wurzer et al. “Photovoltaic yield prediction using an irradiance forecast model based on multiple neural networks,” Journal of Modern Power Systems and Clean Energy, vol. 6, no. 2, pp. 255-267, Mar. 2018. [Baidu Scholar] 

28

X. Zhang, S. Kuenzel, N. Colombo et al., “Hybrid short-term load forecasting method based on empirical wavelet transform and bidirectional long short-term memory neural networks,” Journal of Modern Power Systems and Clean Energy, vol. 10, no. 5, pp. 1216-1228, Sept. 2022. [Baidu Scholar] 

29

B. Stappers, N. G. Paterakis, K. Kok et al., “A class-driven approach based on long short-term memory networks for electricity price scenario generation and reduction,” IEEE Transactions on Power Systems, vol. 35, no. 4, pp. 3040-3050, Jul. 2020. [Baidu Scholar] 

30

B. Huang, Q. Ding, G. Sun et al., “Stock prediction based on Bayesian-LSTM,” in Proceedings of 10th Internation Conference on Machine Learning and Computing, New York, USA, Feb. 2018, pp. 128-133. [Baidu Scholar] 

31

S. Zhang and J. J. Q. Yu, “Bayesian deep learning for dynamic power system state prediction considering renewable energy uncertainty,” Journal of Modern Power Systems and Clean Energy, vol. 10, no. 4, pp. 913-922, Jul. 2022. [Baidu Scholar] 

32

S. Thakur, H. V. Hoof, and J. Higuera et al., “Uncertainty aware learning from demonstrations in multiple contexts using bayesian neural networks,” in Proceedings of International Conference of Robotics and Automation, Montreal, Canada, May 2019, pp. 768-774. [Baidu Scholar] 

33

D. Kingma and M. Welling, “Auto-encoding variational Bayes,” in Proceedings of 2th International Conference on Learning Representations, Banff, Canada, Apr. 2014, pp.1-14. [Baidu Scholar] 

34

Shenzhen Government. (2016, Nov.). Data open platform of Shenzhen Government. [Online]. Available: https://opendata.sz.gov.cn/ [Baidu Scholar] 

35

China Electricity Council. (2023, Jul.). National electricity price monitoring system. [Online]. Available: http://cep.cec.org.cn/ [Baidu Scholar] 

36

Zhejiang Provincial Energy Bureau. (2021, Jun.). Provincial Energy Bureau’s notice on electricity demand response for 2021. [Online]. Available: https://fzggw.zj.gov.cn/art/2021/6/8/art_1229629046_4906648.html [Baidu Scholar] 

37

B. Fuglede and F. Topsoe, “Jensen-Shannon divergence and Hilbert space embedding,” in Proceedings of International Symposium on Information Theory 2004, Chicago, USA, Jul. 2004, p. 31. [Baidu Scholar]