Abstract
As typical prosumers, commercial buildings equipped with electric vehicle (EV) charging piles and solar photovoltaic panels require an effective energy management method. However, the conventional optimization-model-based building energy management system faces significant challenges regarding prediction and calculation in online execution. To address this issue, a long short-term memory (LSTM) recurrent neural network (RNN) based machine learning algorithm is proposed in this paper to schedule the charging and discharging of numerous EVs in commercial-building prosumers. Under the proposed system control structure, the LSTM algorithm can be separated into offline and online stages. At the offline stage, the LSTM is used to map states (inputs) to decisions (outputs) based on the network training. At the online stage, once the current state is input, the LSTM can quickly generate a solution without any additional prediction. A preliminary data processing rule and an additional output filtering procedure are designed to improve the decision performance of LSTM network. The simulation results demonstrate that the LSTM algorithm can generate near-optimal solutions in milliseconds and significantly reduce the prediction and calculation pressures compared with the conventional optimization algorithm.
f Mapping between input and output by LSTM
, H Set of time steps of rolling horizon, and length of rolling horizon
l Time-dependent hypothesis
N, i Number of charging piles installed in a parking lot, and index of each charging pile
T, t, Number of time steps in one day, index of each time step, and length of each time step
^, ~ Indicators of estimated and filtered values
, Complementary 0-1 binary variables to ensure that has only one state at any time step
, Complementary 0-1 binary variables to ensure that has only one state at any time step
Charging or discharging power of EV connected to the
, Charging (non-negative) and discharging (non-positive) power of the
Exchanged power between commercial building and grid
, Imported (non-negative) and exported (non-positive)
si State of charge (SOC) of the
Efficiencies of charging and discharging processes
, Upper and lower limits of
Battery capacity of the
, Time-of-use (TOU) tariff and feed-in tariff
Electrical demand of commercial building (non-negative)
PV output (non-negative)
, Rated values of charging and discharging power
Desired SOC of the
, Upper and lower limits of the
, Arrival time and departure time of the
, Start time and end time of peak period of TOU tariff
TO mitigate environmental pollution, energy crisis, and climate change, the development of distributed photovoltaics (PVs), electric vehicles (EVs), and other distributed energy resources (DERs) has become the focus of societal attention in recent years [
The energy management undertaken by the BEMS is a sequential decision-making process that depends on the characteristics of the DERs. The existing research usually obtains optimal building energy management results by formulating an optimization model. This kind of method is model-driven and strictly follows the physical laws. Owing to the different charging and discharging efficiencies of EVs, the EV-related BEMS optimization model usually needs to introduce complementarity constraints, which guarantee that the charging and discharging of each EV are mutually exclusive [
With the rapid growth of advanced computing infrastructures in recent years, machine learning methods appear to be suitable for overcoming the limitations of optimization models. Instead of building a physical model with complex constraints, these methods can acquire the tacit knowledge and formulate a mapping between the input and output by performing successive transformations of historical data. Subsequently, they can make decisions quickly in online execution, which greatly reduces the calculation pressure of BEMS in online execution. To achieve an efficient home-based demand response, a deep reinforcement learning (DRL) method based on a neural network and Q-learning algorithm is developed in [
However, it is still challenging to adapt the existing machine learning methods to the coordinated scheduling of EVs in commercial buildings. Firstly, considering the different characteristics of each EV such as charging demand, scheduling time limitation, and capacity, the training efficiency and generalization ability of the aforementioned machine learning methods have to be improved. Secondly, most of the aforementioned references ignore the influence of the temporal correlation in the BEMS problem, and still rely on prediction in online implementation. However, the temporal correlation needs to be considered, as the SOC of EV is time-coupling over adjacent time steps. Thirdly, as the machine learning model is a data-driven model, the internal logic and physical concepts of the model are not clear and the output results completely depend on the generalization ability of the model. There is no guarantee that the output of these machine learning methods is always within the physical limitations and meets the scheduling requirements.
In this study, a long short-term memory (LSTM) RNN-based machine learning algorithm is proposed to quickly solve the online BEMS scheduling problem. Compared with the aforementioned studies, this study provides the following contributions.
1) An LSTM-based system scheduling structure is constructed. In this structure, the training and execution of the LSTM network can be separated. The LSTM can be trained offline to acquire generalization ability and quickly generate each EV’s scheduling result online in a fully decentralized manner.
2) An LSTM-based BEMS model is proposed. As one of the most advanced DL architectures for time-series prediction problems, the LSTM has powerful memorization capability. Thus, the LSTM can map the temporal correlation of the BEMS problem well based on historical data and there is no need for additional prediction.
3) A preliminary data processing rule and an additional output filtering procedure are designed to enhance the decision performance of machine learning. The optimal scheduling of the EVs can be learned better by the LSTM after the preliminary data processing. Filtering is performed to remove unsafe, unsatisfactory, and fluctuating LSTM scheduling results.
The remainder of this paper is organized as follows. In Section II, the structure and task of a BEMS are introduced. Subsequently, the MILP-based optimization method, which is used to provide the training dataset for the proposed machine learning method, is reviewed in Section III. Section IV presents the proposed LSTM-based machine learning algorithm. Simulation results and discussions are presented in Section V. Finally, the conclusion is given in Section VI.
This section describes the system structure and main task of the BEMS.
As shown in

Fig. 1 BEMS structure and illustration of power and information flows in it.
(1) |
Considering that the electricity price incentives include the TOU tariff and feed-in tariff, the commercial-building prosumers are known to have significant potential for electricity cost savings, as well as load leveling. In order to maximize the benefits of the commercial-building prosumers, the BEMS is installed to schedule . It is usually expected that the PV output can be fully utilized to meet the electric power demand. Thus, the only schedulable resources considered here are the EVs. The main task of the BEMS is to coordinate the charging and discharging power of EVs with the PV output and electrical demand under the incentives of TOU tariff and feed-in tariff . Note that the EVs’ parking time in commercial office buildings is relatively long enough to provide large scheduling flexibility, so we take commercial office buildings as the research object in this paper.
The BEMS can collect and store the information of , , , , EV behavior such as and , and EV characteristics such as , , and . In the conventional optimization-model-based BEMS, the power control signal is sent to each charging pile after a centralized optimization calculation. However, in the proposed LSTM-based machine learning method, the EV power control can be realized in a fully decentralized manner.
In this section, we formulate a daily BEMS optimization model based on conventional MILP as (2)-(16) to provide the training dataset for the proposed machine learning method.
(2) |
s.t.
(3) |
(4) |
(5) |
(6) |
(7) |
(8) |
(9) |
(10) |
(11) |
(12) |
(13) |
(14) |
(15) |
(16) |
where the first term and second term of (2) are the payment to import electricity and the revenue of export electricity to the grid, respectively. The power balance at the PCC is ensured by (3), in which is divided into and . Equations (
Solving this model relies on full-day complete information. This model can be used in offline scheduling. In this study, we use this model to generate the training dataset (historical data) for the proposed machine learning method (see Section IV) and reference results for the proposed machine learning method in practical application (see Section V). For convenience, we will call this model MILP with complete information later in this paper.
Moreover, a rolling-horizon MILP-based model optimization method (see Appendix A), which we call MILP with incomplete information in this paper, is applied to online scheduling problems. However, this model still faces a computational burden in execution because of the difficulty of solving integer variables and the high prediction dependency in each solving process. A comparison between the online scheduling results obtained by using the MILP with incomplete information model and those obtained by the proposed machine learning method is presented in Section V.
In this section, an LSTM RNN-based machine learning method with significantly less computational burden and no prediction dependency on the power of PV, EV, or other electrical demand of the building is presented to address the BEMS problem. First, preliminaries of the method are introduced in Section IV-A. Subsequently, in Section IV-B, the control structure of LSTM-based system is constructed. Some details of the network training and execution procedure are provided in Section IV-C and Section IV-D, respectively. In Section IV-E, a filtering procedure is designed to enhance the decision performance of LSTM network. Finally, the overall procedure and structure of the proposed LSTM-based algorithm are summarized in Section IV-F.
As described in Section II, the main task of the BEMS is to schedule the charging and discharging of the numerous EVs parked in the commercial building. If we can obtain the SOC of all EVs at time step , the charging or discharging power for each EV at time step t can be calculated as:
(17) |
Thus, we can generate the EV power scheduling by predicting the SOC at the next time step.
Considering that the SOC has temporal correlation and is related to its previous state, its prediction can be regarded as a time-series prediction problem [
As an updated variant of the RNN, the LSTM is one of the most advanced DL architectures for time-series prediction problems [
Generally, an LSTM model is built based on the training of a dataset. The LSTM network consists of an input layer, an LSTM layer, a fully connected layer, and an output layer. Before the training process, the number of LSTM layers, number of nodes in each layer, and learning stopping criteria must be specified. The first step in training is presenting the data of previous states and previous inputs to the input layer. The weights of the network are then continuously adjusted according to the error between the network output and the output value in the training dataset until the algorithm converges.
After training, the LSTM can map inputs and outputs well. Once we input a new set of data to the input layer, the LSTM network can generate the corresponding output with temporal correlation.
Next, we will apply the LSTM network to the energy management system.
To achieve a fast and optimal schedule for the BEMS, we use a system structure in which the training and execution of the LSTM network are separated.
The network training is done at the BEMS level. After training, the LSTM in BEMS has the generalization ability to map states (inputs) to optimal decisions (outputs), and the trained LSTM network parameters can be obtained. But the execution is done at the charging piles level. We assign each charging pile with an LSTM network which has the same network structure as the LSTM in BEMS. The LSTM in each charging pile can copy the trained network parameters from the BEMS so that all the EVs connected to one building could be scheduled in a fully decentralized manner.
Based on the system structure, the LSTM in the BEMS can be trained offline, and the LSTM in each charging pile can carry out the scheduling online. Thus, the LSTM network solution process can be divided into offline training and online execution. This structure-based energy management has no prediction dependency on the PV output, electrical demand, or other system data, and takes only milliseconds to generate power scheduling outputs, as validated in Section V. Note the offline training will be re-implemented after a certain period at the BEMS level to retrain a new model based on up-to-date data, and the new model will be updated to the charging pile accordingly.
This subsection introduces the details of the offline training in practical application. First, we need to decide on the input and output of the LSTM network. Second, the training dataset is specified. A preliminary data processing rule is then proposed to make the training more targeted. Finally, details of the hyperparameter setting and the final structure are presented.
1) Input and Output
As described in Section IV-A, we have selected the LSTM as the tool for predicting the SOC of the EV connected to each charging pile. The output of each LSTM network is obviously the SOC at the next time step .
The prediction of the SOC is a time-series prediction problem related to the previous state. Hence, the SOC at the current time step is an input. As already known, the SOC at the next time step is related to the PV output , building electrical demand , and TOU tariff at the current time step. The feed-in tariff is constant in our case. Therefore, it is not an input to the network.
Thus, there are four nodes in the input layer corresponding to , , , and , respectively, and there is one node in the output layer corresponding to .
Owing to the powerful memorization capability of the LSTM (see Section IV-A), the relationship between the inputs and outputs in this paper can be described as (18), where the output state of time step are related to the input and the state of the previous l time steps.
(18) |
2) Training Dataset
The training dataset consists of historical data of inputs and outputs. The historical data of , , , EV behavior such as and , and EV characteristics such as , , , and collected by the BEMS are used to build this dataset. Based on these historical data, the historical optimal output in each charging pile can be generated by solving the MILP with complete information discussed in Section III. Subsequently, the historical data of , , , and the corresponding optimal () are stored in the training dataset in chronological order.
Considering that the ranges of the aforementioned historical data are different and the LSTM is sensitive to the data scale, these data are scaled to by the min-max normalization method.
Since the LSTM network-based building energy management is a data-driven technology, to ensure the execution performance of the LSTM network, all possible situations of the system should be considered. It implies that a training dataset with massive data is required to realize good generalization ability in different situations.
3) Preliminary Data Processing
Generally, the data sequences in most time-series prediction problems have the characteristic of temporal correlation. To solve the time-series prediction problems, the neural network can be simply trained by sending the data sequences to the network without further processing.
However, in this paper, we should consider two aspects. First, the EV connected in a specific charging pile could be different on different days. Second, the EVs would not stay in the same building parking lot and keep charging the whole day. It means that the SOC data on one day can also be time-discrete. If we train the network with the training dataset directly, the LSTM would just try to fit the historical data without learning to make feasible predictions, as there is no temporal correlation in some data sequences. Therefore, it would be difficult to train the LSTM network, and even though this network reaches a good training target, it would only lead to an overfitting problem and have poor performance in application. The poor result of such training mode, , LSTM with unprocessed data, is discussed in Section V.
To prevent the LSTM networks from overfitting the temporal correlations between the SOC of different EVs connected to the same charging pile, a preliminary data processing is necessary. First, the training dataset is divided into several segments according to the arrival and departure times of each independent EV connected to the
4) Setting of Network Structure and Hyperparameters
To train this network, we select Adam [

Fig. 2 Final structure of each LSTM neural network proposed in this paper.
After the LSTM in each charging pile copies the trained network parameters from the BEMS, the LSTM in each charging pile can generate the power scheduling for each connected EV. When there is an EV connected to a charging pile, the charging pile detects the actual SOC of that EV and updates it in the input vector. Meanwhile, the LSTM in the charging pile can obtain data including , , and at the current time step from the BEMS. As given in (18), all the inputs have been successfully updated so that the network can predict the SOC at the next time step, after which the predicted SOC will be used to generate the power scheduling of the current time step based on (17).
After applying the LSTM, the SOC of the EVs at the next time step can be predicted and the charging or discharging power control can be realized. In fact, as a data-driven model, the LSTM-based BEMS generates the outputs, which completely depends on the generalization ability of the LSTM network. Owing to the lack of strict physical constraints and the limited generalization ability of LSTM, the outputs generated by the LSTM may have two issues. On the one hand, these outputs may violate some constraints of the EVs including the maximum SOC , the minimum SOC , the maximum charging and discharging rates and , and EV charging demand . On the other hand, there may be some fluctuations among the outputs due to the limited generalization ability of the LSTM. For example, when the trends of the SOC curve slightly fluctuate, from (18), we can observe that these slight fluctuations have direct impacts on power and thus result in errors. When the errors of many EVs occur simultaneously, the accumulated error in power can be significant even if the accuracy of the predicted SOC curve is high, which affects the electricity cost as well as the load leveling effect.
Considering that the interpretability of machine learning mechanism is poor, and the internal logic and physical concepts of machine learning models are not clear enough, it is hard to completely avoid these issues from the LSTM network. Therefore, it is necessary to design a filter outside the LSTM network to guarantee that the output of these machine learning methods is within the physical limitations and meet the scheduling requirements. Here, we propose a filtering process with two parts. In the first part, we illustrate our criteria to avoid violation of the operational constraints of the EV. In the second part, we aim to filter out the fluctuations due to the limited generalization ability of the LSTM networks.
Note that the filtering process is used only to avoid abnormal outputs of the LSTM networks, which do not always occur, and the result still mainly relies on the LSTM networks.
1) The First Part of Filter
In the first part, we ensure that the outputs of the LSTM do not violate the constraints of the EVs. Here, five criteria are set up to determine whether these violations occur, and they are formulated as (19)-(23).
(19) |
(20) |
(21) |
(22) |
(23) |
If (19) or (20) is not satisfied, the SOC of the EV will be set to the maximum SOC or the minimum SOC , respectively. If (21) or (22) is violated, the SOC at the next time step will be recalculated based on charging or discharging at the maximum rate.
The first part forms the basic aspect to ensure the feasible power scheduling of the EV.
2) The Second Part of Filter
In the second part, we filter out the fluctuations caused by the lack of strict physical constraints and the limited generalization ability of the LSTM networks. For specificity, we classify the fluctuations into two components, i.e., abnormal charging and discharging behaviors. Our criteria for determining the abnormal behavior are elucidated in the following paragraphs.
First, when the TOU tariff is not at the peak, the discharging is considered abnormal, because it would reduce the scheduling flexibility during the peak period, i.e., the discharging may cause extra charging to occur in the peak time to satisfy the EV charging demand. Such discharging process leads to a higher electricity cost.
Second, during the peak time of the TOU tariff, the charging is considered abnormal because it leads to a higher cost, unless the current SOC needs inevitable charging to occur at the peak time to satisfy the EV charging demand. This is because the LSTM should evenly arrange inevitable charging over the entire period. Otherwise, to satisfy the SOC demands, the full-speed charging of numerous EVs near the departure time leads to an additional peak of the load curve, which affects the load leveling. The criterion of detecting abnormal charging is described as follows: charging at from the current SOC to starts from the peak period of the TOU tariff. This criterion is also given in (24).
(24) |
Moreover, during the peak time of the TOU tariff, the discharging is considered abnormal if it requires to have additional inevitable charging during another peak time of the tariff. This criterion is similar to (24). However, the SOC in the equation should be the predicted value. It can be given as:
(25) |
After detecting the abnormal charging or discharging behavior, the SOC at the next time step remains unchanged.
The filter designed is applied to to obtain a filtered , which is a feasible and reliable prediction of the SOC.
The entire filtering procedure is given in
Based on the analysis, the overall implementation procedure of our proposed LSTM-based BEMS model is illustrated in

Fig. 3 Overall implementation procedure of LSTM-based BEMS model.
At the offline stage, the BEMS collects the historical data of all the charging piles, PV generation, building electrical demand, and the TOU tariff. Based on those historical data, the BEMS can obtain the historical optimal EV power scheduling results by solving the MILP with complete information. All the historical data and optimal solutions are stored in the LSTM training dataset. After the preliminary dataset processing, the LSTM in the BEMS can be trained and the trained LSTM network parameters can be obtained. Considering that the data distribution may not be time-invariant, we implement such an offline training stage after a certain period to retrain a new model based on up-to-date data.
At the online stage, the LSTM in each charging pile can copy the trained network parameters from the BEMS. Subsequently, the SOC at the next time step of the connected EV can be predicted by the LSTM in each charging pile based on the real-time input data as shown in (18). Then, the predicted SOC goes through a filtering process to enhance the decision performance of LSTM network. Afterwards, the charging and discharging control can be generated according to (17). The decentralized online scheduling procedure repeats to generate power control of the connected EV until the EV departs. The EV-related data are also stored and would be used at the offline training stage, as mentioned before.
In this section, we provide a series of simulations to demonstrate the application of the proposed LSTM method in decision-making of EVs on commercial-building energy management. All simulations are run on a computer with an Intel® Core i7-7500U CPU @2.70 GHz-2.90 GHz and 8 GB of RAM. The LSTM algorithm is implemented on MATLAB R2019a platform, and the MILP model is solved using the YALMIP toolbox together with the intlinprog solver.
The case of a medium-scale office building with 500 kW PV onsite generation and 100 EV charging piles is studied [

Fig. 4 Simulation data. (a) Electrical demand on weekdays of one year. (b) PV output on weekdays of one year. (c) TOU tariff and feed-in tariff. (d) EV availability on weekdays of one year.
Note that
The scheduling horizon for BEMS optimization in one day is from 00:00 to 24:00, and we take 1 time step as 15 min; thus, we can divide one day into 96 time steps. In the rolling-horizon optimization, considering the trade-off between the prediction accuracy and solving efficiency, one rolling-horizon is 4 time steps.
We assume that only one EV is connected to a charging pile in the office building during a weekday. Each EV can be charged at any rate from 0 to 3.3 kW with different currents. The average battery capacity of each EV is 30 kWh and the average initial SOC is 0.6. The charging and discharging efficiencies are both 95%. Each EV should be charged to 0.85 SOC minimum before departure.
To verify the accuracy and efficiency of the proposed LSTM-based algorithm, we solve the case using five different methods: ① MILP with complete information (see Section III); ② MILP with incomplete information (see Appendix A); ③ the LSTM algorithm (see Section IV); ④ the LSTM without filter 2, which is based on the complete LSTM algorithm but adding only the first part of the filtering process; and ⑤ the LSTM with unprocessed data, which does not have preliminary data processing. In addition, the non-scheduling case is also simulated, which means that once the EV is connected to the charging pile, it is quickly charged until it is full and then disconnected.
Firstly, the learning performance of the LSTM network is evaluated. The loss function of the LSTM network during iterative training is shown in

Fig. 5 Loss function of LSTM network during training process.
1) SOC Scheduling Results
In this subsection, we present the energy management results of a day, e.g., the 25
Since the optimal SOC obtained by the MILP with complete information is the learning target of the LSTM network, the SOC scheduling result is an important aspect of the BEMS performance.

Fig. 6 SOC scheduling results of 8 randomly selected EVs using different methods. (a) EV 5. (b) EV 16. (c) EV 22. (d) EV 6. (e) EV 34. (f) EV 56. (g) EV 80. (h) EV 98.
The SOC of the LSTM is the closest to that of the MILP with complete information. The second-closest one is that of the LSTM without filter 2, whereas that of the LSTM with unprocessed data is the most divergent.
In fact, we can observe that the SOC of the LSTM without filter 2 is almost coincident with that of the LSTM in
As for the solution of the LSTM with unprocessed data, the large errors mainly result from the lack of preliminary data processing before training. The unprocessed training dataset contains many time steps when the SOC is 0, i.e., there is no EV plugged in the charging pile. The SOC in those redundant time steps are learned by the LSTM network. The preliminary data processing allows the LSTM to focus more on the SOC information when an EV is available (connected to the charging pile), resulting in a better solution.
Given that the optimal solution obtained by the MILP with complete information cannot be applied to the BEMS online scheduling, we can use the near-optimal solution obtained by the LSTM network in online scheduling, because it is close to that of the MILP with complete information.
2) Power Scheduling Results
To illustrate the effect of BEMS load leveling, the corresponding power scheduling results of the commercial building using different methods in comparison with the base electrical demand are plotted in

Fig. 7 Power scheduling results of commercial building using different methods.
Similar to
The power scheduling result of the MILP with incomplete information also deviates slightly from the optimal solution of the MILP with complete information. This deviation is mainly because of the prediction inaccuracy in the rolling-horizon optimization. From the results of the power scheduling, both the LSTM and MILP with incomplete information methods can obtain near-optimal solutions.
As the MILP with incomplete information is a commonly used online scheduling method at present, a more detailed comparison between the MILP with incomplete information and LSTM is presented in the following subsection in terms of solution time and electricity cost.
3) Solution Time
Based on several simulations, we collect statistics on the online solution time of both the MILP with incomplete information and LSTM. For simplicity, the time of prediction in the MILP with incomplete information is not considered, and only the calculation time of optimization is counted.
Our simulation results show that the solution time of one LSTM network can reach a millisecond level, i.e., 0.002 s on average, including the LSTM output time and the filtering process. Considering the LSTM-based system structure, the LSTM in each charging pile will output results simultaneously; thus, the number of EVs has very limited impact on the output time.
The average solution time of the MILP with incomplete information is 1.81 s for one calculation. However, the solution time of the MILP increases exponentially with the increase of the EV numbers [
The difference in solution time between the two methods is mainly because the MILP problem involves numerous matrix inversions, which are time-consuming. By comparison, the LSTM does not require this procedure. Furthermore, the MILP with incomplete information must be combined with a prediction to obtain optimal scheduling results in practical application. Therefore, considering the additional prediction time, the solution time of the MILP is longer. The proposed LSTM does not bring huge computation or prediction burden to the BEMS, which makes the LSTM method more suitable for online scheduling.
To further demonstrate the long-term performance and stability of the LSTM method, we present the BEMS results of the 22
1) Electricity Cost
We compare the electricity cost of the commercial building on 30 consecutive weekdays with different methods and a non-scheduling case, as depicted in

Fig. 8 Comparison of different methods in terms of electricity cost on 30 consecutive weekdays.
When there is no scheduling, the electricity cost is the highest, reaching $71247.87 in total. The scheduled cost given by the MILP with complete information is the lowest, which is $66875.47 in total, because the scheduling is a complete information-based full-horizon optimization. We can observe that the LSTM is the second-lowest cost with $67095.45 in total, which is 0.33% higher than the lowest one.
According to the analysis of the power scheduling results in
We can also observe from
2) EV Power Scheduling Results
Based on

Fig. 9 Comparison of summed power scheduling results of all EVs (100 in total) on 30 consecutive weekdays. (a) Using MILP with complete information. (b) Using LSTM.
The power scheduling results of 30 consecutive days using the LSTM are slightly different from those of the MILP with complete information, particularly at around 08:00 (start time of the peak period of TOU tariff) and 16:00 (end time of the peak period of TOU tariff). Such differences are acceptable at the online execution stage, as what we have analyzed from the perspective of electricity cost and load leveling. Therefore, once the LSTM model is trained, it can run stably in the BEMS over a certain period of time.
In this study, to achieve effective cost saving and load leveling of the commercial-building prosumer, an LSTM-based machine learning method is proposed to schedule the EV charging and discharging considering the PV output and other electrical demands of the building. A preliminary data processing rule and an output filter have also been designed to improve the LSTM mapping performance. The proposed method can be separated into offline and online stages. At the offline stage, the LSTM network is trained to acquire the generalization ability to map from the states (inputs) to decisions (outputs) based on historical data. At the online stage, each EV’s scheduling result can be quickly generated in a fully decentralized manner. The performance of the proposed LSTM method is compared with a commonly used MILP algorithm through a case study. The simulation results demonstrate that the electricity cost of the proposed LSTM method is close to the theoretical optimal solution, and better than that of the conventional MILP method, even without any prediction of the system data. Meanwhile, the solution time of the proposed LSTM method is of the order of milliseconds, which also means that it can help reduce the computational burden to a large extent.
Furthermore, the added preliminary data processing step can significantly help improve the accuracy of the network output, and the additional filtering has enhanced the load leveling effect. In general, the proposed LSTM-based algorithm can not only release the prediction and calculation pressures, but also achieve better results than the commonly used method in commercial building prosumer energy management.
Although the EV in the commercial building is used as an example to illustrate the effectiveness of the proposed method, one should note that the method can also be extended to optimize the power of other flexible loads in the commercial building such as heating ventilation and air conditioning (HVAC). The HVAC may be a more suitable dispatching resource in a shopping mall. At present, some related research works have been done on modeling of EV-HVAC using the virtual battery model. On this basis, it is easy to apply the method proposed in this paper to the energy management of a shopping mall by optimizing the power of HVAC.
Appendix
In this appendix, the formulation of MILP with incomplete information will be presented. In order to apply the MILP model to real-time scheduling, the rolling-horizon-optimization-based MILP model, which we call MILP with incomplete information, is formulated in this paper. Solving this model relies on real-time information and some ultra-short-term prediction information of the rolling-horizon, i.e., incomplete information.
At the beginning of each time step, the BEMS carries out the optimization for the first coming time step and the ultra-short-term prediction-based time horizon . For each optimization horizon [t, ], the objective function (2) and constraints (3)-(16) will remain valid. The power scheduling of time step and the time horizon are included in the optimization solutions. However, only the scheduling of time step from the optimization solutions will be implemented. When the next time step comes, the BEMS carries out the optimization again based on updated available information and new ultra-short-term prediction, and then only implements the scheduling of the time step . The BEMS obtains the online scheduling by continuously implementing this process.
REFERENCES
A. Samimi and M. Nikzad, “Complete active-reactive power resource scheduling of smart distribution system with high penetration of distributed energy resources,” Journal of Modern Power Systems and Clean Energy, vol. 5, no. 6, pp. 863-875, Nov. 2017. [Baidu Scholar]
J. Wang, H. Zhong, J. Qin et al., “Incentive mechanism for sharing distributed energy resources,” Journal of Modern Power Systems and Clean Energy, no. 7, no. 4, pp. 837-850, Jul. 2019. [Baidu Scholar]
M. Khorasany, Y. Mishra, B. Babaki et al., “Enhancing scalability of peer-to-peer energy markets using adaptive segmentation method,” Journal of Modern Power Systems and Clean Energy, no. 7, no. 4, pp. 791-801, Jul. 2019. [Baidu Scholar]
Y. Xue and X. Yu, “Beyond smart grid-cyber-physical-social system in energy future,” Proceedings of the IEEE, vol. 105, no. 12, pp. 2290-2292, Dec. 2017. [Baidu Scholar]
Y. Song, Y. Ding, S. Pierluigi et al., “Optimization methods and advanced applications for smart energy systems considering grid-interactive demand response,” Applied Energy, vol. 259, pp. 1-3, Feb. 2020. [Baidu Scholar]
D. Azuatalam, A. C. Chapman, and G. Verbič, “Probabilistic assessment of impact of flexible loads under network tariffs in low-voltage distribution networks,” Journal of Modern Power Systems and Clean Energy, vol. 9, no. 4, pp. 951-962, Jul. 2021. [Baidu Scholar]
Z. Liu, Q. Wu, M. Shahidehpour et al., “Transactive real-time electric vehicle charging management for commercial buildings with PV on-site generation,” IEEE Transactions on Smart Grid, vol. 10, no. 5, pp. 4939-4950, Sept. 2019. [Baidu Scholar]
J. Hu, H. Zhou, Y. Li et al., “Multi-time scale energy management strategy of aggregator characterized by photovoltaic generation and electric vehicles,” Journal of Modern Power Systems and Clean Energy, vol. 8, no. 4, pp. 727-736, Jul. 2020. [Baidu Scholar]
K. Sou, J. Weimer, H. Sandberg et al., “Scheduling smart home appliances using mixed integer linear programming,” in Proceedings of 2011 50th IEEE Conference on Decision and Control and European Control Conference, Orlando, USA, Dec. 2011, pp. 5144-5149. [Baidu Scholar]
K. Paridari, A. Parisio, H. Sandberg et al., “Energy and CO2 efficient scheduling of smart appliances in active houses equipped with batteries,” in Proceedings of 2014 IEEE International Conference on Automation Science and Engineering (CASE), Taipei, China, Aug. 2014, pp. 632-639. [Baidu Scholar]
T. Sousa, H. Morais, Z. Vale et al., “Intelligent energy resource management considering vehicle-to-grid: a simulated annealing approach,” IEEE Transactions on Smart Grid, vol. 3, no. 1, pp. 535-542, Mar. 2012. [Baidu Scholar]
M. A. A. Pedrasa, T. D. Spooner, and I. F. MacGill, “Coordinated scheduling of residential distributed energy resources to optimize smart home energy services,” IEEE Transactions on Smart Grid, vol. 1, no. 2, pp. 134-143, Sept. 2010. [Baidu Scholar]
M. Fukushima and G.-H. Lin, “Smoothing methods for mathematical programs with equilibrium constraints,” in Proceedings of International Conference on Informatics Research for Development of Knowledge Society Infrastructure, Kyoto, Japan, Mar. 2004, pp. 206-213. [Baidu Scholar]
C. Hu, “Exactness of penalty functions for solving MPEC model of the transportation network optimization problems with user equilibrium constraints,” in Proceedings of 2006 International Conference on Management Science and Engineering, Lille, France, Oct. 2006, pp. 2045-2049. [Baidu Scholar]
X. Xu, Y. Jia, Y. Xu et al., “A multi-agent reinforcement learning based data-driven method for home energy management,” IEEE Transactions on Smart Grid, vol. 11, no. 4, pp. 3201-3211, Jul. 2020. [Baidu Scholar]
F. Ruelens, B. J. Claessens, S. Vandael et al., “Demand response of a heterogeneous cluster of electric water heaters using batch reinforcement learning,” in Proceedings of 2014 Power Systems Computation Conference (PSCC), Wrocław, Poland, Aug. 2014, pp. 1-7. [Baidu Scholar]
H. Berlink and A. H. R. Costa, “Batch reinforcement learning for smart home energy management,” in Proceedings of 1st International Workshop on Social Influence Analysis/24th International Joint Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, Jul. 2015, pp. 2561-2567. [Baidu Scholar]
E. Mocanu, D. C. Mocanu, P. H. Nguyen et al., “On-line building energy optimization using deep reinforcement learning,” IEEE Transactions on Smart Grid, vol. 10, no. 4, pp. 3698-3708, Jul. 2019. [Baidu Scholar]
B. Wang, Y. Li, W. Ming et al., “Deep reinforcement learning method for demand response management of interruptible load,” IEEE Transactions on Smart Grid, vol. 11, no. 4, pp. 3146-3155, Jul. 2020. [Baidu Scholar]
C. Keerthisinghe, A. C. Chapman, and G. Verbic, “Energy management of PV-storage systems: policy approximations using machine learning,” IEEE Transactions on Industrial Informatics, vol. 15, no. 1, pp. 257-265, Jan. 2019. [Baidu Scholar]
K. Paridari, D. Azuatalam, A. C. Chapman et al., “A plug-and-play home energy management algorithm using optimization and machine learning techniques,” in Proceedings of 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), Aalborg, Denmark, Oct. 2018, pp. 1-6. [Baidu Scholar]
Y. Bao, T. Xiong, and Z. Hu, “Multi-step-ahead time series prediction using multiple-output support vector regression,” Neurocomputing, vol. 129, pp. 482-493, Apr. 2014. [Baidu Scholar]
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735-1780, Nov. 1997. [Baidu Scholar]
FA. Gers, J. Schmidhuber, and F. Cummins, “Learning to forget: continual prediction with LSTM,” Neural Computation, vol. 12, no. 10, pp. 2451-2471, Oct. 2000. [Baidu Scholar]
D. P. Kingma and J. Ba. (2014, Dec.). Adam: a method for stochastic optimization. [Online]. Available: https://arxiv.org/abs/1412.6980 [Baidu Scholar]
S. M. Frank and P. K. Sen, “Estimation of electricity consumption in commercial buildings,” in Proceedings of 2011 North American Power Symposium, Boston, USA, Aug. 2011, pp. 1-7. [Baidu Scholar]