1 Introduction

The electricity industry worldwide is undergoing significant changes, gradually evolving from a centralized industry into a distributed and competitive industry. The restructure has necessitated the decomposition of the three components of power system: generation, transmission and distribution [1]. This decomposition begins typically at the supply side, which separates the power producers and the transmitting network by establishing independent power plants (IPPs) and the independent system operator (ISO). In the deregulated environment, generation companies (GenCos) compete for selling energy by submitting competitive bids to ISO, significantly changing the traditional pattern.

However, in markets where only the supply side is restructured, the pricing mechanism is not maturely developed. ISO only accepts bids from GenCos, and the load demand was regarded as a constant value depending on load forecast. End users cannot choose but to passively accept the result. Because the price remains unchanged or little changed in the long term, the response potential of consumers is not motivated, and the elasticity is, if any, extremely low [2]. Obviously this could cause many problems, such as the abuse of market power [3, 4] and a large investment in the long run [5].

In symmetric electricity markets where ISO accepts bids from both GenCos and retailers, the clearing price is decided by both sides, thus the competition mechanism is more complete. On the other hand, demand response (DR), considered as an important alternative solution to improving power system reliability and avoiding surging prices, enables customers to manage load consumption in response to ever-changing supply conditions[6, 7]. DR programs can also be implemented under some critical circumstances to prevent the system from collapsing [8]. Nowadays, electricity markets in many countries have incorporated DR programs, such as the emergency response service and economic DR programs in PJM market [9, 10], and load resources participation in fast responding regulation service in ERCOT [11], etc.

As an effective approach to studying distributed system, multi-agent computational economic simulation has been widely applied in the research of electricity markets [1214]. Reference [15] presented an integrative method to evaluate different wholesale market rules and the effect of market power mitigation. Reference [16] modeled the market power in forward and spot electricity markets using agent-based models. Reference [17] applied two experimental economics methods to a market test suite and discussed the market outcomes under both methods to illustrate the difference between the behavior of human and agents. The new agent-based wholesale market model presented in [18] uses predictive bidding method and multi-step optimization to find bidding curves, which maximize the expected discounted profit. The work described above mainly focused on wholesale market issues. However, the difference between trading mechanisms is not compared and the role of retailers is not incorporated. References [1921] developed a multi-agent simulator of competitive electricity market considering virtual power plant to study possible trading mechanisms, but only gave a rough comparison between symmetric and asymmetric markets.

Recently, many researches have been conducted to study DR characteristics using agent-based approach. References [22] and [23] proposed the bidding strategy of GenCos and load serving entities (LSEs) respectively. However, consumers’ responsive characteristics, namely how end users react to the time-varying prices, were not modeled and analyzed under market environment. Reference [24] modeled the consumption behavior of commercial buildings, and studied the impact from commercial buildings with price-responsive demand with different levels of DR penetration, but only GenCo agents were equipped with learning algorithms. Reference [25] presented a concept of a new market role, the “Decentralized Market Agent”, which optimized the system operation and expansion on distribution grid level using demand side management. However, the relationship and interaction between market participants are not described clearly.

As the electricity market develops, the participation of retailers and the response from end customers are playing an increasingly important role in price forming and market operation. Therefore, more specific research and analysis should be done to describe its impact, as well as the interaction between different participants.

This paper applies agent-based modelling and simulation method to explore the impact of symmetric market mechanism and price-based DR on electricity markets. The overall simulation framework is presented in Fig. 1. Firstly, the trading and clearing mechanism in the symmetric day-ahead market is introduced in Section 2, as well as the simulation procedure. The relationship between different market participants is also described. In Section 3, the detailed models of market participants are established according to their behaviors: describing the bidding behavior of IPPs with cost-based bidding strategy and reinforcement learning algorithm; designing the decision-making method of retailers from two aspects, namely purchasing and selling electricity; and taking into account the response characteristics of consumers under time-of-use mechanism (TOU) based on consumer psychology. Finally, the level of clearing prices and market power is analyzed and compared under different market structures, as well as the variation in prices, load consumption and social welfare.

Fig. 1
figure 1

Market simulation framework

2 Simulation for symmetric electricity market

2.1 Trading mechanism

Day-ahead symmetric markets adopt pool trading pattern and TOU pricing mechanism. The market structure is shown in Fig. 2. 24 hours within one day are divided into three periods (i.e. peak period, flat period and valley period). The prices are different between the three periods, according to the supply and demand conditions.

Fig. 2
figure 2

Market structure

In day-ahead markets, ISO defines a time point before which bids can be accepted. GenCos submit the bids before the deadline, which should include the quantity of electricity supplied and the price. Similarly, buyers, usually referring to large consumers and retailers, are also required to submit offers. Then the supply curve and the demand curve can be obtained. The crossing point of the two curves is the market equilibrium point. All the bids from GenCos which are lower than or equal to the clearing price will be accepted, and GenCos will arrange its production plan according to the clearing result. The clearing price is decided by both sides, so both the high bids from sellers and the low offers from buyers will result in a decrease in the profit or dissatisfaction of the load demand, which helps to lower down the price level.

2.2 Price clearing

When solving out the market clearing price in real markets, relevant constraints should be considered in order to ensure the safe operation of the system [26]. At present, there are many solutions to this problem, including merit-order method, linear programming and dynamic programming [27]. This paper adopts the multi-period linear programming method.

In markets where only the wholesale competition exists, the objective function is to minimize the overall purchasing fee, while in markets where the wholesale competition and the retail competition coexist, it is to maximize the social welfare.

$$\hbox{min} \,F_{1} = \hbox{min} \,\sum\limits_{t = 1}^{T} {\sum\limits_{i}^{{}} {({{B}}_{i,t} \lambda_{t} )} }$$
(1)
$$\hbox{max} \,F_{2} \, = \hbox{max} \,\sum\limits_{t = 1}^{T} {\left( {\sum\limits_{j} {D_{j,t} } \lambda_{t} - \sum\limits_{i} {B_{i,t} C_{i,B} } } \right)}$$
(2)

where \(F_{1}\) is the purchasing fee; \(F_{2}\) is the social welfare; T is the total simulation time horizon; \(B_{i,t}\) is the bid of GenCo i at time t, and \(C_{i,B}\) is the cost to generate that amount of energy; \(\lambda_{t}\) is the system clearing price at time t; and \(D_{j,t}\) is the offer of retailer j at time t.

The constraints associated with this optimal problem include supply-demand balance, unit capacity, unit ramp rate, transmission line capacity, etc.

2.3 Interaction between market participants

In the simulation model of day-ahead market proposed in this paper, agents representing different market participants interact with each other to pursue the maximization of their own profits. The market information is incomplete, meaning that agents do not have access to the strategies of others.

In perfect competition markets, all GenCo agents bid according to their real cost, while in more realistic scenarios, some GenCos may have dominant market power so they could manipulate market price by capacity withholding. However, the pressure from retailers could force GenCos to make a reasonable evaluation of clearing prices, because high bids may cause a loss in market share.

On the other hand, the profit of retailers equals the revenue minus the purchasing cost. The retail price could greatly influence the consumption of end users. The decrease in consumption, in turn, could also affect the profits of retailers and GenCos. In markets where the price is time-varying, customers could cut down their expenses by adjusting their load plans. As a consequence, the revenue retailers earn may decline.

From another perspective, however, the rearrangement of load may cause market prices to rise in peak hours and drop in valley ones. Similar changes can be seen in the purchasing cost of retailers, and the total cost within one day will decrease to some extent. During this interaction process, the overall production cost is also reduced.

2.4 Simulation procedures

During the simulation, GenCo agents and retailer agents will continue adjusting their bidding strategies, while customers adjusting their load profiles, until an equilibrium point is reached, which usually takes dozens of rounds. The procedure of a single simulation round is as follows.

  1. 1)

    GenCo agents apply decision-making approach, which is based on unit parameters and known market information, to obtain the best bidding strategy, then submit the multi-period bids to ISO according to the predefined format.

  2. 2)

    Retailer agents conduct the load forecast based on the load characteristics of consumers, determine the quantity needed and the price, and submit the offer.

  3. 3)

    After receiving all the bids, ISO applies clearing algorithm to work out the market prices at each period, as well as successful bids and offers.

  4. 4)

    GenCos calculate the generation cost and expected profit, which are used as the feedback to improve the decision-making.

  5. 5)

    Retailers calculate their purchasing cost based on the clearing result, and determine the retail price.

  6. 6)

    Customers purchase electricity from retailers, and respond to price signals by adjusting the load profile.

  7. 7)

    Retailers calculate the profit and apply Roth-Erev (R-E) algorithm to improve the decision-making.

3 Agent models of market participants

3.1 GenCo agents

The model is made up of three modules: the calculation of generation cost, the selection of bidding strategy and the learning algorithm.

The unit cost function can be expressed using a ladder diagram, as shown in Fig. 3. Let b be the number of segments, q be the quantity generated, and p be the corresponding cost, then it can be written as:

Fig. 3
figure 3

Cost curve of units

$$C = \left\{ {(p_{1} ,q_{1} ),(p_{2} ,q_{2} ), \cdots ,(p_{b} ,q_{b} )} \right\}$$
(3)

The bid is based on the cost function [28]. Let A be the set of alternative strategies, of which the elements, also called strategy coefficients, refer to the value deviating from the cost.

$$A = \{ A_{0} ,A_{1} ,A_{2} , \cdots ,A_{m} \}$$
(4)

where m is the number of strategies; \(A_{0} = 0\), meaning that units bid the marginal cost.

In this way, the bidding function of GenCo i, for a certain \(A_{i}\), can be expressed as:

$$\{ (A_{i} p_{1} ,q_{1} ),(A_{i} p_{2} ,q_{2} ), \cdots ,(A_{i} p_{b} ,q_{b} )\}$$
(5)

Once the cost function has been confirmed, the key to GenCos’ strategy is to select an appropriate strategy coefficient in order to maximize the profit.

The electricity market is modeled as a repetitive auction, which can be studied through repeated stochastic bidding. We adopt the roulette wheel method for GenCo agents to randomly select \(A_{i}\), and R-E reinforcement learning algorithm to model agents’ learning behavior.

According to the R-E algorithm, selection probabilities and propensities will be continuously updated on the basis of the historic profit. If strategy \(A_{k}\) was chosen in the \(d{\text{th}}\) round, and the profit earned is \(P_{d}\), the probabilities and propensities can be updated through

$$q_{i,d + 1} = (1 - r)q_{i,d} + R_{i,d}$$
(6)
$$R_{i,d} = \left\{ \begin{aligned} (1 - e)P_{d} \,\,\,\,\,\,\,\,\,i = k \hfill \\ \frac{{P_{d} }}{m - 1}\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,i \ne k \hfill \\ \end{aligned} \right.$$
(7)
$$p_{i,d + 1} = \frac{{\exp \left( {\frac{{q_{i,d + 1} }}{c}} \right)}}{{\sum\limits_{j = 1}^{m} {\exp \left( {\frac{{q_{j,d + 1} }}{c}} \right)} }}$$
(8)

where \(q_{i,d + 1}\) and \(p_{i,d + 1}\) are the propensity and the probability of strategy i in the \((d + 1){\text{th}}\) round, respectively; \(R_{i,d}\) is the response factor; r is the forgetting rate; c is the cooling coefficient; and e is the experience factor.

3.2 Retailer agents

A retailer, which can be viewed as an aggregate of consumers, is in charge of the power supply in a certain region. This aggregate way makes it easier to maximize the profits of overall consumers [29]. Being a retailer does not entail large physical assets, so the market access threshold is relatively lower. They make profits by purchasing energy from GenCos and selling it to consumers in a higher price. Numbers of retailers could significantly increase the competition. Generally speaking, it is impossible for every single retailer to occupy a strong share of the market, which leaves no room for monopoly, the market operation efficiency thus improved.

The entire purchasing cost can be expressed as:

$$C_{\text{pur}} = C_{\text{gen}} + C_{\text{tran}} + C_{\text{cong}}$$
(9)

where \(C_{\text{gen}}\) is the cost of purchasing energy from GenCos; \(C_{\text{tran}}\) is the cost of electricity transmission; and \(C_{\text{cong}}\) is the congestion cost when congestion occurs.

The decision-making process of retailer agents contains two parts: determining the purchasing offer and the retail price, respectively. This paper adopts derivative following method to work out the purchasing price, and still applies the roulette wheel method and R-E algorithm to settle the retail price.

The derivative following method adjusts its offer by making a small change to the price offered in the previous round. The adjustment is based on the revenue earned previously and the difference between the current result and the expected result. If the price offered before cannot guarantee all the demand to be satisfied, agents will raise the offer until all the energy needed is bought. Furthermore, if the previous adjustment produced more revenue per good than the previous period, then a similar change, otherwise a different one, will be made.

We assume that the price offered in round i is \(p_{i}^{\text{R}}\), the price adjustment is \(\varDelta_{i}\), then the price offered in the next round will be:

$$p_{i + 1}^{\text{R}} = p_{i}^{\text{R}} + \varDelta_{i + 1}$$
(10)
$$\varDelta_{i + 1} = p_{i}^{\text{R}} (\beta + \frac{{q_{\exp } - q_{\text{pur}} }}{{q_{\exp } \alpha }})$$
(11)
$$0 \le q_{\text{pur}} \le q_{\exp }$$
(12)

where \(q_{\text{pur}}\) is the quantity purchased; \(q_{\exp }\) is the quantity needed; \(\alpha\) and \(\beta\) are relevant coefficients, related to the changing rate of \(\varDelta_{i}\). \(\varDelta_{i}\) changes every round, decided by the difference between the quantity needed and purchased.

The selling price is:

$$p_{\text{sell}} = C_{\text{pur}} + p_{\text{ser}}$$
(13)
$$0 \le p_{\text{ser}} \le p_{\hbox{max} } - C_{\text{pur}}$$
(14)

where \(p_{\text{ser}}\) is the fee charged by retailers for energy provision service, decided by the random bidding method discussed before, usually accounting for 0.5%~5.0% of the total price; and \(p_{\hbox{max} }\) is the maximum price value.

3.3 Consumer agents with DR behaviour

The objective of consumer agents is to minimize the expense without jeopardizing their load demand. For that purpose, customers would modify their original load profile to consume more energy in low-price periods and less energy in high-price periods. In this section, we apply consumer psychology theory to model the responsive characteristics of agents under TOU pricing mechanism. The mathematical model of this mechanism is shown in Fig. 4.

Fig. 4
figure 4

TOU mechanism

The general impact of the price change on consumers’ electricity consumption is illustrated in Fig. 5. Consumers would not respond if the price change is less than a certain threshold value (point a). As the change increases above it, customers will adjust their load consumption accordingly. The quantity adjusted has an approximate linear relation to the price incentive. However, there is a limit to users’ response ability, which reflects the rigid demand. Point b in Fig. 5 is defined as the limit value above which the stimulation loses effect. As is shown in the figure, the responsive characteristics curve is mainly decided by the threshold value, the slope and the limit value.

Fig. 5
figure 5

Consumers’ responsive characteristics

Load shift rate is defined as the ratio of the load transferred from high-price periods to low-price periods, divided by the load in high-price periods. For example, the peak-valley load shift rate \(\mu_{\text{pv}}\) can be expressed by (15).

$$\mu_{\text{pv}} = \left\{ \begin{array} {ll}0 &\quad 0 \le \Delta \lambda_{\text{pv}} < a_{\text{pv}} \hfill \\ k_{\text{pv}} (\Delta \lambda_{\text{pv}} - a_{\text{pv}} )&\quad a_{\text{pv}} \le \Delta \lambda_{\text{pv}} \le b_{\text{pv}} \hfill \\ \mu_{\text{pv}}^{\hbox{max} } &\quad \Delta \lambda_{\text{pv}} > b_{\text{pv}} \hfill \\ \end{array} \right.$$
(15)

where \(\Delta \lambda_{\text{pv}}\) is the difference between the price in peak hours and that in valley hours; \(a_{\text{pv}}\) is the threshold value; \(b_{\text{pv}}\) is the limit value; \(\mu_{\text{pv}}^{\hbox{max} }\) is the maximum response; and \(k_{\text{pv}}\) is the slope.

The peak-flat shift rate \(\mu_{\text{pf}}\) and the flat-valley shift rate \(\mu_{\text{fv}}\) can be expressed by similar equations. We assume that the load transferred from one period to another is evenly distributed by hours, as indicated in (16)-(18).

$$\Delta L_{{{\text{pv}}(k)}} = \Delta L_{{{\text{pv}}(k + 1)}} = \cdots = \Delta L_{{{\text{pv}}(k + N_{\text{v}} )}} = \Delta L_{\text{pv}}$$
(16)
$$\Delta L_{{{\text{vp}}(k)}} = \Delta L_{{{\text{vp}}(k + 1)}} = \cdots = \Delta L_{{{\text{vp}}(k + N_{\text{p}} )}} = \Delta L_{\text{vp}}$$
(17)
$$\Delta L_{\text{pv}} N_{\text{v}} = \Delta L_{\text{vp}} N_{\text{p}}$$
(18)

where \(N_{\text{v}}\) and \(N_{\text{p}}\) are the numbers of valley hours and peak hours, respectively; \(\Delta L_{{{\text{pv}}(k)}}\) is the load increment at the kth valley hour caused by the load shift; and \(\Delta L_{{{\text{vp}}(k)}}\) is the load decrement at the kth peak hour.

According to the model proposed, the load after adjustment can be obtained by (19).

$$L_{t} = \left\{ \begin{aligned} L_{t0} + \mu_{\text{pv}} \bar{L}_{\text{p}} + \mu_{\text{fv}} \bar{L}_{\text{f}} \,\,\,\,\,t \in T_{\text{v}} \hfill \\ L_{t0} + \mu_{\text{pf}} \bar{L}_{\text{p}} - \mu_{\text{fv}} \bar{L}_{\text{f}} \,\,\,\,\,t \in T_{\text{f}} \hfill \\ L_{t0} - \mu_{\text{pv}} \bar{L}_{\text{p}} - \mu_{\text{pf}} \bar{L}_{\text{p}} \,\,\,\,\,t \in T_{\text{p}} \hfill \\ \end{aligned} \right.$$
(19)

where \(T_{\text{p}} ,T_{\text{f}} ,T_{\text{v}}\) are peak, flat and valley hours respectively; \(L_{t0}\) and \(L_{t}\) are the load at hour t before and after TOU is implemented; \(\bar{L}_{\text{p}}\) and \(\bar{L}_{\text{f}}\) are the average peak-hour load and valley-hour load, respectively.

The entire group of consumers can be divided into three types (i.e., industrial, commercial and household), the responsive potential of which differs a lot, reflected by the variation in the parameters. Factors affecting the parameters of the responsive model include the business type, production procedure, and the proportion of electricity cost in total cost, etc. [30]. For example, in iron manufacturing industry, high electricity quality and reliability are required, and the proportion of shiftable load is limited. Thus the maximum shift rate and limit value are relatively small. In contrast, cement enterprises usually work on three shifts, and the electricity expense makes up approximately 15% of total expense. In this case, they are more willing to transfer load for the purpose of cutting down electricity fees, and the limit value and the maximum rate are larger. As for commercial and domestic users, a large part of electricity is consumed by air conditioning and lighting equipment, of which the response potential is considerable. As a result, the maximum load shift rate of which is relatively larger, as well as the threshold value and the limit value.

4 Case study

In this section, Java-based multi-agent simulation platform “Repast” is used to simulate day-ahead electricity markets. The generator parameters are shown in Table 1. In this case, bidding segments are assumed to be 4, the number of available strategies to be 10 and the price cap to be 1.2 times the marginal cost. The electricity is traded on the system clearing price. Segmented cost information of units is shown in Table 2, and the typical load curve is in Fig. 6. There are five retailers in the market, whose load information in each period are presented in Table 3. The separation of periods is shown in Table 4. In the region studied, the proportions of industrial load, commercial load and domestic load are 60%, 25%, 15%, respectively. Based on the analysis in [31, 32], the parameters of customers’ responsive models are set. Given the fact that the automation level of industrial users is relatively higher than that of commercial and residential users, the parameters of the former are generally larger.

Table 1 Generator parameters
Table 2 Segmented cost of GenCos
Fig. 6
figure 6

Typical daily load curve

Table 3 Information of retailers
Table 4 Separation of periods

The asymmetric market is simulated first, where only bids from GenCos are accepted. Figure 7 shows how the valley-hour clearing price fluctuates during the simulation, while the change in the selection probability of GenCo 4’s optimal action is demonstrated in Fig. 8. At the beginning, the price keeps fluctuating up and down due to the random bidding strategy of GenCos, and the selection probability changes slowly. But the probability value, which is continuously updated through the simulation, goes up swiftly after dozens of rounds. In the meantime, the price also gradually converges to a certain value.

Fig. 7
figure 7

Fluctuation of system clearing price

Fig. 8
figure 8

Fluctuation of selection probability

Table 5 demonstrates the optimal bids and profits of GenCos when the equilibrium has been reached, where “+” means the price added on the basis of marginal cost. Shown from the results, GenCo 4 seizes a strong share of market. Its marginal cost is the lowest, so it could commit market power by bidding much higher than the cost. On the other hand, in order to maintain enough revenue, other GenCos also have to raise their bids because their shares are much smaller. As a result, the market price is obtained at a relatively high level.

Table 5 Profits and optimal bids of GenCos

To investigate the impact of market power on price, the case with ten GenCos is also simulated, whose result shows an apparent decline in clearing prices. Since more GenCos increase the competition, every particular one is less likely to seize a dominant market share. As a consequence, the price is closer to the system marginal cost.

Next, the symmetric market is simulated and the prices in different scenarios are compared in Fig. 9. As can be seen, the participation of buyers’ could lead to a further drop in market prices, in contrast with the other two scenarios. GenCos are forced to lower down their bids in order to avoid a loss in market share, as shown in Table 6. Accordingly, the prices are lower than those in the asymmetric market. In addition, the selling prices set by five retailers are also presented in Fig. 10.

Fig. 9
figure 9

Comparison of prices in different scenarios

Table 6 Comparison of GenCos’ bids in different mechanisms
Fig. 10
figure 10

Selling prices of retailers

Changes in prices and electricity consumption after the implementation of TOU mechanism are presented in Fig. 11 and Fig.12, respectively. Customers rearrange their load schedule by shifting load from high-price hours to low-price ones, as response to the period-varying prices. The load adjustment causes similar change in prices. A gradual decrease can be seen of the peak-hour price. On the contrary, the prices in flat and valley hours experience different levels of growth. Besides, the total load consumption slightly rises, so do the profits of retailers, as shown in Table 7.

Fig. 11
figure 11

Price change during simulation

Fig. 12
figure 12

Load curve before and after response

Table 7 Profit information of retailers

5 Conclusion

This paper proposes a multi-agent simulation model of symmetric electricity market to study the impact of trading mechanisms and DR on electricity market. Agent models of different market players are established according to their behavior. Moreover, the response characteristics of customers based on consumer psychology are also presented. The numerical analysis compares the results where there are four and ten units in the market and discusses the impact of market power on the clearing price. By comparing the simulation results of symmetric and asymmetric cases, it can be seen that the participation of retailers could effectively lower down clearing prices and avoid monopoly. Besides, the implementation of TOU could encourage consumers to adjust their original load profiles by shifting load from peak hours to off-peak hours, which also has a similar effect on market prices.

In future work, we will further improve the agent model of consumers by taking into account more factors affecting the response characteristics, such as user satisfaction and interaction between different customers.