1 Introduction

The electricity market is moving from a market where energy is produced in a centralized fashion from traditional and often environmentally harmful sources to a liberalized/competitive and possibly distributed market that exploits renewable energy sources (RESs) [1]. A major challenge in this new environment is the alignment between the varying and to large extent unpredictable energy supply (e.g. RES) and the ad-hoc energy demand of the end users. In addition, innovative concepts such as flexibility markets, energy poverty and energy efficiency are continuously emerging in the energy sector. Towards this goal, the research community focuses on the development of pricing mechanisms, which are able to affect the energy consumption by enabling a dynamic and sophisticated interaction between the pricing of energy (incentives) and the way end users consume it (scheduling). Studies under this premise develop algorithms that belong to the generic family of demand side management (DSM) algorithms. This is a promising approach that aims to affect energy consumption and create an additional tool in the optimization and the stability of energy systems.

As analyzed in [2] residential participation in DSM is commonly envisaged via aggregated participation because of implementation and scalability issues.

Along with these technical and socio-economic changes, there is a rise of innovative business models for aggregating the DSM participation of a set of users. In particular, collective DSM participation can be undertaken by a non-profit organization representing the interests of its portfolio of users [3], a public (regulated) entity or a private company. In this paper, we assume that the aggregating entity only passes the energy costs to the consumers without extracting profit [4]. This use case represents the cases where: ① the private aggregating company operates in a highly competitive environment; ② the profit margins of the private aggregating company are regulated; ③ users form a cooperative organization to represent their interests; ④ the aggregating company is a public and non-profit entity.

Throughout this paper, we will refer to the aggregating entity with electricity service provider (ESP) and cover all four use cases.

In [5], we try to facilitate the easy, rich and deep communication between energy efficiency stakeholders and end users, allowing them to: ① discover each other; ② educate themselves so as to understand the difficulties and challenges each one faces; ③ interact and trade with each other.

Under this perspective, we focus on the development of pricing mechanisms that give to the end users the opportunity to derive direct financial benefits from the actions they undertake regarding their energy consumption. In more detail, through community pricing [6] or personalized pricing mechanisms that we developed, we avoid the well-known problem of the tragedy of the commons [7]. This is a phenomenon, where users do not change their behavior (energy consumption in this case) due to the low impact that this change would have on their bill. In contrast, a personalized pricing mechanism is able to treat different users in different ways, according to their flexibility, and thus achieve a specific behavioral change efficiently.

More specifically, in this paper we refer to “system efficiency” as the maximization of Social Welfare, which is defined as the aggregated users’ welfare (AUW) and relates to the difference between the users’ satisfaction from electricity consumption and the users’ bills.

The challenge lies in the fact that each user’s satisfaction function is private and not known to the ESP, while users are generally considered as selfish, which means that each one opts for maximizing her own welfare, which is not necessarily aligned with the system’s objective.

Moreover, for the use cases of the ESP that we consider, it is very important that a DSM algorithm also exhibits two positive externalities apart from efficiency. Those are:

  1. 1)

    Reduction of the system’s cost, which relates to systems with: higher energy efficiency, more stable and sustainable networks, lower capital expenditure in overprovisioned grid facilities, lower CO2 emissions etc.

  2. 2)

    Fair allocation of the system’s resources among the users. This is particularly important for the business cases considered, because all users will remain under the ESP, only if they know that they get a fair percentage of the benefits that they have incurred in the first place. In our case, we want to allocate the system’s energy savings to the users that provoke those savings.

In such an environment, it is the job of the ESP to set the rules of energy trading in a smart way, such that: the system possesses the budget-balance property; selfish users’ actions bring the system to an equilibrium; and their deliberate choices bring the system to an outcome with desirable properties namely high users’ welfare (KPI1), low system’s cost (KPI2), fairness (KPI3).

Designing such rules is studied by a special sector of game theory, called ‘mechanism design’. The desirable properties above constitute the mechanism’s key performance indicators (KPIs) and they are generally adopted widely in the literature.

A brief overview of energy pricing models for DSM started with the enhancement of the traditional flat electricity tariff (fixed price per consumed unit of energy and identical at all time instances) with inclining block rates (IBRs) [8, 9]. In IBR, the price of each unit depends on the total amount of energy a customer consumes. IBR was the first simple solution to incentivize energy curtailments, usually during a large time interval. A more sophisticated approach is time-of-use (ToU) pricing where prices are predetermined based on prediction of the relationship between aggregate production and consumption. However, TOU is insensitive to the users’ response to the prices and often creates reverse peaks. Finally, real time pricing (RTP) mechanisms create the price per energy unit depending on the total cost of energy production and the total consumption.

2 Related work

Liberalized electricity markets, smart grids and high penetration of RES led to the development of novel markets whose objective is the harmonization between production and demand (i.e. flexibility markets). This necessitates the development of novel pricing schemes able to allow ESPs to exploit flexibility in the energy consumption curves of their consumers.

The general idea described above has been approached in different ways in the literature, including ex-post [10, 11] & ex-ante pricing methods [12,13,14,15,16,17,18,19,20,21,22,23,24]. Many pricing mechanisms [2, 12,13,14,15] opt for system efficiency (KPI1), but at a risk of either running a deficit or extracting a large surplus from the users as explained in [4] and are not compatible with the emerging environments described. In particular, the authors in [12, 13], achieve an efficient allocation, but the system does not possess the budget-balance property described in the introduction. Moreover, users are considered to be price-takers, that is, they do not consider the effect that their choices have on the price. In [14], the users are considered as price-anticipators and the efficient Vickrey-Clarke-Groves (VCG) mechanism is applied, which is inherently not budget-balanced and additionally requires a simple and well-defined form of the user’s utility function in order to remain tractable.

Another class of DSM algorithms [4, 8, 16,17,18,19,20], have been designed to guide the users’ behavior towards more desirable demand profiles. This class of algorithms possesses the budget-balance property. In particular, in [8, 17, 18, 20] the authors opt for minimizing the system’s cost (KPI2), under the constraint that each load will be fully satisfied within its defined interval. The efficiency of the system is defined as the minimization of system’s cost. In this class of studies, the users’ dissatisfaction from deviation from their desired consumption profile is not modeled. In [4, 19], where budget-balanced mechanisms are also proposed, the model does not capture load curtailments, but only load shifting. Moreover, none of the above works considers the property of fairness.

Finally, a third class of studies [21,22,23,24], opts for enhancing the system’s fairness (KPI3). In particular, the authors in [21] propose a pricing model based on the principle that each user should be billed according to her contribution to the system’s cost. The Shapley value from cooperative game theory is used to express this contribution. The same authors in their later work [22] argue that the model of [21] sacrifices efficiency to achieve fairness. In [22] the trade-off between fairness and cost minimization in the design of pricing mechanisms is assessed. However, the users are assumed to distribute evenly their load throughout the eligible timeslots and the user’s satisfaction is again disregarded.

Thus, through the study of the literature, one can confirm that the generally desired KPIs in the design of a pricing mechanism are the ones that we presented in the previous section and adopt in this paper’s context.

As analyzed in the previous paragraphs, the models proposed so far in the literature cope only with one or two of the above KPIs. To the best of our knowledge, there is no prior work that directly assesses the issue of designing a pricing mechanism that achieves an attractive trade-off among all three of the above KPIs. Our approach for the design of such a pricing mechanism is to adopt the concept of personalized–real time pricing (P-RTP).

Motivated from the above, the major contributions of this work are:

  1. 1)

    A P-RTP algorithm that reduces the energy cost without sacrificing at all the AUW. Moreover, the proposed scheme achieves a fair allocation of the energy cost savings among the users.

  2. 2)

    An analysis on the proposed algorithm’s convergence properties.

  3. 3)

    A comparison of the proposed P-RTP with the existing RTP mechanisms that testifies its superiority according to the aforementioned perspectives.

  4. 4)

    An analysis on the findings with useful guidelines towards the design of pricing mechanisms in open and competitive markets.

3 System model and problem formulation

In this section, we describe prerequisites that will facilitate the presentation of our pricing mechanism and existing widely accepted models (i.e. user model, energy cost model) that will act as the testbed in order to objectively evaluate and compare the proposed pricing mechanism.

We consider a set (community) \(N = \left\{ {1, 2, \ldots , n} \right\}\) of \(n\) energy consumers (users). Each user is equipped with a smart meter, tracking his/her consumption at all time instances and an energy management system that schedules his/her consumption. We consider a finite time horizon, which is divided into \(h\) time slots \(H = \left\{ {1, 2, \ldots , h} \right\}\) of equal duration. An ESP, in coordination with the distribution system operator (DSO), installs the necessary equipment to each user and is responsible for the possible failures and upgrades. Various parties, such as utilities and DSOs, may act as ESPs, depending on the legislation of each country. A communication network lies on top of the electric grid and all parties are able to exchange messages with each other.

The consumption of user i in timeslot t is denoted as \(x_{i}^{t}\), where \(t\in H\) and \(i \in N\). The comfort of user \(i\) at a time-slot \(t\) is expressed by a utility function \(u_{i}^{t} \left( {x_{i}^{t} ,\omega_{i}^{t} } \right)\), where \(\omega_{i}^{t}\) is an appropriate elasticity parameter. The utility function expresses, in monetary units, how much user \(i\) values the consumption \(x_{i}^{t}\) at time t. To better characterize the properties of the utility function, the DSM literature draws on two concepts from microeconomics [25]. The first concept is that of diminishing returns, which, in our context, means that:

  1. 1)

    The more a user consumes, the more utility he/she gains (\(u_{i}^{t} \left( {x_{i}^{t} ,\omega_{i}^{t} } \right)\) is increasing with \(x_{i}^{t}\)).

  2. 2)

    The more a user consumes, the less the added utility (\(u_{i}^{t} \left( {x_{i}^{t} ,\omega_{i}^{t} } \right)\) is concave).

The second concept relates to demand elasticity, defined as the rate of change of the utility function with respect to small changes in the consumption quantity. This is expressed through parameter \(\omega_{i}^{t}\), where low values of \(\omega_{i}^{t}\) correspond to elastic demand (very responsive to price), whereas higher values of \(\omega_{i}^{t}\) correspond to inelastic demand (less responsive to price). The dependence of \(\omega_{i}^{t}\) on i and t captures the fact that different users, at different times, value consumption differently.

In what follows, we will sometimes use the shorthand notation \(\dot{u}_{i}^{t}\), with the dot notating that it is a function. In the evaluation of the results, we show that the performance of the proposed mechanism is not affected by the particular choice of \(\dot{u}_{i}^{t}\) as long as it is based on the two concepts presented above.

By the concavity of \(\dot{u}_{i}^{t}\), it is clear that there is a saturation point beyond which utility no longer increases with \(x_{i}^{t}\). This is regarded as the user’s maximum desired consumption and is denoted it as \(\tilde{x}_{i}^{t}\). The respective \(u_{i}^{t} \left( {\tilde{x}_{i}^{t} ,\omega_{i}^{t} } \right)\) is denoted as \(\tilde{u}_{i}^{t}\). In this paper, we assume that the user’s \(\tilde{x}_{i}^{t}\) is known to the ESP (e.g. through statistical data and machine learning) but the particular form of the user’s utility function as well as the user’s elasticity parameter \(\omega_{i}^{t}\), remain private. The model can also be extended to model the comfort derived from the consumption of each electric appliance, in which case the total comfort of the user would be the sum of concave functions of (1) for the different appliances that the user possesses, and would again be concave. For the scope of the current work and without loss of generality (as in [2, 8, 13, 14, 21, 22]), we assume only one continuous, dispatchable and positive load \(x_{i}^{t} > 0\) for user i, representing the sum of the consumptions of all his/her electric appliances.

The supply side is usually modeled either as a game (e.g. a market that admits to a Nash equilibrium [15, 26]) or (more simplistically) as a cost function that approximately relates the aggregate demand with the cost of the energy supplied (e.g. [12,13,14, 18, 22]). In this work, we adopt the latter approach, in which the system’s cost (denoted as \(G_{N}^{t}\)) depends on the total load \(\mathop \sum \limits_{i \in N} x_{i}^{t}\) of the users in set N at timeslot \(t\in H\) through an increasing convex function:

$$G_{N}^{t} = G\left( {\mathop \sum \limits_{i \in N} x_{i}^{t} } \right)$$
(1)

The cost function is commonly approximated by a quadratic cost function in the literature:

$$G_{N}^{t} = c\left( {\mathop \sum \limits_{i \in N} x_{i}^{t} } \right)^{2}$$
(2)

where \(c\) is a cost parameter. Equation (2) represents the cost for the ESP to buy an amount of energy equal to the total demand. As described in the introduction, the system needs to be budget-balanced (the sum of the bills of the participating users needs to be equal with the total system’s cost). The aforementioned function offers a fair test-bed in order to evaluate and compare pricing mechanisms and for this reason it is widely accepted.

The objective at each timeslot t is to find the users’ consumptions \(\hat{x}_{i}^{t} , \forall i\in N\) that maximize the system’s efficiency (maximize the user comfort and minimize the energy cost):

$$\mathop {\text{max }}\limits_{{i\in N\varvec{ }}} \left\{ { \mathop \sum \limits_{i \in N} \left[ {\dot{u}_{i}^{t} } \right] - G_{N}^{t} } \right\}$$
(3)
$${\text{s}}.{\text{t}}. \mathop \sum \limits_{i \in N} \left[ {p_{i}^{t} x_{i}^{t} } \right] = G_{N}^{t}$$
(4)

Constraint (4) expresses the budget-balanced (non-profit) property. We present a model that deals only with load curtailments, implying a memoryless system. This means that the scheduling problem can be solved for the time horizon \(H\), by solving for each timeslot independently [12, 13, 19]. In order to solve (3), it is required from all users in N to disclose their comfort functions to the ESP and also accept a direct ESP control over their loads. Since these requirements are not generally met in practice, the research community focuses on iterative pricing mechanisms that converge to equilibrium (set of prices) that satisfy the KPIs analyzed in the introduction. Considering (4), the prices set by the ESP, are meant to efficiently distribute the energy cost to the users and thus inherently depend on \(G_{N}^{t}\).

At the user’s side, we consider selfish users that choose their \(x_{i}^{t}\), so as to maximize their own welfare under the ESP’s pricing:

$$\mathop {x_{i}^{t} = {\text{argmax }}}\limits_{{x_{i}^{t} }} \{ \dot{u}_{i}^{t} - p^{t} x_{i}^{t} \}$$
(5)

Equation (5) implies a price-taking user. This models a user that either is very small compared to the aggregated system’s consumption and therefore his/her choice of \(x_{i}^{t}\) does not affect the price \(p^{t}\) or does not understand/consider the effect of his/her choice of \(x_{i}^{t}\) at price \(p^{t}\). In that case, (3) can be solved via dual decomposition, where the ESP applies an efficient algorithm for finding the optimal set of prices by exchanging messages with each user (as presented in [13]). In contrast, we consider price-anticipating users, who further consider the effect of their \(x_{i}^{t}\) on the price. Thus, user’s problem (5), is converted into:

$$\mathop {x_{i}^{t} = {\text{argmax }}}\limits_{{x_{i}^{t} }} \{ \dot{u}_{i}^{t} - p^{t} \left( {x_{i}^{t} , \varvec{x}_{ - i}^{t} } \right)\varvec{ }x_{i}^{t} \}$$
(6)

where the expression to be maximized is referred to as the user’s welfare. Moreover, vector \(\varvec{x}_{ - i}^{t}\) denotes the consumptions of users other than \(i\). This, latter co-relation essentially motivates a game \(\varGamma\) where game participants are users \(i\in N\); a user’s strategy is his/her choice of \(x_{i}^{t}\); a user’s payoff is his/her welfare.

Notice that the VCG mechanism is proved to converge to the unique allocation \(\hat{x}_{i}^{t}\) that optimizes (3). However, constraint (4) excludes VCG from consideration, as argued in the related work.

Moreover, efficient allocations in general, require disclosure of the users’ utility functions to the ESP. Such an assumption would make the model convenient for analytical analysis. It is however a strong assumption and it doesn’t properly capture the intricacies of household energy usage, while also raising privacy as well as representation issues. In contrast, we chose to remain agnostic to the particular form of the user’s utility function. Because of this latter property, the efficiency of equilibria cannot be justified for the general case. Nonetheless, we focus on designing a pricing mechanism, such that:

  1. 1)

    Game \(\varGamma\) converges to a Nash equilibrium (NE).

  2. 2)

    The system at equilibrium, achieves an attractive trade-off among efficiency, low-cost and fairness.

4 Real time pricing

We start the description of our personalized pricing mechanism by first presenting the existing RTP approach.

For timeslot \(t\in H\), at the ESP-level, the users’ scheduled energy consumptions \(x_{i}^{t}\) are taken as input and the price \(p^{t}\) of timeslot t (electricity per unit price, which under RTP is common for all users i) is calculated according to:

$$p^{t} = \frac{{G_{N}^{t} }}{{\mathop \sum \limits_{i \in N} x_{i}^{t} }}$$
(7)

Equation (7) leads to a user’s bill which is proportional to the user’s consumption \(\left( {\frac{{x_{i}^{t} }}{{\mathop \sum \limits_{i \in N} x_{i}^{t} }}G_{N}^{t} } \right)\), which ensures that the system is budget-balanced (the users’ bills equals the total energy cost).

At user-level, users sequentially choose their \(x_{i}^{t}\) from (6). During this calculation, \(\varvec{x}_{ - i}^{t}\) is considered fixed. Notice that although, user \(i\) might be agnostic of \(p^{t} \left( {x_{i}^{t} , \varvec{x}_{ - i}^{t} } \right)\), he/she can however detect the pricing trend by exchanging messages with the ESP. More specifically, by trying different \(x_{i}^{t}\) and receiving the respective \(p^{t}\), the user can detect \(p^{t} \left( {x_{i}^{t} , \varvec{x}_{ - i}^{t} } \right)\),by applying some polynomial fitting algorithm. This approach allows for a distributed implementation, which is in line with state of the art requirements [22, 27, 28].

After a limited number of sequential iterations (calculations) of each user’s updated \(x_{i}^{t}\), the system converges to the equilibrium price where no user wishes to further modify his/her \(x_{i}^{t}\). A user’s final \(x_{i}^{t}\) at equilibrium is denoted as \(\hat{x}_{i}^{t,RTP} , \forall i \in N\). Th

Table 1

e procedure is described in Algorithm 1 (where k denotes the algorithm’s iterations).

5 Personalized–real time pricing

In this section we propose the concept of P-RTP, meaning that the price will no longer be a scalar \(p^{t}\) (same for all users \(i \in N\)) but each user will receive a different price \(p_{i}^{t}\).

From the class of all possible P-RTP mechanisms, we formulate a particular mechanism that is designed to perform well, in the three KPIs that described in Section 1. The proposed mechanism allocates lower prices to those users who consume a lower percentage of their desired consumption (\(\tilde{x}_{i}^{t}\)), compared to users who consume a higher percentage of their desired consumption. In particular, for a user \(i\) and a timeslot \(t\) we allocate the price \(p_{i}^{t}\) according to the degree to which the user curtails his consumption. Elastic users receive lower prices and inelastic users receive higher prices. It is highlighted that P-RTP assumes the knowledge of the desired energy consumption (\(\tilde{x}_{i}^{t}\)). In case that we allow for a user to declare a fake (larger) desired consumption, P-RTP would favor him. Thus, this pricing mechanism is suitable for automated environments (through ICT systems) where user do not manually declare their consumption. On the other hand the exploitation of the desired energy consumption leads to very effective pricing mechanisms. In this paper, we present a pricing model for the use case of automated environments, while in [29], we cope up also with the other user case (private \(\tilde{x}_{i}^{t}\)).

In order to achieve prices with a discount proportional to the percentage of curtailments, we set:

$$(p_{i}^{t} - \tilde{p}^{t} ) / \tilde{p}^{t} = (x_{i}^{t} - \tilde{x}_{i}^{t} ) /\tilde{x}_{i}^{t}$$
(8)

where \(\tilde{p}^{t}\) is introduced in order to tune the prices, so that constraint (4) holds. Let us denote as \(\gamma_{i}^{t}\) the percentage of the curtailment of user i at time instant t:

$$\gamma_{i}^{t} = (x_{i}^{t} - \tilde{x}_{i}^{t} ) / \tilde{x}_{i}^{t}$$
(9)

Thus, (8) through the use of (9) becomes:

$$p_{i}^{t} = \tilde{p}^{t} \left( {1 + \gamma_{i}^{t} } \right)$$
(10)

Now through the use of (4) we have:

$$\tilde{p}^{t} = \frac{{G_{N}^{t} }}{{\mathop \sum \limits_{i \in N} \left[ {x_{i}^{t} \left( {1 + \gamma_{i}^{t} } \right)} \right]}}$$
(11)

If we now combine (10) and (11) we have:

$$p_{i}^{t} = \left( {1 + \gamma_{i}^{t} } \right)G_{N}^{t} / \mathop \sum \limits_{i \in N} \left[ {x_{i}^{t} \left( {1 + \gamma_{i}^{t} } \right)} \right]$$
(12)

In the proposed mechanism, we iteratively solve (6) and calculate the prices from (12). The process is described in Algorithm 2.

Table 2

Theorem

Algorithm 2 converges to a NE after a finite number of iterations via best response dynamics.

Proof

The strategy for the proof of the convergence of P-RTP is to find a function that is bounded from above and increases in every iteration of P-RTP. We consider the AUW according to (13).

$$AUW = \mathop \sum \limits_{i \in N} \left( {\dot{u}_{i}^{t} - p_{i}^{t} x_{i}^{t} } \right)$$
(13)

\(\text {AUW}\) is bounded from above (the theoretical maximum is in the case in which every user consumes all the energy that (s)he needs and the price is zero). It remains now to prove that AUW increases in every iteration of P-RTP. Note that we cannot study the monotonicity of \(\text {AUW}\) by exploiting its derivative, because no assumption is made on the differentiability of \(\dot{u}_{i}^{t}\).

Consider an arbitrary instance of game Γ where it is user \(i\)’s turn. User \(i\)’s state is \(x_{i}^{t}\) and the state of users’ other than \(i\) is fixed. We denote the latter as \(\varvec{x}_{j}^{t} , {\text{where}}\ j \in N, j \ne i\). Holding \(\varvec{x}_{j}^{t}\) fixed, suppose \(i\) deviates to \(\hat{x}_{i}^{t}\). The calculation of the change in \(AUW\) breaks down in the calculation of the welfare of user \(i\) (12) and the welfare of users in set \(j\). According to (13) and the recent notation in order to prove that \(AUW\) increases in every iteration of P-RTP it must be proven that:

$$\begin{aligned} U\left( {\hat{x}_{i}^{t} } \right) - \hat{x}_{i}^{t} p_{i}^{t} \left( {\hat{x}_{i}^{t} , \varvec{x}_{j}^{t} } \right) + \mathop \sum \limits_{j \ne i} U\left( {x_{j}^{t} } \right) - \mathop \sum \limits_{j \ne i} x_{j}^{t} p_{j}^{t} \left( {\hat{x}_{i}^{t} , \varvec{x}_{j}^{t} } \right) \hfill \\ > U\left( {x_{i}^{t} } \right) - x_{i}^{t} p_{i}^{t} \left( {x_{i}^{t} , \varvec{x}_{j}^{t} } \right) + \mathop \sum \limits_{j \ne i} U\left( {x_{j}^{t} } \right) \hfill \\ - \mathop \sum \limits_{j \ne i} x_{j}^{t} p_{j}^{t} \left( {x_{i}^{t} , \varvec{x}_{j}^{t} } \right) \hfill \\ \end{aligned}$$
(14)

Best response dynamics means that each user at any instance selects a strategy that maximizes her/his own welfare. So, since user \(i\) deviates, it holds by definition:

$$U\left( {\hat{x}_{i}^{t} } \right) - \hat{x}_{i}^{t} p_{i}^{t} \left( {\hat{x}_{i}^{t} , \varvec{x}_{j}^{t} } \right) > U\left( {x_{i}^{t} } \right) - x_{i}^{t} p_{i}^{t} \left( {x_{i}^{t} , \varvec{x}_{j}^{t} } \right)$$
(15)

From (13) and (14), it suffices to prove that:

$$\mathop \sum \limits_{j \ne i} x_{j}^{t} p_{j}^{t} \left( {x_{i}^{t} , \varvec{x}_{j}^{t} } \right) > \mathop \sum \limits_{j \ne i} x_{j}^{t} p_{j}^{t} \left( {\hat{x}_{i}^{t} , \varvec{x}_{j}^{t} } \right)$$
(16)

We present here the case for \(\hat{x}_{i}^{t} > x_{i}^{t}\). The exact same proof holds symmetrically for \(\hat{x}_{i}^{t} < x_{i}^{t}\). Since we have \(\hat{x}_{i}^{t} > x_{i}^{t}\) without harm of generality:

$$G_{N}^{t} \left( {\mathop \sum \limits_{j \ne i} x_{j}^{t} + \hat{x}_{i}^{t} } \right) > G_{N}^{t} \left( {\mathop \sum \limits_{j \ne i} x_{j}^{t} + x_{i}^{t} } \right)$$
(17)

which means that the system cost has increased by:

$$\Delta G = G_{N}^{t} \left( {\mathop \sum \limits_{j \ne i} x_{j}^{t} + \hat{x}_{i}^{t} } \right) - G_{N}^{t} \left( {\mathop \sum \limits_{j \ne i} x_{j}^{t} + x_{i}^{t} } \right)$$
(18)

In addition the bill of user i has increased by:

$$\Delta B_{i} = \hat{x}_{i}^{t} p\left( {\hat{x}_{i}^{t} , \varvec{x}_{j}^{t} } \right) - x_{i}^{t} p\left( {x_{i}^{t} , \varvec{x}_{j}^{t} } \right)$$
(19)

We will study now the relation between \(\Delta B_{i}\) and \(\Delta G\). In case it is \(\Delta B_{i} > \Delta G\) it means that user \(i\) pays more than the cost difference that she/he creates and thus the new bills of other users are lower in the new state which means that (16) holds. In more formality, because of the budget-balance property of P-RTP, it is:

$$\Delta B_{i} + \Delta \left( {\mathop \sum \limits_{j \ne i} B_{j} } \right) = \Delta G$$
(20)

which means that (15) holds for:

$$\Delta B_{i} - \Delta G > 0$$
(21)

By replacing (12) in (21) it is:

$$\begin{aligned} \Delta B_{i} - \Delta G = \frac{{\hat{x}_{i}^{t} \left( {1 + \gamma_{i}^{t} } \right)G_{N}^{t} \left( {\mathop \sum \limits_{j \ne i} x_{j}^{t} + \hat{x}_{i}^{t} } \right)}}{{\mathop \sum \limits_{i \in N} \left[ {\hat{x}_{i}^{t} \left( {1 + \gamma_{i}^{t} } \right)} \right]}} - \hfill \\ {{x_{i}^{t} \left( {1 + \gamma_{i}^{t} } \right)G_{N}^{t} \left( {\mathop \sum \limits_{j \ne i} x_{j}^{t} + x_{i}^{t} } \right)} \mathord{\left/ {\vphantom {{x_{i}^{t} \left( {1 + \gamma_{i}^{t} } \right)G_{N}^{t} \left( {\mathop \sum \limits_{j \ne i} x_{j}^{t} + x_{i}^{t} } \right)} { \mathop \sum \limits_{i \in N} \left[ {x_{i}^{t} \left( {1 + \gamma_{i}^{t} } \right)} \right] - }}} \right. \kern-0pt} { \mathop \sum \limits_{i \in N} \left[ {x_{i}^{t} \left( {1 + \gamma_{i}^{t} } \right)} \right] - }} \hfill \\ G_{N}^{t} \left( {\mathop \sum \limits_{j \ne i} x_{j}^{t} + \hat{x}_{i}^{t} } \right) + G_{N}^{t} \left( {\mathop \sum \limits_{j \ne i} x_{j}^{t} + x_{i}^{t} } \right) \hfill \\ \end{aligned}$$
(22)

After replacing \(\gamma_{i}^{t}\) from (9) and doing some calculus, we have:

$$\begin{aligned} & \Delta B_{i} - \Delta G = \\ & G_{N}^{t} \left( {\mathop \sum \limits_{j \ne i} x_{j}^{t} + \hat{x}_{i}^{t} } \right)\left[ {\frac{{\left(\tilde{x}_{i}^{t}\right)^2 }}{{\tilde{x}_{i}^{t} \left( {\mathop \sum \limits_{j \ne i} \frac{{\left( {x_{j}^{t} } \right)^{2} }}{{\tilde{x}_{i}^{t} }} + \frac{{\left( {\tilde{x}_{i}^{t} } \right)^{2} }}{{\tilde{x}_{i}^{t} }}} \right)}} - 1} \right] \\ & - G_{N}^{t} \left( {\mathop \sum \limits_{j \ne i} x_{j}^{t} + x_{i}^{t} } \right) \cdot \\ & \left[ {\frac{{\left({x}_{i}^{t}\right)^2}}{{\tilde{x}_{i}^{t} \left( {\mathop \sum \limits_{j \ne i} \frac{{\left( {x_{j}^{t} } \right)^{2} }}{{\tilde{x}_{i}^{t} }} + \frac{{\left( {x_{i}^{t} } \right)^{2} }}{{\tilde{x}_{i}^{t} }}} \right) }} - 1} \right] \\ \end{aligned}$$
(23)

Observe that (23) can be written in the form \(\Delta B_{i} - \Delta G = \varPhi \left( {\tilde{x}_{i}^{t} } \right) - \varPhi \left( {x_{i}^{t} } \right)\) with:

$$\varPhi \left( z \right) = G_{N}^{t} \left( {\mathop \sum \limits_{j \ne i} x_{j}^{t} + z} \right)\left[ {\frac{{z^{2} }}{{\tilde{x}_{i}^{t} \left( {\mathop \sum \limits_{j \ne i} \frac{{\left( {\tilde{x}_{i}^{t} } \right)^{2} }}{{x_{j}^{t} }} + \frac{{z^{2} }}{{\tilde{x}_{i}^{t} }}} \right) }} - 1} \right]$$
(24)

Since it is \(\hat{x}_{i}^{t} > x_{i}^{t}\), it suffices to show that

$$\frac{{{\text{d}}\varPhi \left( z \right)}}{{{\text{d}}z}} > 0$$
(25)

After replacing (2) and (23) in (24) and differentiating we have:

$$\frac{{2\mathop \sum \limits_{j \ne i} \left( {\frac{{\left( {x_{j}^{t} } \right)^{2} }}{{\tilde{x}_{i}^{t} }}} \right)c\left( {z + \mathop \sum \limits_{j \ne i} x_{j}^{t} } \right)\left[ {\left( {\mathop \sum \limits_{j \ne i} x_{j}^{t} } \right)z - \left( {\mathop \sum \limits_{j \ne i} \left( {\frac{{\left( {x_{j}^{t} } \right)^{2} }}{{\tilde{x}_{i}^{t} }}} \right)} \right)\tilde{x}_{i}^{t} } \right]}}{{\left[ {z^{2} + \left( {\mathop \sum \limits_{j \ne i} \left( {\frac{{\left( {x_{j}^{t} } \right)^{2} }}{{\tilde{x}_{i}^{t} }}} \right)} \right)\tilde{x}_{i}^{t} } \right]^{2} }} > 0$$
(26)

which reduces to

$$z > \frac{{\left[ {\mathop \sum \limits_{j \ne i} \left( {\frac{{\left( {x_{j}^{t} } \right)^{2} }}{{\tilde{x}_{i}^{t} }}} \right)} \right]\tilde{x}_{i}^{t} }}{{\mathop \sum \limits_{j \ne i} x_{j}^{t} }}$$
(27)

Observe that \({{x_{j}^{t}}}/{{\tilde{x}_{i}^{t} }} < 1\) (since the denominator is by definition the upper limit of the nominator). We have that:

$$\frac{{\left[ {\mathop \sum \limits_{j \ne i} \left( {\frac{{\left( {x_{j}^{t} } \right)^{2} }}{{\tilde{x}_{i}^{t} }}} \right)} \right]\tilde{x}_{i}^{t} }}{{\mathop \sum \limits_{j \ne i} x_{j}^{t} }} = \frac{{\mathop \sum \limits_{j \ne i} \left( {\frac{{x_{j}^{t} }}{{\tilde{x}_{i}^{t} }}x_{j}^{t} } \right)\tilde{x}_{i}^{t} }}{{\mathop \sum \limits_{j \ne i} x_{j}^{t} }} < \tilde{x}_{i}^{t}$$
(28)

Thus, because of (28) there is a feasible region of \(x_{i}^{t} \in \left[ {\frac{{\left( {\mathop \sum \limits_{j \ne i} \left( {\frac{{\left( {x_{j}^{t} } \right)^{2} }}{{\tilde{x}_{i}^{t} }}} \right)} \right)\tilde{x}_{i}^{t} }}{{\mathop \sum \limits_{j \ne i} x_{j}^{t} }},\tilde{x}_{i}^{t} } \right]\), for which condition (16) holds.

6 Performance evaluation and comparisons

In this section we present simulation results to demonstrate the proposed P-RTP mechanism’s performance in the KPIs sought. In order to have a benchmark for comparisons, we compare with the simple RTP mechanism (Algorithm 1). The evaluation considers scenarios under a variety of assumptions for the values of the parameters in the two models.

In order to evaluate mechanisms, the research community (e.g. [2, 4, 12,13,14,15,16] usually models end users as follows: a concave and increasing function of \(x_{i}^{t}\) and \(\omega_{i}^{t}\) with a constant maximum value after a saturation point, has been widely adopted:

$$u_{i}^{t} \left( {x_{i}^{t} ,\omega_{i}^{t} } \right) = \left\{ {\begin{array}{*{20}l} {\tilde{u}_{i}^{t} - \omega_{i}^{t} \left( {x_{i}^{t} - \tilde{x}_{i}^{t} } \right)^{2} } \hfill & {0 < x_{i}^{t} < \tilde{x}_{i}^{t} } \hfill \\ {\tilde{u}_{i}^{t} } \hfill & {x_{i}^{t} \ge \tilde{x}_{i}^{t} } \hfill \\ \end{array} } \right.$$
(29)

The utility function’s general form is assumed to be the same for all \(i\) and \(t\). In what follows, we present simulations for a representative set of 100 users. Moreover, the optimization problem can be solved for each timeslot independently, as explained in Section 4. Thus, without loss of generality, we run the simulation for one timeslot (\(h = 1\)) and present the results. Parameter \(\tilde{u}_{i}^{t}\) expresses the user’s maximum utility (i.e.utility at \(x_{i}^{t} \ge \tilde{x}_{i}^{t}\)) and was set to \(\tilde{u}_{i}^{t} = \omega_{i}^{t} \left( {\tilde{x}_{i}^{t} } \right)^{2}\). Unless stated otherwise, parameter \(c\) was set to \(c = 0.02\). The flexibility parameter \(\omega_{i}^{t}\) for each user i was selected randomly in the interval [0.1, 5]. These choices are in line with the literature [2, 4, 12,13,14,15,16] as well as with datasets taken from real users and real-life tests undertaken within [30].

In correspondence with the three KPIs presented in Section 1, we define four index metrics for the evaluation:

  1. 1)

    AUW is a straightforward index for system efficiency (KPI1).

    $$AUW = \mathop \sum \limits_{{{i} \in {N}}} \left( {\dot{u}_{i}^{t} - \varvec{ }p_{i}^{t} \varvec{ }x_{i}^{t} } \right)\varvec{ }$$
    (30)
  2. 2)

    The allocation’s cost \(G\) is also a straightforward index metric of system cost KPI2.

    $$G = c\left( {\mathop \sum \limits_{{{i} \in {N}}} \left[ {x_{i}^{t} } \right]} \right)^{2}$$
    (31)

    We evaluate P-RTP and simple RTP with respect to these two KPIs for different values of \(c\) and \(\omega_{i}^{t}\) in order to show that the performance of our mechanism does not depend on the parameters of the system. KPI1 and KPI2 are generally mutually-conflicting; for example, a low system’s cost can lead to lower users’ welfare (because of lower consumption) unless we reward the users with lower prices to compensate for the users’ welfare. We define behavioral reciprocity (\({\text{BR}}\)) as a metric that captures this trade-off:

  3. 3)

    BR of user i is the degree of correlation between the behavioral change of i and the reward that i gets for it:

    $$BR_{i} = \frac{{D_{i}^{A} }}{{D_{i}^{R} }}\quad \forall i \in N$$
    (32)

    where

    $$D_{i}^{A} = \left( {\tilde{x}_{i}^{t} - x_{i}^{t} } \right)\frac{{G\left( {\mathop \sum \limits_{i = 1}^{N} \tilde{x}_{i}^{t} } \right) - G \left( {\mathop \sum \limits_{i = 1}^{N} x_{i}^{t} } \right)}}{{ \mathop \sum \limits_{i = 1}^{N} \tilde{x}_{i}^{t} - \mathop \sum \limits_{i = 1}^{N} x_{i}^{t} }}$$
    (33)

    represents the discount achieved, i.e. the system cost reduction, for which user \(i\) is responsible and:

    $$D_{i}^{R} = \tilde{x}_{i}^{t} \frac{{ G\left( {\mathop \sum \limits_{i = 1}^{N} \tilde{x}_{i}^{t} } \right)}}{{\mathop \sum \limits_{i = 1}^{N} \tilde{x}_{i}^{t} }} - x_{i}^{t} p_{i}^{t}$$
    (34)

    represents the discount received, i.e. the difference between the user’s bill with the original system’s state (\(x_{i}^{t} = \tilde{x}_{i}^{t}\)) and the actual user’s bill (after applying RTP or P-RTP). Values of \(BR_{i}\) close to 1 indicate a better trade-off between AUW and G, and thus a more fair pricing mechanism.

  4. 4)

    User welfare deviation (\({\text{UWD}}\)) is defined to capture the degree of the deviation of user i from the average user’s welfare:

$$UWD_{i} = \frac{{\left[ {\left( {\dot{u}_{i}^{t} - \varvec{ }p_{i}^{t} \varvec{ }x_{i}^{t} } \right) - \frac{AUW}{n}} \right]}}{{\frac{AUW}{n}}}\quad \forall i \in N$$
(35)

Its scope is to depict that a mechanism’s performance, does not come with the expense of treating a subset of users unfairly. A low UWD means that there are no users with very high welfare and users with very low welfare (which means that they will leave the ESP in case of competition or they will be very unhappy in case of monopoly). Thus, the objective here is to keep UWD low.

Having defined the metrics of interest, we now proceed to the presentation of the results obtained. In all figures we normalize the metric by dividing with the highest metric value. Figure 1 compares the energy costs (\(G\)) with RTP and P-RTP pricing under various values of parameter \(c\).

Fig. 1
figure 1

Energy costs as a function of cost parameter

As is obvious from Fig. 1, the proposed P-RTP reduces the cost of energy for every value of \(c\), thus showing that P-RTP indeed manages to achieve a lower system cost, regardless of the cost function we use. This is because P-RTP leads to smaller load level than RTP. In order to show that the results are not affected by the elasticity parameter we use, we multiply \(\omega_{i}^{t}\) by a factor (omega factor) \(\omega_{f}\) in \(\left[ {0.1, 3} \right]\). According to these, Fig. 2 compares the energy costs (\(G\)) with RTP and P-RTP pricing as a function of \(\omega_{f}\). From Fig. 2 we observe that P-RTP always brings a reduction in the energy cost. Thus, its performance is consistent and significant for any choice of the flexibility parameter for the participating users.

Fig. 2
figure 2

Energy costs of P-RTP and RTP as a function of omega factor

The reason behind the reduction of the energy costs is clarified through Fig. 3, where we present the cumulative distribution function (CDF) of the \(BR_{i}\) metric exhibited by the users i in N. The dotted vertical lines represent the average \({\text{UWD}}\) of all users. As is depicted in Fig. 3, under P-RTP, users obtain benefits (discounts received) according to their behavioral change (discount achieved). In more detail, we observe that P-RTP not only offers a better trade-off between \({\text{AUW}}\) and \(G\) (the average \({\text{BR}}\) for P-RTP is closer to 1 than the average \({\text{BR}}\) for RTP) but also results into a much narrower distribution of users around the average. This means that the behavioral change that the users offer is better and more fairly reciprocated. In other words, with the proposed P-RTP, inflexible users do not benefit from the actions of flexible users. This implies that, with P-RTP, flexible users have stronger motives to adapt their behavior, as they know that they will benefit from such an adaptation, while non adaptive users will not receive benefits.

Fig. 3
figure 3

CDF of metric BRi among participating users under RTP and P-RTP pricing

The following figures show that the reduction in the energy cost is achieved without sacrificing at all the user’s welfare. In more detail, Figs. 4 and 5 present metric \({\text{AUW}}\), for the RTP and the P-RTP mechanism, as a function of \(c\) and \(\omega_{f}\) respectively.

Fig. 4
figure 4

AUW under P-RTP and RTP as a function of cost parameter \(c\)

Fig. 5
figure 5

AUW under P-RTP and RTP as a function of omega factor \(\omega_{f}\)

By comparing Figs. 1 and 4, one can see that, the system’s cost has been reduced and the system’s fairness has been enhanced, without loss on users’ aggregated welfare, that is without sacrificing efficiency. This is rationalized by the fact that P-RTP allocates financial savings to the users that provoke the cost reduction and not to the inflexible ones. In comparison with the simple RTP model, this leads to an increase in the flexible users welfare and a decrease in the inflexible users’ welfare, thus the total \({\text{AUW}}\) remains the same.

Though the \({\text{AUW}}\) metric is no better with RTP, we also want to make sure that this benefit does not come with a sacrifice of welfare from a particular subset of users. In Fig. 6, we present the CDF for \(UWD_{i}\). The dotted vertical lines represent the average \({\text{UWD}}\) of all users in the set N. The averages coincide with each other while the distribution with P-RTP is insignificantly narrower.

Fig. 6
figure 6

CDF of metric UWD in P-RTP and RTP

7 Future work

We considered a business model of a budget-balanced aggregating entity serving as ESP for its registered users. We proposed a P-RTP mechanism and evaluated its performance against that of the classic RTP mechanism in terms, of the most well established KPIs derived in the literature. In order to focus on the merits of the main idea, we kept the system model simple so as not to harm the generality of the results. Future research can extend the results to more advanced system models that include: ① the possibility of load shifting in addition to load curtailment; ② RES and energy storage systems (ESS). In addition, the user’s utility function and the way the user makes decisions is still an open area for research. Distinct models for different devices could be considered and applied under the P-RTP paradigm. Moreover, in electricity markets, different pricing mechanisms (P-RTP, RTP, flat-price, etc) are to be offered to real users as an option, making the co-existence of different pricing mechanisms for different users in a given market an interesting problem. Finally, the new prospects of electricity pricing offered by P-RTP will impact, if adopted, the sizing (investment cost) of RES and ESSs. We believe that the integration of RES and ESS sizing with P-RTP mechanism design may give rise to new capabilities for self-sufficient micro-grids and advanced demand side management.