Approximating Nash Equilibrium in Day-ahead Electricity Market Bidding with Multi-agent Deep Reinforcement Learning

doi:10.35833/MPCE.2020.000502

Home > Archive>Volume 9, Issue 3, 2021 >534-544. DOI:10.35833/MPCE.2020.000502

PDF HTML Export

Approximating Nash Equilibrium in Day-ahead Electricity Market Bidding with Multi-agent Deep Reinforcement Learning

DOI:

10.35833/MPCE.2020.000502

Author:

Yan Du ^¹
Yan Du
University of Tennessee, Knoxville, USA
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
Fangxing Li ^¹
Fangxing Li
University of Tennessee, Knoxville, USA
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
Helia Zandi ^²
Helia Zandi
Oak Ridge National Laboratory, Oak Ridge, USA
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
Yaosuo Xue ^²
Yaosuo Xue
Oak Ridge National Laboratory, Oak Ridge, USA
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

Affiliation:

1.University of Tennessee, Knoxville, USA;2.Oak Ridge National Laboratory, Oak Ridge, USA

Fund Project:

This work was supported in part by the US Department of Energy (DOE), Office of Electricity and Office of Energy Efficiency and Renewable Energy under contract DE-AC05-00OR22725, in part by CURENT, an Engineering Research Center funded by US National Science Foundation (NSF) and DOE under NSF award EEC-1041877, and in part by NSF award ECCS-1809458.

Article

Figures

Metrics

Reference

Cited by

Materials

Abstract:

In this paper, a day-ahead electricity market bidding problem with multiple strategic generation company (GENCO) bidders is studied. The problem is formulated as a Markov game model, where GENCO bidders interact with each other to develop their optimal day-ahead bidding strategies. Considering unobservable information in the problem, a model-free and data-driven approach, known as multi-agent deep deterministic policy gradient (MADDPG), is applied for approximating the Nash equilibrium (NE) in the above Markov game. The MADDPG algorithm has the advantage of generalization due to the automatic feature extraction ability of the deep neural networks. The algorithm is tested on an IEEE 30-bus system with three competitive GENCO bidders in both an uncongested case and a congested case. Comparisons with a truthful bidding strategy and state-of-the-art deep reinforcement learning methods including deep Q network and deep deterministic policy gradient (DDPG) demonstrate that the applied MADDPG algorithm can find a superior bidding strategy for all the market participants with increased profit gains. In addition, the comparison with a conventional-model-based method shows that the MADDPG algorithm has higher computational efficiency, which is feasible for real-world applications.

Key words:Bidding strategy ; day-ahead electricity market ; deep reinforcement learning ; Markov game ; multi-agent deterministic policy gradient (MADDPG) ; Nash equilibrium (NE).