Improved Proximal Policy Optimization Algorithm for Sequential Security-constrained Optimal Power Flow Based on Expert Knowledge and Safety Layer

doi:10.35833/MPCE.2023.000232

Home > Archive>Volume 12, Issue 3, 2024 >742-753. DOI:10.35833/MPCE.2023.000232

DOI:

10.35833/MPCE.2023.000232

Author:

Affiliation:

1.State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources and School of Electrical & Electronic Engineering, North China Electric Power University, Beijing 102206, China;2.School of Engineering, Xining University, Xining 810008, China;3.Key Laboratory of Smart Operation of New Energy Power System, Ministry of Education, Qinghai University, Xining 810016, China;4.China Electric Power Research Institute, Nanjing 210003, China;5.National University of Sciences and Technology, Islamabad, 44000, Pakistan

Fund Project:

This work was supported in part by National Natural Science Foundation of China (No. 52077076) and in part by the National Key R&D Plan (No. 2021YFB2601502).

Article

Figures

Metrics

Reference

Cited by

Materials

Abstract:

In recent years, reinforcement learning (RL) has emerged as a solution for model-free dynamic programming problem that cannot be effectively solved by traditional optimization methods. It has gradually been applied in the fields such as economic dispatch of power systems due to its strong self-learning and self-optimizing capabilities. However, existing economic scheduling methods based on RL ignore security risks that the agent may bring during exploration, which poses a risk of issuing instructions that threaten the safe operation of power system. Therefore, we propose an improved proximal policy optimization algorithm for sequential security-constrained optimal power flow (SCOPF) based on expert knowledge and safety layer to determine active power dispatch strategy, voltage optimization scheme of the units, and charging/discharging dispatch of energy storage systems. The expert experience is introduced to improve the ability to enforce constraints such as power balance in training process while guiding agent to effectively improve the utilization rate of renewable energy. Additionally, to avoid line overload, we add a safety layer at the end of the policy network by introducing transmission constraints to avoid dangerous actions and tackle sequential SCOPF problem. Simulation results on an improved IEEE 118-bus system verify the effectiveness of the proposed algorithm.

Reference

Cited by

Get Citation

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:April 14,2023
Revised:June 21,2023
Adopted:
Online: May 20,2024
Published:

Home

Introduction

Editorial Board

For Author

Call For Papers

APC

Sponsor & Publisher

Get Citation

Share

Article Metrics

History