Abstract:Stealthy false data injection attacks (SFDIAs) targeting state estimation can bypass the bad data detection module, mislead operators with false system states, and potentially result in erroneous decisions and physical damages. While most existing studies focus on single-step SFDIAs, multi-step SFDIAs pose a greater threat due to their forward-looking nature, where each step is strategically planned to amplify the cumulative impact. Therefore, this paper focuses on multi-step SFDIAs and presents a vulnerability assessment framework that leverages a Markov decision process (MDP) and bi-level optimization to quantify the system vulnerability to this type of attack. The MDP models the sequential and strategic nature of these attacks, with states reflecting evolving system conditions influenced by prior actions. At each state, actions derived through bi-level optimization identify attack vectors that maximize line overloads, potentially triggering the tripping of transmission lines. The MDP is solved using Q-learning, enabling the calculation of a vulnerability index that assists operators in assessing the impact of multi-step SFDIAs and identifying the attacker ’ s most critical action at each step of multi-step SFDIAs. The effectiveness of the proposed vulnerability assessment framework is validated through simulations on the IEEE 39-bus test system.