Advancements and Future Directions in the Application of Machine Learning to AC Optimal Power Flow: A Critical Review

Jiang, Bozhen; Wang, Qin; Wu, Shengyu; Wang, Yidi; Lu, Gang

doi:10.3390/en17061381

Open AccessReview

Advancements and Future Directions in the Application of Machine Learning to AC Optimal Power Flow: A Critical Review

by

Bozhen Jiang

¹

,

Qin Wang

^1,*

,

Shengyu Wu

²,

Yidi Wang

³ and

Gang Lu

²

¹

Department of Electrical and Electronic Engineering, Hong Kong Polytechnic University, Hong Kong SAR, China

²

State Grid Energy Research Institute, Beijing 102209, China

³

China Electric Power Research Institute, Beijing 100055, China

^*

Author to whom correspondence should be addressed.

Energies 2024, 17(6), 1381; https://doi.org/10.3390/en17061381

Submission received: 28 January 2024 / Revised: 27 February 2024 / Accepted: 11 March 2024 / Published: 13 March 2024

(This article belongs to the Special Issue Advances in Simulations and Analysis of Electrical Power Systems: Enhancing Efficiency, Reliability and Sustainability)

Download

Browse Figures

Versions Notes

Abstract

:

Optimal power flow (OPF) is a crucial tool in the operation and planning of modern power systems. However, as power system optimization shifts towards larger-scale frameworks, and with the growing integration of distributed generations, the computational time and memory requirements of solving the alternating current (AC) OPF problems can increase exponentially with system size, posing computational challenges. In recent years, machine learning (ML) has demonstrated notable advantages in efficient computation and has been extensively applied to tackle OPF challenges. This paper presents five commonly employed OPF transformation techniques that leverage ML, offering a critical overview of the latest applications of advanced ML in solving OPF problems. The future directions in the application of machine learning to AC OPF are also discussed.

Keywords:

optimal power flow; machine learning; artificial neural network; active set; reinforcement learning; optimization method

1. Introduction

Over the past few decades, the optimal power flow (OPF) concept has garnered significant attention in the field of power system operation [1]. First introduced by Carpentier [2], OPF is widely acknowledged as a vital tool for efficiently planning and enhancing the operation of electric power systems. Its primary objective is to determine the optimal or most secure operating point, known as control variables, while ensuring compliance with system constraints and optimizing specific objective functions. However, several challenging developments have emerged in recent years, such as the advent of smart grid (SG) technologies, microgrid systems, distributed energy resources (DERs), and the integration of large-scale renewable energy sources (RESs). These developments have introduced three distinct characteristics for OPF in the future grid.

Firstly, the optimization of power systems is anticipated to shift towards a large-scale framework. Simultaneously, the advanced metering infrastructure (AMI) and two-way communication systems have placed higher demands on real-time performance for power system optimization tasks. The efficiency of conventional optimization tools has demonstrated limitations in meeting these requirements [3].

Secondly, the widespread integration of distributed generations (DGs) such as RES, energy storage systems (ESSs), and fuel cells in power systems has introduced significant security constraints that impede the accuracy of conventional optimization tools. For instance, the Midcontinent Independent System Operator (MISO) oversees one of the world’s largest electricity markets, encompassing over 45,000 buses and 1400 generation resources. Its OPF model encompasses extensive resource-level constraints and system-wide constraints [4]. To simplify the large-scale OPF, MISO has implemented Lagrangian Relaxation (LR), decomposing the problem into smaller subproblems to enhance efficiency [5,6]. With notable advancements in mixed-integer programming (MIP) solvers, MISO has transitioned from LR to MIP in its market clearing engines [7]. The complete OPF model comprises tens of thousands of buses, transmission lines, and thousands of post-contingency scenarios, making it impossible to solve within the market clearing time constraints. Consequently, MISO resorts to solving a reduced OPF model for its daily operations. Furthermore, Chen [8] has highlighted that MIP solvers may not quickly solve the problem, even with an excellent initial solution, as determining optimality for existing commitment solutions can be time-consuming. Hence, there is an urgent need for faster and more accurate OPF solution methods.

Thirdly, OPF problems are typically solved with sampling intervals of a few hours or minutes. Consequently, OPF solutions encompass a multi-period framework that incorporates inter-temporal constraints, such as ramping up and down, minimum on/off time, and others, to ensure a smooth and realistic operational trajectory over time [9,10]. Thus, the inclusion of time-coupling constraints in large-scale problems results in complex, multi-period, mixed-integer programming optimization problems [11,12]. Additionally, RESs exhibit distinct characteristics compared to conventional generating resources, characterized by high stochasticity and intermittency in their production output [13]. The incorporation of RESs into power systems brings about considerable uncertainty in short-term and real-time operations due to its deep integration. Consequently, it is necessary to formulate OPF problems as dynamic/stochastic problems rather than static problems. However, traditional optimization methods have limited efficiency when addressing these multi-period dynamic/stochastic OPF problems [14].

Several classical (deterministic) and recent (nondeterministic) heuristic optimization techniques have been proposed to solve the OPF problem. Classical methods include Newton method network flow programming, linear programming, nonlinear programming, quadratic programming, and the interior point [15,16,17,18]. However, it should be noted that the nonlinearity of OPF problems can cause classical methods to converge to local optima. As a result, metaheuristic optimization techniques are widely employed to address OPF problems. These optimization techniques can be categorized based on their inspirations, such as nature-swarm-inspired methods, human-inspired algorithms, evolutionary-inspired algorithms, physics-inspired algorithms, and artificial neural networks (ANNs).

The advancements in deep architectures of ANNs, particularly in representation learning and data handling capabilities, have contributed to significant progress in various fields, including natural language processing [19,20,21,22], computer vision [23,24,25,26], and power system operation [27,28,29,30,31]. These advancements have enabled the development of data-driven solutions that provide highly accurate numerical solutions in shorter time durations. For example, Nair et al. [32] aimed to incorporate machine learning (ML) techniques into two main sub-tasks of an MIP solver: generating a high-quality joint variable assignment and reducing the objective value gap between this assignment and an optimal solution. Their approach was validated using various real-world datasets, most of which contained between

10^{3}

and

10^{6}

variables and constraints. By comparing the performance of SCIP (a solver) in terms of the primal–dual gap within the given time constraints, they demonstrated that one of the best learning-augmented SCIP variations achieved a

10^{4}

improvement in the gap. Additionally, Xia et al. [33] proposed a bi-projection neural network (BPNN) for solving a class of constrained quadratic optimization problems, which was proven to be globally stable in the sense of Lyapunov. Numerical results showed that BPNN outperformed conventional algorithms in terms of speed. Apart from ANNs, there are various ML algorithms indirectly used for solving the OPF problem, such as support vector machine (SVM) [34,35], random forest [36,37], K-means [38,39], and others.

This article offers a comprehensive review of ML techniques applied to solve AC OPF problems. It introduces how to apply ML into AC OPF problems from five perspectives based on the general mathematical form of AC OPF: the power flow equation, the active constraint of AC OPF, the Markov form of AC OPF, the AC OPF warm-start point prediction, and the optimization process of traditional optimization methods. The techniques are thoroughly compared and discussed, providing researchers with a better understanding of how advanced ML techniques are applied in AC OPF and how they can enhance the efficiency, accuracy, and scalability of AC OPF solutions. Furthermore, in consideration of the future development of power systems, particularly with the emergence of SG, DER, and RES, this review provides an overview of the future development of ML in addressing open problems in AC OPF while acknowledging current limitations, including the need for a hierarchical solution framework to ensure mathematical guarantees, exploring unsupervised learning approaches for enhanced training efficiency, discussing the integration of first principles into the attention mechanism, and utilizing large models.

2. Problem Formulation

The objective of solving the OPF problem is to optimize a specific objective function by making optimal adjustments to power system control variables, while also satisfying several equality and inequality constraints. The mathematical representation of this optimization problem is typically as follows:

M i n F (x, u)

(1)

Subjected to

\begin{matrix} g_{j} (x, u) = 0 j = 1, 2, \dots, m \end{matrix}

(2)

\begin{matrix} h_{j} (x, u) \leq j = 1, 2, \dots p \end{matrix}

(3)

where F acts as the objective function, x is a vector representing the dependent variables (state variables), u is a vector representing the independent (the control) variables,

g_{j}

and

h_{j}

represent equality and inequality constraints, respectively. m and p are the number of equality and inequality constraints, respectively.

The dependent variables (x) in the power system can be described as follows:

x = [P_{G, 1}, V_{L, 1}, \dots V_{L, N P Q}, Q_{G, 1}, \dots Q_{G, N G}, S_{T L, 1}, \dots S_{T L, N T L}]

(4)

where

P G_{1}

is the slack bus power,

V_{L}

is the voltage of load bus,

Q_{G}

is the generator reactive power output,

S_{T} L

is the apparent power flow in the transmission line,

N P Q

is the number of load buses,

N G

is the number of generation buses, and

N T L

is the number of transmission lines.

The independent variable u of the power system can be described as follows:

u = [P_{G, 2}, \dots P_{G, N G}, V_{G, 1}, \dots V_{G, N G}, Q_{C, 1}, \dots Q_{C, N G}, T_{1}, \dots T_{N T}]

(5)

where

P_{G}

is the output active power of the generator,

V_{G}

is the voltage of the generation bus,

Q_{C}

is the injected reactive power of the shunt compensator, T is the tap setting of the transformer,

N C

is the number of shunt compensator units, and

N T

is the number of transformers.

2.1. Objective Functions

2.1.1. Quadratic Fuel Cost

The objective function is the quadratic equation of the total generation fuel cost which is formulated as follows:

F = \sum_{i = 1}^{N} G F_{i} (P_{G i}) = \sum_{i = 1}^{N P V} (a_{i} + b_{i} P_{G i} + c_{i} P_{G i}^{2})

(6)

where

F_{i}

is the fuel cost of the ith generator.

a_{i}

,

b_{i}

, and

c_{i}

are the cost coefficients of the ith generator.

2.1.2. Real Power Loss Minimization

The required objective function is minimizing the active power loss, which can be formulated as follows:

P_{l o s s} = \sum_{i = 1}^{N} T L G_{i j} (V_{i}^{2} + V_{j}^{2} - 2 V_{i} V_{j} c o s δ_{i j})

(7)

where

G_{i j}

the conductance of a transmission,

N T L

is the number of transmission lines, and

δ_{i j}

is the phase difference of voltages.

2.2. Constraints

The transmission system has several constraints which can be categorized as follows:

2.2.1. Equality Constraints

The equality constraints represent the balanced load flow equations as follows:

\begin{matrix} P_{G i} - P_{D i} = | V_{i} | \sum_{j = 1}^{N B} | V_{j} | (G_{i j} c o s δ_{i j} + B_{i j} s i n δ_{i j}) \end{matrix}

(8)

\begin{matrix} Q_{G i} - Q_{D i} = | V_{i} | \sum_{j = 1}^{N B} | V_{j} | (G_{i j} c o s δ_{i j} + B_{i j} s i n δ_{i j}) \end{matrix}

(9)

where

P G_{i}

and

Q G_{i}

are the generated active and reactive power at bus i, respectively.

P D_{i}

and

Q D_{i}

are the active and reactive load demand at bus i, respectively.

G_{i j}

and

B_{i j}

are the conductance and susceptance between bus i and bus j, respectively.

2.2.2. Inequality Constraints

The inequality constraints can be classified as follows:

Generator active power output

$P_{G i}^{m i n} \leq P_{G i} \leq P_{G i}^{m a x} i = 1, 2, \dots, N G$

(10)
Generator bus voltage

$V_{G i}^{m i n} \leq V_{G i} \leq V_{G i}^{m a x} i = 1, 2, \dots, N G$

(11)
Generator reactive power output

$Q_{G i}^{m i n} \leq Q_{G i} \leq Q_{G i}^{m a x} i = 1, 2, \dots, N G$

(12)
Transformer tap settings

$T_{i}^{m i n} \leq T_{i} \leq T_{i}^{m a x} i = 1, 2, \dots, N T$

(13)
Shunt VAR compensator

$Q_{C i}^{m i n} \leq Q_{C i} \leq Q_{C i}^{m a x} i = 1, 2, \dots, N C$

(14)
Apparent power flow in transmission lines

$S_{L i} \leq S_{L i}^{m a x} i = 1, 2, \dots, N T L$

(15)
Voltage magnitude of load buses

$V_{L i}^{m i n} \leq V_{L i} \leq V_{L i}^{m a x} i = 1, 2, \dots, N P Q$

(16)

3. Transformation of AC OPF Formulation for Machine Learning

Based on the form of OPF problems in Section 2, ML technology usually requires a certain form of transformation for OPF problems, with the following five transform forms:

Transform form 1: Direct utilization of the equality constraints according to (8). In this approach, Baker [40,41,42] showed that given

P_{D i}

,

Q_{D i}

,

G_{i j}

, and

B_{i j}

,

P_{G i}

and

V_{i}

can be obtained according to (8). Therefore, the general process involves using traditional optimization algorithms to obtain a large number of pairs

[(P_{D i}, Q_{D i}), (P_{G i}, V_{i})]

. Subsequently, an ANN is constructed to fit those pairs, as illustrated in Figure 1.

Transform form 2: Formulating OPF problems into a Markov decision process (MDP). The OPF problem can be described as an MDP with four components

(S, A, P, R)

. Here, S represents the state space, where

s_{t} \in S

denotes the state at time t. The state includes variables such as

P_{D i}

,

Q_{D i}

,

G_{i j}

, and

B_{i j}

. A represents the action space, where

a_{t} \in A

represents the action at time t. The actions include variables like

P_{G i}

and

V_{i}

.

P (s_{t + 1} | s_{t}, a_{t})

represents the state transition function, which indicates the probability of transitioning from state

s_{t}

to

s_{t + 1}

when action

a_{t}

is taken. It corresponds to the simulation model of the power grid. R represents the reward function, where

r_{t} = R (s_{t}, a_{t}, s_{t + 1})

represents the immediate reward obtained when action

a_{t}

is taken in state

s_{t}

. The reward consists of objective functions, inequality constraints, and other factors. Therefore, the general process involves modeling the power grid and developing the power grid environment. Then, an ANN-based agent is designed to interact with the power grid environment, capturing the state, the action, and the rewards triad

(s_{t}, (a_{t}, r_{t}))

. Finally, the agent is trained based on the rewards

r_{t}

to update the agent parameters, as illustrated in Figure 2.

Transform form 3: Predicting the active constraints in the OPF problem. Recognizing the active and inactive sets before solving the optimization problem can greatly reduce the problem’s complexity. This approach involves two stages: learning and predicting. In the learning stage, sampling is performed to obtain all possible activity sets, and the optimal activity set is identified from these sets. Subsequently, a classifier is trained to determine the optimality of an activity set. In the predicting stage, the classifier is used to evaluate the optimality of an activity set, as depicted in Figure 3.

Transform form 4: Predicting the warm-start point of OPF. Since improving the warm-start point could have convergence and computational speed benefits, the benefits of a warm start are solver dependent [43]. This approach involves generating a large number of datasets using traditional optimization algorithms, and forming state-optimal solution pairs. ML algorithms are then employed to learn the mapping relationships, and the learned model is used to predict the optimal solution in practical scenarios. However, at this stage, the optimal solution serves as the initial solution for the OPF optimization problem.

Transformation form 5: Learning the solving process. Baker [41] highlighted that several Quasi-Newton methods have been developed to find solutions to optimality conditions by utilizing an approximate Jacobian matrix. Firstly, the solver records the optimal solution data pairs

[x_{k}, x_{k + 1}]

at each step. Since

x_{k + 1} = x_{k} - α J^{- 1} (x_{k}) d (x_{k})

, Baker [41] proposed applying an ANN to fit

F_{R} (x_{k})

, which represents

x_{k} - α J^{- 1} (x_{k}) d (x_{k})

, where

d (\cdot)

and

J^{- 1} (\cdot)

are a vector and the inverse Jacobian matrix of the KarushKuhn–Tucker (KKT) conditions, respectively. These data pairs are then utilized as input and output for training an ANN. Convergence is considered achieved when the difference

x_{k} - x_{k + 1}

is smaller than a predefined threshold (

ϵ

). The prediction process involves using the ANN to approximate fast iterations and obtain the best results, followed by solving the power flow to restore feasibility, as illustrated in Figure 4. Some research further incorporates gradients, including the Jacobian matrix, to guide ANN training [44,45,46].

4. Machine Learning Applications in OPF

Based on the aforementioned understanding, existing applications in OPF can be categorized into the direct mapping of OPF variables, prediction of active constraints, learning control policy for OPF, predicting warm-start points, and learning solving processes, which are summarized in Table 1.

4.1. Direct Mapping of OPF Variables

Zamzam and Baker [47] tried to learn a mapping between the system loading and optimal generation values, enabling the ANN to find near-optimal and feasible AC OPF solutions. It can bypass solving the traditionally nonconvex AC OPF problem, resulting in a significant decrease in computational burden for grid operators. However, the above application of ML techniques in OPF has faced several challenges, including the high demand for quality and quantity of training data, the generation of physically infeasible solutions, and the limited generalizability and interpretability of ML models. Zhou et al. [48] embedded the discrete topology representation into the continuous admittance space and trained a deep neural network (DNN) to learn the mapping from load and admittance to the corresponding OPF solution. They then employed the trained DNN to solve AC OPF problems over any power network with the same bus, generation, and line capacity configurations but different topology and/or line admittances. This method has two advantages. It beds the topology of the line as prior information into the neural network, which can greatly improve the interpretability of the neural network, reduce the dependence on a large amount of training data to a certain extent, and have good generalization performance for the case of topology changes.

Nellikkath et al. [30] also used a similar idea by incorporating physics-based rules or laws into ANN to overcome the above obstacles. This method is named the physics-informed neural network (PINN), and Huang and Wang [49] provided a detailed overview of the application of PINN in power systems. Specifically, Nellikkath et al. introduced the physical equations in the form of the AC OPF KKT conditions inside the neural network training. By doing that, the neural network can reduce its dependency on the size and quality of the training dataset, and it can determine its optimal parameters based on the actual equations that it aims to emulate. Falconer et al. [50] embedded physical information into a neural network from another perspective. Falconer pointed out that the majority of work in this area typically employs fully connected neural networks (FCNN), but graph neural networks (GNN) perform with superiority in embedding topological information within the power grid into the neural network architecture. Thus, Falconer et al. employed GNN to model the transmission line, bus, node, and so on, and the result outperformed both FCNN and convolution neural network models. Additionally, they further demonstrated the marginal utility of applying GNN architectures compared to FCNN for a fixed grid topology and found that GNN models are able to straightforwardly take the change of topological information into account.

In order to simplify the learning process, Lei et al. [51] introduced the Lagrange multipliers as intermediate variables to assist the extreme learning machine (SELM) for the OPF problem. The Lagrange multipliers cover the allocation of energy in the optimal power operating points, which is highly related to the AC OPF solutions. Hence, it can perform with significant efficiency for learning the AC OPF solution.

With the significant increase in new energy units, the problem makes decisions on the output of new energy units. Differently from traditional thermal power units, new energy units do not have the constraint of climbing rate, only the maximum power output at the current time. Christian et al. [52] incorporate new energy units into the OPF optimization problem by learning a mapping among the nodal loads and photovoltaic active powers and optimal photovoltaic reactive powers obtained. It first solved a standard AC OPF and then replaced AC OPF with a less computationally expensive ANN to perform centralized control of reactive power in photovoltaic systems. Further, they utilized shapley additive explanations, an explainable technique, to provide further insights into the behavior of the centralized ANN controller.

ANN has many parameters that need to be adjusted. This means that since parameter size, optimization problem complexity, and data set size need to be adapted to the problem, the parameters often need to be adjusted repeatedly. As compared to deep learning algorithms, SELM offers a rapid training speed and eliminates the need for time-consuming parameter-tuning processes. Lei et al. [29] introduced a data-driven approach for OPF utilizing the stacked SELM framework. Nonetheless, Lei et al. also highlighted that employing SELM directly for OPF poses challenges due to the intricate correlation between the system’s operational state and OPF solutions. They followed a similar idea in [51] and further decomposed the OPF model features into three stages by introducing the internal variables, which greatly simplified the training process.

Although the above methods have great advantages in terms of efficiency, the problem of unfeasibility often occurs. Lotfi et al. [34] firstly classify the feasible and infeasible AC OPF problems according to the load configuration where the feasible and infeasible labels come from the traditional optimization method. Then, the classification results will support the system operators to acknowledge the feasibility of the current load configuration before feeding it to ANN to predict system variables such as generator voltages and active power output. Thirdly, another DNN is used to predict the desired output of the OPF problem for each feasible pair of active and reactive power demand points. Instead of liking the work in [34], Minas et al. [53] adopted the idea of division and governance, decomposed the electric network, first learning the current and voltage from the coupling area, and then predicting each decomposed sub-network. They pointed out that the predictions can only result in minor violations only for the operational bound constraints. Xiang et al. [54] emphasized the importance of inequality constraints, arguing that inequality constraints should be satisfied in the solution of OPF based on ML. Therefore, they employed a penalty approach with a zero-order gradient estimation technique in the training process toward guaranteeing the inequality constraints.

Direct current (DC) OPF, due to its extensive simplification of the OPF problem, is essentially a convex problem, and its generalization performance cannot be guaranteed when the neural network is used for training and solving DC OPF. Ling et al. [55] utilized the convexity of the DC OPF problem and trained an input convex neural network. Further, in [30], they constructed the training loss based on Karush–Kuhn–Tucker optimality conditions. By combining these two techniques, the trained model has provable generalization properties. The aforementioned methods primarily concentrate on the static system topology, implying that modifications to the system’s structure (initiated by the system operator) necessitate retraining the DNN. This process incurs substantial training overhead and demands a substantial amount of training data specific to the new system topology. Chen et al. [56] employed meta-learning for DNN-based OPF predictor training, which can find a common initialization vector that enables fast training for any system topology.

4.2. Predicting Active Constraints

Solving the OPF problem directly using ML methods poses challenges, leading researchers to explore simplified versions of the original OPF to alleviate solver complexity. One area of current focus is predicting active constraints, which involves identifying and eliminating non-active constraints beforehand, thereby simplifying the optimization problem. Misra et al. [57,58] emphasize that utilizing active sets as features preserves the underlying system physics, enables interpretable models, considers important safety constraints, and facilitates straightforward representation and encoding. Additionally, it is possible to obtain the optimal solution by solving the reduced OPF problem for each important active set and checking the resulting solutions for feasibility.

Instead of like the work in [34] classifying the feasible and infeasible AC OPF problem, Baker and Bernstein [40] tried to eliminate zero-probability events such as inactive constraints to transform the joint chance constraints into deterministic constraints. Specifically, they use the forecasted load and solar at each node as the features within the SVC used to classify overvoltage constraints at those nodes. Yeesian et al. [59] provided an affine control policy. By utilizing the same set of active constraints, they proposed an ensemble control policy that combines several basis policies to improve performance. Despite the exponential growth in the number of potential bases relative to the system’s size, their research revealed that only a small subset of these bases is pertinent to system operation. However, this method can only be appied in DC OPF due to its convex characteristic. Misra et al. [58] tried to classify the active and inactive constraints, and then recognize the important active sets. Then, the ML-based classifier is constructed and trained. It should be noted that the proposed active margin functions as continuous indices instead of binary active or inactive to better quantify how likely each security constraint will be active. Further, Hasan et al. [60] used a mixture of classification and regression learners to predict active and inactive inequality constraints. Specifically, they first trained a regression learner to complete the direct mapping of OPF variables’ task. Subsequently, the variable was used for constrained classification.

With extensive access to new energy sources, the impact of input uncertainty on the active set becomes significant. Deepjyoti and Sidhant [57] employed the ML-based classifier to learn the mapping between the uncertainty inputs and the active set of constraints at optimality, thus further enhancing the computational efficiency of the real-time prediction. Similarly, Liu et al. [61] used historical data and simulation data about power system conditions (e.g., load fluctuation, temperature, uncertainties of solar and wind generation) and security-constrained optimal power flow (SCOPF) calculation results to prepare input features and output labels.

Zhang et al. [62] discovered that the sets of inactive line flow constraints identified by previously proposed constraint screening methods were actually observed to be active across various scenarios. As a result, they investigated the line characteristics commonly associated with active flow constraints and explored the relationships among sets of concurrently active line flow constraints. Liu et al. [63] proposed active likelihood functions to measure the conditions of transmission line capacity constraints. Then, a DNN model was developed to predict active likelihood functions and then predict an active constraint set.

4.3. Learning Control Policy for OPF

Reinforcement learning focuses on how to make decisions based on the context to maximize the expected benefits [90]. In this framework, a basic reinforcement learning agent interacts with its environment in discrete time steps. At each time step t, the reinforcement learning agent receives the current state

S_{t}

and reward

R_{t}

. It then selects an action

A_{t}

from the available set of actions and sends it to the environment. The environment will run this action and transit it to a new state

S_{t + 1}

, and the associated reward

R_{t + 1}

for the transition

(S_{t}, A_{t}, S_{t + 1})

is determined. The primary goal of the agent is to maximize the anticipated cumulative reward. In the practical context of power system operation, the OPF problem is addressed by utilizing real-time system state detection to determine the most favorable operational actions. These actions are then executed through the corresponding actuators, while the system state is monitored at the subsequent time step. This process bears similarity to the general framework of reinforcement learning.

Woo et al. [64] employed the twin delayed deep deterministic policy gradient approach to improve the computational performance of AC OPF. Specifically, they designed an appropriate reward function in the training process considering cost functions, equality constraints, and inequality constraints. It means that if the agent outputs an unreasonable action, the reward will be very low or even negative. Then, this low reward will instruct the agent not to take this action. By contrast, if the reward is high, the agent will continue to output a similar action the next time. Furthermore, since the action space is continuous and generally extremely high, they also employed random exploration into the training process. Additionally, random Gaussian noise was also added to the individual net loads to represent uncertainty characteristics introduced by renewable energy sources. Similar work can be seen in [65,66,67]. Wang et al. [68] further proposed a novel human–machine collaborative (HMC) framework for line flow control. Specifically, HMC will analyze the actions produced by a reinforcement learning agent and human, respectively, to determine which action the power system should take. Zeng et al. [69] employed RL to devise an adaptive policy for selecting penalty parameters in the AC OPF problem, which is solved using the alternating direction method of multipliers. The objective of their approach is to minimize the number of iterations required for convergence.

In the face of global warming and the ongoing depletion of fossil energy resources, the power system is compelled to adopt an operational and developmental approach that minimizes carbon emissions. Various methods exist to reduce carbon emissions from both the production and consumption perspectives, including the utilization of renewable energy alternatives and the aggregation of distributed resources. Qin et al. [70] presented a novel approach for solving the multi-objective optimal carbon emission flow problem using reinforcement learning. Their method considers both the economic indicators traditionally considered in OPF and the reduction of unnecessary carbon emissions during the electricity transmission process.

In abnormal grid conditions, the microgrid operates in the islanded mode for providing uninterrupted supply to loads and stability improvement with power resilience. This islanded operation depends on the effective operation of connected distributed renewable energy sources (DRESs). In [71,72,73], deep learning reinforcement agent was employed to provide optimum power dispatch, and accurate control of connected DRES enabled the grid to restore service.

The challenges of SCOPF mainly come from the contingency constraints. Conventional supervised learning and reinforcement learning do not directly consider the constraints in the learning process [74]. Recent research shows that model-based RL can outperform the model-free methods in domains that require precision by incorporating a model into the training procedure. Therefore, utilizing primal-dual methods with deep deterministic policy gradient (DDPG) [74,75] to replace the agent cost provides potential alternatives to address the challenges of SCOPF problems. Inspired by this work, instead of building reward critic networks and cost critic networks via interacting with the environment (i.e., power flow equations), Yan and Xu [45] proposed solving KKT conditions of the Lagrangian as the actor gradients. Specifically, with the formulated sparse Jacobians of constraints and sparse Hessians of Lagrangians, the interior point method is incorporated in DDPG to derive the parameter updating rule of the DRL agent.

Despite deep reinforcement learning, the AC OPF problem itself may not be feasible due to the numerous security constraints on the power system. Zhou et al. [76] designed an ancillary classifier to identify the feasibility of the AC OPF problem before conducting a DRL agent. In order to accelerate the training process, they also adopted a supervised-learning method from deep learning to generate good initial weights for neural networks, and then the proximal policy optimization algorithm is applied to train and test the artificial intelligence agents for stable and robust performance. Sayed et al. [77] proposed a convex-constrained soft actor–critic (CC-SAC) deep reinforcement learning algorithm for the AC OPF problem. CC-SAC is a combination of data-driven and physics-driven approaches. The data-driven part speeds up the solution time by predicting near-optimum control actions. The physics-driven part effectively guarantees the solution feasibility.

4.4. Predicting Warm-Start Points

The quality of the starting point greatly influences the result and convergence efficiency of the optimization algorithm, especially for the non-convex and constrained alternating current OPF problem [78,79,80,81,82]. Forecasting warm start points is similar to directly mapping OPF variables. Baker [83] used OPF data to train a random forest to predict solutions for future AC OPF problems. The inputs (features) to the random forest are simply the loads at each bus. The outputs of the model are the optimal generation values and the voltage magnitudes.

Cao et al. [78] pointed out that simply utilizing black-box models lacked interpretability. Therefore, they proposed a fast and explainable warm-start point learning method based on the multi-target binary decision tree with a postpruning module. A set of detailed decision rules for selecting warm-start points was generated after the learning process. The generated rules assist the power system operators in identifying important loads and thereby provide the model interpretability.

4.5. Learning Solution Process

While the ML-based OPF solution method mentioned above has received considerable attention in terms of solution feasibility and optimality, it still lacks theoretical and mathematical guarantees. As a result, numerous studies have focused on the optimization method’s solution process, aiming to utilize ML methods to reveal gradient information and ensure solution feasibility.

The Jacobian matrix is the core part of power flow analysis, which is the basis for power system planning and operations. Chen et al. [84] introduced a measurement-based approach for calculating the power flow Jacobian matrix. By leveraging this matrix, they were able to extract valuable insights regarding the system topology in near real-time. Zeng et al. [85] proposed a GPU-based sparse modified Newton’s method by the introduction of a fixed Jacobian matrix, which integrates vectorization and parallelization techniques to accelerate power flow calculations. He et al. [86] estimated the Jacobian matrix in high dimensional space by the least square method and ANN. They found that the ANN-based method is sensitive to up-to-date topology parameters and state variables.

Calculating the Jacobian and its inverse is expensive and time-consuming, which may not be appropriate for large problems or fast-timescale optimization. Baker [41,42] proposed a Quasi-Newton method, in which Baker designed an ANN with a proper choice of weights and activation functions to approximate Jacobian calculating and guarantee convergence. The radial basis function ANN was also used to solve nonlinear equation load flow analysis and avoid calculating partial derivatives and an inverse Jacobian matrix [87,88]. Veerasamy et al. [89] proposed a novel generalized linear Hopfield neural network-based power flow analysis technique using the Moore–Penrose Inverse (MPI) to solve the nonlinear power flow equations (PFEs). The Hopfield neural network (HNN) with linear activation function augmented by a feed-forward layer is used to compute the MPI. In this work, the inverse of the Jacobian matrix in solving the PFEs is determined by including a feed-forward network along with the feedback network.

5. Limitations and Path Forward

The successful applications of ML in AC OPF reveal the potential of data-driven solutions from various perspectives. This section aims to provide a comprehensive and general outlook for the future development of ML in addressing open problems in AC OPF, while also highlighting current limitations. The advancement of ML in AC OPF should align with the modern and future development of power systems. It is important to note that Section 4 presents extensive applications of ML in AC OPF, particularly with the emergence of SG, DER, and RES. However, the AC OPF problem has not yet been entirely solved through ML due to the lack of mathematical guarantees. Therefore, a common approach is to establish a hierarchical solution framework. The first layer employs a data-driven scheme to quickly obtain an initial solution, while the second layer utilizes a math-driven scheme to refine the initial solution, ensuring adherence to the constraints and a continuous improvement of solution quality.

Furthermore, a substantial portion of power flow data is expressed in complex numbers, including voltage, power, and impedance. Incorporating this valuable information effectively into ANNs and devising an unsupervised learning approach are effective means of improving training efficiency. Simultaneously, both power flow and OPF rely on power flow equations. Hence, leveraging general first principles based on power flow equations can serve as further guidance for PINNs. Additionally, apart from linearity, other convexities observed in OPF research can be integrated into PINNs, thereby expanding their capabilities [49].

Moreover, the grid topology is subject to variations caused by various external and internal factors. Therefore, it is crucial to develop robust reinforcement learning agents and leverage meta-learning techniques to enable ML algorithms to adapt quickly to new tasks based on past learning experiences. Meta-learning algorithms leverage common features and patterns across tasks to improve inference and generalization to new tasks. Transfer learning is also highly valued in the deep learning community as it significantly enhances sample efficiency and training efficiency, particularly for problems with a series of similar tasks.

Finally, the attention mechanism, which has shown success in natural language processing and has recently been applied in power systems [91,92], is considered one of the most popular ML architectures. Exploring the integration of first principles into the attention mechanism presents an exciting avenue for future research. With the rapid advancement of attention mechanism-based large models like GPT-4 [93], they can serve as valuable references for designing large models to address OPF problems. This direction holds immense potential and is quite exciting.

6. Conclusions

This article provides a comprehensive review of machine learning (ML) techniques applied to solve AC OPF problems. It summarizes and discusses five specific techniques based on the general mathematical form of AC OPF: the power flow equation, the active constraint of AC OPF, the Markov form of AC OPF, the AC OPF warm-start point prediction, and the optimization process of traditional optimization methods. This review aims to assist researchers in quickly understanding the application of advanced ML technology in AC OPF problems and provide a framework for improving ML techniques based on the specific characteristics of AC OPF. By deepening the understanding of AC OPF and the aforementioned techniques, researchers can conduct more efficient and reliable research on AC OPF solution technologies.

Furthermore, considering the future development of power systems and the state-of-the-art ML techniques, this review explores potential research directions for ML in AC OPF. These include establishing mathematical guarantees, improving training efficiency, integrating first principles, and utilizing large models. These advancements have the potential to significantly enhance the speed and reliability of AC OPF solutions, supporting the safe and economically stable operation of future power systems.

Author Contributions

Formal analysis, B.J., Q.W., S.W., Y.W. and G.L.; investigation, Q.W.; writing—original draft preparation, B.J., Q.W., S.W., Y.W. and G.L.; writing—review and editing, B.J., Q.W., S.W., Y.W. and G.L.; visualization, B.J. and Q.W.; supervision, Q.W.; project administration, Q.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by The Hong Kong Polytechnic University grant number P0047690.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

Authors Shengyu Wu and Gang Lu were employed by the company State Grid Energy Research Institute. Author Yidi Wang was employed by the company China Electric Power Research Institute. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Zobaa, A.F.; Aleem, S.A.; Abdelaziz, A.Y. Classical and Recent Aspects of Power System Optimization; Academic Press: Cambridge, MA, USA, 2018. [Google Scholar]
Carpentier, J. Contribution a l’etude du dispatching economique. Bull. Soc. Fr. Elec. Ser. 1962, 3, 431. [Google Scholar]
Capitanescu, F. Critical review of recent advances and further developments needed in AC optimal power flow. Electr. Power Syst. Res. 2016, 136, 57–68. [Google Scholar] [CrossRef]
Chen, Y.; Pan, F.; Qiu, F.; Xavier, A.S.; Zheng, T.; Marwali, M.; Knueven, B.; Guan, Y.; Luh, P.B.; Wu, L.; et al. Security-constrained unit commitment for electricity market: Modeling, solution methods, and future challenges. IEEE Trans. Power Syst. 2022, 38, 4668–4681. [Google Scholar] [CrossRef]
Conejo, A.J.; Aguado, J.A. Multi-area coordinated decentralized DC optimal power flow. IEEE Trans. Power Syst. 1998, 13, 1272–1278. [Google Scholar] [CrossRef]
Guan, X.; Zhai, Q.; Papalexopoulos, A. Optimization based methods for unit commitment: Lagrangian relaxation versus general mixed integer programming. In Proceedings of the 2003 IEEE Power Engineering Society General Meeting (IEEE Cat. No. 03CH37491), Toronto, ON, Canada, 13–17 July 2003; Volume 2, pp. 1095–1100. [Google Scholar]
Bixby, R.; Rothberg, E. Progress in computational mixed integer programming—A look back from the other side of the tipping point. Ann. Oper. Res. 2007, 149, 37. [Google Scholar] [CrossRef]
Chen, Y.; Casto, A.; Wang, F.; Wang, Q.; Wang, X.; Wan, J. Improving large scale day-ahead security constrained unit commitment performance. IEEE Trans. Power Syst. 2016, 31, 4732–4743. [Google Scholar] [CrossRef]
Wang, Q.; Yang, A.; Wen, F.; Li, J. Risk-based security-constrained economic dispatch in power systems. J. Mod. Power Syst. Clean Energy 2013, 1, 142–149. [Google Scholar] [CrossRef]
Wang, Q.; Hodge, B.M. Enhancing Power System Operational Flexibility with Flexible Ramping Products: A Review. IEEE Trans. Ind. Inform. 2017, 13, 1652–1664. [Google Scholar] [CrossRef]
Lu, X.; Chan, K.W.; Xia, S.; Zhou, B.; Luo, X. Security-constrained multiperiod economic dispatch with renewable energy utilizing distributionally robust optimization. IEEE Trans. Sustain. Energy 2018, 10, 768–779. [Google Scholar] [CrossRef]
Flores-Quiroz, A.; Strunz, K. A distributed computing framework for multi-stage stochastic planning of renewable power systems with energy storage as flexibility option. Appl. Energy 2021, 291, 116736. [Google Scholar] [CrossRef]
Li, W.; Wang, Q. Stochastic production simulation for generating capacity reliability evaluation in power systems with high renewable penetration. Energy Convers. Econ. 2020, 1, 210–220. [Google Scholar] [CrossRef]
Faulwasser, T.; Engelmann, A.; Mühlpfordt, T.; Hagenmeyer, V. Optimal power flow: An introduction to predictive, distributed and stochastic control challenges. At-Automatisierungstechnik 2018, 66, 573–589. [Google Scholar] [CrossRef]
Momoh, J.A.; Adapa, R.; El-Hawary, M. A review of selected optimal power flow literature to 1993. I. Nonlinear and quadratic programming approaches. IEEE Trans. Power Syst. 1999, 14, 96–104. [Google Scholar] [CrossRef]
Momoh, J.A.; El-Hawary, M.; Adapa, R. A review of selected optimal power flow literature to 1993. II. Newton, linear programming and interior point methods. IEEE Trans. Power Syst. 1999, 14, 105–111. [Google Scholar] [CrossRef]
Wang, Q.; McCalley, J.D.; Zheng, T.; Litvinov, E. Solving corrective risk-based security-constrained optimal power flow with Lagrangian relaxation and Benders decomposition. Int. J. Electr. Power Energy Syst. 2016, 75, 255–264. [Google Scholar] [CrossRef]
Wang, Q.; McCalley, J.D.; Zheng, T.; Litvinov, E. A Computational Strategy to Solve Preventive Risk-Based Security-Constrained OPF. IEEE Trans. Power Syst. 2013, 28, 1666–1675. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.; Mishkin, P.; Zhang, C.; Agarwal, S.; Slama, K.; Ray, A.; et al. Training language models to follow instructions with human feedback. Adv. Neural Inf. Process. Syst. 2022, 35, 27730–27744. [Google Scholar]
Knox, W.B.; Stone, P. Augmenting reinforcement learning with human feedback. In Proceedings of the ICML 2011 Workshop on New Developments in Imitation Learning, Washington, DC, USA, 2 July 2011; Volume 855, p. 3. [Google Scholar]
Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
Voulodimos, A.; Doulamis, N.; Doulamis, A.; Protopapadakis, E. Deep learning for computer vision: A brief review. Comput. Intell. Neurosci. 2018, 2018, 7068349. [Google Scholar] [CrossRef] [PubMed]
Xu, S.; Wang, J.; Shou, W.; Ngo, T.; Sadick, A.M.; Wang, X. Computer vision techniques in construction: A critical review. Arch. Comput. Methods Eng. 2021, 28, 3383–3397. [Google Scholar] [CrossRef]
Wiley, V.; Lucas, T. Computer vision and image processing: A paper review. Int. J. Artif. Intell. Res. 2018, 2, 29–36. [Google Scholar] [CrossRef]
Liu, Y.; Wang, F.; Liu, K.; Mostacci, M.; Yao, Y.; Sfarra, S. Deep convolutional autoencoder thermography for artwork defect detection. Quant. Infrared Thermogr. J. 2023, 1–17. [Google Scholar] [CrossRef]
Ibrahim, M.S.; Dong, W.; Yang, Q. Machine learning driven smart electric power systems: Current trends and new perspectives. Appl. Energy 2020, 272, 115237. [Google Scholar] [CrossRef]
Jiang, B.; Liu, Y.; Geng, H.; Wang, Y.; Zeng, H.; Ding, J. A holistic feature selection method for enhanced short-term load forecasting of power system. IEEE Trans. Instrum. Meas. 2022, 72, 2500911. [Google Scholar] [CrossRef]
Lei, X.; Yang, Z.; Yu, J.; Zhao, J.; Gao, Q.; Yu, H. Data-driven optimal power flow: A physics-informed machine learning approach. IEEE Trans. Power Syst. 2020, 36, 346–354. [Google Scholar] [CrossRef]
Nellikkath, R.; Chatzivasileiadis, S. Physics-informed neural networks for ac optimal power flow. Electr. Power Syst. Res. 2022, 212, 108412. [Google Scholar] [CrossRef]
Jiang, B.; Liu, Y.; Geng, H.; Zeng, H.; Ding, J. A Transformer Based Method with Wide Attention Range for Enhanced Short-term Load Forecasting. In Proceedings of the 2022 4th International Conference on Smart Power & Internet Energy Systems (SPIES), Beijing, China, 9–12 December 2022; pp. 1684–1690. [Google Scholar] [CrossRef]
Nair, V.; Bartunov, S.; Gimeno, F.; Von Glehn, I.; Lichocki, P.; Lobov, I.; O’Donoghue, B.; Sonnerat, N.; Tjandraatmadja, C.; Wang, P.; et al. Solving mixed integer programs using neural networks. arXiv 2020, arXiv:2012.13349. [Google Scholar]
Xia, Y.; Wang, J. A bi-projection neural network for solving constrained quadratic optimization problems. IEEE Trans. Neural Netw. Learn. Syst. 2015, 27, 214–224. [Google Scholar] [CrossRef]
Lotfi, A.; Pirnia, M. Constraint-guided deep neural network for solving optimal power flow. Electr. Power Syst. Res. 2022, 211, 108353. [Google Scholar] [CrossRef]
Pan, W.; Zhao, C.; Fan, L.; Huang, S. Efficient Optimal Power Flow Flexibility Assessment: A Machine Learning Approach. In Proceedings of the 2023 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), Washington, DC, USA, 16–19 January 2023; pp. 1–5. [Google Scholar]
Rahman, J.; Feng, C.; Zhang, J. Machine learning-aided security constrained optimal power flow. In Proceedings of the 2020 IEEE Power & Energy Society General Meeting (PESGM), IEEE, Montreal, QC, Canada, 2–6 August 2020; pp. 1–5. [Google Scholar]
Rahman, J.; Feng, C.; Zhang, J. A learning-augmented approach for AC optimal power flow. Int. J. Electr. Power Energy Syst. 2021, 130, 106908. [Google Scholar] [CrossRef]
Sun, L.; Hu, J.; Chen, H. Artificial Bee Colony Algorithm Based on-Means Clustering for Multiobjective Optimal Power Flow Problem. Math. Probl. Eng. 2015, 2015, 762853. [Google Scholar] [CrossRef]
Hashish, M.S.; Hasanien, H.M.; Ullah, Z.; Alkuhayli, A.; Badr, A.O. Giant Trevally Optimization Approach for Probabilistic Optimal Power Flow of Power Systems Including Renewable Energy Systems Uncertainty. Sustainability 2023, 15, 13283. [Google Scholar] [CrossRef]
Baker, K.; Bernstein, A. Joint chance constraints reduction through learning in active distribution networks. In Proceedings of the 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Anaheim, CA, USA, 26–29 November 2018; pp. 922–926. [Google Scholar]
Baker, K. A learning-boosted quasi-newton method for ac optimal power flow. arXiv 2020, arXiv:2007.06074. [Google Scholar]
Baker, K. Emulating AC OPF Solvers with Neural Networks. IEEE Trans. Power Syst. 2022, 37, 4950–4953. [Google Scholar] [CrossRef]
Zamzam, A.S.; Fu, X.; Sidiropoulos, N.D. Data-driven learning-based optimization for distribution system state estimation. IEEE Trans. Power Syst. 2019, 34, 4796–4805. [Google Scholar] [CrossRef]
Yu, J.; Lu, L.; Meng, X.; Karniadakis, G.E. Gradient-enhanced physics-informed neural networks for forward and inverse PDE problems. Comput. Methods Appl. Mech. Eng. 2022, 393, 114823. [Google Scholar] [CrossRef]
Yan, Z.; Xu, Y. A hybrid data-driven method for fast solution of security-constrained optimal power flow. IEEE Trans. Power Syst. 2022, 37, 4365–4374. [Google Scholar] [CrossRef]
Yan, Z.; Xu, Y. Real-Time Optimal Power Flow: A Lagrangian Based Deep Reinforcement Learning Approach. IEEE Trans. Power Syst. 2020, 35, 3270–3273. [Google Scholar] [CrossRef]
Zamzam, A.S.; Baker, K. Learning Optimal Solutions for Extremely Fast AC Optimal Power Flow. In Proceedings of the 2020 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), Virtual, 11–13 November 2020; pp. 1–6. [Google Scholar] [CrossRef]
Zhou, M.; Chen, M.; Low, S.H. DeepOPF-FT: One Deep Neural Network for Multiple AC-OPF Problems with Flexible Topology. IEEE Trans. Power Syst. 2023, 38, 964–967. [Google Scholar] [CrossRef]
Huang, B.; Wang, J. Applications of physics-informed neural networks in power systems-a review. IEEE Trans. Power Syst. 2022, 38, 572–588. [Google Scholar] [CrossRef]
Falconer, T.; Mones, L. Leveraging power grid topology in machine learning assisted optimal power flow. IEEE Trans. Power Syst. 2022, 38, 2234–2246. [Google Scholar] [CrossRef]
Lei, X.; Yu, J.; Aini, H.; Wu, W. Data-driven alternating current optimal power flow: A Lagrange multiplier based approach. Energy Rep. 2022, 8, 748–755. [Google Scholar] [CrossRef]
Utama, C.; Meske, C.; Schneider, J.; Ulbrich, C. Reactive power control in photovoltaic systems through (explainable) artificial intelligence. Appl. Energy 2022, 328, 120004. [Google Scholar] [CrossRef]
Chatzos, M.; Mak, T.W.; Van Hentenryck, P. Spatial network decomposition for fast and scalable AC-OPF learning. IEEE Trans. Power Syst. 2021, 37, 2601–2612. [Google Scholar] [CrossRef]
Pan, X.; Chen, M.; Zhao, T.; Low, S.H. DeepOPF: A feasibility-optimized deep neural network approach for AC optimal power flow problems. IEEE Syst. J. 2022, 17, 673–683. [Google Scholar] [CrossRef]
Zhang, L.; Chen, Y.; Zhang, B. A convex neural network solver for DCOPF with generalization guarantees. IEEE Trans. Control Netw. Syst. 2021, 9, 719–730. [Google Scholar] [CrossRef]
Chen, Y.; Lakshminarayana, S.; Maple, C.; Poor, H.V. A meta-learning approach to the optimal power flow problem under topology reconfigurations. IEEE Open Access J. Power Energy 2022, 9, 109–120. [Google Scholar] [CrossRef]
Deka, D.; Misra, S. Learning for DC-OPF: Classifying active sets using neural nets. In Proceedings of the 2019 IEEE Milan PowerTech, Milan, Italy, 23–27 June 2019; pp. 1–6. [Google Scholar] [CrossRef]
Misra, S.; Roald, L.; Ng, Y. Learning for constrained optimization: Identifying optimal active constraint sets. INFORMS J. Comput. 2022, 34, 463–480. [Google Scholar] [CrossRef]
Ng, Y.; Misra, S.; Roald, L.A.; Backhaus, S. Statistical learning for DC optimal power flow. In Proceedings of the IEEE 2018 Power Systems Computation Conference (PSCC), Dublin, Ireland, 11–15 June 2018; pp. 1–7. [Google Scholar]
Hasan, F.; Kargarian, A.; Mohammadi, J. Hybrid Learning Aided Inactive Constraints Filtering Algorithm to Enhance AC OPF Solution Time. IEEE Trans. Ind. Appl. 2021, 57, 1325–1334. [Google Scholar] [CrossRef]
Liu, S.; Guo, Y.; Tang, W.; Sun, H.; Huang, W.; Hou, J. Varying Condition SCOPF Optimization Based on Deep Learning and Knowledge Graph. IEEE Trans. Power Syst. 2022, 38, 3189–3200. [Google Scholar] [CrossRef]
Zhang, Z.J.; Mana, P.T.; Yan, D.; Sun, Y.; Molzahn, D.K. Study of Active Line Flow Constraints in DC Optimal Power Flow Problems. In Proceedings of the 2020 SoutheastCon, Raleigh, NC, USA, 28–29 March 2020; pp. 1–8. [Google Scholar] [CrossRef]
Liu, S.; Guo, Y.; Tang, W.; Sun, H.; Huang, W. Predicting Active Constraints Set in Security-Constrained Optimal Power Flow via Deep Neural Network. In Proceedings of the 2021 IEEE Power & Energy Society General Meeting (PESGM), Washington, DC, USA, 16–29 July 2021; pp. 1–5. [Google Scholar] [CrossRef]
Woo, J.H.; Wu, L.; Park, J.B.; Roh, J.H. Real-time optimal power flow using twin delayed deep deterministic policy gradient algorithm. IEEE Access 2020, 8, 213611–213618. [Google Scholar] [CrossRef]
Wang, Z.; Menke, J.H.; Schäfer, F.; Braun, M.; Scheidler, A. Approximating multi-purpose AC optimal power flow with reinforcement trained artificial neural network. Energy AI 2022, 7, 100133. [Google Scholar] [CrossRef]
Wu, S.; Hu, W.; Lu, Z.; Gu, Y.; Tian, B.; Li, H. Power System Flow Adjustment and Sample Generation Based on Deep Reinforcement Learning. J. Mod. Power Syst. Clean Energy 2020, 8, 1115–1127. [Google Scholar] [CrossRef]
Zhou, Y.; Zhang, B.; Xu, C.; Lan, T.; Diao, R.; Shi, D.; Wang, Z.; Lee, W.J. A Data-driven Method for Fast AC Optimal Power Flow Solutions via Deep Reinforcement Learning. J. Mod. Power Syst. Clean Energy 2020, 8, 1128–1139. [Google Scholar] [CrossRef]
Wang, C.; Du, Y.; Chang, Y.; Guo, Z.; Huang, Y. Human–Machine Collaborative Reinforcement Learning for Power Line Flow Regulation. IEEE Trans. Ind. Inform. 2023, 1–13. [Google Scholar] [CrossRef]
Zeng, S.; Kody, A.; Kim, Y.; Kim, K.; Molzahn, D.K. A reinforcement learning approach to parameter selection for distributed optimal power flow. Electr. Power Syst. Res. 2022, 212, 108546. [Google Scholar] [CrossRef]
Qin, P.; Ye, J.; Hu, Q.; Song, P.; Kang, P. Deep reinforcement learning based power system optimal carbon emission flow. Front. Energy Res. 2022, 10, 1017128. [Google Scholar] [CrossRef]
Tianjing, W.; Yong, T. Parallel deep reinforcement learning-based power flow state adjustment considering static stability constraint. IET Gener. Transm. Distrib. 2020, 14, 6276–6284. [Google Scholar] [CrossRef]
Jeyaraj, P.R.; Asokan, S.P.; Kathiresan, A.C.; Nadar, E.R.S. Deep reinforcement learning-based network for optimized power flow in islanded DC microgrid. Electr. Eng. 2023, 105, 2805–2816. [Google Scholar] [CrossRef]
Wang, T.; Tang, Y. An unsolvable power flow adjustment method for weak power grid based on transmission channel positioning and deep reinforcement learning. Electr. Power Syst. Res. 2022, 210, 108050. [Google Scholar] [CrossRef]
Chow, Y.; Nachum, O.; Faust, A.; Duenez-Guzman, E.; Ghavamzadeh, M. Lyapunov-based safe policy optimization for continuous control. arXiv 2019, arXiv:1901.10031. [Google Scholar]
Liang, Q.; Que, F.; Modiano, E. Accelerated primal-dual policy optimization for safe reinforcement learning. arXiv 2018, arXiv:1802.06480. [Google Scholar]
Zhou, Y.; Lee, W.J.; Diao, R.; Shi, D. Deep reinforcement learning based real-time AC optimal power flow considering uncertainties. J. Mod. Power Syst. Clean Energy 2021, 10, 1098–1109. [Google Scholar] [CrossRef]
Sayed, A.R.; Wang, C.; Anis, H.I.; Bi, T. Feasibility Constrained Online Calculation for Real-Time Optimal Power Flow: A Convex Constrained Deep Reinforcement Learning Approach. IEEE Trans. Power Syst. 2023, 38, 5215–5227. [Google Scholar] [CrossRef]
Cao, Y.; Zhao, H.; Liang, G.; Zhao, J.; Liao, H.; Yang, C. Fast and explainable warm-start point learning for AC Optimal Power Flow using decision tree. Int. J. Electr. Power Energy Syst. 2023, 153, 109369. [Google Scholar] [CrossRef]
Yu, J.; Li, Z.; Zhang, J.; Bai, X.; Ge, H.; Zheng, J.; Wu, Q. Efficient contingency analysis of power systems using linear power flow with generalized warm-start compensation. Int. J. Electr. Power Energy Syst. 2024, 156, 109692. [Google Scholar] [CrossRef]
Demirovic, N.; Tesnjak, S.; Tokic, A. Hot Start and Warm start in LP based Interior Point Method and it’s Application to Multiperiod Optimal Power Flows. In Proceedings of the 2006 IEEE PES Power Systems Conference and Exposition, Atlanta, GA, USA, 29 October–1 November 2006; pp. 699–704. [Google Scholar] [CrossRef]
Kim, Y.; Anitescu, M. A real-time optimization with warm-start of multiperiod AC optimal power flows. Electr. Power Syst. Res. 2020, 189, 106721. [Google Scholar] [CrossRef]
Wu, Y.C.; Debs, A. Initialisation, decoupling, hot start, and warm start in direct nonlinear interior point algorithm for optimal power flows. IEE Proc.-Gener. Transm. Distrib. 2001, 148, 67–75. [Google Scholar] [CrossRef]
Baker, K. Learning warm-start points for AC optimal power flow. In Proceedings of the 2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP), Pittsburgh, PA, USA, 13–16 October 2019; pp. 1–6. [Google Scholar]
Chen, Y.C.; Wang, J.; Domínguez-García, A.D.; Sauer, P.W. Measurement-based estimation of the power flow Jacobian matrix. IEEE Trans. Smart Grid 2015, 7, 2507–2515. [Google Scholar] [CrossRef]
Zeng, L.; Alawneh, S.G.; Arefifar, S.A. GPU-Based Sparse Power Flow Studies with Modified Newton’s Method. IEEE Access 2021, 9, 153226–153239. [Google Scholar] [CrossRef]
He, X.; Chu, L.; Qiu, R.; Ai, Q.; Huang, W. Data-driven estimation of the power flow jacobian matrix in high dimensional space. arXiv 2019, arXiv:1902.06211. [Google Scholar]
Baghaee, H.R.; Mirsalim, M.; Gharehpetian, G.B.; Talebi, H.A. Generalized three phase robust load-flow for radial and meshed power systems with and without uncertainty in energy resources using dynamic radial basis functions neural networks. J. Clean. Prod. 2018, 174, 96–113. [Google Scholar] [CrossRef]
Baghaee, H.R.; Mirsalim, M.; Gharehpetian, G.B.; Talebi, H.A. Three-phase AC/DC power-flow for balanced/unbalanced microgrids including wind/solar, droop-controlled and electronically-coupled distributed energy resources using radial basis function neural networks. IET Power Electron. 2017, 10, 313–328. [Google Scholar] [CrossRef]
Veerasamy, V.; Abdul Wahab, N.I.; Ramachandran, R.; Kamel, S.; Othman, M.L.; Hizam, H.; Farade, R. Power flow solution using a novel generalized linear Hopfield network based on Moore–Penrose pseudoinverse. Neural Comput. Appl. 2021, 33, 11673–11689. [Google Scholar] [CrossRef]
Li, G.; Or, S.W.; Chan, K.W. Intelligent Energy-Efficient Train Trajectory Optimization Approach Based on Supervised Reinforcement Learning for Urban Rail Transits. IEEE Access 2023, 11, 31508–31521. [Google Scholar] [CrossRef]
Jiang, B.; Yang, H.; Liu, Y.a. Dynamic Temporal Dependency Model for Multiple Steps Ahead Short-term Load Forecasting of Power System. IEEE Trans. Ind. Appl. 2024, in press. [Google Scholar]
Wu, J.; Tang, S.; Huang, C.; Zhang, D.; Zhao, Y. Review of attention mechanism in electric power systems. In Proceedings of the Advances in Artificial Intelligence and Security: 7th International Conference, ICAIS 2021, Dublin, Ireland, 19–23 July 2021; Proceedings, Part I 7; Springer: Berlin/Heidelberg, Germany, 2021; pp. 618–627. [Google Scholar]
Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; Anadkat, S.; et al. Gpt-4 technical report. arXiv 2023, arXiv:2303.08774. [Google Scholar]

Figure 1. Directly utilizing the equality constraints.

Figure 2. Formulating the optimal power flow problem into Markov form.

Figure 3. Predicting active constraints.

Figure 4. Simulating the steps of traditional optimization algorithms.

Table 1. Machine learning methods applied in optimal power flow (OPF).

Method	Strength	References
Direct Mapping of OPF Variable	Decrease computational time	[47]
	Incorporate physics-based rules	[30,48,49,50]
	Simplify the learning process	[51]
	Consider new energy unit	[29,52]
	Guarantee the solution feasibility	[34,53,54]
	Combine the DC OPF problem and convex ANN	[30,55,56]
Predicting Active Constraints	Explore simplified OPF version	[57,58]
	Classify active and inactive constraints	[40,58,59,60]
	Consider the uncertainty	[57,61]
	Explore the relationships among sets of con-currently active constraints	[62,63]
Learning Control Policy for OPF	Apply RL into PS operation	[64,65,66,67,68,69]
	Consider multi-object	[70]
	Apply RL into microgrid operation	[71,72,73]
	Develop model-based RL	[45,74,75]
	Guarantee the solution feasibility	[76,77]
Predicting Warm-Start Points	Explore simplified OPF version	[78,79,80,81,82,83]
Predicting Warm-Start Points	Enhance interpretability	[78]
Learning solution process	Approximate Jacobian matrix	[41,42,84,85,86,87,88,89]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiang, B.; Wang, Q.; Wu, S.; Wang, Y.; Lu, G. Advancements and Future Directions in the Application of Machine Learning to AC Optimal Power Flow: A Critical Review. Energies 2024, 17, 1381. https://doi.org/10.3390/en17061381

AMA Style

Jiang B, Wang Q, Wu S, Wang Y, Lu G. Advancements and Future Directions in the Application of Machine Learning to AC Optimal Power Flow: A Critical Review. Energies. 2024; 17(6):1381. https://doi.org/10.3390/en17061381

Chicago/Turabian Style

Jiang, Bozhen, Qin Wang, Shengyu Wu, Yidi Wang, and Gang Lu. 2024. "Advancements and Future Directions in the Application of Machine Learning to AC Optimal Power Flow: A Critical Review" Energies 17, no. 6: 1381. https://doi.org/10.3390/en17061381

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Advancements and Future Directions in the Application of Machine Learning to AC Optimal Power Flow: A Critical Review

Abstract

1. Introduction

2. Problem Formulation

2.1. Objective Functions

2.1.1. Quadratic Fuel Cost

2.1.2. Real Power Loss Minimization

2.2. Constraints

2.2.1. Equality Constraints

2.2.2. Inequality Constraints

3. Transformation of AC OPF Formulation for Machine Learning

4. Machine Learning Applications in OPF

4.1. Direct Mapping of OPF Variables

4.2. Predicting Active Constraints

4.3. Learning Control Policy for OPF

4.4. Predicting Warm-Start Points

4.5. Learning Solution Process

5. Limitations and Path Forward

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI