The Evasion Game: Finding the Perfect Zigzag

Author: Denis Avetisyan

New research proves an optimal strategy for dodging pursuers relies on sharp, on-off maneuvers, and presents a practical method for implementing it in real-time.

The analysis of expected cost $J(u_T^n)$ under a uniform future input model demonstrates that optimal control consistently resides at the control bounds, confirming a characteristic bang-bang structure for this system.

This paper demonstrates the stochastic optimality of bang-bang evasion strategies and introduces a Terminal-Set-Based Evasion (TSE) approach that minimizes miss distance.

While optimal control strategies are well-established in deterministic pursuit-evasion scenarios, realistically imperfect information and bounded maneuverability present significant challenges. This paper, ‘Bang-Bang Evasion: Its Stochastic Optimality and a Terminal-Set-Based Implementation’, addresses this by proving the existence of an optimal evasion strategy retaining a bang-bang control structure, even within a fully stochastic framework. A novel closed-loop Terminal-Set-Based Evasion (TSE) strategy is then proposed and validated via simulation, demonstrably outperforming existing stochastic evasion techniques. Could this approach pave the way for more robust and efficient guidance and control systems in complex, uncertain environments?

Decoding the Intercept Challenge: Predicting the Unpredictable

The pursuit of an intercept – guiding a projectile to collide with a moving target – forms a cornerstone of guidance and control systems, yet achieving this seemingly simple goal is profoundly complex. This challenge isn’t merely about calculating a collision course; it fundamentally requires accurate prediction of the target’s future position. Unlike scenarios with fixed trajectories, real-world targets rarely travel in straight lines. They maneuver – accelerating, turning, and altering course unpredictably. Consequently, intercept systems must move beyond simple ballistic calculations and embrace sophisticated algorithms capable of anticipating these evasive actions. The accuracy of this predictive capability directly dictates the effectiveness of the guidance law, influencing factors like response time, required energy expenditure, and ultimately, mission success. A robust solution must therefore account for the target’s dynamic behavior, effectively ‘solving’ for a future point of intersection in a constantly shifting kinematic landscape, a problem that underpins applications ranging from missile defense to autonomous aerial vehicles.

Conventional guidance systems often rely on predicting a target’s future position based on its current velocity and heading, a strategy that proves inadequate when confronted with deliberate evasive actions. These traditional laws, frequently built on proportional navigation or similar approaches, assume a relatively predictable trajectory; however, a maneuvering target can rapidly alter its course, introducing accelerations and turns that quickly invalidate these predictions. The resulting guidance errors compound over time, diminishing the intercept probability, particularly in scenarios involving high-agility targets or complex flight paths. Consequently, systems designed solely around these established principles exhibit a limited capacity to counteract sophisticated maneuvers, necessitating the development of guidance algorithms capable of adapting to unpredictable target behavior and maintaining a reliable path to interception.

Overcoming the intercept challenge demands more than simply reacting to a target’s movements; it requires a proactive system capable of anticipating and responding to complex maneuvers. Robust estimation techniques, such as Kalman filtering and particle filtering, are critical for continuously refining the target’s predicted trajectory, even amidst noisy sensor data and deceptive actions. However, accurate estimation is insufficient without adaptable control strategies. These strategies must dynamically adjust the interceptor’s course, leveraging real-time estimations to optimize performance and counteract evasive maneuvers. The most effective systems employ model predictive control or reinforcement learning, allowing the interceptor to not only react to the target’s current trajectory, but also to predict and neutralize future evasive actions, ultimately ensuring a successful intercept despite the target’s best efforts.

The planar engagement geometry illustrates the configuration of interacting surfaces.

Untangling Complexity: Estimation and Separation in Guidance

The Generalized Separation Theorem, when applied to the intercept problem, establishes that optimal control can be achieved by designing a controller based on the estimated state, rather than requiring knowledge of the full posterior probability distribution. This decoupling is enabled by the theorem’s assertion that the optimal control law is a function only of the estimated state and, potentially, time. Consequently, the estimation and control problems become independently solvable; an estimator can be designed to provide the best possible state estimate, and a controller can then be designed based on that estimate, without needing to account for the complexities of jointly optimizing both processes. This simplifies the design process and allows for modularity, where improvements in estimation do not necessarily require redesign of the control law, and vice-versa.

The decoupling of estimation and control, facilitated by the Generalized Separation Theorem, enables independent optimization of the estimation process. This is achieved through techniques such as the Multiple Model Adaptive Estimator (MMAE), which operates by maintaining a bank of filters, each representing a hypothesized target guidance law. The MMAE calculates the probability of each hypothesis based on observed target maneuvers and sensor data, effectively identifying the most likely guidance law being employed. This probabilistic assessment, derived from a weighted combination of individual filter outputs, provides a robust estimate of the target’s trajectory and allows for adaptation to changing or unpredictable target behavior. The resulting estimate, independent of the control law, can then be utilized by the guidance system for optimal intercept calculations.

Striebel Sufficiency, a key result in Bayesian decision theory, establishes that for a linear Gaussian system with known process and measurement noise, the optimal control law is a function only of the posterior probability distribution of the state. This means that given an estimate of the target’s state derived from available measurements, the best possible control action – maximizing intercept probability or minimizing miss distance – can be computed directly from this posterior distribution, without needing to consider the entire history of measurements or the specific estimation algorithm used to generate the estimate. Formally, if $x_k$ is the posterior estimate at time $k$, the optimal control $u^(x_k)$ satisfies $u^(x_k) = argmin_u E[J(x_k, u) | x_k]$, where $J$ is a cost function and the expectation is with respect to the posterior distribution.

Revealing Optimal Strategies: The Power of Bang-Bang Control

Analysis of optimal intercept scenarios demonstrates that effective evasion policies consistently employ a ‘bang-bang’ control structure. This means the optimal strategy involves abrupt transitions between maximum and minimum acceleration levels, rather than smooth, continuous adjustments. Mathematical models of these scenarios reveal that maintaining intermediate acceleration values does not contribute to minimizing time-to-intercept or maximizing miss distance; the system performs best when operating at its limits. This behavior arises from the non-linear dynamics of the intercept problem, where discrete changes in velocity provide more efficient trajectory adjustments than incremental ones, ultimately leading to demonstrably superior performance compared to continuous control approaches.

Analysis of optimal control problems in both evasion and guidance scenarios demonstrates that strategies employing continuous, nuanced control inputs are frequently less effective than those utilizing discrete, abrupt changes in control. Specifically, solutions consistently favor switching rapidly between extreme control values – maximum and minimum acceleration or deflection – rather than maintaining intermediate states. This indicates that the computational cost and complexity associated with continuous control do not typically justify the performance gains, as the optimal solution often lies within the space of bang-bang control policies. Consequently, algorithms designed for these applications can be significantly simplified by restricting the search space to these discrete control options.

Terminal Set Based Evasion (TSBE) leverages the identified bang-bang control structure by defining a series of reachable “terminal sets” in the state space. Instead of continuously calculating control inputs, TSBE focuses on maneuvering the evader to reach these pre-defined sets, which represent safe or advantageous states. Upon reaching a terminal set boundary, the control switches instantaneously between maximum and minimum acceleration – the bang-bang characteristic – to transition to the next desired set. This discrete switching approach simplifies the computational burden of optimal control while maintaining performance, as the optimal control problem is decomposed into a series of subproblems focused on reaching these defined sets, rather than requiring continuous optimization of control signals.

Acceleration command profiles reveal that the TSE, RTS, Singer, and weaving evasion strategies each produce distinct maneuvering patterns during representative engagements.

Validating Evasion Tactics: The Power of Simulation

Monte Carlo simulation, a computational technique relying on repeated random sampling to obtain numerical results, is employed to assess the efficacy of various evasion strategies due to its ability to model complex, stochastic systems. The Kalman Filter, a recursive algorithm, is frequently integrated within these simulations to estimate the state of a dynamic system from a series of incomplete and noisy measurements, thereby providing a realistic representation of tracking and interception scenarios. By running numerous simulations with randomized initial conditions and sensor noise, statistically significant performance metrics – such as mean miss distance and Single-Shot Kill Probability (SSKP) – can be obtained for each tested evasion maneuver. This methodology allows for a comparative analysis of strategies like the Random Telegraph Signal, Singer Process, and Weaving maneuvers under defined, yet variable, operational conditions, providing data-driven insights into their relative strengths and weaknesses.

Comparative analysis of evasion strategies – specifically the Random Telegraph Signal (RTS), Singer Process, and Weaving Maneuver – is facilitated through simulation environments that model realistic operational conditions. These simulations allow for the systematic variation of parameters such as relative velocities, sensor noise, and maneuver durations to quantify performance differences. Performance metrics commonly used in these comparisons include mean miss distance – representing the average distance between the evading target and the incoming threat – and Single-Shot Kill Probability (SSKP), which indicates the likelihood of a successful intercept. By controlling for confounding variables within the simulated environment, researchers can isolate the effectiveness of each maneuver and identify optimal strategies for minimizing detection and maximizing survivability.

Monte Carlo simulations evaluating evasion tactics demonstrate the efficacy of bang-bang control strategies. Specifically, the Terminal Set Based Evasion (TSE) maneuver resulted in a mean miss distance of 6.29 meters. Comparative analysis against other techniques revealed a superior performance; the Random Telegraph Signal (RTS) achieved a 5.04m miss distance, the Singer process 1.93m, and weaving 0.63m. These results indicate that, while effective, TSE exhibits a greater mean miss distance than the Singer and weaving maneuvers, but outperforms the RTS in terms of minimizing that distance.

Analysis of evasion strategies, conducted via Monte Carlo simulation, indicates that the Terminal Set Based Evasion (TSE) maneuver achieves a Single-Shot Kill Probability (SSKP) of 0.2. This performance represents an improvement over the Random Telegraph Signal (RTS) strategy, which yielded an SSKP of 0.4. However, the TSE’s SSKP is equivalent to that of both the Singer Process and the Weaving maneuver, both of which achieved an SSKP of 0.9. The SSKP metric represents the probability that a single attempt to intercept the evading target will be successful, and is a key indicator of evasion effectiveness.

Across 10,000 Monte Carlo trials, the TSE, RTS, Singer, and weaving evasion strategies exhibited distinct miss distance distributions, as shown by their empirical cumulative distribution functions.

Beyond Proportional Navigation: The Pursuit of Optimal Guidance

Traditional Proportional Navigation, while effective against constant-velocity targets, struggles when adversaries employ evasive maneuvers. Augmented Proportional Navigation addresses this limitation by intelligently modifying the classic guidance law with additional terms that account for target acceleration and predicted movements. These augmentations essentially ‘look ahead’, anticipating where the target will be rather than reacting solely to its current position. This proactive approach significantly enhances intercept probability, particularly in scenarios involving high-g turns and unpredictable flight paths. The added terms often incorporate estimations of target jerk – the rate of change of acceleration – allowing for even finer adjustments and improved tracking performance. Consequently, Augmented Proportional Navigation represents a crucial advancement in missile guidance technology, bridging the gap between simple reactive systems and the complexities of optimal control.

Achieving truly optimal guidance necessitates framing the interception problem as an optimal control challenge, frequently addressed through the mathematical framework of a Linear Quadratic Differential Game. This approach moves beyond reactive strategies, instead proactively calculating the control inputs – acceleration and direction – that minimize a cost function. This cost function typically balances fuel expenditure with the probability of a successful intercept, considering the target’s potential maneuvers. The ‘Linear Quadratic’ designation stems from the mathematical form of the problem: linear system dynamics and a quadratic cost function allow for analytical or numerical solutions, yielding guidance laws that are demonstrably superior to simpler methods like proportional navigation, particularly in scenarios involving aggressive or unpredictable target behavior. Solving this game results in a guidance law that isn’t just responsive, but anticipates and strategically counters the target’s evasive actions, maximizing the likelihood of a successful intercept with minimal resource consumption.

The evolution of guidance techniques beyond proportional navigation signifies a substantial leap in the field of intercept technology. While traditional methods offer a baseline for target interception, recent advancements – particularly those rooted in optimal control theory and differential game formulations – promise markedly improved performance in complex, real-world scenarios. These techniques don’t simply react to target movements; they proactively calculate intercept trajectories that account for predicted maneuvers, drastically increasing the probability of a successful intercept even against highly evasive targets. This shift towards optimality isn’t merely about refining existing systems; it represents a fundamental change in approach, enabling the development of more robust and reliable defensive and offensive capabilities, and opening doors to applications where precise, predictable interception is paramount, such as in advanced missile defense systems and autonomous aerial vehicles.

The pursuit of optimal evasion, as detailed in this work, echoes a fundamental principle of systems analysis: understanding is revealed through the identification of inherent patterns. Just as a scientist meticulously examines a specimen under a microscope, this paper dissects the evasion problem to expose its underlying, bang-bang structure. The demonstration of Terminal-Set-Based Evasion’s superiority over stochastic methods isn’t merely a technical achievement; it’s a confirmation that rigorous logic, applied to the dynamics of pursuit and evasion, yields predictable and ultimately controllable outcomes. As Henry David Thoreau observed, “It is not enough to be busy; so are the ants. The question is: What are we busy with?” This research provides a clear answer – a focused investigation into the patterns governing this dynamic interaction, resulting in a demonstrably effective solution.

Beyond the Bang: Future Directions

The demonstration of stochastic optimality for a bang-bang evasion strategy, while satisfying from a theoretical perspective, merely shifts the interesting problems forward. The current framework relies on a linear guidance law against which to optimize; real-world interceptors are rarely so obliging. Future work must address the implications of non-linear pursuit, potentially revealing that the elegant simplicity of bang-bang control is an artifact of this simplifying assumption. Exploring the robustness of the Terminal-Set-Based Evasion (TSE) method under model uncertainties-parameter estimation errors, aerodynamic disturbances-will be crucial for practical deployment.

Furthermore, the current analysis largely treats the evader as a solitary agent. The introduction of multiple evaders, or a mix of cooperative and competitive evaders, introduces a game-theoretic complexity that promises both challenges and opportunities. The resulting strategies are likely to be less ‘clean’ than the current bang-bang solution, but potentially more effective in cluttered environments. The miss distance metric, while convenient, also warrants re-examination; minimizing risk, rather than solely minimizing distance, may prove a more fruitful objective.

Ultimately, the pursuit-evasion problem is less about finding the optimal solution-such a thing may not exist-and more about understanding the inherent trade-offs between maneuverability, energy expenditure, and predictability. Each model error is not a failure, but a boundary condition, a constraint within which the next, more nuanced, solution must lie.

Original article: https://arxiv.org/pdf/2511.21633.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/