top of page
Buscar

LLM-Enabled Agent Stack for Denied Flight

  • Foto del escritor: Carlos Osorio
    Carlos Osorio
  • 30 mar
  • 3 Min. de lectura


Autonomous flight in GPS-denied environments is one of the most demanding challenges in modern robotics. Drones operating indoors, in forests, near collapsed infrastructure, inside industrial plants, or in dense urban canyons cannot depend on satellite navigation for stable localization and mission execution. In these scenarios, the aircraft must perceive the scene, interpret the mission, reason about obstacles, and generate safe actions using onboard intelligence.

A promising solution is the LLM-enabled agent stack for denied flight, where large language model reasoning is combined with perception, planning, control, and multi-agent coordination. Rather than using a single monolithic autonomy block, this architecture organizes the navigation system as a hierarchy of task-specific agents that cooperate through shared scene understanding and mission intent.


Why denied flight needs intelligent agent stacks

In denied environments, the drone faces several simultaneous difficulties:

  • no continuous GPS position reference

  • limited visibility caused by smoke, dust, shadows, or clutter

  • narrow corridors and tight obstacle spacing

  • uncertain or changing terrain structure

  • dynamic mission goals and time pressure

Traditional navigation pipelines can handle parts of this problem, but they often struggle when the mission requires higher-level interpretation. A drone may detect obstacles and estimate motion, yet

still fail to decide the best route, adapt to new objectives, or coordinate with other drones. Where an LLM-enabled architecture becomes valuable. The language model acts as a reasoning layer that connects mission context, scene interpretation, route selection, and action priorities.


Architecture overview


The proposed architecture organizes denied-flight autonomy into a set of interacting task-specific modules. Each module contributes a dedicated function while exchanging information through structured scene and mission representations. The process begins with a perception layer, which extracts environmental cues from RGB images, depth information, and semantic masks. This layer identifies obstacles, free space, structures, and local scene features required for navigation.


The output of the perception layer is passed into a scene model, which constructs a compact representation of walls, hazards, traversable regions, and 3D geometry. This model provides a grounded description of the environment that can be used by both planning and reasoning modules. A prompt-driven task planning block introduces mission-level flexibility. Instead of relying only on hard-coded rules, the system can interpret high-level instructions such as corridor preference, inspection priority, formation behavior, or fallback conditions. This enables faster adaptation to new tasks and changing environments.


LLM agent navigation enables a drone to interpret mission context, analyze visual and state information, and make adaptive flight decisions in real time. By combining high-level reasoning with trajectory tracking and safety control, the agent can follow goals, avoid obstacles, and maintain stable navigation in dynamic or GPS-denied environments.
LLM agent navigation enables a drone to interpret mission context, analyze visual and state information, and make adaptive flight decisions in real time. By combining high-level reasoning with trajectory tracking and safety control, the agent can follow goals, avoid obstacles, and maintain stable navigation in dynamic or GPS-denied environments.

At the center of the architecture is the LLM reasoning layer. The LLM processes mission context, scene abstractions, and vehicle status to generate structured decisions that guide downstream modules. Its role is strategic rather than low-level. It may recommend path preferences, task priorities, coordination changes, or safety-aware responses when uncertainty increases. For team operations, a mission coordinator allocates roles and synchronizes multiple drones. This module supports cooperative exploration, target search, area partitioning, and mission reallocation. The LLM enhances this process by adding semantic mission awareness to geometric planning. The route selector and waypoint builder translate mission intent into executable subgoals. These modules select candidate paths and generate feasible intermediate waypoints that satisfy environmental and mission constraints.


A local guidance and safety layer then ensures physically safe execution. This block combines path-following control, obstacle avoidance, speed regulation, and action filtering. It acts as a safety shield between high-level reasoning and flight execution. Finally, a command aggregator fuses outputs from planning, coordination, and local control into a consistent kinematic command for the drone or drone group.


Functional significance


The main advantage of this stack lies in its separation of responsibilities. Perception grounds the system in real observations. The scene model organizes the environment into a usable structure. The LLM introduces contextual decision-making. Planning modules convert intent into motion objectives. Safety and control ensure that all generated commands remain feasible and secure. This layered design improves robustness in denied-flight conditions because the system is not dependent on a single navigation source. If localization becomes uncertain, mission logic can trigger fallback behaviors. If the environment changes, the planner can re-route. If multiple drones are available, coordination can be adjusted dynamically.


LLM-based multi-UAV mission planning system that converts user commands and external intelligence into coordinated flight tasks and formation control.
LLM-based multi-UAV mission planning system that converts user commands and external intelligence into coordinated flight tasks and formation control.

The LLM-enabled agent stack for denied flight provides a modular and intelligent framework for autonomous aerial navigation in challenging environments. By combining perception, scene understanding, prompt-driven planning, mission coordination, and safety-constrained control, the architecture supports adaptive and cooperative drone behavior when GPS is not available. This approach represents a strong direction for next-generation aerial autonomy, especially in missions that require resilience, flexibility, and higher-level reasoning.


 
 
 

Comentarios


bottom of page