LLM-Enabled Agent Stack for Denied Flight
- Carlos Osorio
- 30 mar
- 3 Min. de lectura

Autonomous flight in GPS-denied environments is one of the most demanding challenges in modern robotics. Drones operating indoors, in forests, near collapsed infrastructure, inside industrial plants, or in dense urban canyons cannot depend on satellite navigation for stable localization and mission execution. In these scenarios, the aircraft must perceive the scene, interpret the mission, reason about obstacles, and generate safe actions using onboard intelligence.
A promising solution is the LLM-enabled agent stack for denied flight, where large language model reasoning is combined with perception, planning, control, and multi-agent coordination. Rather than using a single monolithic autonomy block, this architecture organizes the navigation system as a hierarchy of task-specific agents that cooperate through shared scene understanding and mission intent.
Why denied flight needs intelligent agent stacks
In denied environments, the drone faces several simultaneous difficulties:
no continuous GPS position reference
limited visibility caused by smoke, dust, shadows, or clutter
narrow corridors and tight obstacle spacing
uncertain or changing terrain structure
dynamic mission goals and time pressure
Traditional navigation pipelines can handle parts of this problem, but they often struggle when the mission requires higher-level interpretation. A drone may detect obstacles and estimate motion, yet
still fail to decide the best route, adapt to new objectives, or coordinate with other drones. Where an LLM-enabled architecture becomes valuable. The language model acts as a reasoning layer that connects mission context, scene interpretation, route selection, and action priorities.
Architecture overview
The proposed architecture organizes denied-flight autonomy into a set of interacting task-specific modules. Each module contributes a dedicated function while exchanging information through structured scene and mission representations. The process begins with a perception layer, which extracts environmental cues from RGB images, depth information, and semantic masks. This layer identifies obstacles, free space, structures, and local scene features required for navigation.
The output of the perception layer is passed into a scene model, which constructs a compact representation of walls, hazards, traversable regions, and 3D geometry. This model provides a grounded description of the environment that can be used by both planning and reasoning modules. A prompt-driven task planning block introduces mission-level flexibility. Instead of relying only on hard-coded rules, the system can interpret high-level instructions such as corridor preference, inspection priority, formation behavior, or fallback conditions. This enables faster adaptation to new tasks and changing environments.

At the center of the architecture is the LLM reasoning layer. The LLM processes mission context, scene abstractions, and vehicle status to generate structured decisions that guide downstream modules. Its role is strategic rather than low-level. It may recommend path preferences, task priorities, coordination changes, or safety-aware responses when uncertainty increases. For team operations, a mission coordinator allocates roles and synchronizes multiple drones. This module supports cooperative exploration, target search, area partitioning, and mission reallocation. The LLM enhances this process by adding semantic mission awareness to geometric planning. The route selector and waypoint builder translate mission intent into executable subgoals. These modules select candidate paths and generate feasible intermediate waypoints that satisfy environmental and mission constraints.
A local guidance and safety layer then ensures physically safe execution. This block combines path-following control, obstacle avoidance, speed regulation, and action filtering. It acts as a safety shield between high-level reasoning and flight execution. Finally, a command aggregator fuses outputs from planning, coordination, and local control into a consistent kinematic command for the drone or drone group.
Functional significance
The main advantage of this stack lies in its separation of responsibilities. Perception grounds the system in real observations. The scene model organizes the environment into a usable structure. The LLM introduces contextual decision-making. Planning modules convert intent into motion objectives. Safety and control ensure that all generated commands remain feasible and secure. This layered design improves robustness in denied-flight conditions because the system is not dependent on a single navigation source. If localization becomes uncertain, mission logic can trigger fallback behaviors. If the environment changes, the planner can re-route. If multiple drones are available, coordination can be adjusted dynamically.

The LLM-enabled agent stack for denied flight provides a modular and intelligent framework for autonomous aerial navigation in challenging environments. By combining perception, scene understanding, prompt-driven planning, mission coordination, and safety-constrained control, the architecture supports adaptive and cooperative drone behavior when GPS is not available. This approach represents a strong direction for next-generation aerial autonomy, especially in missions that require resilience, flexibility, and higher-level reasoning.



Comentarios