Electrical and Computer Engineering Faculty Publications and Presentations
Document Type
Article
Publication Date
10-7-2025
Abstract
Introduction: Developing a reliable and trustworthy navigation policy in deep reinforcement learning (DRL) for mobile robots is extremely challenging, particularly in real-world, highly dynamic environments. Particularly, exploring and navigating unknown environments without prior knowledge, while avoiding obstacles and collisions, is very cumbersome for mobile robots.
Methods: This study introduces a novel trustworthy navigation framework that utilizes variational policy learning to quantify uncertainty in the estimation of the robot’s action, localization, and map representation. Trust-Nav employs the Bayesian variational approximation of the posterior distribution over the policy-based neural network’s parameters. Policy-based and value-based learning are combined to guide the robot’s actions in unknown environments. We derive the propagation of variational moments through all layers of the policy network and employ a first-order approximation for the nonlinear activation functions. The uncertainty in robot action is measured by the propagated variational covariance in the DRL policy network. At the same time, the uncertainty in the robot’s localization and mapping is embedded in the reward function and stems from the traditional Theory of Optimal Experimental Design. The total loss function optimizes the parameters of the policy and value networks to maximize the robot’s cumulative reward in an unknown environment.
Results: Experiments conducted using the Gazebo robotics simulator demonstrate the superior performance of the proposed Trust-Nav model in achieving robust autonomous navigation and mapping.
Discussion: Trust-Nav consistently outperforms deterministic DRL approaches, particularly in complicated environments involving noisy conditions and adversarial attacks. This integration of uncertainty into the policy network promotes safer and more reliable navigation, especially in complex or unpredictable environments. Trust-Nav offers a step toward deployable, self-aware robotic systems capable of recognizing and responding to their own limitations.
Recommended Citation
Dera, Dimah, Karla Van Aardt, Liam Ernst, Rohaan Nadeem, and Bryan Pedraza. "Trustworthy navigation with variational policy in deep reinforcement learning." Frontiers in Robotics and AI 12 (2025): 1652050. https://doi.org/10.3389/frobt.2025.1652050
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Publication Title
Frontiers in Robotics and AI
DOI
10.3389/frobt.2025.1652050

Comments
Student publication.
© 2025 Bockrath, Ernst, Nadeem, Pedraza and Dera. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY).