Theses and Dissertations

Effects of Saltatory Rewards and Generalized Advantage Estimation on Reference-Based Deep Reinforcement Learning of Humanlike Motions

Md Rysul Kabir, The University of Texas Rio Grande Valley

Date of Award

8-2021

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Dr. Dong-Chul Kim

Second Advisor

Dr. Zhixiang Chen

Third Advisor

Dr. Emmett Tomai

Abstract

In the application of learning physics-based character skills, deep reinforcement learning (DRL) can lead to slow convergence and local optimum solutions during the training process of a reinforcement learning (RL) agent. With the presence of an environment with reward saltation, we can easily plan to magnify those saltatory rewards with the perspective of sample usage to increase the experience pool of an agent during this training process. In our work, we have proposed two modified algorithms. The first one is the addition of a parameter based reward optimization process to magnify the saltatory rewards and thus increasing an agent’s utilization of previous experiences. We have added this parameter based reward optimization with proximal policy optimization (PPO) algorithm. What’s more, the other proposed algorithm introduces generalized advantage estimation in estimating the advantage of the advantage actor critic (A2C) algorithm which resulted in faster convergence to the global optimal solutions of DRL. We have conducted all our experiments to measure their performances in a custom reinforcement learning environment built using a physics engine named PyBullet. In that custom environment, the RL agent has a humanoid body which learns humanlike motions, e.g., walk, run, spin, cartwheel, spinkick, and backflip, from imitating example reference motions using the RL algorithms. Our experiments have shown significant improvement in performance and convergence speed of DRL in this custom environment for learning humanlike motions using the modified versions of PPO and A2C if compared with their vanilla versions.

Comments

https://go.openathens.net/redirector/utrgv.edu?url=https://www.proquest.com/dissertations-theses/effects-saltatory-rewards-generalized-advantage/docview/2596055293/se-2?accountid=7119

Recommended Citation

Kabir, Md Rysul, "Effects of Saltatory Rewards and Generalized Advantage Estimation on Reference-Based Deep Reinforcement Learning of Humanlike Motions" (2021). Theses and Dissertations. 899.
https://scholarworks.utrgv.edu/etd/899

Download

Included in

Computer Sciences Commons

COinS

Theses and Dissertations

Effects of Saltatory Rewards and Generalized Advantage Estimation on Reference-Based Deep Reinforcement Learning of Humanlike Motions

Date of Award

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Comments

Recommended Citation

Included in

Browse

Search

Author Corner

Links

Theses and Dissertations

Effects of Saltatory Rewards and Generalized Advantage Estimation on Reference-Based Deep Reinforcement Learning of Humanlike Motions

Author

Date of Award

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Comments

Recommended Citation

Included in

Share

Browse

Search

Author Corner

Links